Skip to content

greninja/cuda_programs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

About

Welcome! This repository is a collection of my experiments and examples as I learn about GPU programming and parallel computing with CUDA. Here you'll find code exploring everything from simple vector addition to more advanced parallel algorithms.

I am referring this book:

Programming Massively Parallel Processors: A Hands-on Approach by Wen-mei W. Hwu, David B. Kirk, and Izzat El Hajj

to get a grasp of CUDA fundamentals and more advanced topics.

Requirements

  • NVIDIA GPU with CUDA support
  • CUDA Toolkit
  • NVCC (NVIDIA CUDA Compiler)
  • C++ compiler

Programs

Program Description Page Link
Vector Addition Basic CUDA program demonstrating parallel vector addition. Each thread computes one element of the result vector. README
Matrix Multiplication Matrix multiplication implementation with two versions: naive kernel and tiled kernel using shared memory. Demonstrates key CUDA concepts like shared memory, tiling, and memory coalescing. README
One-Head Attention Implementation of scaled dot-product attention mechanism using CUDA. Computes Attention(Q, K, V) = softmax(QK^T / √d) × V using multiple optimized CUDA kernels. README

About

CUDA programming examples and implementations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published