Skip to content

Project 2: Vismay Churiwala#47

Open
vismaychuriwala wants to merge 15 commits intoCIS5650-Fall-2025:mainfrom
vismaychuriwala:main
Open

Project 2: Vismay Churiwala#47
vismaychuriwala wants to merge 15 commits intoCIS5650-Fall-2025:mainfrom
vismaychuriwala:main

Conversation

@vismaychuriwala
Copy link

@vismaychuriwala vismaychuriwala commented Sep 17, 2025

  • https://github.com/vismaychuriwala/CUDA-Stream-Compaction
  • Here are some cool features I have implemented:
    • Using Shared Memory (SM) for efficient memory access on the GPU (as opposed to using global memory, which is more than 100× slower.
    • Hardware optimization via bank conflicts prevention.
    • Recursive scanning to scan arrays of arbitrary sizes (tested up to 1B elements (2^30), which took 3470.59 ms).
    • Radix Sort using Parallel Scan
    • Naive and CPU-based implementations to compare and benchmark techniques.
    • Customized testing code to collect, average, and plot GPU and CPU timings.
  • Feedback : A tip about how the main crux of the assignment is multi-block compaction would have been good, I thought I had finished this assignment days ago, until the last day when I realized that my code doesn't work on large arrays. I guess I could have known, it just didn't occur to me. Also, making SM compulsory isn't too bad, can do that maybe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant