Skip to content

Project 2: Yunhao Qian#36

Open
yunhao-qian wants to merge 12 commits intoCIS5650-Fall-2025:mainfrom
yunhao-qian:main
Open

Project 2: Yunhao Qian#36
yunhao-qian wants to merge 12 commits intoCIS5650-Fall-2025:mainfrom
yunhao-qian:main

Conversation

@yunhao-qian
Copy link

  • Repo Link
  • Features:
    • A CPU implementation of scan and stream compaction
    • GPU implementations of scan, using both naive and work-efficient methods
    • A GPU implementation of stream compaction based on the work-efficient scan
    • A carefully optimized work-efficient scan that leverages shared memory along with several additional optimizations
    • C++ and Python scripts used to automate performance measurement accurately and programmatically
    • Performance analysis comparing the different methods
    • More in-depth performance analysis conducted with Nsight
  • Extra credits: all of Part 5 and Part 6 have been completed.
  • Changes to CMakeLists.txt:
    • An additional executable,measure_time.exe, has been added to the project to support block size tuning, performance benchmarking, and profiling.
    • The files cpu_sort.h, cpu_sort.cu, radix_sort.h, radix_sort.h were introduced to implement Extra Credit 1.
    • The files efficient_plus.h and efficient_plus.cu were introduced to implement Extra Credit 2.
  • Function Overloads: To make block size tuning easier, I added overloads of the following functions that accept an additional blockSize parameter. The original overloads remain unchanged and simply forward to the new versions with a tuned default value. These changes are only used by measure_time.exe and do not affect existing code paths.
    • Naive::scan(..., const int blockSize)
    • Efficient::scan(..., const int blockSize)
    • Efficient::compact(..., const int blockSize)

@yunhao-qian
Copy link
Author

Please use the latest "Complete everything" (7b0719c) commit for grading. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant