Skip to content

Comments

Project2: Carlos Lopez Garces#29

Open
carlos-lopez-garces wants to merge 33 commits intoCIS5650-Fall-2024:mainfrom
carlos-lopez-garces:main
Open

Project2: Carlos Lopez Garces#29
carlos-lopez-garces wants to merge 33 commits intoCIS5650-Fall-2024:mainfrom
carlos-lopez-garces:main

Conversation

@carlos-lopez-garces
Copy link

@carlos-lopez-garces carlos-lopez-garces commented Sep 18, 2024

Repo Link

I needed an extra day to complete the performance analysis.

  • Implemented 2 versions of StreamCompaction::CPU::scan. One is used as reference for performance analysis and performs the sum in sequence; when simulateGPUScan = true, a second version is used that simulates as much as possible the algorithm of the GPU naive scan (I did this to become familiar with it).
  • Implemented StreamCompaction::Naive::scan.
    • For extra credit 2, I used shared memory to perform the per-block scan, instead of global memory. I didn't deal with bank conflicts, though.
  • Implemented StreamCompaction::Efficient::scan. Used 0-padding to handle non-power-of-2 inputs sizes; implementation tries to not perform global memory writes for padding elements during the scan.
  • Implemented StreamCompaction::Thrust::scan.
  • Implemented StreamCompaction::CPU::compactWithoutScan and StreamCompaction::CPU::compactWithScan.
  • Included execution time comparison of all the different scan implementations.
  • Implemented StreamCompaction::Efficient::compact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant