Conversation
|
Hi, I haven't looked super in-depth yet, but ideally in each grid implementation you should only have to do a single Take another look at the diagrams in the instructions page for another approach. You will probably have to add a kernel that we did not spell out for you. I know it's close to the deadline, but this adjustment shouldn't take long! |
|
I just saw that. It actually works pretty well on my end. The sorting is taking place on a reinitialized array for consistency. |
|
Ah okay. Sorry for the mixup with github notifications. |
Project1 Git Link
redde
Implemented flocking using the naive, sparse grid and coherent grid methods.
Ran a run tests to compare methods using NSIGHT and using cudaEventRecord, cudaEventSynchronize and cudaEventElapsedTime to compute average frame time and compare the different methods.
Speed wise Naive was predictably the slowest and the as long as the we clamp the minimum cell size to prevent exceeding the maximum GPU memory (see commented out line 165 in kernel.cu), the fastest solution is the coherent grid despite having to sort 2 buffers.