Add Richardson-Lucy deconvolution benchmark by pentschev · Pull Request #790 · rapidsai/dask-cuda

pentschev · 2021-11-16T22:19:00Z

No description provided.

codecov-commenter · 2021-11-16T22:45:50Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (branch-21.12@7f00ee2). Learn more about missing BASE report.

Additional details and impacted files

@@               Coverage Diff               @@
##             branch-21.12     #790   +/-   ##
===============================================
  Coverage                ?   90.11%           
===============================================
  Files                   ?       15           
  Lines                   ?     1992           
  Branches                ?        0           
===============================================
  Hits                    ?     1795           
  Misses                  ?      197           
  Partials                ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jakirkham · 2021-11-19T17:22:50Z

dask_cuda/benchmarks/local_cupy.py

+        def _richardson_lucy(image, psf, im_deconv, psf_mirror):
+            conv = _convolve(im_deconv, psf, mode="constant")
+            relative_blur = image / conv
+            im_deconv *= _convolve(relative_blur, psf_mirror, mode="constant")
+            return im_deconv


We have a richardson_lucy implementation in cuCIM, which we could use here

Something worth noting is Richardson-Lucy is an iterative algorithm that converges on a solution. This involves entering and leaving Fourier space repeatedly. So there is a fair bit of computation, which may affect profiling.

Thanks for pointing that out, John! I was trying to reproduce https://github.com/nv-legate/cunumeric/blob/18792f3e988e3240eb10ff6de6d78de7df57d090/examples/richardson_lucy.py#L28-L41 , but I now see the mistake I've made in not iterating over im_deconv but rather overwriting it. I'll also take a closer look at the cuCIM implementation and see what I can make up of both approaches.

Ofc! Yeah that makes sense. Feel free to grab that code from cuCIM if it helps.

Should add the convolve call there is using some vendored code, but that preceded CuPy adding convolve in 9.0.0. So it should be possible to use CuPy directly for that call. Everything else is also straight CuPy so that should hopefully make it easier to use.

The other interesting thing about this convolve call is it will try to do convolution in Fourier space or real space depending on which is faster (using some heuristic). If you determine one is faster for your needs, it may be worth bypassing that autodetection logic and just calling with the appropriate implementation.

One last thought since it seems in their benchmark they used a warm-up run, we might want to consider doing the same thing. After all CuPy will create the kernels on the first run. So it only seems fair to do the same thing here.

github-actions · 2021-12-23T20:02:55Z