Use joblib for robust parallel import scanning by Peter554 · Pull Request #209 · python-grimp/grimp

Peter554 · 2025-04-14T14:23:46Z

https://joblib.readthedocs.io/en/stable/parallel.html

Joblib takes some things for us. Relevant here:

Robust calculation for number of available CPUs.
Sequential calculation when n_jobs = 1.

And likely other minor things I don't even understand (joblib has put a lot of thought into multi-processing, so we don't have to).

Peter554 · 2025-04-14T14:24:50Z

src/grimp/application/usecases.py


 # This is an arbitrary number, but setting it too low slows down our functional tests considerably.
-MIN_NUMBER_OF_MODULES_TO_SCAN_USING_MULTIPROCESSING = 50
+MIN_NUMBER_OF_MODULES_TO_SCAN_USING_MULTIPLE_PROCESSES = 64


🐼🐼🐼 My OCD => 64 is a power of two, just feels nicer

Not sure we should include this change here, at least not without something in the changelog as it could have an impact on end users. On balance I'd prefer to concentrate on fixing the issue without making another change at the same time.

Peter554 · 2025-04-14T14:25:26Z

src/grimp/application/usecases.py



 # This is an arbitrary number, but setting it too low slows down our functional tests considerably.
-MIN_NUMBER_OF_MODULES_TO_SCAN_USING_MULTIPROCESSING = 50


MULTIPLE_PROCESSES instead of MULTIPROCESSING to avoid confusion with multiprocessing module

Peter554 · 2025-04-14T14:25:57Z

src/grimp/application/usecases.py

    module_files_tuple = tuple(module_files)

    number_of_module_files = len(module_files_tuple)
-    n_chunks = _decide_number_of_of_processes(number_of_module_files)


Typo (should really have put that in a separate commit...)

Peter554 · 2025-04-14T14:26:57Z

src/grimp/application/usecases.py

+        # Don't incur the overhead of multiple processes.
        return 1
-    return min(multiprocessing.cpu_count(), number_of_module_files)
+    return min(joblib.cpu_count(), number_of_module_files)


joblib.cpu_count() is a thin wrapper around loky.cpu_count() https://github.com/joblib/loky/blob/d8bb877b94214883c2b69cf85ae375ea53d6cd17/loky/backend/context.py#L78

Peter554 · 2025-04-14T14:27:54Z

src/grimp/application/usecases.py

-    if number_of_processes == 1:
-        # No need to spawn a process if there's only one chunk.
-        [chunk] = chunks
-        return _scan_chunk(import_scanner, exclude_type_checking_imports, chunk)


No need for this special number_of_processes == 1 case anymore - joblib will do this automatically. https://joblib.readthedocs.io/en/stable/generated/joblib.Parallel.html#joblib.Parallel

If 1 is given, no parallel computing code is used at all, and the behavior amounts to a simple python for loop.

codspeed-hq · 2025-04-14T14:30:26Z

CodSpeed Instrumentation Performance Report

Merging #209 will not alter performance

_{Comparing Peter554:joblib (d56af94) with master (6b2dd86)}

Summary

✅ 22 untouched benchmarks

https://joblib.readthedocs.io/en/stable/parallel.html Joblib takes some things for us. Relevant here: * Robust calculation for number of available CPUs. * Sequential calculation when n_jobs = 1. And likely other minor things I don't even understand.

seddonym

Thanks for this - I've merged into a branch in my repo and will make a couple of tweaks before merging to master.

seddonym · 2025-04-14T17:19:50Z

src/grimp/application/usecases.py


 # This is an arbitrary number, but setting it too low slows down our functional tests considerably.
-MIN_NUMBER_OF_MODULES_TO_SCAN_USING_MULTIPROCESSING = 50
+MIN_NUMBER_OF_MODULES_TO_SCAN_USING_MULTIPLE_PROCESSES = 64


Not sure we should include this change here, at least not without something in the changelog as it could have an impact on end users. On balance I'd prefer to concentrate on fixing the issue without making another change at the same time.

Peter554 force-pushed the joblib branch from e033ca8 to 69e0415 Compare April 14, 2025 14:24

Peter554 commented Apr 14, 2025

View reviewed changes

Peter554 marked this pull request as ready for review April 14, 2025 14:28

Peter554 force-pushed the joblib branch from 69e0415 to d56af94 Compare April 14, 2025 18:44

Peter554 changed the title ~~Use joblib for more robust parallel import scanning~~ Use joblib for robust parallel import scanning Apr 14, 2025

seddonym changed the base branch from master to joblib-multiprocessing April 23, 2025 11:45

seddonym merged commit 417753f into python-grimp:joblib-multiprocessing Apr 23, 2025
18 checks passed

seddonym reviewed Apr 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use joblib for robust parallel import scanning#209

Use joblib for robust parallel import scanning#209
seddonym merged 1 commit intopython-grimp:joblib-multiprocessingfrom
Peter554:joblib

Peter554 commented Apr 14, 2025

Uh oh!

Peter554 Apr 14, 2025

Uh oh!

seddonym Apr 14, 2025

Uh oh!

Peter554 Apr 14, 2025

Uh oh!

Peter554 Apr 14, 2025

Uh oh!

Peter554 Apr 14, 2025

Uh oh!

Peter554 Apr 14, 2025

Uh oh!

codspeed-hq bot commented Apr 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

seddonym left a comment

Uh oh!

seddonym Apr 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		# This is an arbitrary number, but setting it too low slows down our functional tests considerably.
		MIN_NUMBER_OF_MODULES_TO_SCAN_USING_MULTIPROCESSING = 50

Conversation

Peter554 commented Apr 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codspeed-hq bot commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging #209 will not alter performance

Summary

Uh oh!

Uh oh!

seddonym left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codspeed-hq bot commented Apr 14, 2025 •

edited

Loading