Don't parallelize scanning for small codebases by seddonym · Pull Request #207 · python-grimp/grimp

seddonym · 2025-04-11T11:12:57Z

The recently-introduced multiprocessing work had slowed down the test suite, because we were spawning multiple processes for each functional test that scanned relatively small packages.

This introduces a constant defining the minimum number of modules before using multiprocessing, returning the test suite to its previous speed.

pytest --benchmark-skip
Before: 13.85s.
After: 4.91s

We can see there's a regression here for scanning Django with 15 cache misses: as it happens, the same size as the testpackage that is sped up. I think the reason for that is that the test package has very little AST parsing to do as the Python modules are almost empty, whereas the Django benchmark includes 15 quite complex modules. We could adjust this a bit (or even run it at a higher threshold when running tests), but given that scanning 50 modules in a single process is still quite fast, I doubt it will impact users in practice. We can easily revisit this later if need be.

This significantly speeds up the test suite, which had slowed down when we added scanning parallelization.

codspeed-hq · 2025-04-11T11:23:30Z

CodSpeed Instrumentation Performance Report

Merging #207 will degrade performances by 54.35%

_{Comparing small-codebase-build-optimization (059a5d9) with master (30e36bc)}

Summary

❌ 1 regressions
✅ 21 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
❌	`test_build_django_from_cache_a_few_misses[15]`	174.2 ms	381.6 ms	-54.35%

seddonym added 4 commits April 11, 2025 10:17

Move chunking decision inside _create_chunks

65237bd

Extract _scan_chunks function

aa705bc

Don't spawn processes if there is only one chunk

e3fdbb4

Don't parallelize scanning for smaller codebases

059a5d9

This significantly speeds up the test suite, which had slowed down when we added scanning parallelization.

seddonym changed the title ~~Small codebase build optimization~~ Don't parallelize scanning for small codebases Apr 11, 2025

seddonym marked this pull request as ready for review April 11, 2025 11:36

seddonym merged commit 4fdf254 into master Apr 11, 2025
17 of 18 checks passed

seddonym deleted the small-codebase-build-optimization branch April 11, 2025 11:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't parallelize scanning for small codebases#207

Don't parallelize scanning for small codebases#207
seddonym merged 4 commits intomasterfrom
small-codebase-build-optimization

seddonym commented Apr 11, 2025 •

edited

Loading

Uh oh!

codspeed-hq bot commented Apr 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

seddonym commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Apr 11, 2025

CodSpeed Instrumentation Performance Report

Merging #207 will degrade performances by 54.35%

Summary

Benchmarks breakdown

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

seddonym commented Apr 11, 2025 •

edited

Loading