Closed
Conversation
CodSpeed Instrumentation Performance ReportMerging #221 will degrade performances by 73.02%Comparing Summary
Benchmarks breakdown
|
b420e06 to
cf7672f
Compare
This necessitates us turning off multiprocessing, but we'll switch to multithreading in a subsequent commit.
cf7672f to
bf68f26
Compare
bf68f26 to
0cfe293
Compare
Collaborator
Author
|
It occurred to me we could probably keep multiprocessing for the time being if, rather than passing the file system as an argument to the jobs, we pulled it from the settings object within each job. Might be worth a try so we can merge this. |
Collaborator
Author
|
Replacing in favour of #229, which doesn't turn off the multiprocessing. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Moves the ImportScanner class to Rust.
In order to do this we also need to define Rust-based swappable fake/real file systems to be passed in, so we can still unit test via Python. To reduce the amount of work involved, I've defined a narrower interface for the filesystem that only implements what is needed by
ImportScanner. We can broaden it later when we come to do the same with caching / module finding.I had trouble getting the ImportScanner to pickle/unpickle correctly, which is needed to do multiprocessing, so I've removed multiprocessing. This slows things down considerably (on a large graph, locally, building the graph goes from 5s -> 15s). So really we need to keep going and move the multiprocessing to threads before this is releasable.
There's also distinctly smelly Rust code that loads the
ModuleandDirectImportdataclasses, rather than defining them in Rust - all in the interests of trying to limit how many changes I needed to make to keep the tests passing.Parallelism next steps
The
ImportScannerclass currently requires the GIL, which means we still have a bit of work to do before we can move to multithreading. I think the best thing to do is abandon the ports-and-adapters approach forImportScanner(we only have one of them anyway) and instead make a function that we can unit test from Python, that does them in bulk along the lines of #222. We should keep the unit tests of import scanner but just adapt them to call the bulk function instead - which at first could be in Python, but then we could push that down to Rust. That would allow us to turn theImportScannerinto a pure Rust class that doesn't need the GIL, and do any mapping to Python classes / exceptions in a wrapper function in Python.Still to do