Skip to content

Experiment with changes to the BiMap library#173

Closed
andir wants to merge 12 commits intopython-grimp:masterfrom
andir:rust-graph
Closed

Experiment with changes to the BiMap library#173
andir wants to merge 12 commits intopython-grimp:masterfrom
andir:rust-graph

Conversation

@andir
Copy link

@andir andir commented Jan 16, 2025

No description provided.

This is the main graph implemented in Rust - but it isn't plumbed into
Python yet.
This adds a wrapper around the Rust-graph in lib.rs which handles the
interface between Python and Rust. At the same time, it changes the
Python ImportGraph so it uses Rust instead.
This is faster, but not cryptographically safe. That shouldn't matter for our
purposes.
@andir andir changed the base branch from rust-graph to master January 16, 2025 11:50
@seddonym seddonym marked this pull request as ready for review January 16, 2025 11:51
@seddonym seddonym changed the base branch from master to rust-graph January 16, 2025 11:53
@seddonym seddonym changed the base branch from rust-graph to master January 16, 2025 11:53
@codspeed-hq
Copy link

codspeed-hq bot commented Jan 16, 2025

CodSpeed Performance Report

Merging #173 will degrade performances by 51.02%

Comparing andir:rust-graph (fdbe510) with master (957ff81)

Summary

⚡ 2 improvements
❌ 1 regressions
✅ 1 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark master andir:rust-graph Change
test_build_django_from_cache 224.4 ms 194.3 ms +15.5%
test_deep_layers_large_graph 1.3 s 2.7 s -51.02%
test_top_level_large_graph 260.8 ms 207.4 ms +25.73%

andir added 6 commits January 17, 2025 10:11
This is using a bit of state to store the potential expensive
recomputation of get_modules between invocations.

The current fallback, in case of concurrent access, is to recalcuate
the data instead of waiting for the lock to be release.
This allows passing a string reference rather than an owned copy. This
can, in some cases, removes additional copies that the compiler might
not get rid of.
We don't need that anymore as the Module::new invocation will take
care of that. Since the String will be stored in an Arc<> that
hopefully leads to the String instance being done in-place.
This reduces the amount of benchmarking noise that might stem from
noise in filesystem access time.
Most of the time the result should be cached. Unfortunately we are
required to keep it in a Sync structure (thus a Mutex + Arc is being
used). To workaround this the MutexGuard that we acquire during the
function call is moved into another struct that supports a std::ops::deref-alike
function which returns either a reference to an owned copy (slow case)
or a reference to the value within the Mutex (fast path).

Thus clones or allocations are only done whenever the get_modules
value was invalidated *or* when there is a lock contention (which
shouldn't currently happen due do the GIL).
@seddonym
Copy link
Collaborator

seddonym commented Feb 7, 2025

Thanks for this. As discussed I'm going to close as there have been a lot of changes to the implementation on the rust-graph branch which seem to have sorted the performance issues out, but this is helpful to have in our back pocket if we need to go further.

@seddonym seddonym closed this Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants