-
Notifications
You must be signed in to change notification settings - Fork 23
Add find_shortest_cycle
#194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
68860cd
f73738d
f29fc2c
1cbefec
43b7c31
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| use crate::errors::GrimpResult; | ||
| use crate::graph::pathfinding; | ||
| use crate::graph::{ExtendWithDescendants, Graph, ModuleToken}; | ||
| use rustc_hash::{FxHashMap, FxHashSet}; | ||
| use tap::Conv; | ||
|
|
||
| impl Graph { | ||
| pub fn find_shortest_cycle( | ||
| &self, | ||
| module: ModuleToken, | ||
| as_package: bool, | ||
| ) -> GrimpResult<Option<Vec<ModuleToken>>> { | ||
| if as_package { | ||
| pathfinding::find_shortest_cycle( | ||
| self, | ||
| &module.conv::<FxHashSet<_>>().with_descendants(self), | ||
| &FxHashSet::default(), | ||
| &FxHashMap::default(), | ||
| ) | ||
| } else { | ||
| pathfinding::find_shortest_cycle( | ||
| self, | ||
| &module.conv::<FxHashSet<_>>(), | ||
| &FxHashSet::default(), | ||
| &FxHashMap::default(), | ||
| ) | ||
| } | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,6 +1,7 @@ | ||
| use crate::errors::{GrimpError, GrimpResult}; | ||
| use crate::graph::{Graph, ModuleToken, EMPTY_MODULE_TOKENS}; | ||
| use indexmap::{IndexMap, IndexSet}; | ||
| use itertools::Itertools; | ||
| use rustc_hash::{FxHashMap, FxHashSet, FxHasher}; | ||
| use slotmap::SecondaryMap; | ||
| use std::hash::BuildHasherDefault; | ||
|
|
@@ -28,7 +29,8 @@ pub fn find_reach( | |
| &seen.into_iter().collect::<FxHashSet<_>>() - from_modules | ||
| } | ||
|
|
||
| pub fn find_shortest_path_bidirectional( | ||
| /// Finds the shortest path, via a bidirectional BFS. | ||
| pub fn find_shortest_path( | ||
| graph: &Graph, | ||
| from_modules: &FxHashSet<ModuleToken>, | ||
| to_modules: &FxHashSet<ModuleToken>, | ||
|
|
@@ -39,6 +41,39 @@ pub fn find_shortest_path_bidirectional( | |
| return Err(GrimpError::SharedDescendants); | ||
| } | ||
|
|
||
| _find_shortest_path( | ||
| graph, | ||
| from_modules, | ||
| to_modules, | ||
| excluded_modules, | ||
| excluded_imports, | ||
| ) | ||
| } | ||
|
|
||
| /// Finds the shortest cycle from `modules` to `modules`, via a bidirectional BFS. | ||
| pub fn find_shortest_cycle( | ||
| graph: &Graph, | ||
| modules: &FxHashSet<ModuleToken>, | ||
| excluded_modules: &FxHashSet<ModuleToken>, | ||
| excluded_imports: &FxHashMap<ModuleToken, FxHashSet<ModuleToken>>, | ||
| ) -> GrimpResult<Option<Vec<ModuleToken>>> { | ||
| // Exclude imports internal to `modules` | ||
| let mut excluded_imports = excluded_imports.clone(); | ||
| for (m1, m2) in modules.iter().tuple_combinations() { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Generally, I dont know Rust at all, I will try to understand the full code and learn a bit of Rust. Maybe then I will add some useful comments to the PR. According to those lines (question about logic, not Rust details): what is the logic in lines 62-64 for? Why do we remove internal imports between modules in
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably the way to understand this is by looking at the test case There is an import between
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does that make sense? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, makes sense. I mean, I understand now why we have those lines. Thanks for the explanation. But should we have them? In the test case I rather think of package "colors" as a set of all descendant modules inside all sub-packages., not only children. In case of finding cycles in the whole project, instead of iterating over every single package in some package tree, I would rather just provide a single package. For example, I would like to fix "colors.red". If grimp does not return any cycle for "colors.red", it means that the whole package and all sub-packages inside are fine when it comes to importing. Maybe we could introduce another parameter called something like "consider_sub_packages: bool = True"? Based on the value we could execute lines 62-64 or not. What do you think?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the function is behaving as I intended it (we're finding the shortest chain from the This behaviour is consistent with e.g. In the above example That said, judging by your reply above and #194 (comment) it sounds like
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes sounds good to me. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Coming back to the topic, I guess in the high lvl functionality we can loop over sub-packages of the package in our interest executing For example, having the following dependencies: we would like to find cycles (including internal ones) for "colors". So we execute: We go further and start with a first sub-package of "colors" which is "colors.blue": Wanting to find all cycles, we execute further, selecting the second sub-package "colors.red": We get the semantically equivalent cycle. I wonder if optionally disabling those lines by having a parameter
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be better if the entirety of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. Thinner the Python layer is, faster executions we get. As @Peter554 mentioned in the PR description we can have both, low and high lvl functionality implemented in Grimp. Then we can utilize There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As discussed here: #189 (comment) |
||
| excluded_imports.entry(*m1).or_default().insert(*m2); | ||
| excluded_imports.entry(*m2).or_default().insert(*m1); | ||
| } | ||
|
|
||
| _find_shortest_path(graph, modules, modules, excluded_modules, &excluded_imports) | ||
| } | ||
|
|
||
| fn _find_shortest_path( | ||
| graph: &Graph, | ||
| from_modules: &FxHashSet<ModuleToken>, | ||
| to_modules: &FxHashSet<ModuleToken>, | ||
| excluded_modules: &FxHashSet<ModuleToken>, | ||
| excluded_imports: &FxHashMap<ModuleToken, FxHashSet<ModuleToken>>, | ||
| ) -> GrimpResult<Option<Vec<ModuleToken>>> { | ||
| let mut predecessors: FxIndexMap<ModuleToken, Option<ModuleToken>> = from_modules | ||
| .clone() | ||
| .into_iter() | ||
|
|
@@ -123,3 +158,105 @@ fn import_is_excluded( | |
| .contains(to_module) | ||
| } | ||
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod test_find_shortest_cycle { | ||
| use super::*; | ||
|
|
||
| #[test] | ||
| fn test_finds_cycle_single_module() -> GrimpResult<()> { | ||
| let mut graph = Graph::default(); | ||
| let foo = graph.get_or_add_module("foo").token; | ||
| let bar = graph.get_or_add_module("bar").token; | ||
| let baz = graph.get_or_add_module("baz").token; | ||
| let x = graph.get_or_add_module("x").token; | ||
| let y = graph.get_or_add_module("y").token; | ||
| let z = graph.get_or_add_module("z").token; | ||
| // Shortest cycle | ||
| graph.add_import(foo, bar); | ||
| graph.add_import(bar, baz); | ||
| graph.add_import(baz, foo); | ||
| // Longer cycle | ||
| graph.add_import(foo, x); | ||
| graph.add_import(x, y); | ||
| graph.add_import(y, z); | ||
| graph.add_import(z, foo); | ||
|
|
||
| let path = find_shortest_cycle( | ||
| &graph, | ||
| &foo.into(), | ||
| &FxHashSet::default(), | ||
| &FxHashMap::default(), | ||
| )?; | ||
| assert_eq!(path, Some(vec![foo, bar, baz, foo])); | ||
|
|
||
| graph.remove_import(baz, foo); | ||
|
|
||
| let path = find_shortest_cycle( | ||
| &graph, | ||
| &foo.into(), | ||
| &FxHashSet::default(), | ||
| &FxHashMap::default(), | ||
| )?; | ||
| assert_eq!(path, Some(vec![foo, x, y, z, foo])); | ||
|
|
||
| Ok(()) | ||
| } | ||
|
|
||
| #[test] | ||
| fn test_returns_none_if_no_cycle() -> GrimpResult<()> { | ||
| let mut graph = Graph::default(); | ||
| let foo = graph.get_or_add_module("foo").token; | ||
| let bar = graph.get_or_add_module("bar").token; | ||
| let baz = graph.get_or_add_module("baz").token; | ||
| graph.add_import(foo, bar); | ||
| graph.add_import(bar, baz); | ||
|
|
||
| let path = find_shortest_cycle( | ||
| &graph, | ||
| &foo.into(), | ||
| &FxHashSet::default(), | ||
| &FxHashMap::default(), | ||
| )?; | ||
|
|
||
| assert_eq!(path, None); | ||
|
|
||
| Ok(()) | ||
| } | ||
|
|
||
| #[test] | ||
| fn test_finds_cycle_multiple_module() -> GrimpResult<()> { | ||
| let mut graph = Graph::default(); | ||
|
|
||
| graph.get_or_add_module("colors"); | ||
| let red = graph.get_or_add_module("colors.red").token; | ||
| let blue = graph.get_or_add_module("colors.blue").token; | ||
| let a = graph.get_or_add_module("a").token; | ||
| let b = graph.get_or_add_module("b").token; | ||
| let c = graph.get_or_add_module("c").token; | ||
| let d = graph.get_or_add_module("d").token; | ||
|
|
||
| // The computation should not be confused by these two imports internal to `modules`. | ||
| graph.add_import(red, blue); | ||
| graph.add_import(blue, red); | ||
| // This is the part we expect to find. | ||
| graph.add_import(red, a); | ||
| graph.add_import(a, b); | ||
| graph.add_import(b, blue); | ||
| // A longer path. | ||
| graph.add_import(a, c); | ||
| graph.add_import(c, d); | ||
| graph.add_import(d, b); | ||
|
|
||
| let path = find_shortest_cycle( | ||
| &graph, | ||
| &FxHashSet::from_iter([red, blue]), | ||
| &FxHashSet::default(), | ||
| &FxHashMap::default(), | ||
| )?; | ||
|
|
||
| assert_eq!(path, Some(vec![red, a, b, blue])); | ||
|
|
||
| Ok(()) | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| import pytest | ||
|
|
||
| from grimp.adaptors.graph import ImportGraph | ||
|
|
||
|
|
||
| class TestFindShortestCycle: | ||
| @pytest.mark.parametrize( | ||
| "expected_cycle", | ||
| [ | ||
| ("foo", "bar", "foo"), | ||
| ("foo", "bar", "baz", "foo"), | ||
| ], | ||
| ) | ||
| def test_finds_shortest_cycle_when_exists(self, expected_cycle): | ||
| graph = ImportGraph() | ||
| # Shortest cycle | ||
| for importer, imported in zip(expected_cycle[:-1], expected_cycle[1:]): | ||
| graph.add_import(importer=importer, imported=imported) | ||
| # Longer cycle | ||
| graph.add_import(importer="foo", imported="x") | ||
| graph.add_import(importer="x", imported="y") | ||
| graph.add_import(importer="y", imported="z") | ||
| graph.add_import(importer="z", imported="foo") | ||
|
|
||
| assert graph.find_shortest_cycle("foo") == expected_cycle | ||
|
|
||
| graph.remove_import(importer=expected_cycle[-2], imported=expected_cycle[-1]) | ||
|
|
||
| assert graph.find_shortest_cycle("foo") == ("foo", "x", "y", "z", "foo") | ||
|
|
||
| def test_returns_none_if_no_cycle_exists(self): | ||
| graph = ImportGraph() | ||
| graph.add_import(importer="foo", imported="bar") | ||
| graph.add_import(importer="bar", imported="baz") | ||
| # graph.add_import(importer="baz", imported="foo") # This import is missing -> No cycle. | ||
|
|
||
| assert graph.find_shortest_cycle("foo") is None | ||
|
|
||
| def test_ignores_internal_imports_when_as_package_is_true(self): | ||
| graph = ImportGraph() | ||
| graph.add_module("colors") | ||
| graph.add_import(importer="colors.red", imported="colors.blue") | ||
| graph.add_import(importer="colors.blue", imported="colors.red") | ||
| graph.add_import(importer="colors.red", imported="x") | ||
| graph.add_import(importer="x", imported="y") | ||
| graph.add_import(importer="y", imported="z") | ||
| graph.add_import(importer="z", imported="colors.blue") | ||
|
|
||
| assert graph.find_shortest_cycle("colors", as_package=True) == ( | ||
| "colors.red", | ||
| "x", | ||
| "y", | ||
| "z", | ||
| "colors.blue", | ||
| ) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried the following test: Why do we only find the cycle while using "package_1" as root?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, so basically, providing And in the case of We get None, because there is no any module in |
||
Uh oh!
There was an error while loading. Please reload this page.