Use ruff to parse imports#214
Conversation
CodSpeed Instrumentation Performance ReportMerging #214 will degrade performances by 18.69%Comparing Summary
Benchmarks breakdown
|
4ae107f to
081aa36
Compare
567cefc to
e6d4183
Compare
6bf66d2 to
de9e0a7
Compare
src/grimp/adaptors/importscanner.py
Outdated
| imported_object_dicts = rust.parse_imported_objects_from_code(module_contents) | ||
| imported_objects = {_ImportedObject(**d) for d in imported_object_dicts} | ||
|
|
||
| # TODO - remove these lines once we're confident the rust way is consistent with ast. |
There was a problem hiding this comment.
I didn't find any discrepancies when checking Kraken, so removing this. I'd value a second opinion though.
70f061e to
e8f5f99
Compare
src/grimp/adaptors/importscanner.py
Outdated
| # TODO What to do with the rust.ParseError? | ||
| # TODO The line number and text are part of str(e), but is there a nice way to get them? |
There was a problem hiding this comment.
@seddonym I could do with some help to decide what to do here. The problem is that the line number and text are only available as part of str(e), and not as structured fields.
There was a problem hiding this comment.
Ah I think I found a reasonable solution now, using pyclass(extends=PyException) instead of create_exception! we're able to create a custom exception class with custom fields.
e8f5f99 to
5650e50
Compare
| expected_exception = exceptions.SourceSyntaxError( | ||
| filename=filename, lineno=5, text="fromb . import two\n" | ||
| ) | ||
| expected_exception = exceptions.SourceSyntaxError(filename=filename, lineno=5, text="import") |
There was a problem hiding this comment.
The text has changed here - the ruff parser reports a different text range for the syntax error than the python ast parser. See let text = source_code.slice(e.location); in the import_parsing.rs file.
There was a problem hiding this comment.
Hmm, that's weird - "import" doesn't seem like a very useful error message. I might look into pulling out the full line via the line contents. It's not a blocker though - the syntax error reporting is not part of Grimp's documented API, so I'm less concerned about it changing.
Ruff can be installed via git! You may need to use CARGO_NET_GIT_FETCH_WITH_CLI=true Inspired by RustPython/RustPython#5494
5650e50 to
dc502f3
Compare
seddonym
left a comment
There was a problem hiding this comment.
Awesome stuff - nice work.
I suspect that writing our own parser to parse a subset of Python might ultimately be faster, but I think it's a good practical decision to do it like this at least for the time being. I'm not quite clear on how it interacts with different Python versions though.
| expected_exception = exceptions.SourceSyntaxError( | ||
| filename=filename, lineno=5, text="fromb . import two\n" | ||
| ) | ||
| expected_exception = exceptions.SourceSyntaxError(filename=filename, lineno=5, text="import") |
There was a problem hiding this comment.
Hmm, that's weird - "import" doesn't seem like a very useful error message. I might look into pulling out the full line via the line contents. It's not a blocker though - the syntax error reporting is not part of Grimp's documented API, so I'm less concerned about it changing.
rust/src/import_parsing.rs
Outdated
| fn parse_imports_from_code_without_line_contents( | ||
| code: &str, | ||
| ) -> GrimpResult<Vec<ImportedObjectWithoutLineContents>> { | ||
| let ast = parse_module(code).expect("failed to parse python code"); |
There was a problem hiding this comment.
Do you know if the Python syntax it checks is the same as the version of Python it will be running under?
There was a problem hiding this comment.
Good question. I'm unsure, but I'd suspect the latest python syntax is parsed. Assuming python doesn't release a new major version, then the syntax should always be backwards compatible? So it should be fine I think.
| pub name: String, | ||
| pub line_number: usize, | ||
| pub line_contents: String, | ||
| pub typechecking_only: bool, |
There was a problem hiding this comment.
One optimization we could consider for the future: we don't actually need to keep track of whether an import is within a type checking guard here (it's deliberately not exposed in Grimp's public API). Instead, we could just take exclude_type_checking_imports as an argument and use a (I'm guessing) slightly faster algorithm if it's false.
Ruff can be installed via git!
You may need to use
CARGO_NET_GIT_FETCH_WITH_CLI=trueInspired by RustPython/RustPython#5494
Performance results on my machine (Kraken):