Skip to content

Use ruff to parse imports#214

Merged
seddonym merged 8 commits intopython-grimp:masterfrom
Peter554:ruff-imported-object-parser
May 5, 2025
Merged

Use ruff to parse imports#214
seddonym merged 8 commits intopython-grimp:masterfrom
Peter554:ruff-imported-object-parser

Conversation

@Peter554
Copy link
Collaborator

@Peter554 Peter554 commented May 3, 2025

Ruff can be installed via git!

You may need to use CARGO_NET_GIT_FETCH_WITH_CLI=true

Inspired by RustPython/RustPython#5494

Performance results on my machine (Kraken):

  • AST
    • 1 process: 41s
    • Many processes: 11s
  • Ruff
    • 1 process: 15s
    • Many processes: 6s

@codspeed-hq
Copy link

codspeed-hq bot commented May 3, 2025

CodSpeed Instrumentation Performance Report

Merging #214 will degrade performances by 18.69%

Comparing Peter554:ruff-imported-object-parser (dc502f3) with master (810bb16)

Summary

⚡ 2 improvements
❌ 3 regressions
✅ 17 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
test_deep_layers_large_graph_kept 16.3 ms 20 ms -18.69%
test_no_chain 1.1 ms 1.2 ms -11.14%
test_no_chains 1.1 ms 1.2 ms -11.09%
test_build_django_from_cache_a_few_misses[15] 381.5 ms 161.3 ms ×2.4
test_build_django_from_cache_a_few_misses[2] 167.8 ms 137.8 ms +21.75%

@Peter554 Peter554 force-pushed the ruff-imported-object-parser branch from 4ae107f to 081aa36 Compare May 3, 2025 18:32
@Peter554 Peter554 mentioned this pull request May 3, 2025
@Peter554 Peter554 force-pushed the ruff-imported-object-parser branch 2 times, most recently from 567cefc to e6d4183 Compare May 3, 2025 18:39
@Peter554 Peter554 marked this pull request as ready for review May 3, 2025 18:40
@Peter554 Peter554 force-pushed the ruff-imported-object-parser branch 2 times, most recently from 6bf66d2 to de9e0a7 Compare May 4, 2025 08:39
imported_object_dicts = rust.parse_imported_objects_from_code(module_contents)
imported_objects = {_ImportedObject(**d) for d in imported_object_dicts}

# TODO - remove these lines once we're confident the rust way is consistent with ast.
Copy link
Collaborator Author

@Peter554 Peter554 May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find any discrepancies when checking Kraken, so removing this. I'd value a second opinion though.

@Peter554 Peter554 force-pushed the ruff-imported-object-parser branch from 70f061e to e8f5f99 Compare May 4, 2025 09:18
Comment on lines 54 to 55
# TODO What to do with the rust.ParseError?
# TODO The line number and text are part of str(e), but is there a nice way to get them?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seddonym I could do with some help to decide what to do here. The problem is that the line number and text are only available as part of str(e), and not as structured fields.

Copy link
Collaborator Author

@Peter554 Peter554 May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I think I found a reasonable solution now, using pyclass(extends=PyException) instead of create_exception! we're able to create a custom exception class with custom fields.

@Peter554 Peter554 force-pushed the ruff-imported-object-parser branch from e8f5f99 to 5650e50 Compare May 4, 2025 14:34
expected_exception = exceptions.SourceSyntaxError(
filename=filename, lineno=5, text="fromb . import two\n"
)
expected_exception = exceptions.SourceSyntaxError(filename=filename, lineno=5, text="import")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text has changed here - the ruff parser reports a different text range for the syntax error than the python ast parser. See let text = source_code.slice(e.location); in the import_parsing.rs file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's weird - "import" doesn't seem like a very useful error message. I might look into pulling out the full line via the line contents. It's not a blocker though - the syntax error reporting is not part of Grimp's documented API, so I'm less concerned about it changing.

@seddonym seddonym force-pushed the ruff-imported-object-parser branch from 5650e50 to dc502f3 Compare May 5, 2025 11:15
@seddonym seddonym mentioned this pull request May 5, 2025
@seddonym seddonym self-requested a review May 5, 2025 11:36
Copy link
Collaborator

@seddonym seddonym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome stuff - nice work.

I suspect that writing our own parser to parse a subset of Python might ultimately be faster, but I think it's a good practical decision to do it like this at least for the time being. I'm not quite clear on how it interacts with different Python versions though.

expected_exception = exceptions.SourceSyntaxError(
filename=filename, lineno=5, text="fromb . import two\n"
)
expected_exception = exceptions.SourceSyntaxError(filename=filename, lineno=5, text="import")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's weird - "import" doesn't seem like a very useful error message. I might look into pulling out the full line via the line contents. It's not a blocker though - the syntax error reporting is not part of Grimp's documented API, so I'm less concerned about it changing.

fn parse_imports_from_code_without_line_contents(
code: &str,
) -> GrimpResult<Vec<ImportedObjectWithoutLineContents>> {
let ast = parse_module(code).expect("failed to parse python code");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if the Python syntax it checks is the same as the version of Python it will be running under?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I'm unsure, but I'd suspect the latest python syntax is parsed. Assuming python doesn't release a new major version, then the syntax should always be backwards compatible? So it should be fine I think.

pub name: String,
pub line_number: usize,
pub line_contents: String,
pub typechecking_only: bool,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One optimization we could consider for the future: we don't actually need to keep track of whether an import is within a type checking guard here (it's deliberately not exposed in Grimp's public API). Instead, we could just take exclude_type_checking_imports as an argument and use a (I'm guessing) slightly faster algorithm if it's false.

@seddonym seddonym merged commit 8f50d9c into python-grimp:master May 5, 2025
34 of 35 checks passed
@seddonym seddonym mentioned this pull request May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants