Skip to content

Conversation

@netomi
Copy link
Contributor

@netomi netomi commented Aug 13, 2023

This fixes #557 .

Instead of calculating all roots at the start we delay that when evaluating each individual snippet.
The roots data structure is populated with all nodes of all imported jsonnet snippets, so we utilize that also in the importFunc where we try to resolve imports from the list of roots to avoid doing duplicate imports.

However this is a best effort only, as the mechanism is not smart, but is just able to resolve imports in the same path. This could be further improved, but for the test cases that is already sufficient, as all of them are in the same directory.

With this fix, the multiple tests run in around 0.23s compared to 9s before.

Copy link
Collaborator

@johnbartholomew johnbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, apologies for the multi-year delay before anyone looked at this. Thank you very much for the contribution!

linter/linter.go Outdated
vars[importedPath] = info.VarAt
}
for importedPath, rootNode := range roots {
variablesInFile[importedPath] = *findVariables(nodeWithLocation{rootNode, importedPath})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of this is confusing now, because the function-scoped map variablesInFile is being updated gradually in each iteration of the loop over nodes, so we don't have either a complete set of variables collected before looping over nodes, nor fully separate minimal set of variables collected individually for each node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the PR changes the variablesInFile from function-scope to snippet-scope.

Previously, all variables for were evaluated for all snippets as a whole, using them when building the type graph for each snippet which is unnecessary, as most of the time they are not used.

so we try to collect only the vars that are actually used by the snippet to speed up the type graph creation, which is the slow part afaict from profiling.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I made this more confusing by rebasing and adding a commit. But in 8893d8e it actually doesn't change the variablesInFile scope, which is why I commented on it. It still has variablesInFile at function scope, it just updates it in the snippet scope (adding to it each time).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through the code history, support for linting multiple files in a single invocation was added in #545, and the review at the time noted that it doesn't really take advantage of being able to collect and process the expression graphs for all the files together. But the review did not note that just partially processing the files together would itself create a performance problem.

It seems like from there we can go in one of two directions: either split processing apart more, so each listed input file is processed mostly independently of the others (which is the direction this PR goes in), or make a deeper change in the linter expression/type graph processing to more efficiently process all the input files with a single expression/type graph. Since the latter is obviously a more invasive change I'm happy to stick with this fix. It should be at least no worse than running the lint command entirely separately on each individual input file.

I do have a few smaller structural comments, added below.

@johnbartholomew johnbartholomew marked this pull request as draft January 27, 2026 18:47
@johnbartholomew johnbartholomew marked this pull request as ready for review January 27, 2026 18:47
@coveralls
Copy link

coveralls commented Jan 27, 2026

Coverage Status

coverage: 44.269% (-0.03%) from 44.297%
when pulling 23a013c on netomi:fix-linter-multiple-snippets
into a52ac8d on google:master.

@johnbartholomew
Copy link
Collaborator

Since this has been waiting for over 2 years, I will work under the assumption that the original author may not be available, so I'll rebase and add an adjustment and then merge this.

@johnbartholomew johnbartholomew force-pushed the fix-linter-multiple-snippets branch from ac93205 to 23a013c Compare January 27, 2026 19:07
@netomi
Copy link
Contributor Author

netomi commented Jan 27, 2026

It's a long time and I tried to get back into the PR to understand what I did there. I did some debugging at the time and some things got evaluated again and again, which the patch avoids though there are certainly better ways to do it.

@johnbartholomew
Copy link
Collaborator

Sadly we don't have any automated benchmark checks that run as part of CI. However, the normal tests do report how long in total the linter test took to run. Here are recent results from the CI workflow on master (before this PR):

  • ok github.com/google/go-jsonnet/linter 16.132s coverage: 96.7% of statements
  • ok github.com/google/go-jsonnet/linter 15.854s coverage: 96.7% of statements
  • ok github.com/google/go-jsonnet/linter 16.219s coverage: 96.7% of statements
  • ok github.com/google/go-jsonnet/linter 15.867s coverage: 96.7% of statements

Here's the result for the latest CI run on this PR:

  • ok github.com/google/go-jsonnet/linter 0.543s coverage: 96.4% of statements

@johnbartholomew
Copy link
Collaborator

It's a long time and I tried to get back into the PR to understand what I did there. I did some debugging at the time and some things got evaluated again and again, which the patch avoids though there are certainly better ways to do it.

@netomi Oh, hi! Glad you're still around. Yeah, this is definitely an improvement and worth merging even if there are other ways.

@netomi
Copy link
Contributor Author

netomi commented Jan 27, 2026

ah now I remember again what happens:

in the original code we create a roots map of all snippets and when we evaluate each individual snippet, all roots are added to the type graph and prepared which takes the majority of time.

With the patch, we try to be clever and only create a map of roots that are referenced via the variables that are found in the snippet.

@johnbartholomew johnbartholomew merged commit 23a013c into google:master Jan 27, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Running linter on multiple files at once suspiciously slow

3 participants