Skip to content

Improve performance of duplicate comparisons#227

Merged
goneall merged 1 commit intomasterfrom
issue226
Oct 26, 2025
Merged

Improve performance of duplicate comparisons#227
goneall merged 1 commit intomasterfrom
issue226

Conversation

@goneall
Copy link
Member

@goneall goneall commented Oct 25, 2025

Fixes #226

Only tokenize the license compares once. Improves performance approximately 2 orders of magnitude when processing the entire license list. Performance improvements are negligable for single licenses.

Add a timeout to the future gets of cross reference URL details. During testing, this fetch stalled once - this change will create a warning but will prevent an application hang.

@goneall
Copy link
Member Author

goneall commented Oct 25, 2025

@xsuchy - this is the PR fix for performance. It is still O(N!), but the inner loop is much faster so it completes in a reasonable amount of time.

Part of me wants to redo the algorithm to be close to linear, but it's probably good enough with this PR.

@goneall
Copy link
Member Author

goneall commented Oct 25, 2025

Note that the CI is failing due to #224 which is unrelated to this PR

Only tokenize the license compares once.  Improves performance
approximately 2 orders of magnitude when processing the entire license
list.  Performance improvements are negligable for single licenses.

Add a timeout to the future gets of cross reference URL details.  During
testing, this fetch stalled once - this change will create a warning but
will prevent an application hang.
@goneall goneall merged commit 1166f92 into master Oct 26, 2025
1 check passed
@goneall goneall deleted the issue226 branch October 26, 2025 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance improvement suggestion

1 participant