perf: Add fast path for Version to Version comparison by skipping _key property#1083
perf: Add fast path for Version to Version comparison by skipping _key property#1083notatallshaw wants to merge 3 commits intopypa:mainfrom
Version to Version comparison by skipping _key property#1083Conversation
_key propertyVersion to Version comparison by skipping _key property
|
Ohhh, interesting. You are just skipping the property lookup. I tried this, but instead I avoided _key entirely and tried to just implement the comparison directly without creating an extra tuple, but it was much slower than tuple comparisons (though maybe I was missing something, I didn't write out all the methods, but sorting only uses one anyway). If we really have to write out a lot more code that also needs to be maintained, we should be pretty sure the performance is worth it. Most changes so far haven't added too much code. |
Agreed, 20% improvement in sorting versions really is worth it though, this is something you need to do often. But I do want to really understand if this is the only way to achieve this performance benefit, or if there's a simpler approach, also if this benefit is consistent across Python versions. |
uv run asv continuous main pull/1083/head --sort default
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY. Okay, not showing 3.12+. is odd, rerunning with
|
|
Actually, I think there's still quite a bit of overhead from the isinstance check, which is maybe where the variation is coming from. I suspect the fast path will be even faster pathier by changing the check from I'll try and play around with that later. |
|
FYI, you can run the benchmarks by checking out #1059 and then using the commands above (or your branch name to test un-pushed changes). |
|
@henryiii thanks, that showed my idea about I ran the benchmarks locally, but updated the sorting benchmark to only measure sorting performance (as I mention in #1059 (comment)) by adding Doing that, this is what I see with this branch: |
88587fb to
4a41224
Compare
|
Sorting only uses one method, |
Yes, I implemented this cold/warm micro benchmark approach I mention in #1059 (comment) and I only need However, I remembered different comparisons are used in specifiers, so I also implemented cold/warm micro benchmarks there as well (both cold/warm versions and cold/warm specifiers). For specifiers I see smaller but significantly significant improvement in filtering prior to Python 3.12, and in that case I need all the methods except Here's my results: |
|
This is probably the change that makes accessing the property much faster on Python 3.12+: python/cpython#93912. |
| def __hash__(self) -> int: | ||
| return hash(self._key) |
There was a problem hiding this comment.
Why is this here (and also in the base class written exactly the same way)?
| def __hash__(self) -> int: | ||
| return hash(self._key) | ||
|
|
||
| # Override comparison methods to use direct _key_cache access |
There was a problem hiding this comment.
| # Override comparison methods to use direct _key_cache access | |
| # Override comparison methods to use direct _key_cache access | |
| # This is faster than property access, especially before Python 3.12 |
|
Strange, I tried a variation of this, by using a method instead: + # Property access is slower, especially on Python < 3.12
+ def _get_key(self) -> CmpKey:
+ if self._key_cache is None:
+ self._key_cache = _cmpkey(
+ self._epoch,
+ self._release,
+ self._pre,
+ self._post,
+ self._dev,
+ self._local,
+ )
+ return self._key_cache
+
def __hash__(self) -> int:
return hash(self._key)
# Override comparison methods to use direct _key_cache access
def __lt__(self, other: _BaseVersion) -> bool:
if isinstance(other, Version):
- if self._key_cache is None:
- self._key_cache = _cmpkey(
- self._epoch,
- self._release,
- self._pre,
- self._post,
- self._dev,
- self._local,
- )
- if other._key_cache is None:
- other._key_cache = _cmpkey(
- other._epoch,
- other._release,
- other._pre,
- other._post,
- other._dev,
- other._local,
- )
- return self._key_cache < other._key_cache
+ return self._get_key() < other._get_key()(and likewise for the other methods). This was slower, except for complex filtering, where it was all over the place:
|
|
Thanks for the review, I will address shortly.
Yeah, my current speculation is that even method look up in a sort adds measurable overhead, and on Python 3.12+ it's this performance that is being saved, not specifically that it's a property look up. |
|
I'm not sure why it's so variable based on Python version on that one benchmark (or why that benchmark seems so variable in the first place, need to look at it). I also tried a free function to avoid descriptors, def _get_key(ver: Version, /) -> CmpKey:
if ver._key_cache is None:
ver._key_cache = _cmpkey(
ver._epoch,
ver._release,
ver._pre,
ver._post,
ver._dev,
ver._local,
)
return ver._key_cacheand results are similar, sorting is much slower (20-30 percent), and |
|
Did you want to address the two comments above? :) |
Yes, but I have very low capacity to work on open source code right now, hopefully be back to doing OSS code soon. |
I was looking at #1082 to see if there were any performance gains to be made by implementing a simple comparison method, but almost all the performance improvements I found were related to skipping the
_keyproperty access.I really don't understand why accessing the
_keyproperty is so slow, butsorted(versions)on a list of Versions I found to be consistently 10-20% faster on Python 3.14.I'm raising for draft right now until I do some more research on why this has such as big impact and if there's a simpler way to get the gains.