Skip to content

Conversation

@CarloMariaProietti
Copy link
Contributor

@CarloMariaProietti CarloMariaProietti commented Dec 19, 2025

Fixed version of #1636

Fixes #1492
The idea is the following:
ValueColumnInternal is an interface for statistic values, which in this way are not exposed as public.
Implementations of ValueColumnInternal contain the actual cache.

It was necessary to have two caches for each stat (for the moment only max) because computing the stat may give different outputs basing on skipNaN boolean parameter.

I implemented the solution by overloading aggregateSingleColumn, this overload exploits the original aggregateSingleColumn by wrapping it so that it is possible to exploit caches.

For the moment there is only max, however it would be easy to do the same with min, sum, mean and median.
For percentile and std it could be done something similar.


@JvmInline
internal value class StatisticResult(val value: Any?)

Copy link
Contributor Author

@CarloMariaProietti CarloMariaProietti Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParameterValue class was created so that it could override equals and hashCode. In this way it is possible to correctly compare two Map<String, ParameterValue>.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you comment, please leave a comment at exactly the relevant line :) Looking at the comment, I thought it belonged to StatisticResult

@JvmInline
internal value class StatisticResult(val value: Any?)

public class ParameterValue(public val parameter: Any?) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? booleans, doubles and integers are primitives. They already have a consistent hashcode and correctly working equals

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also perfectly fine to use a map of primitives as key of another map, as the equality and hashcode of a Map depends solely on its contents:
image

Copy link
Contributor Author

@CarloMariaProietti CarloMariaProietti Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, that class was actually useless. Fixed with the new commit. I choose for a non-nullable Any beacuse each time an entry is created the actual parameter is non-nullable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lazy statistics for columns

2 participants