ils: library KPIs by JakobMiesner · Pull Request #104 · inveniosoftware/rfcs

JakobMiesner · 2025-09-15T12:40:23Z

kpsherva · 2025-09-21T16:57:47Z

rfcs/ils-0104-library-kpis.md

+2. Outsider/Stakeholder Dashboard
+The Audience of this Dashboard is not the librarians, but rather the patrons and management.
+It displays simpler KPIs that show if the library is working well, while being less detailed and technical than the internal dashboard.


I am not sure which KPIs would fit in the internal which to external dashboard, was this voiced as a requirement?

no, but since this RFC just handles the API endpoints, this does not impact the implementation of this RFC.

rfcs/ils-0104-library-kpis.md

kpsherva · 2025-09-21T17:05:44Z

rfcs/ils-0104-library-kpis.md

+          - `after_record_insert`
+          - `after_record_update`
+          - `after_record_delete`
+        - aggregate:


Could you clarify this part - for loans, we already have all the loans indexed with the creation date - why do we need to generate state events? couldn't we just query the loan search and get the answer? is it that because we need it as a number per day?

Because:

With the way this will be implemented, we can easily track the creation, update and deletion of all types of records in ILS. While we could also search for the creation date of loans, this is not as easily possible with updates/deletions. If we also use this stat for loan creations, we stay consistent with how the stat is implemented for the other record types.

deleted records would no longer show up if we just look at the creation date. While this is no problem for loans, as they can not be deleted, it is a problem for other record types (e.g. documents).

kpsherva · 2025-09-21T17:11:07Z

rfcs/ils-0104-library-kpis.md

+
+### The specific KPIs are implemented as follows
+1. Turnover rate of the Library collection:
+   1. number of new loans / number of loanable items


it would be great to note somewhere what kind of outcome is expected to measure. As in: why do we divide by number of loanable items?

This KPI is used to measure the rate of use of the collection.
It is described in ISO 11620:2023 A.2.1.1. I will also mention the ISO in the RFC

please note down what we decided for implementation of number of loanable items per unit of time

kpsherva · 2025-09-21T17:13:52Z

rfcs/ils-0104-library-kpis.md

+           - aggregate:
+             - count
+             - daily
+             - over composite field `loan_creation_method__document_availability_during_loan_creation`:


lets go through this section together IRL

kpsherva · 2025-09-21T17:15:42Z

rfcs/ils-0104-library-kpis.md

+                  - we also add information about the provider to the event for interlibrary loans, so future aggregations can differentiate the waiting time based on the provider
+
+
+4. Number of changes to the Library collections:


In this section I believe are missing to store also curators id - we are frequently asked for stats for an individual curator

I will ask whether this is wanted.

rfcs/ils-0104-library-kpis.md

kpsherva · 2025-09-21T17:39:23Z

rfcs/ils-0104-library-kpis.md

+## Drawbacks
+
+### Periodic Stats
+By implementing periodic stats to be added as events, it is easy to run into situations, where invenio-stats always only aggregates one document per time period.


not sure I understand this part, let's chat

rfcs/ils-0104-library-kpis.md

kpsherva · 2025-09-21T17:45:23Z

rfcs/ils-0104-library-kpis.md

+So a loan would also contain the field `waiting_time`.
+This would allow to directly query the records indices for the KPIs, without the need of an additional stats index.
+But this would introduce a lot of fields to the records indices, which are only used for KPIs.
+Additionally, the dashboard would need to perform a lot of queries and aggregations, which might overload the search system.


the stats will be queried by search, can you explain how is it different ?

The stats served by invenio-stats are already partially aggregated (e.g. on a daily basis).
If we use invenio-stats, a request asking for the number of new loans for a month leads to an aggreagtion that has to take a maxmimum of 31 documents into account.
When not using invenio-stats, the query triggers an aggregation over all new loans in those 31 days, which might be a large number of documents.

I updated the RFC to better describe this.

rfcs/ils-0104-library-kpis.md

kpsherva · 2025-09-21T17:53:52Z

rfcs/ils-0104-library-kpis.md

+
+### KPIs
+
+#### KPI 3.1 - Extracting for loan creation method


let's discuss on this part

kpsherva · 2025-09-21T17:54:36Z

rfcs/ils-0104-library-kpis.md

+
+Alternatively, we could just listen to the signal `after_record_insert` from `invenio_records`, filter for loans and only during event generation or preprocessing extract the creation method. (Unsure if possible)
+
+#### Median vs. Average


was this discussed with the librarians? is it possible to easily have both or leave it up to the dashboard "client"?

Average was the one requested in the ticket but we could also add both. But this also depends on this discussion

kpsherva · 2025-09-21T17:55:23Z

rfcs/ils-0104-library-kpis.md

+
+#### Aggregation period
+We aggregate most stats on a daily basis.
+An exception of this are the loan durations and waiting times, which are aggregated monthly.


what is the motivation behind montly aggregation? what do we gain/lose by aggregating daily too?

While it does not really matter for the average, it matters for the median (discussed here).
For average, we offer two separate queries that allows the dashboard to compute the average x/y

sum of metric in index: x

count of documents in index: y

We could aggregate both numbers daily and still, the dashboard could display the average for the whole month by doing $\frac{x_1 + x_2 ... x_{31}}{y_1 + x_2 ... + y_{31}}$.
But for median we have to decide the granularity during the aggregation and here daily does not really make sense.
As we might want to add the median and currently only one granularity per aggregation is allowed by the StatAggregator in invenio-stats, I decided to go for month.

ntarocco · 2025-09-23T12:36:22Z

rfcs/ils-0104-library-kpis.md

+
+By the design of `invenio-stats`, all stats are aggregated.
+Currently, this aggregation is always done over a certain `field` (a field to group the documents in the events index by).
+Some of our KPIs do not have such a `field`, as all documents should be grouped together.


Can you please explain better what is the expected change? It is unclear what all documents should be grouped together practically means.

please see the subsection "Global Aggregation - global-aggregation" and the attached PR

JakobMiesner force-pushed the feature/ils-library-kpis branch 3 times, most recently from 868574c to 9471bc3 Compare September 17, 2025 15:03