-
Notifications
You must be signed in to change notification settings - Fork 0
Description
When using the GRIP API (https://api.grip.inetintel.cc.gatech.edu/json/events), I encountered the issue that if many events occur simultaneously, it is not (easily) possible to fetch all events due to the limit of returning at most 10000 results for a given query.
To solve this issue, I initially tried to split my query into smaller queries using the available search filters described in (https://github.com/InetIntel/grip-api-legacy/blob/master/api-spec.md). In particular, I tried to narrow down the time range (ts_start and ts_end), the event duration (min_duration and max_duration), the event type (event_type), and the suspicion level (min_susp and max_susp). However even when narrowing all of these filters down to a single value, I still hit the limit of 10000 events.
-
Concrete example of the issue:
The following query which fetchesmoasevents for a specific duration (300s) and suspicion level (80) at time 2024-11-04T11:30:00 returns 10000 events (indicating that more events exist at this point in time): https://api.grip.inetintel.cc.gatech.edu/json/events?event_type=moas&start=0&full=true&min_duration=300&max_duration=300&min_susp=80&max_susp=80&length=1&ts_start=2024-11-04T11:30:00&ts_end=2024-11-04T11:30:00 -
Proposed solution:
Keep the record limit at 10000 but allow the user to fetch events beyond the record limit through the use of the elasticsearchsearch_afterfeature (https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html#search-after). If the other issue regarding the nondeterministic sort order (Inconsistent results caused by nondeterministic sort order #1) is addressed, this could be implemented as follows:Allow the user to specify two optional GET parameters
search_after_view_tsandsearch_after_id, and then add the following logic in line https://github.com/InetIntel/grip-api-v2/blob/main/app/elastic.py#L340:
search_after_view_ts = queryparams.get("search_after_view_ts", type=int)
search_after_id = queryparams.get("search_after_id", type=str)
if search_after_view_ts is not None and search_after_id is not None:
kwargs["search_after"] = [search_after_view_ts, search_after_id]
Unfortunately, since I do not have access to the database and the elasticsearch backend, I cannot test whether this change will solve the issue or not.