-
Notifications
You must be signed in to change notification settings - Fork 0
Description
First of all, thank you for running the GRIP service and providing it for free to the public and help researchers such as myself to analyze BGP behavior in the wild!
I recently tried to build a script that fetches suspicious BGP events from the GRIP API (https://api.grip.inetintel.cc.gatech.edu/json/events). If I understand correctly, the service running the API is based on this repository.
The issue I encountered is that the returned results were not consistent among multiple identical requests. Looking at this repository, my assumption is that the results are not deterministically sorted. Sorting is based on the view_ts parameter, but if multiple events have the same view_ts parameters, the order of these events in the query result is undefined, resulting in different orders for different invocations.
-
Concrete Example of the issue:
Running the following query 10 times, always returns an event withview_ts = 1653825300, but the event id may change betweensubmoas-1653825300-11351=42960andsubmoas-1653825300-132721=58678.for x in $(seq 1 10); do wget --quiet -O - "https://api.grip.inetintel.cc.gatech.edu/json/events?event_type=submoas&start=1999&full=true&min_duration=300&max_duration=300&min_susp=80&max_susp=80&length=1&ts_start=2024-11-04T11:30:00&ts_end=2024-11-04T11:30:00" | python3 -m json.tool - | grep '"id"\|"view_ts"'; doneThe script returns the following output for me:
"id": "submoas-1653825300-11351=42960",
"view_ts": 1653825300
"view_ts": 1653825300
"id": "submoas-1653825300-11351=42960",
"view_ts": 1653825300
"view_ts": 1653825300
"id": "submoas-1653825300-11351=42960",
"view_ts": 1653825300
"view_ts": 1653825300
"id": "submoas-1653825300-11351=42960",
"view_ts": 1653825300
"view_ts": 1653825300
"id": "submoas-1653825300-132721=58678",
"view_ts": 1653825300
"view_ts": 1653825300
"id": "submoas-1653825300-11351=42960",
"view_ts": 1653825300
"view_ts": 1653825300
"id": "submoas-1653825300-132721=58678",
"view_ts": 1653825300
"view_ts": 1653825300
"id": "submoas-1653825300-11351=42960",
"view_ts": 1653825300
"view_ts": 1653825300
"id": "submoas-1653825300-132721=58678",
"view_ts": 1653825300
"view_ts": 1653825300
"id": "submoas-1653825300-132721=58678",
"view_ts": 1653825300
"view_ts": 1653825300
-
Proposed solution:
When defining the sort order for the elasticsearch backend, also define a tiebreaker (the eventid) as a second key for events with identicalview_tsfor sorting, as suggested in the elasticsearch doc (https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html#search-after):"..., we recommend that you include a tiebreaker field in your sort. This tiebreaker field should contain a unique value for each document. If you don’t include a tiebreaker field, your paged results could miss or duplicate hits."
The following change in line https://github.com/InetIntel/grip-api-v2/blob/main/app/elastic.py#L339 should solve the issue:
Replace'sort': "view_ts:desc"with'sort': ["view_ts:desc", "id:asc"]Unfortunately, since I do not have access to the database and the elasticsearch backend, I cannot test whether this change will effectively solve the issue or not.