-
Notifications
You must be signed in to change notification settings - Fork 339
Add utilities to convert KQL timespans to pandas timedeltas #874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Co-authored-by: ianhelle <13070017+ianhelle@users.noreply.github.com>
Co-authored-by: ianhelle <13070017+ianhelle@users.noreply.github.com>
Co-authored-by: ianhelle <13070017+ianhelle@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix KQL timespans not converting to timedeltas
Add utility functions to convert KQL timespans to timedeltas
Jan 21, 2026
Co-authored-by: ianhelle <13070017+ianhelle@users.noreply.github.com>
ianhelle
approved these changes
Jan 21, 2026
Copilot
AI
changed the title
Add utility functions to convert KQL timespans to timedeltas
Add utilities to convert KQL timespans to pandas timedeltas
Jan 21, 2026
ianhelle
approved these changes
Jan 29, 2026
ianhelle
approved these changes
Jan 30, 2026
FlorianBracq
requested changes
Jan 31, 2026
- Replace custom parse_timespan implementation with azure.kusto.data.helpers.parse_timedelta - Simplify ensure_df_timedeltas to require explicit column names - Add convert_timedeltas method to pandas accessor (df.mp.convert_timedeltas) - Remove auto-detection logic for timespan columns - Update tests to match simplified implementation
- Add drop_duplicates(subset=['query']) before merge in get_whois_df to prevent row multiplication from duplicate whois results - Change net_df fixture scope from module to function for test isolation with random sampling - Add autouse fixture to clear LRU caches (get_whois_info, _whois_lookup) between tests to prevent state leakage
FlorianBracq
approved these changes
Feb 2, 2026
Collaborator
FlorianBracq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks OK for me now!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
KQL returns timespans as strings:
"hh:mm:ss.fffffff"for < 1 day,"d.hh:mm:ss.fffffff"for >= 1 day. Pandasto_timedelta()fails on the latter format withValueError: expecting hh:mm:ss format, received: 1.00:00:00.Changes
Added to
msticpy.common.data_utils:parse_timespan(timespan)- Parses KQL timespan strings topd.Timedeltaensure_df_timedeltas(data, columns=None)- Converts DataFrame columns totimedelta64[ns]pd.to_timedelta()first, falls back to element-wise for large timespansensure_df_datetimes()APITests:
Usage
Design Notes
Original prompt
This section details on the original issue you should resolve
<issue_title>[Bug]:
msticpyKQL timespans are not implicity converted to timedeltas</issue_title><issue_description>Describe the bug
KQL datetime types are implicity converted to a pandas/numpy datetime type but timespans are not converted and returned as strings.
To Reproduce
Setup:
Issue:
Output:
Expected behavior
Screenshots and/or Traceback
From a test notebook: msticpy_large_timespan_type_conversion_fails.zip
Environment (please complete the following information):
Additional context
It's arguable if this should be classified as a feature request vs bug. However, given the behavior for datetime64, I argue that the lack of consistency and convenience makes it a bug.
Another problem is that for larger timespan values, pandas does support handling the string format for type conversion to timedeltas.
E.g. an "ValueError: expecting hh:mm:ss format, received: 1.00:00:00" exception occurs for a 1 day timespan.
Demo in notebook: msticpy_large_timespan_type_conversion_fails.zip
Client wrapper function to help compensate.