Skip to content

fix(import): crash recovery, stale detection, and InfluxDB v3 timestamps#51

Open
Aliaksei-Kharlap wants to merge 1 commit intomainfrom
fix/issues-45-46-47
Open

fix(import): crash recovery, stale detection, and InfluxDB v3 timestamps#51
Aliaksei-Kharlap wants to merge 1 commit intomainfrom
fix/issues-45-46-47

Conversation

@Aliaksei-Kharlap
Copy link
Collaborator

Fix crash recovery, stale import detection, and InfluxDB v3 timestamp compatibility for the import plugin.

Bug fixes

  • Crash recovery: On unhandled errors, the plugin now writes paused state before exiting, so the import can be resumed after fixing the issue. Previously, a crash left the import
    stuck in "running" forever.
  • Stale import detection: resume now checks if a "running" import is actually stale (last update >5 min ago) and allows resume instead of returning "already running". If no
    checkpoint exists, restarts from the beginning.
  • Missing import_state table on resume: Wrapped query in try/except — if the table doesn't exist yet (crashed before processing any tables), restarts the import.
  • InfluxDB v3 timestamps: paused_at_time extraction now handles both int nanoseconds (v1/v2) and ISO 8601 strings (v3).
  • Row-level errors: Invalid rows in convert_influxql_to_line_protocol are now skipped with a warning instead of crashing the entire batch.

Improvements

  • Added completed state to import_pause_state — pause/cancel/resume now return errors for already-completed imports.
  • Extracted _write_import_pause_state() helper — replaced 8 duplicate LineBuilder patterns; fixed a missing time_ns() call in one resume_import path.
  • Removed duplicate check_pause_state() function (~48 lines) — replaced with existing get_import_pause_state().

Docs

  • Updated README: crash recovery section, stale detection (5-min threshold), troubleshooting for "already running" after crash, completed state handling.

@caterryan
Copy link
Collaborator

Closes #45 #46 #47

@caterryan
Copy link
Collaborator

caterryan commented Mar 10, 2026

This pr does not fix #46

curl -H "Authorization: Bearer $INFLUXDB3_TOKEN" \
         -H "Content-Type: application/json" \
         -X GET "http://localhost:$INFLUXDB3_PORT/api/v3/engine/import?action=status&import_id=e60ff739-fe6c-43c4-9a12-437f1de73364"
{"import_id": "e60ff739-fe6c-43c4-9a12-437f1de73364", "overall_status": "completed", "summary": {"total_tables": 1, "completed_tables": 1, "in_progress_tables": 0, "paused_tables": 0, "cancelled_tables": 0, "pending_tables": 0, "total_rows_imported": 5001, "progress_percentage": 100.0}, "timing": {"started_at": 1773179493947118080, "last_updated_at": 1773179494980521984, "duration_seconds": 1.03}, "config": {"source_url": "http://localhost:8381", "source_database": "demo", "dest_database": "imports", "start_timestamp": "", "end_timestamp": "", "import_direction": "oldest_first", "target_batch_size": 5000, "query_interval_ms": 1, "table_filter": "measurement_xs"}, "pause_state": {"is_paused": false, "is_cancelled": false}, "table_details": [{"table_name": "measurement_xs", "status": "completed", "rows_imported": 5001, "last_update": 1773179494980521984, "paused_at_time": null}]}

curl -H "Authorization: Bearer $INFLUXDB3_TOKEN" \
         -H "Content-Type: application/json" \
         -X POST "http://localhost:$INFLUXDB3_PORT/api/v3/engine/import?action=resume&import_id=e60ff739-fe6c-43c4-9a12-437f1de73364" -d '{"source_username": "demo", "source_password": "demo1234" }'
{"status": "error", "error": "Import e60ff739-fe6c-43c4-9a12-437f1de73364 is already running"}

@caterryan
Copy link
Collaborator

This pr does not fix #47

curl -H "Authorization: Bearer $INFLUXDB3_TOKEN" \
         -H "Content-Type: application/json" \
         -X GET "http://localhost:$INFLUXDB3_PORT/api/v3/engine/import?action=status&import_id=230eb551-ffa9-47e7-9b20-dbe4d152dd0c"
{"import_id": "230eb551-ffa9-47e7-9b20-dbe4d152dd0c", "overall_status": "running", "summary": {"total_tables": 1, "completed_tables": 0, "in_progress_tables": 1, "paused_tables": 0, "cancelled_tables": 0, "pending_tables": 0, "total_rows_imported": 30006, "progress_percentage": 0.0}, "timing": {"started_at": 1773179821166896128, "last_updated_at": 1773179827367419904, "duration_seconds": 6.2}, "config": {"source_url": "http://localhost:8381", "source_database": "demo", "dest_database": "imports", "start_timestamp": "", "end_timestamp": "", "import_direction": "oldest_first", "target_batch_size": 5000, "query_interval_ms": 1000, "table_filter": "measurement_l"}, "pause_state": {"is_paused": false, "is_cancelled": false}, "table_details": [{"table_name": "measurement_l", "status": "in_progress", "rows_imported": 30006, "last_update": 1773179827367419904, "paused_at_time": null}]}

curl -H "Authorization: Bearer $INFLUXDB3_TOKEN" \
         -H "Content-Type: application/json" \
         -X POST "http://localhost:$INFLUXDB3_PORT/api/v3/engine/import?action=resume&import_id=230eb551-ffa9-47e7-9b20-dbe4d152dd0c" -d '{"source_username": "demo", "source_password": "demo1234" }'
{"status": "error", "error": "Import 230eb551-ffa9-47e7-9b20-dbe4d152dd0c is already running"}

@Aliaksei-Kharlap
Copy link
Collaborator Author

#47 I added a check for the last entry in the import_state table (written for each time interval for which data is collected (batch)). If the last entry was less than 5 minutes ago, is_running is returned; if it was more than 5 minutes ago, the migration can continue. @caterryan How long did it take you to submit a request to resume migration?

@Aliaksei-Kharlap
Copy link
Collaborator Author

#46 That's strange, I'll double check.

@caterryan
Copy link
Collaborator

#47 I added a check for the last entry in the import_state table (written for each time interval for which data is collected (batch)). If the last entry was less than 5 minutes ago, is_running is returned; if it was more than 5 minutes ago, the migration can continue. @caterryan How long did it take you to submit a request to resume migration?

Ah that must be the problem. I didn't wait long enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants