-
Notifications
You must be signed in to change notification settings - Fork 0
Pilot 9050: fixup the preupload timeout causing resume issue #249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a resumable upload issue caused by network timeouts during pre-upload operations. When timeouts occur, the local manifest can become mismatched with the backend state, treating already-registered items as unregistered.
Changes:
- Updated duplication check API from v1 to v2 endpoint to separately identify ACTIVE vs REGISTERED items
- Fixed manifest key inconsistency (changed from
local_pathtoobject_pathfor unregistered items) - Added logic to reconcile registered items found on backend with local manifest during resume
- Updated Python version support from 3.10-3.11 to 3.10-3.13
- Consolidated imports and updated copyright years to 2026
Reviewed changes
Copilot reviewed 100 out of 101 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Version bump to 3.21.0 and Python version range update |
| poetry.lock | Dependency updates for new Poetry version and Python support |
| app/services/file_manager/file_upload/upload_client.py | API v2 migration, three-way return for duplication check |
| app/services/file_manager/file_upload/file_upload.py | Resume logic to handle registered items mismatch |
| app/services/file_manager/file_upload/exception.py | Added proper exception message to super().init |
| tests/app/services/file_manager/file_upload/test_upload_client.py | Updated tests for new three-way return signature |
| tests/app/services/file_manager/file_upload/test_file_upload.py | Added comprehensive tests for manifest batch processing |
| Multiple files | Copyright year updates and import consolidation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
also tested with batch size 50 with folder size of 10 to 1000. |
* merged * update duplicate logic to check based on item status * update COPYRIGHT * fixup the preupload timeout fails the resumable upload * fixup the resumable upload didn't output correct batches * update version matrix pipeline * update version matrix pipeline * Updated docs/compatible_version.ndjson with PR #249 * fixup the missing index when resume * Updated docs/compatible_version.ndjson with PR #249 * fixup the missing index when resume * Updated docs/compatible_version.ndjson with PR #249 * update version matrix * Updated docs/compatible_version.ndjson with PR #249 * fixup the cli version fetching * Updated docs/compatible_version.ndjson with PR #249 * fixup the cli version fetching * trigger pipelines --------- Co-authored-by: GitHub Actions <github-actions@users.noreply.indocresearch.org>
Summary
After investigation of this issue, it is caused by network timeout during the pre-upload in python version. The resumable json will treat the timeout batch AS unregistered, while backend already registered them(This issue only happened when timeout). Then mismatching happens and block resumable upload. Another PR will increase the timeout in bff-cli service. This PR is meant to fix the mismatching issue:
a. old cli version will still use v1 endpoint.
api updates:
JIRA Issues
Pilot 9050
Type of Change
Please delete options that are not relevant.
Testing
Are there any new or updated tests to validate the changes?
Test Directions
unittest:
manual testing:
manifest.jsonto create mismatching, moving registered items into unregistered fields.Versions