Skip to content

Better curation functionality - removing volumes #3

@ehenneken

Description

@ehenneken

Example message from ADSScanExplorerPipeline.ingestor.log:

{"asctime": "2024-10-21T20:16:13.362Z", "name": "ADSScanExplorerPipeline.ingestor", "processName": "ForkPoolWorker-2", "filename": "ingestor.py", "funcName": "identify_journals", "levelname": "ERROR", "lineno": 280, "module": "ingestor", "threadName": "MainThread", "message": "Error checking file hash on top file: /proj/ads/articles/lists/seri/Astr./Astr.0200.top due to [Errno 2] No such file or directory: '/proj/ads/articles/bitmaps/seri/Astr./0200/600'", "timestamp": "2024-10-21T20:16:13.362Z", "hostname": "0653bcda4d2f"}

This is the result of bad data in the back office. In the back office this is solved by removing the conflicting file (Astr.0200.top). However, the volume remains in the Scan Explorer database, such with the next update run, it will attempt to do the same thing and generate the same error message.

Proposal: create a command line option to remove a certain volume from the database. Perhaps something like

python run.py REMOVE --id=Astr.0200

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions