Skip to content

feat(comments): Add runner for comments migration separately#380

Open
sakshamarora1 wants to merge 2 commits intoCERNDocumentServer:masterfrom
sakshamarora1:feature/comments_migration
Open

feat(comments): Add runner for comments migration separately#380
sakshamarora1 wants to merge 2 commits intoCERNDocumentServer:masterfrom
sakshamarora1:feature/comments_migration

Conversation

@sakshamarora1
Copy link
Contributor

@sakshamarora1 sakshamarora1 commented Feb 2, 2026

closes: #286

Steps

  1. Update the collection queries for a collection, retreive all the comments for the records in the records found and create a json metadata file.
ipython ./scripts/dump_comments_to_migrate.py

Output file: comments_metadata.json
Another output file generated for the missing users: users_metadata.json

  1. For missing users:
    users_metadata.json file will be read and then this script (with some tweaks) can be run to find out the missing users in the new system.
    https://gitlab.cern.ch/cds-team/production_scripts/-/blob/master/cds-rdm/migration/dump_users.py?ref_type=heads

  2. Create those users using: (NOT CONFIRMED)

invenio migration users people-run --filepath cds_migrator_kit/rdm/data/users/missing_users.json --dirpath cds_migrator_kit/rdm/data/users/dump
  1. Place comments_metadata.json in /eos/media/cds/cds-rdm/<env>/migration/<collection>/comments/

  2. SET config vars:

  3. Finally migrate those comments:

invenio migration comments --filepath /eos/media/cds/cds-rdm/<env>/migration/<collection>/comments/comments_metadata.json --dirpath /eos/media/cds/cds-rdm/<env>/migration/<collection>/comments/ --dry-run

@sakshamarora1 sakshamarora1 marked this pull request as ready for review February 4, 2026 16:33
self.all_record_versions = {
str(hit["versions"]["index"]): hit for hit in search_result
}
oldest_version = min(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it be faster via record._record.versions[-1]? I mean instead of scan_versions etc.

elif comment_status == "dm":
comment_payload["payload"].update(
{
"content": "comment was deleted by the moderator.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in RDM we do not have the "moderator" - it would be good to align it with what we display when we delete a comment in RDM (I don't remember the exact text). ping @zzacharo for more opinions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Comments migration

2 participants