-
Notifications
You must be signed in to change notification settings - Fork 15
youtube-vis: video dataset #486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
youtube-vis: video dataset #486
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: af08e50151
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
/review |
michal-lightly
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think we can simplify this PR.
I have not looked at the tests yet.
lightly_studio/src/lightly_studio/examples/example_video_annotations.py
Outdated
Show resolved
Hide resolved
| - Enabled editing of segmentation masks and deletion of annotations in the details view. | ||
| - Allowed users to customize toolbar shortcuts. | ||
| - Added `Sample.add_annotation()` method, adds annotations to samples. | ||
| - Loading videos with annotations from youtube-vis format via `dataset.add_videos_from_youtube_vis`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will also need to add info to docs, I assume that will be in a separate PR?
We will need to highlight that this is not the standard youtubevis format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up PR
| allowed_extensions: An iterable container of allowed video file | ||
| extensions in lowercase, including the leading dot. If None, | ||
| uses default VIDEO_EXTENSIONS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add information about how this is used - this is user-facing interface.
Naively, annotations.json should have all the info about the loaded videos. But there is a reason we need to supply path and allowed_extensions.
What is not clear to me is what happens if the info from annotations.json and the list of video files mismatches.
| def add_videos_from_youtube_vis( | ||
| self, | ||
| annotations_json: PathLike, | ||
| path: PathLike, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: We could call it videos_path.
| ) | ||
|
|
||
|
|
||
| def _collect_video_file_paths( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: Consider moving this to add_videos.py too.
|
|
||
| def test_load_video_annotations_from_labelformat( | ||
| db_session: Session, | ||
| patch_collection: None, # noqa: ARG001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why has this file changed to depend on the VideoDataset class? It would be better to keep it independent if possible, then we also don't need the patch.
| session: The database session. | ||
| dataset_id: The ID of the video dataset to load annotations into. | ||
| video_paths: An iterable of file paths to the videos to load. | ||
| input_labels: The labelformat input containing video annotations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here, please add info on how the parameters interact (see the comment in VideoDataset).
| file_path = Path(video_file) | ||
| video_name_to_path[file_path.name] = str(file_path.absolute()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work also with fsspec paths?
| for video_file in video_paths: | ||
| file_path = Path(video_file) | ||
| video_name_to_path[file_path.name] = str(file_path.absolute()) | ||
| if file_path.stem in video_stem_to_path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem with .stem(), it will strip any path to the video. I'd suggest .with_suffix("").
This assumes labelformat might return the video paths with a folder prefix, which I think is the case. Can we add a test with videos in some nested folder structure (e.g. a/vid.mp4, b/vid.mp4)? Can be a follow-up.
| if resolved_path is None: | ||
| raise FileNotFoundError(f"No video file found for '{filename}'.") | ||
| video_paths.append(resolved_path) | ||
| return list(dict.fromkeys(video_paths)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return list(dict.fromkeys(video_paths)) | |
| return video_paths |
| "length": 3, | ||
| }, | ||
| ], | ||
| "annotations": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: We can keep this empty since it is irrelevant for the test.
| embed=True, | ||
| ) | ||
|
|
||
| # Verify embeddings were created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an easier way, from a test with images:
assert len(image1.sample.embeddings) == 1
What has changed and why?
Adding the loading method for youtube-vis format to the video dataset.
How has it been tested?
tests are implemented.
Did you update CHANGELOG.md?