-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
Hi, I am consistently finding a difference of -0.04% in the reported performance on ActivityNet-QA dataset when using the official code for evaluation (https://github.com/MILVLG/activitynet-qa)
Replicating the results on ActivityNet-QA:
- Accuracy using Singularity-Temporal (n=12 frames, num_temporal_layers=2, ckpt: ft_anet_qa_singularity_temporal_17m.pth): 44.01%
- Accuracy using ActivityNet-QA: 43.97%
Bonus: ActivityNet-QA evaluation code provides evaluation of each question sub-type :)
Accuracy (per question type):
Motion: 32.2500%
Spatial Relation: 22.6250%
Temporal Relation: 4.1250%
Free: 75.7523%
All: 43.9750%
Accuracy of the Free type questions(per answer type):
Yes/No: 75.1194%
Color: 51.3630%
Object: 27.6730%
Location: 39.8964%
Number: 54.4554%
Other: 36.2241%
P.S.: The difference of -0.04% is consistent for all my experiments on ActivityNet-QA.
Thanks in advance!
jayleicn
Metadata
Metadata
Assignees
Labels
No labels