-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Description
Hi. Thank you for releasing the precious benchmark! I'm working on implementing the PPO agent you reported in the paper. However, I found some misalignments between the code and your paper.
Trimmed action space
As mentioned by #4, the code below does not correspond to the 89 action dims in Appendix G.2.
| action_dim=[3, 3, 4, 25, 25, 8], |
About the compass observation
In the paper I see that the compass has a shape of (2,). However, I see an input of (4,) shape in your code.
| "compass": torch.rand((B, 4), device=device), |
Training on MultiDiscrete action space
Is the 89-dimension action space in the paper a MultiDiscrete action space like the original MineDojo action space, or you simply treat it as a Discrete action space?
In addition, can you release the training code on three task groups in the paper (or share this code via my GitHub email)? It will be beneficial for baseline comparisons!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels