Skip to content

Training details about MineAgent #9

@mansicer

Description

@mansicer

Hi. Thank you for releasing the precious benchmark! I'm working on implementing the PPO agent you reported in the paper. However, I found some misalignments between the code and your paper.

Trimmed action space

As mentioned by #4, the code below does not correspond to the 89 action dims in Appendix G.2.

action_dim=[3, 3, 4, 25, 25, 8],

About the compass observation

In the paper I see that the compass has a shape of (2,). However, I see an input of (4,) shape in your code.

"compass": torch.rand((B, 4), device=device),

Training on MultiDiscrete action space

Is the 89-dimension action space in the paper a MultiDiscrete action space like the original MineDojo action space, or you simply treat it as a Discrete action space?

In addition, can you release the training code on three task groups in the paper (or share this code via my GitHub email)? It will be beneficial for baseline comparisons!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions