Skip to content

gradient accumulation using optax multisteps*#2

Open
apoorvtintin wants to merge 6 commits intomainfrom
gradient_accumulation_optax
Open

gradient accumulation using optax multisteps*#2
apoorvtintin wants to merge 6 commits intomainfrom
gradient_accumulation_optax

Conversation

@apoorvtintin
Copy link
Collaborator

We are not directly using optax multisteps because Axlearn has it's own wrappers and optimizer state classes that optax multisteps wrapper is incompatible with.

@apoorvtintin apoorvtintin force-pushed the gradient_accumulation_optax branch from 74d9270 to 183c27a Compare March 25, 2024 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant