-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Scenario summary
Current inference with t5 models is slow
Proposed solution
Investigate and implement solution to reduce model size and speed-up inference, some of the ideas to consider:
- Reduce T5 model size by 3X and increase the inference speed up to 5X: https://github.com/Ki6an/fastT5
- How to adapt a multilingual T5 model for a single language: https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90
- https://huggingface.co/blog/optimum-inference
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request