Collected sources and patterns will appear here. Add from search, explore, or the patterns library.
HuggingFaceModel -> MegatronModel
Translate standard HuggingFace/Transformers model definitions and configurations into Megatron-Core tensor-parallel layers automatically.
Problem it solves
Megatron-Core speedups are difficult to adopt because migrating standard HuggingFace model architectures manually is highly error-prone.
Consumes
Emits
The real projects this mechanism was found in. Attribution is the point — this is how the best teams actually do it.