Collected sources and patterns will appear here. Add from search, explore, or the patterns library.
Sequence<Representation> -> Sequence<Representation>
Alternately process sequences using Mamba state-space layers and self-attention layers to achieve linear scaling with global context retrieval.
Problem it solves
Pure attention scales quadratically, while pure state-space models struggle with high-recall long-context retrieval.
Consumes
Emits
The real projects this mechanism was found in. Attribution is the point — this is how the best teams actually do it.