Collected sources and patterns will appear here. Add from search, explore, or the patterns library.
Tensor<HiddenStates> -> Tensor<HiddenStates>
Replace dense attention in video diffusion transformers with localized neighborhood attention to achieve sub-quadratic scaling across spatial-temporal tokens.
Problem it solves
Standard self-attention scales quadratically with spatial-temporal context length, causing memory bottlenecks during high-resolution video generation.
Consumes
Emits
The real projects this mechanism was found in. Attribution is the point — this is how the best teams actually do it.