Collected sources and patterns will appear here. Add from search, explore, or the patterns library.
TokenSequence -> DraftTokenProposals
Predict candidate draft tokens by searching for recurring historical token sub-sequences in the past prompt context.
Problem it solves
Traditional speculative decoding requires a secondary draft model that consumes extra GPU memory and runtime overhead.
Consumes
Emits
The real projects this mechanism was found in. Attribution is the point — this is how the best teams actually do it.