Collected sources and patterns will appear here. Add from search, explore, or the patterns library.
AudioFeatures, CTCPosteriors -> Transcription
Generate N-best hypothesis candidates via CTC prefix beam search, then rescore them using a shared multi-head attention decoder.
Problem it solves
Streaming CTC decoders suffer from local search errors, but full-sequence attention decoders are too slow for real-time streaming.
Consumes
Emits
The real projects this mechanism was found in. Attribution is the point — this is how the best teams actually do it.