Collected sources and patterns will appear here. Add from search, explore, or the patterns library.
TextTokens + BoundingBoxes -> SpatialTextEmbeddings
Encode 2D spatial coordinates of word bounding boxes into coordinate embeddings and sum them with standard text token embeddings to retain document layout geometry.
Problem it solves
Standard 1D text tokenization discards visual and spatial formatting of scanned or PDF documents.
Consumes
Emits
The real projects this mechanism was found in. Attribution is the point — this is how the best teams actually do it.