Collected molecules will appear here. Add from search or explore.
(Dataset, TransformFunction) -> CachedDataset
Apply a mapping function to a dataset and store the output on disk, keyed by a fingerprint hash of the input dataset state and mapping function code.
Problem it solves
Re-running data preprocessing pipelines repeatedly wastes compute and introduces non-deterministic delays.
Consumes
Emits
The real projects this mechanism was found in. Attribution is the point — this is how the best teams actually do it.