Collected sources and patterns will appear here. Add from search, explore, or the patterns library.
DatasetIdentifier, LocalCachePath -> DataFrame
Fetch a remote compressed tabular dataset by identifier, store it in a local cache directory if configured, and deserialize it into an in-memory dataframe.
Problem it solves
Redundant network requests and slow runtimes when repeatedly executing machine learning pipelines on remote benchmark datasets.
Consumes
Emits
The real projects this mechanism was found in. Attribution is the point — this is how the best teams actually do it.