Collected molecules will appear here. Add from search or explore.
RemoteDatasetURI -> Generator<DataRecord>
Iterate sequentially over a remote multi-file dataset by fetching, buffering, and yielding records from shards on-the-fly without a full disk download.
Problem it solves
Dataset size exceeds local storage capacity, or downloading the whole dataset introduces excessive latency before processing can start.
Consumes
Emits
The real projects this mechanism was found in. Attribution is the point — this is how the best teams actually do it.