dataflux-pytorch

The Dataflux Accelerated Dataloader for PyTorch with GCS is an effort to improve ML-training efficiency when using data stored in GCS for training datasets. Using the Dataflux Accelerated Dataloader for training is up to 3X faster when the dataset consists of many small files (e.g., 100 - 500 KB).

APACHE-2.0 License

Downloads
2K
Stars
23

Issue Statistics

Past Year

All Time

Total Pull Requests
132
132
Merged Pull Requests
117
117
Total Issues
1
1
Time to Close Issues
13 days
13 days
Package Rankings
Top 36.94% on Pypi.org
Related Projects