sparseml

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

APACHE-2.0 License

Downloads
4.6K
Stars
2K
Committers
49

Bot releases are hidden (Show)

sparseml - SparseML v1.7.0 Latest Release

Published by jeanniefinks 7 months ago

New Features:

  • Fine-tuning, one-shot, and general compression techniques now support large language models built on top of Hugging Face Transformers, including full FSDP support and model stages for transitioning between training and post-training pathways. (#1834, #1891, #1907, #1902, #1940, #1939, #1897, #1907, #1912)
  • SparseML eval pathways have been added with plugins for perplexity and lm-eval-harness specifically for large language model support. (#1834)
  • AutoModel for casual language models, including quantized and sparse quantized support, has been added.

Changes:

  • Exporting pathways has been simplified across text generation and CV use cases to auto infer previously required arguments, such as task type. (#1858, #1878, #1880, #1883, #1884, #1888, #1889, #1890, #1898, #1908, #1909, #1910)
  • Recipe pathways have been updated to fully support LLMs for model compression techniques. (#1802, #1804, #1819, #1825, #1849)
  • Pruning for models that are partially quantized is now supported. (#1792)
  • OBCQ modifier target_ids argument is now optional. (#1825)
  • sequence_length for transformer exports is now automatically inferred if it is not supplied. (#1826)
  • OBCQ now supports non-CUDA systems. (#1828)
  • Neural Magic's Ultrayltics Enterprise License has been updated with a December 2023 amendment as cited. (#2090)

Resolved Issues:

  • KV-cache injections now function accurately with MPT models in DeepSparse and SparseML, where before they crashed on export for MPT models. (#1801)
  • SmoothQuant updated to support proper device forwarding where it would not work properly in FSDP setups and crash. (#1830)
  • With nsamples increased to 512, the stability of OBCQ improved, resulting in a higher likelihood of it converging correctly. (#1812)
  • SmoothQuant NaN values are resolved during computation. (#1872)
  • TypeError with OBCQ when no sequence_length is provided is now resolved. (#1899)

Known Issues:

  • Memory usage is currently high for one-shot and fine-tuning algorithms on LLMs, resulting in the need for GPUs with more memory for model sizes 7B and above.
  • Memory usage is currently high for export pathways for LLMs, resulting in a requirement of large CPU RAM (>150GB) to successfully export for model sizes 7B and above.
  • Currently, exporting models created with quantization through FSDP pathways is failing on reloading the model from disk. The workaround is to perform quantization on a single GPU rather than multiple GPUs. A hotfix is forthcoming.
  • Currently, multi-stage pipelines that include quantization and are running through FSDP will fail after running training and on initialization of the SparseGPT quantization stage. This is due to the FSDP state not being propagated correctly. The workaround is to restart the run from the saved checkpoint after training and pruning are finished. A hotfix is forthcoming.
sparseml - SparseML v1.6.1 Patch Release

Published by jeanniefinks 10 months ago

This is a patch release for 1.6.0 that contains the following changes:

Known Issues:

  • The compile time for dense LLMs can be very slow. Compile time to be addressed in forthcoming release.
  • Docker images are not currently pushing. A resolution is forthcoming for functional Docker builds. [RESOLVED]
sparseml - SparseML v1.6.0

Published by jeanniefinks 10 months ago

New Features:

  • Version support added:

    • Python 3.11 (#1764)
    • PyTorch 2.0 (#1618, #1635)
    • ONNX 1.14 and Opset 14 (Documentation) (#1627, #1641, #1660, #1767, #1768)
    • NumPy 1.21.6 (#1623)
  • Ultralytics YOLOv8 training and sparsification pipelines added. (Documentation) (#1517, #1522, #1520, #1528, #1521, #1561, #1579, #1597, #1599, #1629, #1637, #1638, #1673, #1686, #1656, #1787)

  • NOTICE updated to reflect now public-facing Ultralytics Enterprise Enterprise Software License Agreement for YOLOv3/v5/v8.

  • Initial sparsification framework v2 added for better generative AI support and improved functionality and extensibility. (Documentation available in v1.7) (#1713, #1751, #1742, #1763, #1759, #1769)

  • BLOOM, CodeGen, OPT, Falcon, GPTNeo, LLAMA, MPT, and Whisper large language and generative models are supported through transformers training, sparsification, and export pipelines. (Documentation) (#1562, #1571, #1585, #1584, #1616, #1633, #1590, #1644, #1615, #1664, #1646, #1631, #1648, #1683, #1687, #1677, #1692, #1694, #1699, #1703, #1709, #1691, #171, #1720, #1746)

  • QuantizationModifier for PyTorch sparsification pathways implemented to enable cleaner, more robust, and simpler arguments for quantizing models in comparison to the legacy quantization modifier. (Documentation) (#1568, #1594, #1639, #1693, #1745, #1738)

  • CLIP pruning, quantization, and export supported. (Documentation) ( #1581, #1626, #1711)

  • INT4 quantization support added for model sparsification and export. (Documentation available in v1.8 with LLM support expansion)(#1670)

  • DDP support added to Torchvision image classification training and sparsification pipelines. (Documentation available in v1.8 with new research paper)(#1698, #1784)

  • SparseGPT, OBC, and OBQ one-shot/post-training pruning and quantization modifiers added for PyTorch pathways. (Documentation) (#1705, #1736, #1737, #1761, #1770, #1781, #1776, #1777, #1758)

Changes:

  • SparseML upgraded for SparseZoo V2 model file structure changes, which expands the number of supported files and reduces the number of bytes that need to be downloaded for model checkpoints, folders, and files. (#1719)

  • Docker builds updated to consistently rebuild for new releases and nightlies. (#1506, #1531, #1543, #1537, #1665, #1684)

  • README and documentation updated to include: Slack Community name change, Contact Us form introduction, Python version changes; corrections for YOLOv5 torchvision, transformers, and SparseZoo broken links; and installation command. (#1536, #1577, #1578, #1610, #1617, #1612, #1602, #1659, #1721, #1725 , #1726, #1785)

  • Improved support for large ONNX files to improve loading performance and limit memory performance issues, especially for LLMs. (#1515, #1540, #1514, #1586)

  • Transformers datasets can now be created without a model needing to be passed in. (#1544, #1545)

  • Torchvision training and sparsification pipelines updated to enable patch versions of torchvision as installable dependencies, whereas before the version was restricted to 0.14.0 and now supports 0.14.x. (#1556)

  • Image classification training and sparsification pipelines for torchvision now support arguments for RGB emans and standard deviations to be passed in, enabling overriding of the default ImageNet values that were hardcoded. (#1546)

  • YOLOv5 training and sparsification pipelines migrated to install from nm-yolov5 on PyPI and remove the autoinstall from the nm-yolov5 GitHub repository that would happen on invocation of the relevant pathways, enabling more predictable environments. (#1518, #1564, #1566)

  • Transformers training and sparsification pipelines migrated to install from nm-transformers on PyPI and remove the autoinstall from the nm-transformers GitHub repository that would happen on invocation of the relevant pathways, enabling more predictable environments. (#1518, #1553, #1564, #1566, #1730)

  • Deprecated and no longer supported:

    • Keras pathways (#1585, #1607)
    • TensorFlow pathways (#1606, #1607)
    • Python 3.7 (#1611)
    • sparseml.benchmark commands and utilities; may be refactored in a future release (#1625)
    • SSD ResNet models sparsification and model loading; will be removed in a future release (#1739)
  • Pydantic version pinned to <2.0 preventing potential issues with untested versions. (#1645)

  • Automatic link checking added to GitHub actions. (#1525)

Resolved Issues:

  • ONNX export for MobileBERT results in an exported ONNX model that previously had poor performance in DeepSparse. (#1539)

  • OpenCV is now installed for image classification pathways when running pip install sparseml[torchvision]. Before it would crash with a missing dependency error of opencv unless installed. (#1575)

  • Scipy version dependency issues resolved with scikit-image which would result in incompatibility errors on install of scikit-image for computer vision pathways. (#1570)

  • Transformers export pathways for quantized models addressed where the export would improperly crash and not export for all transformers models. (#1654)

  • Transformers data support for jsonl files through the question answering pathways was resulting in a JSONDecodeError; these are now loading correctly. (#1667, #1669)

  • Unit and integration tests updated to remove temporary test files and limit test file creation which were not being properly deleted. (#1609, #1668, #1672, #1696)

  • Image classification pipelines no longer crash with an extra argument error when using CIFAR10 or CIFAR100 datasets. (#1671)

Known Issues:

  • None
sparseml - DeepSparse v1.5.4 Patch Release

Published by jeanniefinks about 1 year ago

This is a patch release for 1.5.0 that contains the following changes:

  • ClearML logging has been enabled for transformers. (#81)
sparseml - SparseML v1.5.3 Patch Release

Published by jeanniefinks over 1 year ago

This is a patch release for 1.5.0 that contains the following changes:

  • Pinned dependency Pydantic, a data validation library for Python, to < v2.0, to prevent current workflows from breaking. Pydantic upgrade planned for future release. (#1651)
sparseml - SparseML v1.5.2 Patch Release

Published by jeanniefinks over 1 year ago

This is a patch release for 1.5.0 that contains the following changes:

  • Latest 1.5-supported transformers datasets are incompatible with pandas 2.0. Future releases will support later datasets versions so this is to restrict pandas to < 2.0. (#1634 )
sparseml - SparseML v1.5.1 Patch Release

Published by jeanniefinks over 1 year ago

This is a patch release for 1.5.0 that contains the following changes:

  • Propagated datasets_dir argument in YOLOv8 training command to address missing args error. (#1620)
sparseml - SparseML v1.5.0

Published by jeanniefinks over 1 year ago

New Features:

  • PyTorch 1.13 support (#1143)
  • Enabled patch versions for torchvision 0.14.x (#1557)
  • YOLOv8 sparsification pipelines (view)
  • Per layer distillation support for PyTorch Distillation modifier (#1272)
  • Torchvision training pipelines:
    • Wandb, TensorBoard, and console logging (#1299)
    • DataParallel module (#1332)
    • Distillation (#1310)
  • Product usage analytics tracking; to disable, run the command export NM_DISABLE_ANALYTICS=True (#1487)

Changes:

  • Transformers and YOLOv5 integrations migrated from auto install to install from PyPI packages. Going forward, pip install sparseml[transformers] and pip install sparseml[yolov5] will need to be used.
  • Error message updated when utilizing wandb loggers and wandb is not installed in the environment, telling user to pip install wandb. (#1374)
  • Keras and TensorFlow tests have been removed; these are no longer actively supported pathways.
  • scikit-learn now replaced with sklearn to stay current with dependency name changes. (#1294)

Resolved Issues:

  • Using recipes that utilized the legacy PyTorch QuantizationModifier with DDP when restoring weights for sparse transfer no longer crashes. (#1490)
  • If labels were not being set correctly when utilizing a distillation teacher different from the student with token classification pipelines, training runs would crash. (#1414)
  • Q/DQ folding fixed on ONNX export for quantization nodes occurring before Softmax in transformer graphs; performance issues would result for some transformer models in DeepSparse. (#1343)
  • Inaccurate metrics calculations for torchvision training pipelines led to discrepancies in top1 and top5 accuracies by ~1%. (#1341)

Known Issues:

  • None
sparseml - SparseML v1.4.4 Patch Release

Published by jeanniefinks over 1 year ago

This is a patch release for 1.4.0 that contains the following changes:

  • Support implemented for overriding ONNX inputs with static and dynamic shapes. (#1476)
sparseml - SparseML v1.4.3 Patch Release

Published by jeanniefinks over 1 year ago

This is a patch release for 1.4.0 that contains the following changes:

  • The auto install for transformers was failing due to the cutover from sklearn to scikit-learn package naming intermittently; this is no longer failing. (#1294)
  • Python sparsification loggers now on instantiation they print out the directory. (#1432)
  • ONNX models in YOLOv5 were improperly exported for some shapes; more shapes are now supported for dynamic models. (#1442)
  • SparseML YOLOv5 validation commands were creating folders under the "zoo:" name for the SparseZoo stub; folders are now created under their ids.
  • Image classificaiton training script no longer fails if optional dependency tensorboard is not installed. (#1456)
  • Torchvision sparsification script now properly overrides final layer of torchvision native models. (#1455)
sparseml - SparseML v1.4.2 Patch Release

Published by jeanniefinks over 1 year ago

This is a patch release for 1.4.0 that contains the following changes:

sparseml - SparseML v1.4.0

Published by jeanniefinks over 1 year ago

New Features:

  • OpenPifPaf training prototype support (#1171)
  • Layerwise distillation support for the PyTorch DistillationModifier (#1272)
  • Recipe template API added in PyTorch for simple creation of default recipes (#1147)
  • Ability to create sample inputs and outputs on export for transformers, YOLOv5, and image classification pathways (#1180)
  • Loggers and one-shot support for torchvision training script (#1299, #1300)

Changes:

  • Refactored the ONNX Export pipeline to standardize implementations, adding functionality for more complicated models, and adding better debugging support. (#1192)
  • Refactored the PyTorch QuantizationModifier to expand supported models and operators and simplify the interface. (#1183)
  • YOLOv5 integration upgraded to the latest upstream. (#1322)

Resolved Issues:

  • recipe_template CLI no longer has improper code documentation, impairing operability. (#1170)
  • ONNX export now enforces that all quantized graphs will have unit8 values. fixing issues for some quantized models that were crashing in DeepSparse. (#1181)
  • Changed over to vector_norm for PyTorch pruning modifiers that were leading to crashes in older PyTorch versions. (#1167)
  • Model loading for torchvision script fixed where models were failing on load unless a recipe was supplied. (#1281)

Known Issues:

  • None
sparseml - SparseML v1.3.1 Patch Release

Published by jeanniefinks almost 2 years ago

This is a patch release for 1.3.0 that contains the following changes:

  • NumPy version pinned to <=1.21.6 to avoid deprecation warning/index errors in pipelines.
sparseml - SparseML v1.3.0

Published by jeanniefinks almost 2 years ago

New Features:

  • NLP multi-label training and eval support added.
  • SQuAD v2.0 support provided.
  • Recipe template APIs introduced, enabling easier creation of recipes for custom models with standard sparsification pathways.
  • EfficientNetV2 model architectures implemented.
  • Sample inputs and outputs exportable for YOLOv5, transformers, and image classification integrations.

Changes:

  • PyTorch 1.12 and Python 3.10 now supported.
  • YOLOv5 pipelines upgraded to the latest version from Ultralytics.
  • Transformers pipelines upgraded to latest version from Hugging Face.
  • PyTorch image classification pathway upgraded using torchvision standards.
  • Recipe arguments now support list types.

Resolved Issues:

  • Improper URLs fixed for ONNX export documentation to proper documentation links.
  • Latest transformers version hosted by Neural Magic automatically installs; previously it would pin on older versions and not receive updates

Known Issues:

  • None
sparseml - SparseML v1.2.0

Published by jeanniefinks almost 2 years ago

New Features:

  • Document classification training and export pipelines added for transformers integration.

Changes:

  • Refactor of transformers training and export integration code now enables more code reuse across use cases.
  • List of supported quantized nodes expanded to enable more complex quantization patterns for ResNet-50 and MobileBERT enabling better performance for similar models.
  • Transformers integration expanded to enable saving and reloading of optimizer state from trained checkpoints.
  • Deployment folder added for image classification integration which will export to deployment.
  • Gradient accumulation support added for image classification integration.
  • Minimum Python version changed to 3.7 as 3.6 as reached EOL.

Resolved Issues:

  • Quantized checkpoints for image classification models now instantiates correctly, no longer leading to random initialization of weights rather than restore.
  • TraininableParamsModifier for PyTorch now enables and disables params properly so weights are frozen while training.
  • Quantized embeddings no longer causes crashes while training with distributed data parallel.
  • Improper EfficientNet definitions fixed that would lead to accuracy issues due to convolutional strides being duplicated.
  • Protobuf version for ONNX 1.12 compatibility pinned to prevent install failures on some systems.

Known Issues:

  • None
sparseml - SparseML v1.1.1 Patch Release

Published by jeanniefinks about 2 years ago

This is a patch release for 1.1.0 that contains the following changes:

  • Some structurally modified image classification models in PyTorch would crash on reload; they now reload properly.
sparseml - SparseML v1.1.0

Published by jeanniefinks about 2 years ago

New Features:

  • YOLACT Segmentation native training integration made for SparseML.
  • OBSPruning modifier added (https://arxiv.org/abs/2203.07259).
  • QAT now supported for MobileBERT.
  • Custom module support provided for QAT to enable quantization of layers such as GELU.

Changes:

  • Updates made across the repository for new SparseZoo Python APIs.
  • Non-string keys are now supported in recipes for layer and module names.
  • Native support added for DDP training with pruning in PyTorch pathways.
  • YOLOV5p6 models default to their native activations instead of overwriting to Hardswish.
  • Transformers eval pathways changed to turn off Amp (fFP16) to give more stable results.
  • TensorBoard logger added to transformers integration.
  • Python setuptools set as required at 59.5 to avoid installation issues with other packages.
  • DDP now works for quantized training of embedding layers where tensors were being placed on incorrect devices and causing training crashes.

Resolved Issues:

  • ConstantPruningModifier propagated None in place of the start_epoch value when start_epoch > 0. It now propagates the proper value.
  • Quantization of BERT models were dropping accuracy improperly by quantizing the identify branches.
  • SparseZoo stubs were not loading model weights for image classification pathways when using DDP training.

Known Issues:

  • None
sparseml - SparseML v1.0.1 Patch Release

Published by jeanniefinks over 2 years ago

This is a patch release for 1.0.0 that contains the following changes:

  • Quantized ONNX graph folding resolution that prevents and extra quant/dequant pair being added into the residuals for BERT style models. This was causing an accuracy drop after exporting to ONNX of up to 1% and is now fixed.
sparseml - SparseML v1.0.0

Published by jeanniefinks over 2 years ago

New Features:

  • One-shot and recipe arguments support added for transformers, yolov5, and torchvision.
  • Dockerfiles and new build processes created for Docker.
  • CLI formats and inclusion standardized on install of SparseML for transformers, yolov5, and torchvision.
  • N:M pruning mask creator deployed for use in PyTorch pruning modifiers.
  • Masked_language_modeling training CLI added for transformers.
  • Documentation additions made across all standard integrations and pathways.
  • GitHub action tests running for end-to-end testing of integrations.

Changes:

  • Click as a root dependency added as the new preferred route for CLI invocation and arg management.
  • Provider parameter added for ONNXRuntime InferenceSessions.
  • Moved onnxruntime to optional install extra. onnxruntime no longer a root dependency and will only be imported when using specific pathways.
  • QAT export pipelines improved with better support for QATMatMul and custom operators.

Resolved Issues:

  • Incorrect commands and models updated for older docs for transformers, yolov5, and torchvision.
  • YOLOv5 issues addressed with data files, configs, and datasets not being easily accessible with the new install pathway. They are now included in the sparseml src folder for yolov5.
  • An extra batch no longer runs for the PyTorch ModuleRunner.
  • None sparsity parameter was being improperly propagated for sparsity in the PyTorch ConstantPruningModifier.
  • PyPI dependency conflicts no longer occur with the latest ONNX and Protobuf upgrades.
  • When GPUs were not available, yolov5 pathways were not working.
  • Transformers export was not working properly when neither --do_train or --do_eval arguments were passed in.
  • Non-string keys now allowed within recipes.
  • Numerous fixes applied for pruning modifiers including improper masks casting, improper initialization, and improper arguments passed through for MFAC.
  • YOLOv5 export formatting error addressed.
  • Missing or incorrect data corrected for logging and recording statements.
  • PyTorch DistillationModifier for transformers was ignoring both "self" distillation and "disable" distillation values; instead, normal distillation would be used.
  • FP16 not deactivating on QAT start for torchvision.

Known Issues:

  • PyTorch > 1.9 quantized ONNX export is broken; waiting on PyTorch resolution and testing.
sparseml - SparseML v0.12.2 Patch Release

Published by jeanniefinks over 2 years ago

This is a patch release for 0.12.0 that contains the following changes:

  • Protobuf is restricted to version < 4.0 as the newer version breaks ONNX.