optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

APACHE-2.0 License

Downloads
946K
Stars
2.1K
Committers
84

Bot releases are hidden (Show)

optimum - v1.18.1: Patch release Latest Release

Published by JingyaHuang 6 months ago

Fix the installation for Optimum Neuron v0.0.21 release

  • Improve the installation of optimum-neuron through optimum extras #1778

Fix the task inference of stable diffusion

  • Fix infer task for stable diffusion #1793

Full Changelog: https://github.com/huggingface/optimum/compare/v1.18.0...v1.18.1

optimum - v1.18.0: Gemma, OWLv2, MPNet Qwen2 ONNX support

Published by echarlaix 7 months ago

New architectures ONNX export :

  • OWLv2 by @xenova in #1689
  • Gemma by @fxmarty in #1714
  • MPNet by @nathan-az in #1471
  • Qwen2 by @uniartisan in #1746

Other changes and bugfixes

  • Fix starcoder ORT integration by @fxmarty in #1722
  • Fix use_auth_token with ORTModel by @fxmarty in #1740
  • Fix compatibility with transformers v4.39.0 by @echarlaix in #1764
optimum - v1.17.1: Patch release

Published by regisss 8 months ago

Update Transformers dependency for the release of Optimum Habana v1.10.2

  • Update Transformers dependency in Habana extra #1700

Full Changelog: https://github.com/huggingface/optimum/compare/v1.17.0...v1.17.1

optimum - v1.17.0: Improved ONNX support & many bugfixes

Published by fxmarty 8 months ago

ONNX export from nn.Module

A function is exposed to programmatically export any nn.Module (e.g. models coming from Transformers, but modified). This is useful in case you need to do some modifications on models loaded from the Hub before exporting. Example:

from transformers import AutoModelForImageClassification
from optimum.exporters.onnx import onnx_export_from_model

model = AutoModelForImageClassification.from_pretrained("google/vit-base-patch16-224")

# Here one could do any modification on the model before the export.
onnx_export_from_model(model, output="vit_onnx")

ONNX export with static shapes

The Optimum ONNX export CLI allows to disable dynamic shape for inputs/outputs:

 optimum-cli export onnx --model timm/ese_vovnet39b.ra_in1k  out_vov --no-dynamic-axes

This is useful if the exported model is to be consumed by a runtime that does not support dynamic shapes. The static shape can be specified e.g. with --batch_size 1 . See all the shape options in optimum-cli export onnx --help.

BF16 ONNX export

The Optimum ONNX export now supports BF16 export on CPU and GPU. Beware though that ONNX Runtime is most often not able to consume the models as some operation are not implemented in this data type, although the exported models comply with ONNX standard. This is useful if you are developing a runtime that consomes BF16 ONNX models.

Example:

optimum-cli export onnx --model bert-base-uncased --dtype bf16 bert_onnx 

ONNX export for news models

You can now export to ONNX table-transformer, bart for text-classification.

Sentence Transformers ONNX export

Timm models support with ONNX Runtime

Timm models can now be run through ONNX Runtime with the class ORTModelForImageClassification:

from urllib.request import urlopen

import timm
import torch
from PIL import Image

from optimum.onnxruntime import ORTModelForImageClassification

# Export the model to ONNX under the hood with export=True.
model = ORTModelForImageClassification.from_pretrained("timm/resnext101_64x4d.c1_in1k", export=True)

# Get model specific transforms (normalization, resize).
data_config = timm.data.resolve_data_config(pretrained_cfg=model.config.pretrained_cfg)
transforms = timm.data.create_transform(**data_config, is_training=False)

img = Image.open(
    urlopen("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png")
)
output = model(transforms(img).unsqueeze(0)).logits
top5_probabilities, top5_class_indices = torch.topk(torch.softmax(output, dim=1) * 100, k=5)

Other changes and bugfixes

New Contributors

Full Changelog: https://github.com/huggingface/optimum/compare/v1.16.0...v1.17.0

optimum - v1.16.2: Patch release

Published by echarlaix 9 months ago

optimum - v1.16.1: Patch release

Published by fxmarty 10 months ago

Breaking change: BetterTransformer llama, falcon, whisper, bart is deprecated

The features from BetterTransformer for Llama, Falcon, Whisper and Bart have been upstreamed in Transformers. Please use transformers>=4.36 and torch>=2.1.1 to use by default PyTorch's scaled_dot_product_attention.

More details: https://github.com/huggingface/transformers/releases/tag/v4.36.0

What's Changed

New Contributors

Full Changelog: https://github.com/huggingface/optimum/compare/v1.16.0...v1.16.1

Transformers 4.36 compatiblity

Notably, the ONNX exports aten::scaled_dot_product_attention in a standardized way for the compatible models.

Extended ONNX support: timm, sentence-transformers, Phi, ESM

GPTQ for Mixtral

Work in progress.

What's Changed

Full Changelog: https://github.com/huggingface/optimum/compare/v1.15.0...v1.16.0

optimum - v1.15.0: ROCMExecutionProvider support

Published by fxmarty 11 months ago

ROCMExecutionProvider support

The Optimum ONNX Runtime integration is extended to officially support ROCMExecutionProvider. See more details in the documentation.

Extended ONNX export

The Swin2sr, DPT, GLPN, ConvNextv2 are now supported in the ONNX export.

What's Changed

New Contributors

Full Changelog: https://github.com/huggingface/optimum/compare/v1.14.0...v1.15.0

optimum - v1.14.1: Patch release

Published by echarlaix 11 months ago

optimum - v1.14.0: LCMs, SpeechT5, Falcon, Mistral, decoder refactorization

Published by echarlaix 12 months ago

ONNX

New architectures

Falcon

SpeechT5

Mistral

TrOCR

LCMs

Enable LCMs (available in in diffusers since v0.22.0) ONNX export and ORT inference by @echarlaix in https://github.com/huggingface/optimum/pull/1469

from optimum.onnxruntime import ORTLatentConsistencyModelPipeline

pipe = ORTLatentConsistencyModelPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", export=True)
prompt = "sailing ship in storm by Leonardo da Vinci"
images = pipe(prompt=prompt, num_inference_steps=4, guidance_scale=8.0).images

Also enable ONNX export using the CLI :

optimum-cli export onnx --model SimianLuo/LCM_Dreamshaper_v7 lcm_onnx/

Decoder refactorization

GPTQ

Other changes and bugfixes

New Contributors

optimum - v1.13.3: Patch release

Published by fxmarty 12 months ago

Patch release for transformers==4.34.1 compatibility. We will do a release next week for transformers==4.35 compatibility and new features. Please bear with us!

optimum - v1.13.2: Patch release

Published by echarlaix about 1 year ago

optimum - v1.13.1: Patch release

Published by fxmarty about 1 year ago

Fix ONNX fp16 export that broke in 1.13.0.

What's Changed

optimum - v1.13.0: ONNX weight deduplication, ONNX export and ORT extension

Published by fxmarty about 1 year ago

Deduplicate Embedding / LM head weight in the ONNX export

Workaround for a bug in the PyTorch ONNX export that does not deduplicate the Embedding and LM head shared weight: https://github.com/pytorch/pytorch/issues/108342. For small enough models, this results in up to 50% ONNX serialized model size decrease.

Extended ONNX Runtime support

ONNX Runtime integration now supports Pix2Struct and MPT architectures. Donut now supports IO Binding. Encoder-Decoder models are now supported as well.

Extended ONNX export: MPT, TIMM models, Encoder-Decoder

Additionally, the model SAM is now be default exported as a vision_encoder.onnx, and prompt_encoder_mask_decoder.onnx.

BetterTransformer supports Falcon

Major bugfix: ability to set GPTQ Exllama kernel maximum length in the transformers integration

The function exllama_set_max_input_length from auto-gptq can now be used with Transformers GPTQ models.

Other changes and bugfixes

New Contributors

Full Changelog: https://github.com/huggingface/optimum/compare/v1.12.0...v1.13.0

optimum - v1.12.0: AutoGPTQ integration, extended BetterTransformer support

Published by fxmarty about 1 year ago

AutoGPTQ integration

Part of AutoGPTQ library has been integrated in Optimum, with utilities to ease the integration in other Hugging Face libraries. Reference: https://huggingface.co/docs/optimum/llm_quantization/usage_guides/quantization

Extended BetterTransformer support

BetterTransformer now supports BLOOM and GPT-BigCode architectures.

Other changes and bugfixes

New Contributors

Full Changelog: https://github.com/huggingface/optimum/compare/v1.11.2...v1.12.0

optimum - v1.11.2: Patch release

Published by regisss about 1 year ago

Remove the Transformers version constraint on optimum[habana].

  • Remove Transformers version constraint on Optimum Habana #1290 by @regisss

Full Changelog: https://github.com/huggingface/optimum/compare/v1.11.1...v1.11.2

optimum - v1.11.1: Patch release

Published by fxmarty about 1 year ago

Minor fix: documentation building for 1.11.

Full Changelog: https://github.com/huggingface/optimum/compare/v1.11.0...v1.11.1

optimum - v1.11.0: Extended ONNX, ONNX Runtime, BetterTransformer support

Published by JingyaHuang about 1 year ago

Extended ONNX and ONNX Runtime support

Add ONNX export and ONNX Runtime inference support for gpt bigcode.

  • Add ONNX / ONNXRuntime support for StarCoder by @JingyaHuang in #1042

Extended BetterTransformer support

BetterTransformer now supports Llama 2 and bark.

Training and autocast are now supported for most architectures, please refer to the documentation for more details: https://huggingface.co/docs/optimum/main/en/bettertransformer/overview

Major bugfixes

  • Update ORT training to be compatible with transformers 4.31 by @JingyaHuang in #1227

Other improvements and bugfix

New Contributors

Full Changelog: https://github.com/huggingface/optimum/compare/v1.10.0...v1.11.0

optimum - v1.10.1: Patch release

Published by echarlaix about 1 year ago

Full Changelog: https://github.com/huggingface/optimum/compare/v1.10.0...v1.10.1

optimum - v1.10.0: Stable Diffusion XL pipelines

Published by echarlaix about 1 year ago

Stable Diffusion XL

Enable SD XL ONNX export and ONNX Runtime inference by @echarlaix in https://github.com/huggingface/optimum/pull/1168

  • Enable SD XL ONNX export using the CLI :
optimum-cli export onnx --model stabilityai/stable-diffusion-xl-base-0.9 --task stable-diffusion-xl ./sd_xl_onnx
  • Add SD XL pipelines for ONNX Runtime inference (supported tasks : text-to-image and image-to-image) :
from optimum.onnxruntime import ORTStableDiffusionXLPipeline

model_id = "stabilityai/stable-diffusion-xl-base-0.9"
pipeline = ORTStableDiffusionXLPipeline.from_pretrained(model_id, export=True)

prompt = "sailing ship in storm by Leonardo da Vinci"
image = pipeline(prompt).images[0]
pipeline.save_pretrained("onnx-sd-xl-base-0.9")

Stable Diffusion pipelines

Enable image-to-image and inpainting pipelines for ONNX Runtime inference by @echarlaix in https://github.com/huggingface/optimum/pull/1121

More examples in documentation

Major bugfixes

What's Changed

New Contributors

Full Changelog: https://github.com/huggingface/optimum/compare/v1.9.0...v1.10.0