🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
APACHE-2.0 License
Bot releases are hidden (Show)
Full Changelog: https://github.com/huggingface/optimum/compare/v1.18.0...v1.18.1
Published by echarlaix 7 months ago
v4.39.0
by @echarlaix in #1764Published by regisss 8 months ago
Full Changelog: https://github.com/huggingface/optimum/compare/v1.17.0...v1.17.1
Published by fxmarty 8 months ago
nn.Module
A function is exposed to programmatically export any nn.Module
(e.g. models coming from Transformers, but modified). This is useful in case you need to do some modifications on models loaded from the Hub before exporting. Example:
from transformers import AutoModelForImageClassification
from optimum.exporters.onnx import onnx_export_from_model
model = AutoModelForImageClassification.from_pretrained("google/vit-base-patch16-224")
# Here one could do any modification on the model before the export.
onnx_export_from_model(model, output="vit_onnx")
The Optimum ONNX export CLI allows to disable dynamic shape for inputs/outputs:
optimum-cli export onnx --model timm/ese_vovnet39b.ra_in1k out_vov --no-dynamic-axes
This is useful if the exported model is to be consumed by a runtime that does not support dynamic shapes. The static shape can be specified e.g. with --batch_size 1
. See all the shape options in optimum-cli export onnx --help
.
The Optimum ONNX export now supports BF16 export on CPU and GPU. Beware though that ONNX Runtime is most often not able to consume the models as some operation are not implemented in this data type, although the exported models comply with ONNX standard. This is useful if you are developing a runtime that consomes BF16 ONNX models.
Example:
optimum-cli export onnx --model bert-base-uncased --dtype bf16 bert_onnx
You can now export to ONNX table-transformer, bart for text-classification.
trust_remote_code
to sentence transformers export by @xenova in https://github.com/huggingface/optimum/pull/1677
Timm models can now be run through ONNX Runtime with the class ORTModelForImageClassification
:
from urllib.request import urlopen
import timm
import torch
from PIL import Image
from optimum.onnxruntime import ORTModelForImageClassification
# Export the model to ONNX under the hood with export=True.
model = ORTModelForImageClassification.from_pretrained("timm/resnext101_64x4d.c1_in1k", export=True)
# Get model specific transforms (normalization, resize).
data_config = timm.data.resolve_data_config(pretrained_cfg=model.config.pretrained_cfg)
transforms = timm.data.create_transform(**data_config, is_training=False)
img = Image.open(
urlopen("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png")
)
output = model(transforms(img).unsqueeze(0)).logits
top5_probabilities, top5_class_indices = torch.topk(torch.softmax(output, dim=1) * 100, k=5)
input_labels
input to SAM model export by @xenova in https://github.com/huggingface/optimum/pull/1638
_export
by @JingyaHuang in https://github.com/huggingface/optimum/pull/1652
onnx_export
by @fxmarty in https://github.com/huggingface/optimum/pull/1685
Full Changelog: https://github.com/huggingface/optimum/compare/v1.16.0...v1.17.0
Published by echarlaix 9 months ago
Fix ORT training compatibility for transformers v4.36.0 by @AdamLouly https://github.com/huggingface/optimum/pull/1586
Fix ONNX expor tcompatibility for transformers v4.37.0 by @echarlaix https://github.com/huggingface/optimum/pull/1641
Published by fxmarty 10 months ago
The features from BetterTransformer for Llama, Falcon, Whisper and Bart have been upstreamed in Transformers. Please use transformers>=4.36
and torch>=2.1.1
to use by default PyTorch's scaled_dot_product_attention
.
More details: https://github.com/huggingface/transformers/releases/tag/v4.36.0
Full Changelog: https://github.com/huggingface/optimum/compare/v1.16.0...v1.16.1
Published by fxmarty 10 months ago
Notably, the ONNX exports aten::scaled_dot_product_attention
in a standardized way for the compatible models.
Work in progress.
modules_in_block_to_quantize
arg for gptq by @SunMarc in https://github.com/huggingface/optimum/pull/1585
Full Changelog: https://github.com/huggingface/optimum/compare/v1.15.0...v1.16.0
Published by fxmarty 11 months ago
The Optimum ONNX Runtime integration is extended to officially support ROCMExecutionProvider
. See more details in the documentation.
The Swin2sr, DPT, GLPN, ConvNextv2 are now supported in the ONNX export.
convnextv2
onnx export by @xenova in https://github.com/huggingface/optimum/pull/1560
delete_doc_comment
workflows by @regisss in https://github.com/huggingface/optimum/pull/1565
Full Changelog: https://github.com/huggingface/optimum/compare/v1.14.0...v1.15.0
Published by echarlaix 11 months ago
Published by echarlaix 12 months ago
Enable LCMs (available in in diffusers
since v0.22.0
) ONNX export and ORT inference by @echarlaix in https://github.com/huggingface/optimum/pull/1469
from optimum.onnxruntime import ORTLatentConsistencyModelPipeline
pipe = ORTLatentConsistencyModelPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", export=True)
prompt = "sailing ship in storm by Leonardo da Vinci"
images = pipe(prompt=prompt, num_inference_steps=4, guidance_scale=8.0).images
Also enable ONNX export using the CLI :
optimum-cli export onnx --model SimianLuo/LCM_Dreamshaper_v7 lcm_onnx/
with-past
in the ONNX export by @fxmarty in https://github.com/huggingface/optimum/pull/1358
Published by fxmarty 12 months ago
Patch release for transformers==4.34.1
compatibility. We will do a release next week for transformers==4.35
compatibility and new features. Please bear with us!
Published by echarlaix about 1 year ago
Published by fxmarty about 1 year ago
Fix ONNX fp16 export that broke in 1.13.0.
Published by fxmarty about 1 year ago
Workaround for a bug in the PyTorch ONNX export that does not deduplicate the Embedding and LM head shared weight: https://github.com/pytorch/pytorch/issues/108342. For small enough models, this results in up to 50% ONNX serialized model size decrease.
ONNX Runtime integration now supports Pix2Struct and MPT architectures. Donut now supports IO Binding. Encoder-Decoder models are now supported as well.
Additionally, the model SAM is now be default exported as a vision_encoder.onnx, and prompt_encoder_mask_decoder.onnx.
BetterTransformer
] Add falcon to BetterTransformer
by @younesbelkada in https://github.com/huggingface/optimum/pull/1343
The function exllama_set_max_input_length
from auto-gptq
can now be used with Transformers GPTQ models.
Update version to 1.12.1.dev0 following release by @fxmarty in https://github.com/huggingface/optimum/pull/1312
Add GPTQ prefill benchmark by @fxmarty in https://github.com/huggingface/optimum/pull/1313
Precise ORTModel documentation by @fxmarty in https://github.com/huggingface/optimum/pull/1268
Improve BetterTransformer backward compatibility by @fxmarty in https://github.com/huggingface/optimum/pull/1314
Improve ORTModel documentation by @fxmarty in https://github.com/huggingface/optimum/pull/1245
Add bitsandbytes benchmark by @fxmarty in https://github.com/huggingface/optimum/pull/1320
fix typo in log message by @AAnirudh07 in https://github.com/huggingface/optimum/pull/1322
Support customize dtype for dummy generators by @JingyaHuang in https://github.com/huggingface/optimum/pull/1307
Fix opset custom onnx export by @mht-sharma in https://github.com/huggingface/optimum/pull/1331
Replace mpt to ernie custom export by @mht-sharma in https://github.com/huggingface/optimum/pull/1332
Fix BT benchmark script by @fxmarty in https://github.com/huggingface/optimum/pull/1344
Add name_or_path for donut generation by @fxmarty in https://github.com/huggingface/optimum/pull/1345
send both negative prompt embeds to ORT SDXL by @ssube in https://github.com/huggingface/optimum/pull/1339
add vae image processor by @echarlaix in https://github.com/huggingface/optimum/pull/1219
add negative prompt test by @echarlaix in https://github.com/huggingface/optimum/pull/1347
Add GPT BigCode to the BT documentation by @fxmarty in https://github.com/huggingface/optimum/pull/1356
Add BT dummy objects by @fxmarty in https://github.com/huggingface/optimum/pull/1355
Add text2text-generation-with-past test for encoder-decoder model by @mht-sharma in https://github.com/huggingface/optimum/pull/1338
Fix sentence transformer export by @mht-sharma in https://github.com/huggingface/optimum/pull/1366
Full Changelog: https://github.com/huggingface/optimum/compare/v1.12.0...v1.13.0
Published by fxmarty about 1 year ago
Part of AutoGPTQ library has been integrated in Optimum, with utilities to ease the integration in other Hugging Face libraries. Reference: https://huggingface.co/docs/optimum/llm_quantization/usage_guides/quantization
BetterTransformer now supports BLOOM and GPT-BigCode architectures.
Full Changelog: https://github.com/huggingface/optimum/compare/v1.11.2...v1.12.0
Published by regisss about 1 year ago
Remove the Transformers version constraint on optimum[habana]
.
Full Changelog: https://github.com/huggingface/optimum/compare/v1.11.1...v1.11.2
Published by fxmarty about 1 year ago
Minor fix: documentation building for 1.11.
Full Changelog: https://github.com/huggingface/optimum/compare/v1.11.0...v1.11.1
Published by JingyaHuang about 1 year ago
Add ONNX export and ONNX Runtime inference support for gpt bigcode.
BetterTransformer now supports Llama 2 and bark.
Training and autocast are now supported for most architectures, please refer to the documentation for more details: https://huggingface.co/docs/optimum/main/en/bettertransformer/overview
Full Changelog: https://github.com/huggingface/optimum/compare/v1.10.0...v1.11.0
Published by echarlaix about 1 year ago
Fix OwlViT exporter by @regisss in https://github.com/huggingface/optimum/pull/1188
Fix SD loading when safetensors weights only by @echarlaix in https://github.com/huggingface/optimum/pull/1232
Fix optimum-intel
version requirements by @echarlaix in https://github.com/huggingface/optimum/pull/1234
Full Changelog: https://github.com/huggingface/optimum/compare/v1.10.0...v1.10.1
Published by echarlaix about 1 year ago
Enable SD XL ONNX export and ONNX Runtime inference by @echarlaix in https://github.com/huggingface/optimum/pull/1168
optimum-cli export onnx --model stabilityai/stable-diffusion-xl-base-0.9 --task stable-diffusion-xl ./sd_xl_onnx
from optimum.onnxruntime import ORTStableDiffusionXLPipeline
model_id = "stabilityai/stable-diffusion-xl-base-0.9"
pipeline = ORTStableDiffusionXLPipeline.from_pretrained(model_id, export=True)
prompt = "sailing ship in storm by Leonardo da Vinci"
image = pipeline(prompt).images[0]
pipeline.save_pretrained("onnx-sd-xl-base-0.9")
Enable image-to-image and inpainting pipelines for ONNX Runtime inference by @echarlaix in https://github.com/huggingface/optimum/pull/1121
More examples in documentation
attention_mask=None
for BetterTransformer in the inference batched case for gpt2 & gpt-neo by @fxmarty in https://github.com/huggingface/optimum/pull/1180
Full Changelog: https://github.com/huggingface/optimum/compare/v1.9.0...v1.10.0