optimum

Bot releases are hidden (Show)

optimum - v1.18.1: Patch release Latest Release

Published by JingyaHuang 6 months ago

Fix the installation for Optimum Neuron v0.0.21 release

Improve the installation of optimum-neuron through optimum extras #1778

Fix the task inference of stable diffusion

Fix infer task for stable diffusion #1793

Full Changelog: https://github.com/huggingface/optimum/compare/v1.18.0...v1.18.1

optimum - v1.18.0: Gemma, OWLv2, MPNet Qwen2 ONNX support

Published by echarlaix 7 months ago

New architectures ONNX export :

OWLv2 by @xenova in #1689
Gemma by @fxmarty in #1714
MPNet by @nathan-az in #1471
Qwen2 by @uniartisan in #1746

Other changes and bugfixes

Fix starcoder ORT integration by @fxmarty in #1722
Fix use_auth_token with ORTModel by @fxmarty in #1740
Fix compatibility with transformers v4.39.0 by @echarlaix in #1764

optimum - v1.17.1: Patch release

Published by regisss 8 months ago

Update Transformers dependency for the release of Optimum Habana v1.10.2

Update Transformers dependency in Habana extra #1700

Full Changelog: https://github.com/huggingface/optimum/compare/v1.17.0...v1.17.1

optimum - v1.17.0: Improved ONNX support & many bugfixes

Published by fxmarty 8 months ago

ONNX export from `nn.Module`

A function is exposed to programmatically export any nn.Module (e.g. models coming from Transformers, but modified). This is useful in case you need to do some modifications on models loaded from the Hub before exporting. Example:

from transformers import AutoModelForImageClassification
from optimum.exporters.onnx import onnx_export_from_model

model = AutoModelForImageClassification.from_pretrained("google/vit-base-patch16-224")

# Here one could do any modification on the model before the export.
onnx_export_from_model(model, output="vit_onnx")

Enable model ONNX export by @echarlaix in https://github.com/huggingface/optimum/pull/1649

ONNX export with static shapes

The Optimum ONNX export CLI allows to disable dynamic shape for inputs/outputs:

 optimum-cli export onnx --model timm/ese_vovnet39b.ra_in1k  out_vov --no-dynamic-axes

This is useful if the exported model is to be consumed by a runtime that does not support dynamic shapes. The static shape can be specified e.g. with --batch_size 1 . See all the shape options in optimum-cli export onnx --help.

Enable export of model with fixed shape by @mht-sharma in https://github.com/huggingface/optimum/pull/1643

BF16 ONNX export

The Optimum ONNX export now supports BF16 export on CPU and GPU. Beware though that ONNX Runtime is most often not able to consume the models as some operation are not implemented in this data type, although the exported models comply with ONNX standard. This is useful if you are developing a runtime that consomes BF16 ONNX models.

Example:

optimum-cli export onnx --model bert-base-uncased --dtype bf16 bert_onnx

BF16 support in the ONNX export by @fxmarty in https://github.com/huggingface/optimum/pull/1654

ONNX export for news models

You can now export to ONNX table-transformer, bart for text-classification.

Add ONNX export for table-transformer by @xenova in https://github.com/huggingface/optimum/pull/1616
Reactivate BART Onnx Export by @claeyzre in https://github.com/huggingface/optimum/pull/1666

Sentence Transformers ONNX export

Fix sentence transformers ONNX export by @fxmarty in https://github.com/huggingface/optimum/pull/1632
Bump sentence-transformers ONNX opset by @fxmarty in https://github.com/huggingface/optimum/pull/1634
Pass trust_remote_code to sentence transformers export by @xenova in https://github.com/huggingface/optimum/pull/1677
Fix library detection by @fxmarty in https://github.com/huggingface/optimum/pull/1690

Timm models support with ONNX Runtime

Timm models can now be run through ONNX Runtime with the class ORTModelForImageClassification:

from urllib.request import urlopen

import timm
import torch
from PIL import Image

from optimum.onnxruntime import ORTModelForImageClassification

# Export the model to ONNX under the hood with export=True.
model = ORTModelForImageClassification.from_pretrained("timm/resnext101_64x4d.c1_in1k", export=True)

# Get model specific transforms (normalization, resize).
data_config = timm.data.resolve_data_config(pretrained_cfg=model.config.pretrained_cfg)
transforms = timm.data.create_transform(**data_config, is_training=False)

img = Image.open(
    urlopen("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png")
)
output = model(transforms(img).unsqueeze(0)).logits
top5_probabilities, top5_class_indices = torch.topk(torch.softmax(output, dim=1) * 100, k=5)

Add Timm support in ORTModelForImageClassification by @mht-sharma in https://github.com/huggingface/optimum/pull/1578

Other changes and bugfixes

Modify SEW-D model for tests by @echarlaix in https://github.com/huggingface/optimum/pull/1601
Add phi and mixtral model type to normalizedconfig by @changwangss in https://github.com/huggingface/optimum/pull/1625
Remove "to ONNX" from info message when exporting model by @helena-intel in https://github.com/huggingface/optimum/pull/1627
Modify model id for test by @echarlaix in https://github.com/huggingface/optimum/pull/1628
Fix cupy detection by @fxmarty in https://github.com/huggingface/optimum/pull/1635
Fix ORT detection by @fxmarty in https://github.com/huggingface/optimum/pull/1636
Enable sdpa export for SD unet component by @echarlaix in https://github.com/huggingface/optimum/pull/1637
[ORT] Improve dummy mask & add tips for attention fusion in the doc by @JingyaHuang in https://github.com/huggingface/optimum/pull/1640
Improve error message by @Almonok in https://github.com/huggingface/optimum/pull/1623
Add input_labels input to SAM model export by @xenova in https://github.com/huggingface/optimum/pull/1638
Fix c4 dataset loading by @SunMarc in https://github.com/huggingface/optimum/pull/1646
Avoid loading onnx file in weight deduplication if not necessary by @fxmarty in https://github.com/huggingface/optimum/pull/1648
Allow lower ONNX opsets by @fxmarty in https://github.com/huggingface/optimum/pull/1650
Remove abstract decorator from _export by @JingyaHuang in https://github.com/huggingface/optimum/pull/1652
Add rjieba install by @mht-sharma in https://github.com/huggingface/optimum/pull/1661
Fix wikitext2 processing by @SunMarc in https://github.com/huggingface/optimum/pull/1663
Fix: local variable 'dataset' referenced before assignment by @hiyouga in https://github.com/huggingface/optimum/pull/1600
Support float16 images in StableDiffusionXLWatermarker by @jambayk in https://github.com/huggingface/optimum/pull/1603
Extend autocast check to cover more platforms like XPU by @hoshibara in https://github.com/huggingface/optimum/pull/1639
Support IO Binding for ORTModelForCTC by @vidalmaxime in https://github.com/huggingface/optimum/pull/1629
Add fp16 support for split cache by @PatriceVignola in https://github.com/huggingface/optimum/pull/1602
ORTModelForFeatureExtraction always exports as transformers models by @fxmarty in https://github.com/huggingface/optimum/pull/1684
Avoid overriding model_type in TasksManager by @fxmarty in https://github.com/huggingface/optimum/pull/1647
Fix gptq device_map = "cpu" by @SunMarc in https://github.com/huggingface/optimum/pull/1662
CI: Avoid iterating over a mutated iterable by @fxmarty in https://github.com/huggingface/optimum/pull/1683
Add option to disable ONNX constant folding by @fxmarty in https://github.com/huggingface/optimum/pull/1682
re-enable decoder sequence classification by @dwyatte in https://github.com/huggingface/optimum/pull/1679
Move & rename onnx_export by @fxmarty in https://github.com/huggingface/optimum/pull/1685
Update standardize_model_attributes by @mht-sharma in https://github.com/huggingface/optimum/pull/1686
Fix: AttributeError: module 'packaging' has no attribute 'version' by @soulteary in https://github.com/huggingface/optimum/pull/1660
Disable failing test & free space when building documentation by @fxmarty in https://github.com/huggingface/optimum/pull/1693
Fix no space left on device in actions by @fxmarty in https://github.com/huggingface/optimum/pull/1694
Add end-to-end Marlin benchmark by @fxmarty in https://github.com/huggingface/optimum/pull/1695
Fix main doc build by @fxmarty in https://github.com/huggingface/optimum/pull/1697
Update optimum-intel requirements by @echarlaix in https://github.com/huggingface/optimum/pull/1699

New Contributors

@tomaarsen made their first contribution in https://github.com/huggingface/optimum/pull/1597
@helena-intel made their first contribution in https://github.com/huggingface/optimum/pull/1627
@Almonok made their first contribution in https://github.com/huggingface/optimum/pull/1623
@hiyouga made their first contribution in https://github.com/huggingface/optimum/pull/1600
@jambayk made their first contribution in https://github.com/huggingface/optimum/pull/1603
@hoshibara made their first contribution in https://github.com/huggingface/optimum/pull/1639
@vidalmaxime made their first contribution in https://github.com/huggingface/optimum/pull/1629
@PatriceVignola made their first contribution in https://github.com/huggingface/optimum/pull/1602
@claeyzre made their first contribution in https://github.com/huggingface/optimum/pull/1666
@dwyatte made their first contribution in https://github.com/huggingface/optimum/pull/1679
@soulteary made their first contribution in https://github.com/huggingface/optimum/pull/1660

Full Changelog: https://github.com/huggingface/optimum/compare/v1.16.0...v1.17.0

optimum - v1.16.2: Patch release

Published by echarlaix 9 months ago

Fix ORT training compatibility for transformers v4.36.0 by @AdamLouly https://github.com/huggingface/optimum/pull/1586
Fix ONNX expor tcompatibility for transformers v4.37.0 by @echarlaix https://github.com/huggingface/optimum/pull/1641

optimum - v1.16.1: Patch release

Published by fxmarty 10 months ago

Breaking change: BetterTransformer llama, falcon, whisper, bart is deprecated

The features from BetterTransformer for Llama, Falcon, Whisper and Bart have been upstreamed in Transformers. Please use transformers>=4.36 and torch>=2.1.1 to use by default PyTorch's scaled_dot_product_attention.

More details: https://github.com/huggingface/transformers/releases/tag/v4.36.0

What's Changed

Update dev version by @fxmarty in https://github.com/huggingface/optimum/pull/1596
Typo: tansformers -> transformers by @tomaarsen in https://github.com/huggingface/optimum/pull/1597
[GPTQ] fix tests by @SunMarc in https://github.com/huggingface/optimum/pull/1598
Show correct error message on using BT for SDPA models by @fxmarty in https://github.com/huggingface/optimum/pull/1599

New Contributors

@tomaarsen made their first contribution in https://github.com/huggingface/optimum/pull/1597

Full Changelog: https://github.com/huggingface/optimum/compare/v1.16.0...v1.16.1

optimum - v1.16.0: Transformers 4.36 compatibility, extended ONNX support, Mixtral GPTQ

Published by fxmarty 10 months ago

Transformers 4.36 compatiblity

Notably, the ONNX exports aten::scaled_dot_product_attention in a standardized way for the compatible models.

Compatibility with Transformers 4.36 by @fxmarty in https://github.com/huggingface/optimum/pull/1590

Extended ONNX support: timm, sentence-transformers, Phi, ESM

Add ONNX export for phi models by @xenova in https://github.com/huggingface/optimum/pull/1579
Add ESM onnx support by @xenova in https://github.com/huggingface/optimum/pull/1581
Add timm models export by @mht-sharma in https://github.com/huggingface/optimum/pull/1587
Proper sentence-transformers ONNX export support by @fxmarty in https://github.com/huggingface/optimum/pull/1589

GPTQ for Mixtral

Work in progress.

add modules_in_block_to_quantize arg for gptq by @SunMarc in https://github.com/huggingface/optimum/pull/1585

What's Changed

Update version to 1.16.0.dev0 by @fxmarty in https://github.com/huggingface/optimum/pull/1571
Use doc links in the README for subpackages by @fxmarty in https://github.com/huggingface/optimum/pull/1572
Fix GPTQ compatibility with AutoGPTQ by @fxmarty in https://github.com/huggingface/optimum/pull/1574
Refactoring EC2 CIs by @JingyaHuang in https://github.com/huggingface/optimum/pull/1575
Remove inputs from sentence-transformers ONNX output by @fxmarty in https://github.com/huggingface/optimum/pull/1593
Gptq tokenized dataset by @SunMarc in https://github.com/huggingface/optimum/pull/1584
Run timm ONNX CI only once per day by @fxmarty in https://github.com/huggingface/optimum/pull/1594
Run timm ONNX CI nightly v2 by @fxmarty in https://github.com/huggingface/optimum/pull/1595

Full Changelog: https://github.com/huggingface/optimum/compare/v1.15.0...v1.16.0

optimum - v1.15.0: ROCMExecutionProvider support

Published by fxmarty 11 months ago

ROCMExecutionProvider support

The Optimum ONNX Runtime integration is extended to officially support ROCMExecutionProvider. See more details in the documentation.

Add AMD GPU support by @mht-sharma in https://github.com/huggingface/optimum/pull/1546
Update ROCM ORT doc by @mht-sharma in https://github.com/huggingface/optimum/pull/1564

Extended ONNX export

The Swin2sr, DPT, GLPN, ConvNextv2 are now supported in the ONNX export.

Swin2sr onnx by @baskrahmer in https://github.com/huggingface/optimum/pull/1492
Add depth-estimation w/ DPT+GLPN by @xenova in https://github.com/huggingface/optimum/pull/1529
Add convnextv2 onnx export by @xenova in https://github.com/huggingface/optimum/pull/1560

What's Changed

Add OV export CLI to README by @echarlaix in https://github.com/huggingface/optimum/pull/1526
Refactor NormalizedConfigs for GQA by @michaelbenayoun in https://github.com/huggingface/optimum/pull/1539
Fix model patcher ONNX decoder export by @fxmarty in https://github.com/huggingface/optimum/pull/1547
Add AMD to the documentation main page by @mfuntowicz in https://github.com/huggingface/optimum/pull/1540
Add Optimum-amd documentation to the PR & release doc by @fxmarty in https://github.com/huggingface/optimum/pull/1562
Add amd documentation by @echarlaix in https://github.com/huggingface/optimum/pull/1557
Remove delete_doc_comment workflows by @regisss in https://github.com/huggingface/optimum/pull/1565
optimum-nvidia by @mfuntowicz in https://github.com/huggingface/optimum/pull/1566
Update installation instructions in README by @echarlaix in https://github.com/huggingface/optimum/pull/1568
Update doc for AMD by @mht-sharma in https://github.com/huggingface/optimum/pull/1570
Add amd extra to setup.py by @echarlaix in https://github.com/huggingface/optimum/pull/1567

New Contributors

@xenova made their first contribution in https://github.com/huggingface/optimum/pull/1529

Full Changelog: https://github.com/huggingface/optimum/compare/v1.14.0...v1.15.0

optimum - v1.14.1: Patch release

Published by echarlaix 11 months ago

Update optimum-intel required version by @echarlaix in https://github.com/huggingface/optimum/pull/1521
Swin2sr onnx by @baskrahmer in https://github.com/huggingface/optimum/pull/1492
Fix Falcon ONNX export with alibi by @fxmarty in https://github.com/huggingface/optimum/pull/1524
Fix whisper v3 ONNX export by @fxmarty in https://github.com/huggingface/optimum/pull/1525
Add new fusion argument to fix compatibility with onnxruntime v1.16.2 by @echarlaix in https://github.com/huggingface/optimum/pull/1535
Add depth-estimation w/ DPT+GLPN by @xenova in https://github.com/huggingface/optimum/pull/1529

optimum - v1.14.0: LCMs, SpeechT5, Falcon, Mistral, decoder refactorization

Published by echarlaix 12 months ago

ONNX

New architectures

Falcon

Add ONNX and ORT support for Falcon by @fxmarty in https://github.com/huggingface/optimum/pull/1391

SpeechT5

SpeechT5 ONNX support by @fxmarty in https://github.com/huggingface/optimum/pull/1404

Mistral

Add Mistral models ONNX export support by @echarlaix in https://github.com/huggingface/optimum/pull/1425

TrOCR

Enable KV cache support by @fxmarty in https://github.com/huggingface/optimum/pull/1456

LCMs

Enable LCMs (available in in diffusers since v0.22.0) ONNX export and ORT inference by @echarlaix in https://github.com/huggingface/optimum/pull/1469

from optimum.onnxruntime import ORTLatentConsistencyModelPipeline

pipe = ORTLatentConsistencyModelPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", export=True)
prompt = "sailing ship in storm by Leonardo da Vinci"
images = pipe(prompt=prompt, num_inference_steps=4, guidance_scale=8.0).images

Also enable ONNX export using the CLI :

optimum-cli export onnx --model SimianLuo/LCM_Dreamshaper_v7 lcm_onnx/

Decoder refactorization

Add position ids as input during ONNX export by @fxmarty in https://github.com/huggingface/optimum/pull/1381
Enable the export of only one decoder for decoder-only models by @echarlaix in https://github.com/huggingface/optimum/pull/1257

GPTQ

Enable possibility to choose exllamav2 kernels for GPTQ models by @SunMarc in https://github.com/huggingface/optimum/pull/1419
Disable exllamav2 for quantization by @SunMarc in https://github.com/huggingface/optimum/pull/1482
Default to exllama when exllamav2 is disabled by @SunMarc in https://github.com/huggingface/optimum/pull/1494
Added cache_block_outputs parameter to handle models with non-regular structure such as ChatGLM by @AlexKoff88 in https://github.com/huggingface/optimum/pull/1479
Add support for CPU Inference by @vivekkhandelwal1 in https://github.com/huggingface/optimum/pull/1496
Fix minimum version of auto-gptq by @fxmarty in https://github.com/huggingface/optimum/pull/1504
switch to exllama_config instead of disabling exllamav2 by @SunMarc in https://github.com/huggingface/optimum/pull/1505

Other changes and bugfixes

Fix wrong dtype in the ONNX export by @fxmarty in https://github.com/huggingface/optimum/pull/1369
Add support for loading quantization from config by @aarnphm https://github.com/huggingface/optimum/pull/1363
Guard multiprocessing set start method by @fxmarty in https://github.com/huggingface/optimum/pull/1377
Do not output KV cache when not using with-past in the ONNX export by @fxmarty in https://github.com/huggingface/optimum/pull/1358
Fix provider availability check on ORT 1.16.0 release by @fxmarty in https://github.com/huggingface/optimum/pull/1403
Fix quantization for onnxruntime v1.16.0 by @echarlaix in https://github.com/huggingface/optimum/pull/1405
Fix normalized config key for models architecture by @echarlaix in https://github.com/huggingface/optimum/pull/1408
Fix arg in bettertransformer llama attention by @SunMarc in https://github.com/huggingface/optimum/pull/1421
Ignore .xml files for Stable Diffusion ORT downloads by @baskrahmer in https://github.com/huggingface/optimum/pull/1428
Falcon BetterTransformer requires transformers>=4.34 by @fxmarty in https://github.com/huggingface/optimum/pull/1431
Fix llama ONNX export by @fxmarty in https://github.com/huggingface/optimum/pull/1432
Update attention.py by @DongHande in https://github.com/huggingface/optimum/pull/1416
Remove SharedDDP as it was deprecated from Transformers by @AdamLouly in https://github.com/huggingface/optimum/pull/1443
Fix owlvit task detection by @fxmarty in https://github.com/huggingface/optimum/pull/1453
Improve ONNX quantization doc by @fxmarty in https://github.com/huggingface/optimum/pull/1451
Fix perceiver tests and dummy inputs for ONNX by @baskrahmer in https://github.com/huggingface/optimum/pull/1449
Disable bart onnx export for text-classification and question-answering by @fxmarty in https://github.com/huggingface/optimum/pull/1457
Fix ONNX exporter library_name by @baskrahmer in https://github.com/huggingface/optimum/pull/1460
[ORT Training] Some important updates of ONNX Runtime training APIs by @JingyaHuang in https://github.com/huggingface/optimum/pull/1335
Fix typo in BetterTransformer CLIP by @fxmarty in https://github.com/huggingface/optimum/pull/1468
Fix custom architecture detection in onnx export by @fxmarty in https://github.com/huggingface/optimum/pull/1472
Fix whisper export by @mht-sharma in https://github.com/huggingface/optimum/pull/1503
Update Transformers dependency for Habana extra by @regisss in https://github.com/huggingface/optimum/pull/1508
Fix argument error by @ranchlai in https://github.com/huggingface/optimum/pull/1501
Remove attention mask patching by @fxmarty in https://github.com/huggingface/optimum/pull/1509
Fix generation input by @echarlaix in https://github.com/huggingface/optimum/pull/1512
Fix tests ORTModel by @fxmarty in https://github.com/huggingface/optimum/pull/1517
Fix BT on transformers 4.35 release by @fxmarty in https://github.com/huggingface/optimum/pull/1518

New Contributors

@aarnphm made their first contribution in https://github.com/huggingface/optimum/pull/1363
@DongHande made their first contribution in https://github.com/huggingface/optimum/pull/1416
@AlexKoff88 made their first contribution in https://github.com/huggingface/optimum/pull/1479
@vivekkhandelwal1 made their first contribution in https://github.com/huggingface/optimum/pull/1496
@ranchlai made their first contribution in https://github.com/huggingface/optimum/pull/1501

optimum - v1.13.3: Patch release

Published by fxmarty 12 months ago

Patch release for transformers==4.34.1 compatibility. We will do a release next week for transformers==4.35 compatibility and new features. Please bear with us!

Falcon BetterTransformer requires transformers>=4.34 by @fxmarty https://github.com/huggingface/optimum/pull/1431
Fix arg in bettertransformer llama attention by @SunMarc #1421
Update Transformers dependency for Habana extra by @regisss #1508
temporarily pin to transformers<4.35 by @fxmarty https://github.com/huggingface/optimum/commit/616931019b9bd7546918a48d475a07efb92f51b1

optimum - v1.13.2: Patch release

Published by echarlaix about 1 year ago

Fix provider availability check on ORT 1.16.0 release by @fxmarty in https://github.com/huggingface/optimum/pull/1403
Fix ONNX Runtime quantization compatibility for onnxruntime v1.16.0 by @echarlaix in https://github.com/huggingface/optimum/pull/1405

optimum - v1.13.1: Patch release

Published by fxmarty about 1 year ago

Fix ONNX fp16 export that broke in 1.13.0.

What's Changed

Fix wrong dtype in the ONNX export by @fxmarty in https://github.com/huggingface/optimum/pull/1369
Fix tests collection for TFLite export and trigger TFLite tests only when relevant by @fxmarty in https://github.com/huggingface/optimum/pull/1368
upgrade min compatible optimum-intel version by @echarlaix in https://github.com/huggingface/optimum/pull/1371
Fix fp16 ONNX export test by @fxmarty in https://github.com/huggingface/optimum/pull/1373

optimum - v1.13.0: ONNX weight deduplication, ONNX export and ORT extension

Published by fxmarty about 1 year ago

Deduplicate Embedding / LM head weight in the ONNX export

Workaround for a bug in the PyTorch ONNX export that does not deduplicate the Embedding and LM head shared weight: https://github.com/pytorch/pytorch/issues/108342. For small enough models, this results in up to 50% ONNX serialized model size decrease.

Fix PyTorch tied weights being duplicated in the exported ONNX models by @fxmarty in https://github.com/huggingface/optimum/pull/1326
Fix initializer detection for weight deduplication by @fxmarty in https://github.com/huggingface/optimum/pull/1333

Extended ONNX Runtime support

ONNX Runtime integration now supports Pix2Struct and MPT architectures. Donut now supports IO Binding. Encoder-Decoder models are now supported as well.

Pix2Struct onnxruntime support by @krathul in https://github.com/huggingface/optimum/pull/1296
Add MPT onnx and ORT support by @jiqing-feng in https://github.com/huggingface/optimum/pull/1161
Donut iobinding by @IlyasMoutawwakil in https://github.com/huggingface/optimum/pull/1209
Add encoder decoder model by @mht-sharma in https://github.com/huggingface/optimum/pull/851

Extended ONNX export: MPT, TIMM models, Encoder-Decoder

Additionally, the model SAM is now be default exported as a vision_encoder.onnx, and prompt_encoder_mask_decoder.onnx.

Add MPT onnx and ORT support by @jiqing-feng in https://github.com/huggingface/optimum/pull/1161
Adds ONNX Export Support for Timm Models by @mht-sharma in https://github.com/huggingface/optimum/pull/965
Add encoder decoder model by @mht-sharma in https://github.com/huggingface/optimum/pull/851
Fix SAM ONNX export requirements with transformers 4.32, export vision encoder separately by @fxmarty in https://github.com/huggingface/optimum/pull/1301

BetterTransformer supports Falcon

[BetterTransformer] Add falcon to BetterTransformer by @younesbelkada in https://github.com/huggingface/optimum/pull/1343

Major bugfix: ability to set GPTQ Exllama kernel maximum length in the transformers integration

The function exllama_set_max_input_length from auto-gptq can now be used with Transformers GPTQ models.

Version bump + add max_input_length to gptq by @SunMarc in https://github.com/huggingface/optimum/pull/1329

Other changes and bugfixes

Update version to 1.12.1.dev0 following release by @fxmarty in https://github.com/huggingface/optimum/pull/1312
Add GPTQ prefill benchmark by @fxmarty in https://github.com/huggingface/optimum/pull/1313
Precise ORTModel documentation by @fxmarty in https://github.com/huggingface/optimum/pull/1268
Improve BetterTransformer backward compatibility by @fxmarty in https://github.com/huggingface/optimum/pull/1314
Improve ORTModel documentation by @fxmarty in https://github.com/huggingface/optimum/pull/1245
Add bitsandbytes benchmark by @fxmarty in https://github.com/huggingface/optimum/pull/1320
fix typo in log message by @AAnirudh07 in https://github.com/huggingface/optimum/pull/1322
Support customize dtype for dummy generators by @JingyaHuang in https://github.com/huggingface/optimum/pull/1307
Fix opset custom onnx export by @mht-sharma in https://github.com/huggingface/optimum/pull/1331
Replace mpt to ernie custom export by @mht-sharma in https://github.com/huggingface/optimum/pull/1332
Fix BT benchmark script by @fxmarty in https://github.com/huggingface/optimum/pull/1344
Add name_or_path for donut generation by @fxmarty in https://github.com/huggingface/optimum/pull/1345
send both negative prompt embeds to ORT SDXL by @ssube in https://github.com/huggingface/optimum/pull/1339
add vae image processor by @echarlaix in https://github.com/huggingface/optimum/pull/1219
add negative prompt test by @echarlaix in https://github.com/huggingface/optimum/pull/1347
Add GPT BigCode to the BT documentation by @fxmarty in https://github.com/huggingface/optimum/pull/1356
Add BT dummy objects by @fxmarty in https://github.com/huggingface/optimum/pull/1355
Add text2text-generation-with-past test for encoder-decoder model by @mht-sharma in https://github.com/huggingface/optimum/pull/1338
Fix sentence transformer export by @mht-sharma in https://github.com/huggingface/optimum/pull/1366

New Contributors

@krathul made their first contribution in https://github.com/huggingface/optimum/pull/1296
@AAnirudh07 made their first contribution in https://github.com/huggingface/optimum/pull/1322
@jiqing-feng made their first contribution in https://github.com/huggingface/optimum/pull/1161
@ssube made their first contribution in https://github.com/huggingface/optimum/pull/1339

Full Changelog: https://github.com/huggingface/optimum/compare/v1.12.0...v1.13.0

optimum - v1.12.0: AutoGPTQ integration, extended BetterTransformer support

Published by fxmarty about 1 year ago

AutoGPTQ integration

Part of AutoGPTQ library has been integrated in Optimum, with utilities to ease the integration in other Hugging Face libraries. Reference: https://huggingface.co/docs/optimum/llm_quantization/usage_guides/quantization

Add GPTQ Quantization by @SunMarc in https://github.com/huggingface/optimum/pull/1216
Fix GPTQ doc by @regisss in https://github.com/huggingface/optimum/pull/1267
Add AutoGPTQ benchmark by @fxmarty in https://github.com/huggingface/optimum/pull/1292
Fix gptq params by @SunMarc in https://github.com/huggingface/optimum/pull/1284

Extended BetterTransformer support

BetterTransformer now supports BLOOM and GPT-BigCode architectures.

Bt bloom by @baskrahmer in https://github.com/huggingface/optimum/pull/1221
Support gpt_bigcode in bettertransformer by @fxmarty in https://github.com/huggingface/optimum/pull/1252
Fix BetterTransformer starcoder init by @fxmarty in https://github.com/huggingface/optimum/pull/1254
Fix BT starcoder fp16 by @fxmarty in https://github.com/huggingface/optimum/pull/1255
SDPA dispatches to flash for MQA by @fxmarty in https://github.com/huggingface/optimum/pull/1259
Check output_attentions is False in BetterTransformer by @fxmarty in https://github.com/huggingface/optimum/pull/1306

Other changes and bugfixes

Update bug report template by @fxmarty in https://github.com/huggingface/optimum/pull/1266
Fix ORTModule uses fp32 model issue by @jingyanwangms in https://github.com/huggingface/optimum/pull/1264
Fix build PR doc workflow by @fxmarty in https://github.com/huggingface/optimum/pull/1270
Avoid triggering stop job on label by @fxmarty in https://github.com/huggingface/optimum/pull/1274
Update version following 1.11.1 patch by @fxmarty in https://github.com/huggingface/optimum/pull/1275
Fix fp16 ONNX detection for decoder models by @fxmarty in https://github.com/huggingface/optimum/pull/1276
Update version following 1.11.2 patch by @regisss in https://github.com/huggingface/optimum/pull/1291
Pin tensorflow<=2.12.1 by @fxmarty in https://github.com/huggingface/optimum/pull/1305
ONNX: disable text-generation models for sequence classification & fixes for transformers 4.32 by @fxmarty in https://github.com/huggingface/optimum/pull/1308
Fix staging tests following transformers 4.32 release by @fxmarty in https://github.com/huggingface/optimum/pull/1309
More fixes following transformers 4.32 release by @fxmarty in https://github.com/huggingface/optimum/pull/1311

New Contributors

@SunMarc made their first contribution in https://github.com/huggingface/optimum/pull/1216
@jingyanwangms made their first contribution in https://github.com/huggingface/optimum/pull/1264

Full Changelog: https://github.com/huggingface/optimum/compare/v1.11.2...v1.12.0

optimum - v1.11.2: Patch release

Published by regisss about 1 year ago

Remove the Transformers version constraint on optimum[habana].

Remove Transformers version constraint on Optimum Habana #1290 by @regisss

Full Changelog: https://github.com/huggingface/optimum/compare/v1.11.1...v1.11.2

optimum - v1.11.1: Patch release

Published by fxmarty about 1 year ago

Minor fix: documentation building for 1.11.

Accelerate as a soft dependency by @fxmarty

Full Changelog: https://github.com/huggingface/optimum/compare/v1.11.0...v1.11.1

optimum - v1.11.0: Extended ONNX, ONNX Runtime, BetterTransformer support

Published by JingyaHuang about 1 year ago

Extended ONNX and ONNX Runtime support

Add ONNX export and ONNX Runtime inference support for gpt bigcode.

Add ONNX / ONNXRuntime support for StarCoder by @JingyaHuang in #1042

Extended BetterTransformer support

BetterTransformer now supports Llama 2 and bark.

Training and autocast are now supported for most architectures, please refer to the documentation for more details: https://huggingface.co/docs/optimum/main/en/bettertransformer/overview

Support Llama 2 in BetterTransformer. by @noamwies in #1235
BetterTransformer support training & autocast for all archs by @fxmarty in #1225
Add bark into bettertransformer by @ylacombe in https://github.com/huggingface/optimum/pull/1199
Drop mask for training in all cases for BetterTransformer & precise documentation by @fxmarty in https://github.com/huggingface/optimum/pull/1250

Major bugfixes

Update ORT training to be compatible with transformers 4.31 by @JingyaHuang in #1227

Other improvements and bugfix

add upgrade strategy by @echarlaix in https://github.com/huggingface/optimum/pull/1228
fix typo README by @echarlaix in https://github.com/huggingface/optimum/pull/1230
Fix OwlViT exporter config by @regisss in https://github.com/huggingface/optimum/pull/1188
Add example SD XL documentation by @echarlaix in https://github.com/huggingface/optimum/pull/1233
fix SD loading when safetensors weights only by @echarlaix in https://github.com/huggingface/optimum/pull/1232
fix optimum-intel min version by @echarlaix in https://github.com/huggingface/optimum/pull/1234
fix typo documentation by @echarlaix in https://github.com/huggingface/optimum/pull/1238
update documentation by @echarlaix in https://github.com/huggingface/optimum/pull/1240
Update onnxruntime minimum version to 1.11 by @fxmarty in https://github.com/huggingface/optimum/pull/1244
ORT quantizes by default all ops by @fxmarty in https://github.com/huggingface/optimum/pull/1246

New Contributors

@ylacombe made their first contribution in https://github.com/huggingface/optimum/pull/1199
@noamwies made their first contribution in https://github.com/huggingface/optimum/pull/1235

Full Changelog: https://github.com/huggingface/optimum/compare/v1.10.0...v1.11.0

optimum - v1.10.1: Patch release

Published by echarlaix about 1 year ago

Fix OwlViT exporter by @regisss in https://github.com/huggingface/optimum/pull/1188
Fix SD loading when safetensors weights only by @echarlaix in https://github.com/huggingface/optimum/pull/1232
Fix optimum-intel version requirements by @echarlaix in https://github.com/huggingface/optimum/pull/1234

Full Changelog: https://github.com/huggingface/optimum/compare/v1.10.0...v1.10.1

optimum - v1.10.0: Stable Diffusion XL pipelines

Published by echarlaix about 1 year ago

Stable Diffusion XL

Enable SD XL ONNX export and ONNX Runtime inference by @echarlaix in https://github.com/huggingface/optimum/pull/1168

Enable SD XL ONNX export using the CLI :

optimum-cli export onnx --model stabilityai/stable-diffusion-xl-base-0.9 --task stable-diffusion-xl ./sd_xl_onnx

Add SD XL pipelines for ONNX Runtime inference (supported tasks : text-to-image and image-to-image) :

from optimum.onnxruntime import ORTStableDiffusionXLPipeline

model_id = "stabilityai/stable-diffusion-xl-base-0.9"
pipeline = ORTStableDiffusionXLPipeline.from_pretrained(model_id, export=True)

prompt = "sailing ship in storm by Leonardo da Vinci"
image = pipeline(prompt).images[0]
pipeline.save_pretrained("onnx-sd-xl-base-0.9")

Stable Diffusion pipelines

Enable image-to-image and inpainting pipelines for ONNX Runtime inference by @echarlaix in https://github.com/huggingface/optimum/pull/1121

Major bugfixes

Fix bloom KV cache usage in ORTForCausalLM by @fxmarty in https://github.com/huggingface/optimum/pull/1152

What's Changed

Add stable diffusion example by @prathikr in https://github.com/huggingface/optimum/pull/1136
Fixed incomplete ONNX export model memory release issue by @sharpbai in https://github.com/huggingface/optimum/pull/1154
Add trust remote code option for config by @changwangss in https://github.com/huggingface/optimum/pull/1151
Fix typos of ONNXRuntimme -> ONNXRuntime by @mgoin in https://github.com/huggingface/optimum/pull/1155
Fix ONNX export for MobileViT for segmentation by @regisss in https://github.com/huggingface/optimum/pull/1128
Revert "update the default block size" by @rui-ren in https://github.com/huggingface/optimum/pull/1162
ONNX export for custom architectures & models with custom modeling code by @fxmarty in https://github.com/huggingface/optimum/pull/1166
Update Optimum Neuron doc by @regisss in https://github.com/huggingface/optimum/pull/1164
Fix stable diffusion ONNX export by @echarlaix in https://github.com/huggingface/optimum/pull/1173
Add gpt_bigcode model_type to NormalizedTextConfig by @changwangss in https://github.com/huggingface/optimum/pull/1170
Allow attention_mask=None for BetterTransformer in the inference batched case for gpt2 & gpt-neo by @fxmarty in https://github.com/huggingface/optimum/pull/1180
Fix encoder attention mask input order for ORT by @fxmarty in https://github.com/huggingface/optimum/pull/1181
Fix ORTModel initialization on specific device id by @fxmarty in https://github.com/huggingface/optimum/pull/1182
Add stable diffusion img2img and inpain documentation by @echarlaix in https://github.com/huggingface/optimum/pull/1149
Fix SD XL ONNX export for img2img task by @echarlaix in https://github.com/huggingface/optimum/pull/1194
Remove graphcore from documentation quickstart by @echarlaix in https://github.com/huggingface/optimum/pull/1201
Unpin tensorflow by @fxmarty in https://github.com/huggingface/optimum/pull/1211
Fix ORT test for unknown architecture for task by @fxmarty in https://github.com/huggingface/optimum/pull/1212
add ort + stable diffusion documentation by @prathikr in https://github.com/huggingface/optimum/pull/1205
Fix vision encoder decoder that may not cache cross-attention by @fxmarty in https://github.com/huggingface/optimum/pull/1210
Add documentation for Optimum Furiosa by @regisss in https://github.com/huggingface/optimum/pull/1165
Add BLIP-2 to BetterTransformer documentation by @fxmarty in https://github.com/huggingface/optimum/pull/1218
Set default value to unet config sample size by @echarlaix in https://github.com/huggingface/optimum/pull/1223
Fix broken link in doc by @regisss in https://github.com/huggingface/optimum/pull/1222
Fix BT test by @fxmarty in https://github.com/huggingface/optimum/pull/1224
Add SD XL documentation by @echarlaix in https://github.com/huggingface/optimum/pull/1198
Update setup.py to add optimum-furiosa extras by @mht-sharma in https://github.com/huggingface/optimum/pull/1226

New Contributors

@sharpbai made their first contribution in https://github.com/huggingface/optimum/pull/1154
@mgoin made their first contribution in https://github.com/huggingface/optimum/pull/1155

Full Changelog: https://github.com/huggingface/optimum/compare/v1.9.0...v1.10.0

Package Rankings

Top 1.36% on Pypi.org

Top 21.48% on Conda-forge.org

Top 38.25% on Anaconda.org

Badges

Extracted from project README

Related Projects

Keras-OneClassAnomalyDetection

[5 FPS - 150 FPS] Learning Deep Features for One-Class Classification (AnomalyDetection). Corresp...

06 Jan 2019 127

text-generation-inference

Large Language Model Text Generation Inference

08 Oct 2022 7,916

Pointcept

Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT...

21 Mar 2023 1,143

ao

torchao: PyTorch Architecture Optimization (AO). Performant kernels that work with PyTorch.

03 Nov 2023 193

Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

31 Aug 2020 9,151

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

13 Apr 2023 4,292

nlpboost

Python library for automatic training, optimization and comparison of Transformer models on most ...

28 Dec 2022 20

ort

Accelerate PyTorch models with ONNX Runtime

08 Feb 2021 353

video-transformers

Easiest way of fine-tuning HuggingFace video classification models

12 Aug 2022 131

Fix the installation for Optimum Neuron v0.0.21 release

Fix the task inference of stable diffusion

New architectures ONNX export :

Other changes and bugfixes

Update Transformers dependency for the release of Optimum Habana v1.10.2

ONNX export from nn.Module

ONNX export with static shapes

BF16 ONNX export

ONNX export for news models

Sentence Transformers ONNX export

Timm models support with ONNX Runtime

Other changes and bugfixes

New Contributors

Breaking change: BetterTransformer llama, falcon, whisper, bart is deprecated

What's Changed

New Contributors

Transformers 4.36 compatiblity

Extended ONNX support: timm, sentence-transformers, Phi, ESM

GPTQ for Mixtral

What's Changed

ROCMExecutionProvider support

Extended ONNX export

What's Changed

New Contributors

ONNX

New architectures

Falcon

SpeechT5

Mistral

TrOCR

LCMs

Decoder refactorization

GPTQ

Other changes and bugfixes

New Contributors

What's Changed

Deduplicate Embedding / LM head weight in the ONNX export

Extended ONNX Runtime support

Extended ONNX export: MPT, TIMM models, Encoder-Decoder

BetterTransformer supports Falcon

Major bugfix: ability to set GPTQ Exllama kernel maximum length in the transformers integration

Other changes and bugfixes

New Contributors

AutoGPTQ integration

Extended BetterTransformer support

Other changes and bugfixes

New Contributors

Extended ONNX and ONNX Runtime support

Extended BetterTransformer support

Major bugfixes

Other improvements and bugfix

New Contributors

Stable Diffusion XL

Stable Diffusion pipelines

Major bugfixes

What's Changed

New Contributors

Related Projects

Keras-OneClassAnomalyDetection

text-generation-inference

Pointcept

ao

Transformers-Tutorials

AutoGPTQ

nlpboost

ort

video-transformers

ONNX export from `nn.Module`