TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

OTHER License

Stars
1.5K

Bot releases are hidden (Show)

TurboTransformers - TurboTransformers v0.5.1 Latest Release

Published by fangjiarui almost 4 years ago

Albert Model uses the model-aware-allocator.

TurboTransformers - TurboTransformers v0.5.0

Published by fangjiarui almost 4 years ago

Add Model Aware Allocator for Bert Model.

TurboTransformers - TurboTransformers v0.4.2

Published by feifeibear about 4 years ago

Add Quantized Bert using onnxruntime.

TurboTransformers - TurboTransformers v0.4.1

Published by feifeibear about 4 years ago

Using onnxruntime-cpu as CPU backend, parallel to our own home-grown implementation.

TurboTransformers - TurboTransformer v0.3.0

Published by feifeibear over 4 years ago

Support Transformer decoder used in OpenNMT-py.
New GPU memory allocator.
Be Compatible with Pytorch v1.5.0.

TurboTransformers - TurboTransformer v0.2.1

Published by feifeibear over 4 years ago

Add blis to BLAS options.

TurboTransformers - TurboTransformer v0.0.1

Published by feifeibear over 4 years ago

Bert Acceleration on CPU and GPU.

Package Rankings
Top 5.64% on Proxy.golang.org
Related Projects