Liger-Kernel

Efficient Triton Kernels for LLM Training

BSD-2-CLAUSE License

Downloads
53.3K
Stars
3.1K
Committers
35
Liger-Kernel - v0.2.1 Latest Release

Published by yundai424 about 2 months ago

Patch Release

Fix bug in Gemma patch function that FLCE and CE are both true by default ruh roh

What's Changed

Full Changelog: https://github.com/linkedin/Liger-Kernel/compare/v0.2.0...v0.2.1

Liger-Kernel - v0.2.0 Release Note

Published by yundai424 about 2 months ago

Opening Thoughts 🫶

Thank You!

We'd love to take this chance to express our sincere gratefulness to the community! 2500+ ⭐ , 10+ new contributors, 50+ PRs, plus integration into Hugging Face 🤗, axolotl and LLaMA-Factory in less than one week since going open sourced is totally beyond our expectation. Being able to work together with all the cool people in the community is a bliss and we can't wait for further collaborations down the road!

Looking Ahead

We look forward to further enhancing our collaboration with the community, to work together on a lot of cool stuff -- support for more model families, squeeze out all optimization opportunities for kernels, and, why not, llama.triton? 😉

Get Involved and Stay Tuned

Please feel free to join our discord channel hosted in CUDA MODE server, and follow our repo's official account on X: https://x.com/liger_kernel !

Welcome Phi3 and Qwen2 🚀

This release ships with support for other popular models including Phi3 and Qwen2. All existing kernels in Liger repo can be leveraged to boost your training with models from these families now. Please refer to our API guide for how to use.

Even Easier API ❤️

Experimenting with different model families and tired of having if-else everywhere just to switch between kernel patching functions? You can now try out our new model-agnostic API to apply Liger kernels. Still a one-liner, but more elegant :) For example:

from liger_kernel.transformers import AutoLigerKernelForCausalLM

# This AutoModel wrapper class automatically monkey-patches the
# model with the optimized Liger kernels if the model is supported.
model = AutoLigerKernelForCausalLM.from_pretrained(...)

More Features

  • Support optional bias term in FusedLinearCrossEntropy (#144)
  • Mistral is now equipped with the humongous memory reduction from FusedLinearCrossEntropy now (#93)
  • Gemma is now equipped with the humongous memory reduction from FusedLinearCrossEntropy now (#111)

Bug Fixes

  • Fixed import error when using triton>=3.0.0 on NGC containers (#79)
  • Fixed the missing offset in Gemma RMSNorm (#85) oops
  • Added back missing dataclass entries in efficiency callback (#116)
  • There was some confusion on which Gemma do we support, we now support all! (#125)
  • Fallback to torch native linear + CrossEntropy when without label (#128)
  • Match the exact dtype up and downcasting in Llama & Gemma for RMSNorm (#92)
  • Address the bug that RoPE gets very slow when using dynamic sequence length (#149)

What's Changed

New Contributors

Full Changelog: https://github.com/linkedin/Liger-Kernel/compare/v0.1.1...v0.2.0

Liger-Kernel - v0.1.1: Add readme on pypi

Published by ByronHsu 2 months ago

What's Changed

New Contributors

Full Changelog: https://github.com/linkedin/Liger-Kernel/compare/v0.1.0...v0.1.1

Liger-Kernel - v0.1.0: First Public Release

Published by shimizust 2 months ago

What's Changed

New Contributors

Full Changelog: https://github.com/linkedin/Liger-Kernel/compare/v0.0.1...v0.1.0

Liger-Kernel - v0.0.1 pre release

Published by ByronHsu 2 months ago

What's Changed

New Contributors

Full Changelog: https://github.com/linkedin/Liger-Kernel/compare/0.0.2...v0.0.1

Package Rankings
Top 34.84% on Pypi.org
Badges
Extracted from project README
Star History Chart
Related Projects