soft-moe-pytorch

Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch

MIT License

Downloads

1.5K

Stars

239

Committers

View Code on GitHub

Ecosystems: Python

Commit Statistics

Past Year

All Time

Total Commits

Total Committers

Avg. Commits Per Committer

7.0

28.0

Bot Commits

Issue Statistics

Past Year

All Time

Total Pull Requests

Merged Pull Requests

Total Issues

Time to Close Issues

about 2 months

Package Rankings

Top 25.27% on Pypi.org

Related Projects

mirasol-pytorch

Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch

18 Nov 2023 84

g-mlp-pytorch

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

18 May 2021 422

pytorch-widedeep

A flexible package for multimodal-deep-learning to combine tabular data with text and images usin...

21 Oct 2017 1,243

megablocks

MegaBlocks

26 Jan 2023 1,076

fastmoe

A fast MoE impl for PyTorch

25 Jan 2021 1,534

slot-attention

Implementation of Slot Attention from GoogleAI

29 Jun 2020 384

sinkhorn-router-pytorch

Self contained pytorch implementation of a sinkhorn based router, for mixture of experts or other...

23 Aug 2024 31

st-moe-pytorch

Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch

26 Mar 2023 285

PEER-pytorch

Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen...

09 Jul 2024 109

feedback-transformer-pytorch

Implementation of Feedback Transformer in Pytorch

02 Feb 2021 104

MEGABYTE-pytorch

Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Py...

15 May 2023 620

mixture-of-attention

Some personal experiments around routing tokens to different autoregressive attention, akin to mi...

21 Apr 2023 101

meshgpt-pytorch

Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch

29 Nov 2023 642

OpenMoE

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

08 Aug 2023 1,368

mixture-of-experts

A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the param...

13 Jul 2020 624