MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

APACHE-2.0 License

Stars
1.9K

Bot releases are hidden (Show)

MoE-LLaVA - Release v1.0.0 Latest Release

Published by LinB203 9 months ago

  • Supported higher resolution input using google/siglip-so400m-patch14-384 as the vision encoder for a more detailed visual understanding.
  • Changed capacity_factor to 1.5 to support stronger MoE-LLaVA.
  • Added the results of MME benchmark and evaluation pipeline.
  • Improved docs.
  • Fixed typos.

We hope that community researchers can pay attention to the fact that large vision-language models can also be sparsified and even perform better.

Package Rankings
Top 6.75% on Proxy.golang.org
Badges
Extracted from project README
hf_space Replicate demo and cloud API Open In Colab hf_space arXiv youtube jiqizhixin License Hits GitHub issues GitHub closed issues github github github github arXiv github github arXiv Replicate demo and cloud API Open In Colab Star History
Related Projects