mixture-of-attention

Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts

MIT License

Downloads
1.1K
Stars
101
Committers
1

Commit Statistics

Past Year

All Time

Total Commits
0
41
Total Committers
0
1
Avg. Commits Per Committer
0.0
41.0
Bot Commits
0
0

Issue Statistics

Past Year

All Time

Total Pull Requests
0
0
Merged Pull Requests
0
0
Total Issues
2
2
Time to Close Issues
about 2 hours
about 2 hours
Package Rankings
Top 22.15% on Pypi.org
Related Projects