davzoku/moecule-3x3b-m10-fks
Question Answering • 8B • Updated • 3 • 1
A toolkit is presented for creating cost-effective MOEs from trained models or adapters, offering architecture guidance and a public repository.
We present a toolkit for creating low-cost Mixture-of-Domain-Experts (MOE) from trained models. The toolkit can be used for creating a mixture from models or from adapters. We perform extensive tests and offer guidance on defining the architecture of the resulting MOE using the toolkit. A public repository is available.
Get this paper in your agent:
hf papers read 2408.17280 curl -LsSf https://hf.co/cli/install.sh | bash No dataset linking this paper
No Space linking this paper
No Collection including this paper