Moe Yasuo - Search News

Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference

Abstract: Large language models (LLMs) based on transformers have made significant strides in recent years, the success of which is driven by scaling up their model size. Despite their high ...

GitHub

HITsz-TMG/Uni-MoE

[2025/11/24] 🔥 We have integrated our model Uni-MoE-2.0-Omni for evaluation within the Lmms-eval framework, see here. [2025/11/13] 🔥 We release the second version of Uni-MoE-2.0-Omni. It achieves a ...

GitHub

Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques

The Mixture of Experts (MoE) approach dynamically selects and activates only a subset of experts, significantly reducing computational costs while maintaining high performance. However, MoE introduces ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference

HITsz-TMG/Uni-MoE

Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques

Trending now