OLMoE Achieves State-Of-The-Art Performance using Fewer Resources and MoE [Video]

A team of researchers from the Allen Institute for AI, Contextual AI, and the University of Washington have released OLMoE (Open Mixture-of-Experts Language Models), a new open-source LLM that achieves state-of-the-art performance while using significantly fewer computational resources than comparable models.

OLMoE utilizes a Mixture-of-Experts (MoE) architecture, allowing it to have 7 billion total parameters but only activate 1.3 billion for each input. This enables OLMOE to match or exceed the performance of much larger models like Llama2-13B while using far less compute power during inference.

Thanks to Mixture-of-Experts, better data & hyperparams, OLMoE is much more efficient than OLMo 7B as it uses 4x less training FLOPs and 5x less parameters were used per forward pass for cheaper training and cheaper inference.

Importantly, the researchers have open-sourced not just the model weights, but also the training data, code, and logs. This level of transparency is rare for high-performing language models and will allow …

Watch/Read More