We're dedicated to advancing the state of the art in language models across the spectrum - from massive foundation models to efficient specialized systems.
Pushing the boundaries of scale and capabilities with our cutting-edge research on massive transformer models that demonstrate advanced reasoning, knowledge, and generation abilities.
Developing highly efficient models that maintain impressive capabilities while running on consumer devices. Our SLMs are designed for specialized tasks with minimal resource requirements.
Our team has made significant contributions to the field, publishing groundbreaking research and developing novel technologies.
Developing advanced pre-training techniques that significantly improve model efficiency and knowledge retention
Uncovering new scaling relationships guiding efficient architecture design across model sizes
Creating an optimized training framework that achieves near-linear scaling across thousands of GPUs
Advanced methods for aligning model behavior with human values and preferences
Breakthrough optimizations reducing inference compute by 70% while maintaining performance
Drafted over 30 peer-reviewed papers and released 5 model weights under open licenses
Our models exhibit state-of-the-art performance across a wide range of tasks and applications.
Our models demonstrate exceptional reasoning capabilities, with breakthrough performance on complex problem-solving tasks.
Our research extends beyond text to understand and generate content across multiple modalities.
Our models excel at understanding and generating code across multiple programming languages and paradigms.
Our research connects language models with external knowledge sources for enhanced accuracy and verifiability.
Explore our latest breakthroughs and ongoing research initiatives.
Investigating novel techniques for language models to iteratively improve their own capabilities and architectures.
Creating specialized small models that retain the capabilities of larger counterparts with 95% less compute.
Extending language models with robust reasoning capabilities across visual, audio, and textual inputs.