Join us
@kala ・ Dec 01,2025

INTELLECT-3, a 100B+ parameter model, sets new benchmarks in AI, with open-sourced training components to foster research in reinforcement learning.
INTELLECT-3 is a 100B+ parameter Mixture-of-Experts model that achieves state-of-the-art performance in math, code, science, and reasoning benchmarks, outperforming many larger models.
The model is trained using PRIME-RL, an asynchronous reinforcement learning framework that allows for efficient scaling of RL to long-horizon tasks without bottlenecks, making it unique compared to other RL trainers.
The training components, including model weights, datasets, and RL environments, have been open-sourced to promote open research.
INTELLECT-3 was trained on a diverse set of RL environments available on the Environments Hub, which hosts hundreds of tasks across various domains.
Plans include scaling agentic RL, expanding the range of RL environments, and developing long-horizon agents capable of managing their own context.
INTELLECT-3 has just made its debut, and it's already turning heads with its impressive 100 billion parameters. This Mixture-of-Experts model is not just another face in the crowd; it's outperforming many of its larger counterparts in fields like math, code, science, and reasoning. The secret sauce? It's trained using the PRIME-RL asynchronous reinforcement learning framework. This framework is unique because it uses an async-only approach, which helps scale reinforcement learning to long-horizon agentic rollouts without hitting the usual snags.
Now, let's talk about the training setup for INTELLECT-3. It's quite the operation, featuring PRIME-RL for both supervised fine-tuning and large-scale RL. Add to that Verifiers and the Environments Hub, which offer a unified interface and ecosystem for RL training environments. There's also Prime Sandboxes for high-throughput, secure code execution, and a compute orchestration system that uses 512 NVIDIA H200 GPUs across 64 nodes. The whole setup is designed to keep things deterministic and synchronized, with tools like Ansible, Slurm, and Lustre making sure everything runs like a well-oiled machine.
Training INTELLECT-3 was a two-stage process. It kicked off with supervised fine-tuning on the GLM-4.5-Air base model, then transitioned to a large-scale RL stage. The model was trained on a diverse set of RL environments, which significantly improved its reasoning and agentic capabilities. These environments are publicly available on the Environments Hub, offering a wide range of tasks across different domains.
But the release of INTELLECT-3 is about more than just showcasing a new model; it's about opening doors. The training components are open-sourced to encourage open research in RL. The model's framework and infrastructure are available on the Prime Intellect platform, allowing others to post-train their own models. Looking ahead, there's a push to scale agentic RL even further, expand the range of RL environments, and develop long-horizon agents that can manage their own context.
The number of parameters in the INTELLECT-3 Mixture-of-Experts model.
The number of NVIDIA H200 GPUs used for training the INTELLECT-3 model.
The number of interconnected nodes used in the training infrastructure for INTELLECT-3.
The duration of the training process for INTELLECT-3.
Prime Intellect is responsible for building the open superintelligence stack and providing tools for training frontier models like INTELLECT-3.
PRIME-RL is an asynchronous reinforcement learning framework used to train the INTELLECT-3 model.
The release of INTELLECT-3 marks the introduction of a 100B+ parameter Mixture-of-Experts model with state-of-the-art performance in various benchmarks.
Prime Intellect released INTELLECT-3, a 100B+ parameter Mixture-of-Experts model trained with large-scale reinforcement learning, achieving state-of-the-art results in math, code, science, and reasoning.
Prime Intellect open-sourced the full INTELLECT-3 training recipe, including the PRIME-RL framework, RL environments, datasets, and evaluation suites to support open research in large-scale RL.
The release emphasized PRIME-RL, an async-only reinforcement learning framework enabling efficient large-scale, long-horizon agentic RL training.
INTELLECT-3 was trained on a compute cluster consisting of 512 NVIDIA H200 GPUs across 64 interconnected nodes, supported by orchestration tools such as Slurm, Ansible, Lustre, and DCGM monitoring.
Prime Intellect trained INTELLECT-3 using hundreds of verifier-backed RL environments across math, science, reasoning, coding, deep research, and software engineering.
Prime Intellect outlined next steps including scaling agentic RL, broadening environment coverage, and developing long-horizon models capable of autonomous context management.
Prime Intellect reaffirmed its mission to democratize advanced AI development by enabling organizations to train and post-train their own frontier models using the INTELLECT-3 stack.
Subscribe to our weekly newsletter Kala to receive similar updates for free!
Join other developers and claim your FAUN.dev() account now!
FAUN.dev() is a developer-first platform built with a simple goal: help engineers stay sharp without wasting their time.

FAUN.dev()
@kala