MiniMax M2.7 is the latest flagship open-source model from MiniMax, announced on March 18, 2026. It is MiniMax’s first model to actively participate in its own development cycle — generating evaluation data, identifying its own capability gaps, and producing synthetic training examples to close them. This makes M2.7 one of the earliest production demonstrations of AI-assisted AI training at meaningful scale.
Architecture
M2.7 is a Sparse Mixture-of-Experts (MoE) model with 230 billion total parameters and only 10 billion activated per token. It routes each token to the most relevant subset of its 256 expert sub-networks, keeping per-inference compute equivalent to a ~10B dense model while retaining the knowledge capacity of a much larger system.
Key architectural details:
- Multi-head causal self-attention with Rotary Position Embeddings (RoPE) for positional encoding
- QK RMSNorm (Query-Key Root Mean Square Normalisation) for stable attention at scale
- Top-k expert routing — only the most relevant experts activate per token
- 200,000-token context window — sufficient for large codebases, extended agent sessions, and long documents
- Full-precision (BF16) deployment requires approximately 460 GB of GPU VRAM
Self-Evolution
The defining feature of M2.7 is its recursive self-optimisation framework: a training pipeline where the model generates its own evaluation data, identifies capability gaps, and produces synthetic training examples to address them.
In practice, an M2.7 instance autonomously runs a complete iterative improvement loop:
- Analyse failure trajectories from previous evaluations
- Plan modifications to the training scaffold or data strategy
- Modify scaffold code
- Run evaluations
- Compare results against baseline
- Decide whether to keep or revert changes
MiniMax ran this loop for 100+ rounds without human intervention, achieving a 30% improvement on internal benchmarks. In broader production workflows, M2.7 handles 30–50% of its own training pipeline — including aspects of data curation, evaluation, and iteration.
Performance
- SWE-Pro: 56.22% — strong real-world software engineering benchmark
- Terminal Bench 2: 57.0%
- Artificial Analysis Intelligence Index: Ranked #1 out of 136 models (score: 50)
- GDPval-AA (agentic coding): ELO 1,495 — highest among open-weight models
- MLE Bench Lite (22 Kaggle ML competitions): 66.6% medal rate — second only to Claude Opus 4.6 and GPT-5.4
- Matches GPT-5 on several multi-language engineering benchmarks
API pricing: $0.30 per million input tokens — roughly 1/50th the cost of Claude Opus.
Agentic Capabilities
M2.7 is optimised for complex multi-step agent harnesses. It maintains a 97% skill adherence rate across 40+ complex skills (each exceeding 2,000 tokens), making it reliable in automated pipelines that require following long, structured instruction sets across multiple rounds.
Practical strengths include:
- End-to-end software engineering (project delivery, log-based debugging, code security)
- ML research workflows
- Complex multi-round document editing (Excel, PowerPoint, Word)
- Multi-agent coordination with dynamic tool search
Significance
M2.7 sits at the intersection of two frontier trends: MoE architectures for compute-efficient scaling, and models that reduce their own training cost by automating evaluation and data generation. Its open-source release under a permissive licence, combined with sub-cent-per-token pricing, makes it one of the most cost-effective options for production agentic workloads as of mid-2026.