Audience
Developers, teams, researchers, general users
About Open R1
Open R1 is a community-driven, open-source initiative aimed at replicating the advanced AI capabilities of DeepSeek-R1 through transparent methodologies. You can try Open R1 AI model or DeepSeek R1 free online chat on Open R1.
The project offers a comprehensive implementation of DeepSeek-R1's reasoning-optimized training pipeline, including tools for GRPO training, SFT fine-tuning, and synthetic data generation, all under the MIT license. While the original training data remains proprietary, Open R1 provides the complete toolchain for users to develop and fine-tune their own models.
Other Popular Alternatives & Related Software
Phi-4-mini-reasoning
Phi-4-mini-reasoning is a 3.8-billion parameter transformer-based language model optimized for mathematical reasoning and step-by-step problem solving in environments with constrained computing or latency. Fine-tuned with synthetic data generated by the DeepSeek-R1 model, it balances efficiency with advanced reasoning ability. Trained on over one million diverse math problems spanning multiple levels of difficulty from middle school to Ph.D. level, Phi-4-mini-reasoning outperforms its base model on long sentence generation across various evaluations and surpasses larger models like OpenThinker-7B, Llama-3.2-3B-instruct, and DeepSeek-R1. It features a 128K-token context window and supports function calling, enabling integration with external tools and APIs. Phi-4-mini-reasoning can be quantized using Microsoft Olive or Apple MLX Framework for deployment on edge devices such as IoT, laptops, and mobile devices.
Learn more
DeepCoder
DeepCoder is a fully open source code-reasoning and generation model released by Agentica Project in collaboration with Together AI. It is fine-tuned from DeepSeek-R1-Distilled-Qwen-14B using distributed reinforcement learning, achieving a 60.6% accuracy on LiveCodeBench (representing an 8% improvement over the base), a performance level that matches that of proprietary models such as o3-mini (2025-01-031 Low) and o1 while using only 14 billion parameters. It was trained over 2.5 weeks on 32 H100 GPUs with a curated dataset of roughly 24,000 coding problems drawn from verified sources (including TACO-Verified, PrimeIntellect SYNTHETIC-1, and LiveCodeBench submissions), each problem requiring a verifiable solution and at least five unit tests to ensure reliability for RL training. To handle long-range context, DeepCoder employs techniques such as iterative context lengthening and overlong filtering.
Learn more
DeepScaleR
DeepScaleR is a 1.5-billion-parameter language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B using distributed reinforcement learning and a novel iterative context-lengthening strategy that gradually increases its context window from 8K to 24K tokens during training. It was trained on ~40,000 carefully curated mathematical problems drawn from competition-level datasets like AIME (1984–2023), AMC (pre-2023), Omni-MATH, and STILL. DeepScaleR achieves 43.1% accuracy on AIME 2024, a roughly 14.3 percentage point boost over the base model, and surpasses the performance of the proprietary O1-Preview model despite its much smaller size. It also posts strong results on a suite of math benchmarks (e.g., MATH-500, AMC 2023, Minerva Math, OlympiadBench), demonstrating that small, efficient models tuned with RL can match or exceed larger baselines on reasoning tasks.
Learn more
Phi-4-reasoning
Phi-4-reasoning is a 14-billion parameter transformer-based language model optimized for complex reasoning tasks, including math, coding, algorithmic problem solving, and planning. Trained via supervised fine-tuning of Phi-4 on carefully curated "teachable" prompts and reasoning demonstrations generated using o3-mini, it generates detailed reasoning chains that effectively leverage inference-time compute. Phi-4-reasoning incorporates outcome-based reinforcement learning to produce longer reasoning traces. It outperforms significantly larger open-weight models such as DeepSeek-R1-Distill-Llama-70B and approaches the performance levels of the full DeepSeek-R1 model across a wide range of reasoning tasks. Phi-4-reasoning is designed for environments with constrained computing or latency. Fine-tuned with synthetic data generated by DeepSeek-R1, it provides high-quality, step-by-step problem solving.
Learn more
Pricing
Starting Price:
Free
Pricing Details:
Open source
Free Version:
Free Version available.
Integrations
No integrations listed.
Company Information
Open R1
Founded: 2025
United Kingdom
open-r1.com
Videos and Screen Captures
Other Useful Business Software
QA Wolf | We Write, Run and Maintain Tests
QA Wolf is an AI-native service that delivers 80% automated E2E test coverage for web & mobile apps in weeks not years.
Product Details
Platforms Supported
Cloud
Training
Documentation
Support
Online