Multimodal model achieving SOTA performance
Browser action engine for AI agents. 10× faster, resilient by design
From Images to High-Fidelity 3D Assets
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
An Open Source text-to-speech system built by inverting Whisper
MiniMax-M2, a model built for Max coding & agentic workflows
Rust framework for building modular and scalable LLM-powered apps
Run a 1-billion parameter LLM on a $10 board with 256MB RAM
Real-time voice interactive digital human
Build a free commercial AI dialogue environment in 10 minutes
Editing large language models within 10 seconds
A MNIST-like fashion product database
Recognizing biological data from a notebook.
DE-based Weight Optimisation for Heterogeneous Ensemble
simple algorithm for a realtime interactive visual cortex for painting
High-performance MoE model with MLA, MTP, and multilingual reasoning
VaultGemma: 1B DP-trained Gemma variant for private NLP tasks
Qwen3-Next: 80B instruct LLM with ultra-long context up to 1M tokens