Benchmarking Multimodal Agents for Open-Ended Tasks
Qwen3-omni is a natively end-to-end, omni-modal LLM
Plug-and-play library to enable agents to call MCP and UTCP tools
Deep learning library
A security scanner for custom LLM applications
An AI-powered security review GitHub Action using Claude
Data Lake for Deep Learning. Build, manage, and query datasets
A game theoretic approach to explain the output of ml models
Machine Learning Engineering Open Book
Lightweight framework for evaluating large language model performance
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
UI-TARS-desktop version that can operate on your local personal device
State-of-the-art (SoTA) text-to-video pre-trained model
Terminal-based LLM chat tool with multi-model and local support
Block Diffusion for Ultra-Fast Speculative Decoding
Self-healing browser harness that enables LLMs to complete any task
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
When LLM Meets Domain Experts
Jupyter notebook tutorials for OpenVINO
Library for OCR-related tasks powered by Deep Learning
20+ high-performance LLMs with recipes to pretrain, finetune at scale
The fastest way to bring multi-agent workflows to production
CRAB: Cross-environment Agent Benchmark for Multimodal Language Model
Supercharge Your LLM Application Evaluations
Chat & pretrained large audio language model proposed by Alibaba Cloud