Uncommon Objects in 3D dataset
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Constrained Value Alignment via Safe Reinforcement Learning
Ensure consistency and alignment between different codebases
Qwen3 is the large language model series developed by Qwen team
Automatic Speech Recognition with Word-level Timestamps
High-Performance Face Recognition Library on PaddlePaddle & PyTorch
Open source AI model for generating full songs from lyrics prompts
A dataset consists of 15,140 ChatGPT prompts from Reddit
Get 10X more out of Claude Code, Codex or any coding agent
Video translation and dubbing tool powered by LLMs
One-stop AI digital human system with video voice synthesis tools
A tool to snap pixels to a perfect grid
Multimodal-Driven Architecture for Customized Video Generation
Recipes to train reward model for RLHF
The Triton Inference Server provides an optimized cloud
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Analyze computation-communication overlap in V3/R1
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Course to get into Large Language Models (LLMs)
Pluggable SOTA multi-object tracking modules for segmentation
Synchronized Translation for Videos
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
SOTA Open Source TTS
HivisionIDPhotos: a lightweight and efficient AI ID photos tools