Uncommon Objects in 3D dataset
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Constrained Value Alignment via Safe Reinforcement Learning
Qwen3 is the large language model series developed by Qwen team
Automatic Speech Recognition with Word-level Timestamps
High-Performance Face Recognition Library on PaddlePaddle & PyTorch
Open source AI model for generating full songs from lyrics prompts
A dataset consists of 15,140 ChatGPT prompts from Reddit
Multimodal-Driven Architecture for Customized Video Generation
The Triton Inference Server provides an optimized cloud
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Recipes to train reward model for RLHF
Pluggable SOTA multi-object tracking modules for segmentation
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Synchronized Translation for Videos
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
SOTA Open Source TTS
HivisionIDPhotos: a lightweight and efficient AI ID photos tools
Pretrained (Language) Models for Probabilistic Time Series Forecasting
Semi-Structured Agentic Framework. Workflows build themselves
A Survey of Large Language Models
A Unified Framework for Image Customization
Unsupervised Learning for Image Registration
A trainable PyTorch reproduction of AlphaFold 3
Open-source model for program synthesis