Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Pokee Deep Research Model Open Source Repo
Unified Multimodal Understanding and Generation Models
DeepMind model for tracking arbitrary points across videos & robotics
Global weather forecasting model using graph neural networks and JAX
Tooling for the Common Objects In 3D dataset
code for Mesh R-CNN, ICCV 2019
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Language modeling in a sentence representation space
An AI-powered security review GitHub Action using Claude
GPT4V-level open-source multi-modal model based on Llama3-8B
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Renderer for the harmony response format to be used with gpt-oss
The ChatGPT Retrieval Plugin lets you easily find personal documents
Designed for text embedding and ranking tasks
Implementation of the Surya Foundation Model for Heliophysics
Pushing the Limits of Mathematical Reasoning in Open Language Models
A SOTA open-source image editing model
Diversity-driven optimization and large-model reasoning ability
Chinese and English multimodal conversational language model
Repo of Qwen2-Audio chat & pretrained large audio language model
Tongyi Deep Research, the Leading Open-source Deep Research Agent
A trainable PyTorch reproduction of AlphaFold 3