An alignment auditing agent capable of exploring alignment hypothesis
Tooling for the Common Objects In 3D dataset
code for Mesh R-CNN, ICCV 2019
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
PyTorch code and models for VJEPA2 self-supervised learning from video
Language modeling in a sentence representation space
An AI-powered security review GitHub Action using Claude
GPT4V-level open-source multi-modal model based on Llama3-8B
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Evals is a framework for evaluating LLMs and LLM systems
The ChatGPT Retrieval Plugin lets you easily find personal documents
Designed for text embedding and ranking tasks
Implementation of the Surya Foundation Model for Heliophysics
A modular high-level library to train embodied AI agents
Revolutionizes the way users interact with Autogen
LLM training code for MosaicML foundation models
State-of-the-art diffusion models for image and audio generation
Implementation of Video Diffusion Models
Data Lake for Deep Learning. Build, manage, and query datasets
Synthetic Data Generation for tabular, relational and time series data
Simplest working implementation of Stylegan2
An MLOps framework to package, deploy, monitor and manage models
Proofs, cases, concept supplements, and reference explanations