Veo 3.1
Veo 3.1 builds on the capabilities of the previous model to enable longer and more versatile AI-generated videos. With this version, users can create multi-shot clips guided by multiple prompts, generate sequences from three reference images, and use frames in video workflows that transition between a start and end image, both with native, synchronized audio. The scene extension feature allows extension of a final second of a clip by up to a full minute of newly generated visuals and sound. Veo 3.1 supports editing of lighting and shadow parameters to improve realism and scene consistency, and offers advanced object removal that reconstructs backgrounds to remove unwanted items from generated footage. These enhancements make Veo 3.1 sharper in prompt-adherence, more cinematic in presentation, and broader in scale compared to shorter-clip models. Developers can access Veo 3.1 via the Gemini API or through the tool Flow, targeting professional video workflows.
Learn more
Veo 3.1 Lite
Veo 3.1 Lite is a cost-effective video generation model developed by Google DeepMind for developers building AI-powered applications. It enables users to create videos from text or images using advanced generative AI capabilities. The model supports multiple formats, including landscape and portrait orientations, as well as HD resolutions like 720p and 1080p. Designed for efficiency, it delivers high-speed performance at a lower cost compared to other models in the Veo family. Developers can customize video duration, allowing flexibility in content creation. Veo 3.1 Lite is accessible through the Gemini API and Google AI Studio. Overall, it makes scalable video generation more affordable and accessible for developers.
Learn more
Seedance
Seedance 1.0 API is officially live, giving creators and developers direct access to the world’s most advanced generative video model. Ranked #1 globally on the Artificial Analysis benchmark, Seedance delivers unmatched performance in both text-to-video and image-to-video generation. It supports multi-shot storytelling, allowing characters, styles, and scenes to remain consistent across transitions. Users can expect smooth motion, precise prompt adherence, and diverse stylistic rendering across photorealistic, cinematic, and creative outputs. The API provides a generous free trial with 2 million tokens and affordable pay-as-you-go pricing from just $1.8 per million tokens. With scalability and high concurrency support, Seedance enables studios, marketers, and enterprises to generate 5–10 second cinematic-quality videos in seconds.
Learn more
Wan2.5
Wan2.5-Preview introduces a next-generation multimodal architecture designed to redefine visual generation across text, images, audio, and video. Its unified framework enables seamless multimodal inputs and outputs, powering deeper alignment through joint training across all media types. With advanced RLHF tuning, the model delivers superior video realism, expressive motion dynamics, and improved adherence to human preferences. Wan2.5 also excels in synchronized audio-video generation, supporting multi-voice output, sound effects, and cinematic-grade visuals. On the image side, it offers exceptional instruction following, creative design capabilities, and pixel-accurate editing for complex transformations. Together, these features make Wan2.5-Preview a breakthrough platform for high-fidelity content creation and multimodal storytelling.
Learn more