AI Models & APIs

Browse and run state-of-the-art AI models via REST API for image generation, video generation, upscaling, and more.

Text to Image API

11 models

Z-Image Turbo

6 billion parameter text-to-image model that generates photorealistic images in sub-second time

$0.011-3s

Seedream V4.5

ByteDance's next-gen text-to-image model optimized for typography - crisper text rendering, stronger prompt adherence, and up to 4K output

$0.0510-30s

Nano Banana Pro

Google's Gemini 3.0 Pro Image - cutting-edge text-to-image model enabling high-res 4K image generation

$0.1610-30s

Nano Banana

Google's cutting-edge text-to-image model that generates images from natural language prompts

$0.055-15s

FLUX.1 Dev

12-billion-parameter rectified-flow transformer for text-to-image generation with inpainting support

$0.025-15s

Grok Imagine

X-AI's Grok Imagine model for precise text-to-image generation with AI-powered quality

$0.035-15s

FLUX.2 Turbo Text-to-Image

FLUX 2 turbo from Black Forest Labs is the speed-optimized text-to-image model for real-time workflows. Generate photoreal images and clean typography with strong prompt adherence and consistent style—ideal for ads, posters, social posts, and rapid iteration. Built for low-latency, high-throughput use.

$0.02Low-latency

ByteDance Seedream V3.1

Seedream V3.1 by ByteDance is a text-to-image model with upgraded visuals, stronger style fidelity, and rich detail from text prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.035-15s

Alibaba WAN 2.6 Text-to-Image

Generates high-quality images from natural-language prompts with strong prompt adherence and clean composition. It supports multiple aspect ratios and size control, seed-based reproducibility, and flexible styles (photorealistic to illustrative) for ads, product shots, and social visuals. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing.

$0.045-15s

OpenAI GPT Image 1

OpenAI GPT Image-1 generates images from text prompts, ideal for creating visual assets. It combines the reasoning power of GPT-4-Turbo with DALL·E-class visual synthesis, allowing for high-quality, creative, and context-aware image generation across various styles and purposes.

$0.075-15s

OpenAI DALL-E 3 Text-To-Image Generation API

OpenAI DALL·E 3 is OpenAI's most advanced text-to-image system, capable of generating highly detailed, realistic, and creative visuals directly from natural language descriptions. It builds upon OpenAI's extensive world knowledge and artistic training to create images that are accurate, expressive, and aligned with your intent.

$0.095-15s

Image to Image API

13 models

Z-Image Turbo I2I

6 billion parameter model that enhances image quality and applies style transformations in sub-second time

$0.011-3s

Qwen Image Edit 2511

Advanced image editor with strong edit consistency, multi-person identity preservation, built-in LoRA styles, and geometric reasoning

$0.045-15s

GPT Image 1.5 Edit

OpenAI's image editor for precise natural-language edits - add/remove objects, swap backgrounds, retouch, adjust colors, and edit text

$0.035-20s

Grok Imagine Edit

X-AI's Grok Imagine model for precise image editing - transform and modify images using text prompts

$0.035-15s

Nano Banana Edit

Google's image editing model for precise inpainting, outpainting, background replacement, and stylized transformations

$0.055-15s

Nano Banana Pro Edit

Google Gemini 3.0 Pro image editor with 4K-capable output for high-resolution image editing and manipulation

$0.1610-30s

FLUX.2 Klein 9B Edit

FLUX.2 Klein 9B Edit is a high-quality image editing model with 9B parameters, offering precise modifications using natural language instructions. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

$0.0260-120s

Seedream V4 Edit

Seedream V4 Edit is ByteDance's state-of-the-art image editing model that outperforms Nano Banana in fidelity and edit quality. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.03Processing time not specified

Alibaba WAN 2.6 Image-Edit

Alibaba WAN 2.6 Image-Edit turns prompts into precise photo edits—adjusting color and lighting, restyling aesthetics, replacing backgrounds, removing objects, and refining details while preserving subject identity. Built for stable, repeatable image-to-image pipelines. Ready-to-use REST API, best performance, no cold starts, affordable pricing.

$0.045-15s

ByteDance Seedream 4.5 Edit

Seedream 4.5 Edit preserves facial features, lighting, and color tone from reference images, delivering professional, high-fidelity edits up to 4K with strong prompt adherence. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

$0.055-15s

FLUX.2 [pro] Edit

FLUX.2 [pro] Edit delivers production-grade image-to-image editing from Black Forest Labs—apply natural-language instructions and exact hex color control for consistent, studio-quality results. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.07Variable, depending on task

ByteDance Seedream 4.5 Edit Sequential

Seedream 4.5 Edit Sequential performs multi-image editing while locking character and object identity across shots. It detects main subjects, preserves continuity, and applies controlled edits with up to 4K output. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

$0.095-15s

Google Nano Banana Pro Edit Ultra

Nano Banana Pro Edit (Gemini 3.0 Pro Image) is Google’s advanced AI-powered image editing and generation model, designed to make visual transformation as intuitive as describing it in words. Built on Google’s cutting-edge computer vision and generative research, it combines precision, flexibility, and semantic awareness for professional-grade editing.

$0.175-15s

Text to Video API

11 models

MiniMax Video 01

High-compression text-to-video with cinematic quality, smooth motion, and prompt-faithful generation

$0.5660-120s

Kling 2.6 Pro

Top-tier text-to-video with smooth motion, cinematic visuals, strong prompt adherence, and optional native audio

$0.3960-180s

Grok Imagine Video

X-AI's Grok Imagine Video generates high-quality videos from text with customizable duration, aspect ratio, and resolution

$0.3130-120s

WAN 2.6

Alibaba's text-to-video model with coherent cinematic clips, stable motion, and strong instruction-following for ads and explainers

$0.5660-180s

OpenAI Sora 2

State-of-the-art text-to-video with realistic visuals, accurate physics, synchronized audio, and strong steerability

$0.4460-180s

Google Veo3

Google Veo3 is Google's flagship text-to-video model with built-in audio, producing synchronized video and sound from text prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$1.3260-120s

ByteDance Seedance V1 Lite

Seedance V1 Lite produces coherent multi-shot 720p videos with smooth motion and accurate following of detailed text prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.185-15s

Seedance 1.5 Pro Text to Video

Seedance 1.5 Pro generates cinematic, live-action–leaning clips from text with strong prompt adherence, expressive motion, and stable aesthetics. It supports 4–12s duration control, multiple aspect ratios, and reproducible generation via seeds.

$0.2960-120s

WAN 2.1 T2V 720P

WAN 2.1 T2V 720P offers text-to-video 720p generation from prompts, enabling unlimited AI video creation for social and marketing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.335-15s

Vidu Q3 Text-To-Video

Vidu Q3 Text-to-Video turns text prompts into high-quality videos with exceptional visual fidelity and diverse motion. It features multiple styles, resolutions up to 1080p, flexible duration, audio generation, and motion control.

$0.425-15s

Alibaba WAN 2.5 Text-to-Video Model

Alibaba WAN 2.5 makes 480p-1080p text/image-to-video with synced audio and is faster, more affordable than Google Veo3.

$0.565-15s

Image to Video API

15 models

WAN 2.2

Turns a single image into smooth, cinematic motion - ideal for storyboards, mood shots, and product demos

$0.1760-120s

Grok Imagine Video

X-AI's model that transforms images into videos with natural motion, scene continuity, and synchronized audio

$0.3130-120s

Kling 2.6 Pro

Top-tier image-to-video with smooth motion, cinematic visuals, accurate prompt adherence, and optional native audio

$0.3960-180s

Google Veo 3.1 Fast

Google's fast Image-to-Video model with native 1080p output and audio generation

$0.8860-180s

OpenAI Sora 2 Pro

Creates physics-aware, realistic videos from images with synchronized audio and greater steerability

$1.3260-180s

Google Veo 3.1

Google's premium Image-to-Video model with native 1080P output for highest quality videos with audio

$0.88120-300s

RunwayML Gen4 Turbo

RunwayML Gen4 Turbo is an image-to-video model that generates high-quality videos from images. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.565-15s

InfiniteTalk

InfiniteTalk converts one photo + audio into audio-driven talking or singing avatar videos (Image-to-Video), up to 10 minutes, 720p tier.

$0.1710-30s per second of video

LTX-2 19B

LTX-2 19B is the first DiT-based audio-video foundation model with synchronized audio and video, high fidelity, multiple performance modes, and production-ready outputs in one model. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.095-20s

HunyuanVideo-1.5

HunyuanVideo-1.5 (i2v) is a lightweight 8.3B parameter image-to-video model that generates high-quality videos from images with top-tier visual quality and motion coherence. Optimized for fast inference on consumer-grade GPUs. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.445-10s

Vidu Image-to-Video

Vidu Image-to-Video converts images into smooth-transition videos with high visual quality and diverse motion for cinematic results. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.225-15s

Seedance 1.5 Pro Image to Video

Seedance 1.5 Pro Image-to-Video generates cinematic, live-action–leaning clips from a text prompt plus a first-frame image, preserving the image's subject and composition while adding expressive motion and stable aesthetics. It supports 4–12s duration control, adaptive aspect ratio, and reproducible outputs via seeds, ideal for ad creatives and short-drama shots that need a strong visual anchor.

$0.2960-120s

Vidu Q3 Image-to-Video

Vidu Q3 Image-to-Video turns text prompts into high-quality videos with exceptional visual fidelity and diverse motion. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.4260-120s

MiniMax Hailuo 2.3 Pro

An image-to-video model for ultra-clear 1080P output and physics-aware scenes with responsive rendering. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

$0.545 seconds

Google Veo3

Google Veo 3 is Google's flagship image-to-video model that creates audio-enabled videos from images. It transforms still images into cinematic 1080p videos with smooth, realistic motion, consistent lighting, and synchronized native audio.

$1.3260-120s

Motion Control API

4 models

Kling 2.6 Standard Motion Control

Transfer motion from reference videos to animate still images with smooth, realistic results at an affordable price

$0.2460-120s

Kling 2.6 Pro Motion Control

Premium motion transfer with superior identity preservation, temporal consistency, and native-audio option for dance, action, and gesture animations

$0.3760-180s

Wan 2.2 Fun Control

Wan2.2-Fun-Control is an advanced video generation and control model developed by the Alibaba PAI team, designed for precise and creative video synthesis. It uses Control Codes and multi-modal inputs to generate preset-controlled videos up to 120s at 720p, offering features like multi-modal control, high-quality video generation, and intelligent composition.

$0.2260-120s

LTX-2 19B ControlNet

LTX-2 19B ControlNet generates synchronized audio-video (up to 20s) from video input with pose, depth, or canny edge guidance. Supports audio preservation, generation, or removal for flexible video transformation. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

$0.2260-120s

Video to Video API

2 models

RunwayML Gen4 Aleph

RunwayML Gen4 Aleph is a Video-to-Video model for editing, transforming, and generating video at $0.18 per second. It utilizes natural language instructions to edit and modify footage—removing objects, changing environments, and applying styles. The API supports context-aware transformations and flexible aspect ratios, ensuring high-quality outputs.

$1.005-15s

InfiniteTalk Fast Video-To-Video

Audio-driven infinitetalk-fast turns one video plus audio into realistic talking or singing videos with lip-sync. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.1710–30s per 1 second of video

Upscaler API

7 models

Real-ESRGAN

Real-ESRGAN delivers high-quality image super-resolution with optional face correction and adjustable upscale factors. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.015-15s

Bria Increase Resolution

Bria Increase Resolution upscales images with a method that preserves original content without regeneration, producing sharper, higher-quality output. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

$0.055-15s

FlashVSR Video Upscaler

FlashVSR is a fast, high-quality video upscaler that boosts resolution and restores clarity for low-resolution or blurry footage. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.073–20 seconds per second of video

WaveSpeed AI Image Upscaler

AI Image Upscaler that enhances image resolution to 4K or 8K while improving detail and clarity for photos and graphics. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.02Fast & Efficient

Video Upscaler

AI Video Upscaler enhances resolution and clarity to fix blurry outputs and improve low-resolution footage with ML upscaling. It features a ready-to-use REST inference API, optimized performance, and affordable pricing.

$0.035–10 seconds per second of video

Ultimate Image Upscaler

The Ultimate Image Upscaler is a high-performance enhancement model that intelligently increases image resolution while preserving details, sharpness, and natural texture. It uses advanced deep learning upscaling to restore fine features and eliminate blur or compression artifacts.

$0.075-15s

AI Video Upscaler Pro

AI Video Upscaler Pro converts low-resolution videos into crisp 4K footage with seamless motion dynamics and frame consistency. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

$0.1110-30s per second of video

AI Models & APIs

Welcome to VibeDream

Build with API

Popular Models

Text to Image API

Z-Image Turbo

Seedream V4.5

Nano Banana Pro

Nano Banana

FLUX.1 Dev

Grok Imagine

FLUX.2 Turbo Text-to-Image

ByteDance Seedream V3.1

Alibaba WAN 2.6 Text-to-Image

OpenAI GPT Image 1

OpenAI DALL-E 3 Text-To-Image Generation API

Image to Image API

Z-Image Turbo I2I

Qwen Image Edit 2511

GPT Image 1.5 Edit

Grok Imagine Edit

Nano Banana Edit

Nano Banana Pro Edit

FLUX.2 Klein 9B Edit

Seedream V4 Edit

Alibaba WAN 2.6 Image-Edit

ByteDance Seedream 4.5 Edit

FLUX.2 [pro] Edit

ByteDance Seedream 4.5 Edit Sequential

Google Nano Banana Pro Edit Ultra

Text to Video API

MiniMax Video 01

Kling 2.6 Pro

Grok Imagine Video

WAN 2.6

OpenAI Sora 2

Google Veo3

ByteDance Seedance V1 Lite

Seedance 1.5 Pro Text to Video

WAN 2.1 T2V 720P

Vidu Q3 Text-To-Video

Alibaba WAN 2.5 Text-to-Video Model

Image to Video API

WAN 2.2

Grok Imagine Video

Kling 2.6 Pro

Google Veo 3.1 Fast

OpenAI Sora 2 Pro

Google Veo 3.1

RunwayML Gen4 Turbo

InfiniteTalk

LTX-2 19B

HunyuanVideo-1.5

Vidu Image-to-Video

Seedance 1.5 Pro Image to Video

Vidu Q3 Image-to-Video

MiniMax Hailuo 2.3 Pro

Google Veo3

Motion Control API

Kling 2.6 Standard Motion Control

Kling 2.6 Pro Motion Control

Wan 2.2 Fun Control

LTX-2 19B ControlNet

Video to Video API

RunwayML Gen4 Aleph

InfiniteTalk Fast Video-To-Video

Upscaler API

Real-ESRGAN

Bria Increase Resolution

FlashVSR Video Upscaler

WaveSpeed AI Image Upscaler

Video Upscaler

Ultimate Image Upscaler

AI Video Upscaler Pro