Quick Start

fal gives you two ways to work with AI models. If you want to generate images, video, audio, or other media, the Model APIs let you call 1,000+ production-ready models with a single API call. If you have your own model to deploy, Serverless gives you the full lifecycle: develop, test, deploy, and scale on the same infrastructure that powers the marketplace. Both paths start with an API key and take a few minutes. The “consume” path is for calling existing models through fal’s client libraries or HTTP. The “deploy” path is for teams bringing their own models to run on fal’s GPU infrastructure, using the same fal.App framework that powers every model on the platform.

I want to consume a model
I want to deploy a model

What do you want to build?

Generate Images

Create images from text prompts with FLUX, Nano Banana 2, and more

Generate Videos

Transform images into videos with Kling 3.0, Sora 2, and other models

Transcribe Audio

Convert speech to text with Whisper

Use LLMs

Build with Llama, Mistral, and other large language models

Fast FLUX

Ultra-fast image generation with optimized FLUX

Build a Workflow UI

Create interfaces for complex AI workflows

Next.js Integration

Build full-stack AI apps with Next.js and fal

Vercel Integration

Deploy AI-powered apps on Vercel with fal

n8n Integration

Automate workflows by connecting fal models to n8n

Quick example

Generate your first image in under a minute.

Install the client

pip install fal-client

Set your API key

Get a key from the fal dashboard and set it as an environment variable:

export FAL_KEY="your-api-key-here"

Generate an image

import fal_client

result = fal_client.subscribe("fal-ai/flux/schnell", arguments={
    "prompt": "a futuristic cityscape at sunset"
})
print(result["images"][0]["url"])

What do you want to deploy?

Text-to-Image Model

Deploy Sana, FLUX, or your custom image generation model

Text-to-Video Model

Deploy WAN or other video generation models

Text-to-Speech Model

Deploy Kokoro or custom voice synthesis models

Text-to-Music Model

Deploy DiffRhythm or music generation models

ComfyUI Server

Run ComfyUI workflows as a serverless API

LoRA Training

Fine-tune WAN video generation with LoRA training

Multi-GPU Inference

Scale generation across multiple GPUs with streaming

3D Progressive Rendering

Stream real-time text-to-3D reconstruction

Real-time Video-to-Video

Run real-time video-to-video with object detection

Real-time World Model

Deploy a real-time world model powered by Matrix-Game

Custom Container

Deploy any model with custom Docker containers

Migrating from another platform?

Migrate from Replicate

Move your Replicate models to fal

Migrate from Modal

Move your Modal apps to fal

Migrate from RunPod

Move your RunPod Serverless workers to fal

Migrate Docker Server

Move your existing Docker-based inference server to fal

Quick example

Deploy your first app in under 2 minutes.

Install the CLI and authenticate

pip install fal
fal auth login

Create your app

Create a file called my_app.py:

my_app.py

import fal

class MyApp(fal.App):
    @fal.endpoint("/")
    def run(self, prompt: str) -> dict:
        return {"message": f"Hello from fal! You said: {prompt}"}

Test on cloud GPUs

fal run my_app.py::MyApp

Deploy to production

fal deploy my_app.py::MyApp

Deploy a real model

Step-by-step guide to deploying an image generation model

Next Steps

Get Your API Key

Create an API key to authenticate your requests

AI Tools

Use AI coding assistants to build with fal faster

Explore Models

Browse 1,000+ available models

Setting Up

Model APIs

Serverless

Compute

Organizations

Quick Start

What do you want to build?

Generate Images

Generate Videos

Transcribe Audio

Use LLMs

Fast FLUX

Build a Workflow UI

Next.js Integration

Vercel Integration

n8n Integration

Quick example

What do you want to deploy?

Text-to-Image Model

Text-to-Video Model

Text-to-Speech Model

Text-to-Music Model

ComfyUI Server

LoRA Training

Multi-GPU Inference

3D Progressive Rendering

Real-time Video-to-Video

Real-time World Model

Custom Container

Migrating from another platform?

Migrate from Replicate

Migrate from Modal

Migrate from RunPod

Migrate Docker Server

Quick example

Deploy a real model

Next Steps

Get Your API Key

AI Tools

Explore Models

Setting Up

Model APIs

Serverless

Compute

Organizations

Documentation Index

​What do you want to build?

Generate Images

Generate Videos

Transcribe Audio

Use LLMs

Fast FLUX

Build a Workflow UI

Next.js Integration

Vercel Integration

n8n Integration

​Quick example

​What do you want to deploy?

Text-to-Image Model

Text-to-Video Model

Text-to-Speech Model

Text-to-Music Model

ComfyUI Server

LoRA Training

Multi-GPU Inference

3D Progressive Rendering

Real-time Video-to-Video

Real-time World Model

Custom Container

​Migrating from another platform?

Migrate from Replicate

Migrate from Modal

Migrate from RunPod

Migrate Docker Server

​Quick example

Deploy a real model

​Next Steps

Get Your API Key

AI Tools

Explore Models

What do you want to build?

Quick example

What do you want to deploy?

Migrating from another platform?

Quick example

Next Steps