Realtime endpoints use WebSockets for bidirectional communication over a persistent connection. Once a client connects, it can send multiple inputs and receive results without the overhead of establishing new connections for each request. This makes them ideal for interactive applications like real-time image editing, live camera filters, or game-like experiences where latency between requests needs to be minimal. You define a realtime endpoint using theDocumentation Index
Fetch the complete documentation index at: https://fal.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
@fal.realtime("/realtime") decorator, which uses fal’s binary msgpack protocol for efficient serialization. Callers connect using the realtime() method in the fal client SDKs. For one-way progressive output from a single request (like showing diffusion steps), use Streaming Endpoints instead — they use SSE and are simpler when you don’t need bidirectional communication.
How Realtime Works
Under afal.App, the @fal.realtime() decorator makes your endpoint compatible with fal’s real-time clients. It uses fal’s binary msgpack protocol for efficient serialization and eliminates connection establishing overhead for repeated requests.
Important: The
fal_client.realtime() method automatically connects to the /realtime path on your app. If you use @fal.realtime(), you must set the path to /realtime (e.g., @fal.realtime("/realtime")) for the client to connect successfully.@fal.endpoint can
be initialized with is_websocket=True flag and the underlying function will receive the raw WebSocket connection and
can choose to use it however it wants.
Server-Side Implementation
Here’s an example of a fal app with both a regular HTTP endpoint and a WebSocket endpoint:Client-Side Connection
Connecting to @fal.realtime() Endpoints
For endpoints decorated with @fal.realtime(), use fal_client.realtime() or fal_client.realtime_async(). These methods handle serialization automatically using fal’s binary protocol:
Connecting to Raw WebSocket Endpoints
For endpoints usingis_websocket=True, use fal_client.ws_connect() or fal_client.ws_connect_async() for direct WebSocket access:
- For
@fal.realtime()endpoints: Usefal_client.realtime()- serialization is handled automatically. - For raw
is_websocket=Trueendpoints: Usefal_client.ws_connect()with thepathparameter to specify the endpoint path.
WebRTC Transport
For applications that need direct video/audio streaming (webcam feeds, live game rendering), you can use WebRTC as the transport layer on top of fal’s WebSocket infrastructure. WebRTC provides lower latency for media streams compared to sending frames over msgpack. The pattern uses@fal.endpoint("/webrtc", is_websocket=True) to handle WebRTC signaling, while the actual media flows peer-to-peer between the browser and your runner. For a higher-level wrapper that handles the signaling, tracks, and frame batching for you, see the experimental World Model Accelerator (WMA).
Real-time World Model
Deploy a live world model with WebRTC video streaming
Real-time Video-to-Video
Run YOLO detections on a live webcam feed via WebRTC
Next Steps
Streaming Endpoints
Stream progressive results (image previews, video updates) using SSE
Real-time Client Docs
Client library documentation for realtime connections