Documentation Index
Fetch the complete documentation index at: https://fal.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Once your app is deployed, you can call it using the same client SDKs and patterns used for any model on fal. Your app’s endpoint ID is your-username/your-app-name, and all of the inference methods work identically whether you are calling a marketplace model or your own deployed app.
This page shows quick examples of each calling pattern. subscribe is the simplest option since it handles polling for you and blocks until the result is ready. For production workloads where you need to manage many requests in parallel, submit gives you full control over the request lifecycle. For full details on parameters, response shapes, status polling, and cancellation, see the Inference documentation.
Subscribe
Submits a request to the queue, polls automatically, and returns the result when ready. This is the simplest calling pattern since it handles the request lifecycle for you. Optionally receive progress updates via callbacks.
import fal_client
result = fal_client.subscribe("your-username/your-app-name", arguments={
"prompt": "a sunset over mountains"
})
print(result)
With progress updates:import fal_client
def on_queue_update(update):
if isinstance(update, fal_client.InProgress):
for log in update.logs:
print(log["message"])
result = fal_client.subscribe(
"your-username/your-app-name",
arguments={"prompt": "a sunset over mountains"},
with_logs=True,
on_queue_update=on_queue_update,
)
import { fal } from "@fal-ai/client";
const result = await fal.subscribe("your-username/your-app-name", {
input: { prompt: "a sunset over mountains" },
});
console.log(result.data);
With progress updates:import { fal } from "@fal-ai/client";
const result = await fal.subscribe("your-username/your-app-name", {
input: { prompt: "a sunset over mountains" },
logs: true,
onQueueUpdate: (update) => {
if (update.status === "IN_PROGRESS") {
update.logs?.map((log) => console.log(log.message));
}
},
});
Synchronous Inference
Full details on subscribe, progress updates, and timeout handling
Queue (Async)
For fire-and-forget workflows. Submit a request, get a request ID, and retrieve the result later by polling or webhook.
import fal_client
handler = fal_client.submit("your-username/your-app-name", arguments={
"prompt": "a sunset over mountains"
})
print(f"Request ID: {handler.request_id}")
# Check status
status = handler.status()
print(status)
# Get result when ready
result = handler.get()
print(result)
import { fal } from "@fal-ai/client";
const { request_id } = await fal.queue.submit("your-username/your-app-name", {
input: { prompt: "a sunset over mountains" },
});
console.log(`Request ID: ${request_id}`);
// Check status
const status = await fal.queue.status("your-username/your-app-name", {
requestId: request_id,
logs: true,
});
// Get result when ready
const result = await fal.queue.result("your-username/your-app-name", {
requestId: request_id,
});
Asynchronous Inference
Full details on the queue system, status polling, and REST API reference
Streaming
For apps that produce progressive output via Server-Sent Events (SSE). Your app must define a streaming endpoint at /stream using @fal.endpoint("/stream").
import fal_client
for event in fal_client.stream("your-username/your-app-name", arguments={
"prompt": "a sunset over mountains"
}):
print(event)
import { fal } from "@fal-ai/client";
const stream = await fal.stream("your-username/your-app-name", {
input: { prompt: "a sunset over mountains" },
});
for await (const event of stream) {
console.log(event);
}
const finalResult = await stream.done();
Building Streaming Endpoints
How to implement SSE streaming in your fal.App
Streaming Inference
Client-side streaming details and REST API
Real-Time (WebSocket)
For bidirectional, low-latency communication over a persistent connection. Your app must define a @fal.realtime("/realtime") endpoint.
import fal_client
with fal_client.realtime("your-username/your-app-name") as connection:
connection.send({"prompt": "Hello, world!"})
result = connection.recv()
print(result)
Async version:import asyncio
import fal_client
async def main():
async with fal_client.realtime_async("your-username/your-app-name") as connection:
await connection.send({"prompt": "Hello, world!"})
result = await connection.recv()
print(result)
asyncio.run(main())
import { fal } from "@fal-ai/client";
const connection = fal.realtime.connect("your-username/your-app-name", {
onResult: (result) => {
console.log(result);
},
});
connection.send({ prompt: "Hello, world!" });
Building Realtime Endpoints
How to implement WebSocket endpoints in your fal.App
Real-Time Inference
Client-side real-time details and proxy setup
Webhooks
Submit a request and receive the result at a URL you specify, instead of polling.
import fal_client
handler = fal_client.submit(
"your-username/your-app-name",
arguments={"prompt": "a sunset over mountains"},
webhook_url="https://your-server.com/api/webhook",
)
print(f"Request ID: {handler.request_id}")
import { fal } from "@fal-ai/client";
const { request_id } = await fal.queue.submit("your-username/your-app-name", {
input: { prompt: "a sunset over mountains" },
webhookUrl: "https://your-server.com/api/webhook",
});
curl -X POST "https://queue.fal.run/your-username/your-app-name?fal_webhook=https://your-server.com/api/webhook" \
-H "Authorization: Key $FAL_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "a sunset over mountains"}'
When the request completes, fal sends a POST to your webhook URL with the result:
{
"request_id": "abc-123",
"status": "OK",
"payload": { ... }
}
Webhooks API
Full details on webhook payloads, retries, verification, and IP allowlisting
You can pass custom headers with any calling method to control platform behavior:
import fal_client
result = fal_client.subscribe(
"your-username/your-app-name",
arguments={"prompt": "a sunset"},
headers={
"x-fal-no-retry": "1",
},
)
import { fal } from "@fal-ai/client";
const result = await fal.subscribe("your-username/your-app-name", {
input: { prompt: "a sunset" },
headers: {
"x-fal-no-retry": "1",
},
});
See Platform Headers for all available headers, and Retries for retry control. Each inference method page also documents its available SDK parameters.
Next Steps
Inference Documentation
Full details on all calling methods, parameters, status polling, and the request handle
Async Inference (Queue)
Submit, status, result, cancel, webhooks, and streaming status updates
Client Setup
Install and configure the fal client SDK
Handle Inputs & Outputs
Define the input/output schema for your endpoints