fal exposes a Prometheus-compatible metrics endpoint that you can scrape with any monitoring tool. Use it to build custom dashboards, set up alerts on queue depth or error rates, and feed fal metrics into the same observability stack you use for the rest of your infrastructure. The endpoint returns metrics in Prometheus exposition format, so it works with Grafana, Datadog, New Relic, Splunk, or any tool that can scrape a Prometheus target. Responses are cached for 10 seconds, so set your scrape interval accordingly.Documentation Index
Fetch the complete documentation index at: https://fal.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Endpoint
Available Metrics
| Metric | Labels | Description |
|---|---|---|
fal_app_runners | application, state, machine_type | Number of runners currently allocated |
fal_app_queue_size | application | Requests waiting in queue |
fal_app_concurrent_requests | application | Requests being actively processed |
fal_app_requests_completed | application, method, status | Requests completed in the last minute |
fal_app_requests_received | application, method | Requests received in the last minute |
fal_app_request_latency | application, le | Completed requests bucketed by latency |
fal_app_runners metric tracks runners across three states: idle (warm, waiting), running (processing), and pending (cold start in progress).
Integration
Add the endpoint as a Prometheus data source in your monitoring tool. The only requirement is passing your API key in theAuthorization: Key ... header. Set the scrape interval to at least 10 seconds since responses are cached at that frequency.
Example PromQL Queries
All metrics are gauges. The
fal_app_request_latency metric uses histogram-style buckets (labeled by le) for latency distribution analysis.Platform API Reference
Full API specification for the metrics endpoint