November 25, 2025
Scale your application with the new scaling delay feature
Scale your application with the new scaling delay feature.- Scaling delay - the amount of seconds the system will wait for a request to be picked up by a runner before triggering a scale up of a runner
November 17, 2025
Reduce cold start times with shared compiled PyTorch caches
Dramatically reduce cold start times for torch.compile() models with the new inductor cache utilities.- Load pre-compiled CUDA kernels in ~2 seconds instead of recompiling for 20-30 seconds on each worker
- GPU-specific caching automatically organized by GPU type (H100, H200, A100)
- Two usage patterns: Manual control with
load_inductor_cache()/sync_inductor_cache()or automatic withsynchronized_inductor_cache()context manager - Persistent shared storage at
/data/inductor-caches/<GPU_TYPE>/<cache_key>.zip - First worker compiles and shares, subsequent workers load instantly
November 14, 2025
Get Slack notifications for serverless app failures
Never miss critical issues with instant Slack alerts for your serverless applications.- Connect your workspace with one-click OAuth installation
- Choose notification channel from a dropdown of your Slack channels
- Instant alerts for:
- App startup failures and timeouts
- Critical platform issues
- Real-time error notifications
- Team visibility - everyone in the channel sees important updates
- Configure at
https://fal.ai/dashboard/notifications/settings
November 4, 2025
Stop and kill runners directly from the dashboard
No more switching to the CLI to manage your runners. You now have full lifecycle control right from the dashboard.- Graceful shutdown or force kill runners with a single click
- Access at
https://fal.ai/dashboard/apps/{username}/{appname}/runners
Stream platform logs to your own endpoint with drains
Integrate fal’s logging with your existing observability stack using the new Serverless Drains feature.- Automatic log forwarding from apps, runners, and file operations in NDJSON format
- Works with Datadog, Splunk, Elasticsearch, or any HTTP endpoint
- Configure at
https://fal.ai/dashboard/drains
November 2, 2025
Upload larger files with improved timeout handling
We’ve significantly improved the reliability of file uploads from URLs, especially for large datasets and model files.- Extended timeout to 10 minutes for
fal files uploadandfal files upload-url - Upload multi-GB files without timeout errors
- See
fal filesdocs
November 1, 2025
Restart all runners without redeploying
Apply environment changes or recover from bad states instantly with the newfal apps rollout command.- Restart all runners for an app without creating a new deployment
- Graceful by default (runners finish current requests) or use
--forcefor immediate restart - Pick up new secrets, environment variables, or clear memory issues
- See
fal apps rolloutdocs
Stop specific runners without affecting others
Target individual runners for maintenance with graceful shutdown viafal runners stop.- Stop specific runners without affecting others, useful for targeted maintenance
- See
fal runnersdocs
Debug production runners with interactive shell access
Jump directly into any running container to troubleshoot issues in real-time withfal runners shell.- SSH-like access to inspect files, environment variables, and dependencies
- Debug production issues without redeploying
- See
fal runners shelldocs
October 31, 2025
See everything happening in your app with the events timeline
Complete activity history for runners, deployments, and config changes in one place.- Unified timeline of runner events, deployments, and config changes
- Access at
https://fal.ai/dashboard/apps/{username}/{appname}/events
October 25, 2025
Get from zero to deployed in minutes with in-app onboarding
New interactive guide walks you through your first serverless deployment step-by-step.- Step-by-step walkthrough from installation to deployment with copy-paste examples
- Access at
https://fal.ai/dashboard/serverless-get-started
October 22, 2025
Delete files from fal storage
Remove files and directories with the newfal files rm command.- Recursive deletion:
fal files rm path/to/file-or-directory - See
fal filesdocs
October 21, 2025
Platform APIs v1 officially released
Programmatically manage your model deployments with the new Platform APIs.- Model discovery - search and metadata retrieval for 600+ models
- Pricing and cost estimation - real-time pricing information
- Usage tracking - detailed line items with quantities and prices
- Analytics - request counts, error rates, and latency percentiles
- Available at
https://api.fal.ai/v1- see docs
Get notified when you hit concurrent requests limits
Never wonder why requests are queuing—we now send notifications when you reach your concurrency limit.- Email and dashboard notifications with smart throttling (immediate, 1h, 1d, weekly)
- Limit value included in 429 responses for programmatic handling
Debug errors faster with the new errors page
Comprehensive error analytics to identify and resolve issues quickly.- Server vs client error rates with 4xx/5xx breakdown and sparklines
- Error timeline with status code distribution and endpoint-level breakdown
- Access at
https://fal.ai/dashboard/apps/{username}/{appname}/errors
October 20, 2025
Stop or kill individual runners from the command line
Precise control over each runner’s lifecycle without touching the dashboard.fal runners stop- gracefully stop a runner, allowing in-flight requests to completefal runners kill- immediately terminate a runner without waiting- See
fal runnersdocs
October 16, 2025
See exactly how long runners spend starting up
Identify GPU availability bottlenecks and optimize cold start times.- Pending uptime metrics show how long runners wait before becoming active
- Track PENDING, DOCKER_PULL, and SETUP state durations separately
October 15, 2025
Connect fal docs to Cursor with MCP
Access the complete fal documentation directly in Cursor using Model Context Protocol.- Complete documentation in your IDE with AI-powered suggestions
- Simple setup: add fal MCP server to your
mcp.json- see guide
Personalized dashboard with creator and developer views
The dashboard now adapts to your workflow with two distinct experiences.- Creator view - gallery-focused with favorite models and visual generation history
- Developer view - metrics-driven with usage stats, error tracking, and API analytics
- Quick stats showing credits, requests, and errors with sparklines
October 13, 2025
Add custom headers to your API requests
Integrate seamlessly with analytics, auth, and middleware by passing custom HTTP headers.- Add custom headers for analytics, authentication, or middleware integration
- Works with all client libraries
Multi-GPU inference and training with fal.distributed
Scale AI workloads across multiple GPUs with the newfal.distributed module.- Data parallelism - generate multiple outputs simultaneously (e.g., 4 images on 4 GPUs)
- Model parallelism - split large models across GPUs for faster generation
- Distributed training - synchronized gradient updates with DDP
- Supports 2, 4, or 8 GPU configurations on H100s and A100s
- See distributed docs
October 10, 2025
Dedicated pages for Analytics, Runners, Logs, and Versions
Complete app details redesign gives each deployment aspect its own focused view.- New Analytics page - runner-focused metrics with date range filtering
- New Runners page - app-scoped runner view with enhanced filters
- New Logs page - dedicated log viewer for debugging
- New Versions page - manage and view app revisions
- Enhanced Overview - endpoint stats and performance metrics at a glance
October 9, 2025
Compare models side-by-side in the new Sandbox
Find the perfect model by testing multiple options in parallel with the same prompt.- Run multiple models simultaneously with the same prompt
- Available at
https://fal.ai/sandbox
October 8, 2025
Manage deployments from Python without async/await
New synchronous client makes serverless management feel just like the CLI.- Manage apps, runners, and deployments programmatically without async/await
- Same API as CLI:
client.apps.*,client.runners.*,client.deploy() - See Python client docs
October 6, 2025
Bring your own container to any deployment
Full control over your runtime environment with custom Docker images.- Use
ContainerImage.from_dockerfile_str()orContainerImage.from_dockerfile() - Install any dependencies, tools, or system packages you need
- See custom containers guide
October 3, 2025
Dynamic auto-scaling with percentage-based buffers
Scale more intelligently by setting concurrency buffers as percentages instead of fixed numbers.- Configure buffer as a percentage of current concurrency for dynamic scaling
- See scaling docs
Runner logs with streaming and filtering
Real-time log streaming and powerful filtering for faster debugging.- Stream logs in real-time with
fal runners logs --follow - Filter by time range with
--sinceand--until - Search logs with
--searchparameter - Scrollable and searchable in the dashboard with SSE-powered updates
- See
fal runners logsdocs
Include local files in your deployments automatically
Bring configs, utilities, and code from your local machine into serverless apps.- Specify files with relative or absolute paths to include at runtime
- Works with
fal runandfal deploy - See app files docs
Find what you need faster with reorganized navigation
Clearer dashboard structure groups features by workflow: Generate, Serverless, and Manage.- Generate group: Sandbox, Model Gallery
- Serverless group: Apps, Logs, Files, Runners
- Manage group: Usage, Billing, API Keys, Webhooks, Team Members
October 2, 2025
Know exactly which version each runner is running
Track deployments better with revision IDs shown on every runner.- Revision ID displayed on runners to track which version is running
- State renamed: “DEAD” → “TERMINATED” for clarity
October 1, 2025
Filter logs with custom labels and powerful queries
Find what you need instantly with EXACT/CONTAINS matching and multi-condition filters.- EXACT or CONTAINS matching for label values
- Multiple conditions with OR logic (e.g.,
status IN ["error", "warning"]) - Available in dashboard and API
- Examples:
error_type = "ValidationError",endpoint CONTAINS "/api/v2/"
See what runners are doing during startup
Track exactly where runners are in the startup process—pending, pulling images, or setting up.fal runners listnow shows PENDING, DOCKER_PULL, and SETUP states- Understand deployment progress in real-time
View all app endpoints and config at a glance
Redesigned app details page surfaces the information you need most.- Endpoints, configuration, and status all in one place
September 27, 2025
Monitor and clear your request queue from the CLI
Check how many requests are queued and flush them when needed.fal queue size app_name- check queue size for an appfal queue flush app_name- flush all pending requests- See
fal queuedocs
September 10, 2025
View runner history with time-based filtering
See terminated runners and filter by state to debug failures.fal runners list --since "1h"- view runners from the last hour (max 24h)fal runners list --state dead- filter by state (running, pending, setup, dead)- Helpful for debugging failed deployments and understanding runner lifecycle
- See
fal runners listdocs
August 29, 2025
Reorganize files in fal storage without re-uploading
Move and rename files instantly with the newfal files mv command.- Rename or move files in fal storage:
fal files mv source destination - See
fal filesdocs