What it does

Key features

Decorator-based: `@app.function(gpu="A100")` runs APAC Python on cloud GPU
Per-second billing: APAC cost only for execution time, not idle GPU instances
Container caching: fast APAC cold starts with cached Python environment layers
Persistent volumes: APAC checkpoint storage for fine-tuning and data jobs
Secrets management: secure APAC API key injection without code exposure
Scheduled + webhook: APAC cron jobs and event-driven GPU function triggers

When to reach for it

Best for

APAC AI and ML engineering teams who need GPU compute for fine-tuning, batch inference, or data pipelines without managing Kubernetes GPU clusters — particularly APAC teams with variable GPU demand where reserved instances would be wasteful.

Don't get burned

Limitations to know

! US-based primary infrastructure — APAC data sovereignty teams should review data residency
! Cold start latency (2-10s) for APAC latency-sensitive real-time inference workloads
! Per-second billing expensive for APAC always-on inference vs reserved GPU instances at scale

Context

About Modal

Modal is a serverless compute platform that runs Python functions on cloud GPUs without infrastructure management — APAC AI teams define GPU workloads as decorated Python functions and Modal handles provisioning, scaling, dependency installation, and billing. APAC teams use Modal for LLM fine-tuning jobs, inference batch processing, AI data pipelines, and model serving without managing APAC GPU clusters.

Modal's decorator syntax turns ordinary APAC Python functions into cloud-native compute jobs — `@app.function(gpu='A100', timeout=3600)` makes a Python function run on an A100 GPU in the cloud. APAC teams run the same function locally (CPU) during development and on GPU in production without code changes, using `modal run` from the terminal or `modal deploy` for persistent APAC endpoints.

Modal's container caching builds APAC Python environments once and reuses cached layers — a function requiring `torch`, `transformers`, and `datasets` installs once and subsequent APAC runs start in seconds rather than minutes. For APAC fine-tuning workflows that run many iterations, this cold start optimization significantly reduces iteration time versus raw container builds.

Modal's APAC persistent storage provides volumes that survive function invocations — APAC fine-tuning jobs can checkpoint model weights to Modal volumes and resume from checkpoints without APAC data transfer costs. Modal also provides secrets management for APAC API keys and credentials, keeping sensitive APAC configuration out of function code.

Modal

Key features

Best for

Limitations to know

About Modal

Where this category meets practice depth.