Skip to main content
Mainland China
AIMenta
Model release

Meta AI Releases Llama 4 Scout and Maverick with Frontier Performance at Open-Weight Cost

Meta AI releases Llama 4 Scout and Maverick — open-weight models achieving frontier performance on coding and reasoning benchmarks at lower inference cost. Accelerates APAC enterprise open-source deployment as the cost-performance gap with closed models narrows significantly.

AE By AIMenta Editorial Team ·

Original source: Meta AI (opens in new tab)

AIMenta editorial take

Meta AI releases Llama 4 Scout and Maverick — open-weight models achieving frontier performance on coding and reasoning benchmarks at lower inference cost. Accelerates APAC enterprise open-source deployment as the cost-performance gap with closed models narrows significantly.

Meta AI has released Llama 4 Scout (17B active parameters, 16 experts MoE architecture) and Llama 4 Maverick (17B active parameters, 128 experts) under the Llama 4 Community License — open-weight models that achieve performance competitive with GPT-4o and Claude 3.5 Sonnet on standard reasoning, coding, and instruction following benchmarks, while operating at inference costs that are 60-80% lower than closed API providers at equivalent parameter counts.

Llama 4's Mixture-of-Experts (MoE) architecture — which activates only a subset of model parameters (17B) for each forward pass despite the model having a much larger total parameter count — enables frontier-class reasoning performance at inference costs closer to smaller dense models. For APAC enterprises evaluating open-source AI deployment, Llama 4's performance-cost ratio substantially improves the ROI case for self-hosted inference: running Llama 4 Maverick on dedicated APAC cloud infrastructure (4x A100 GPU instance on AWS Singapore) achieves GPT-4o-comparable quality at approximately 30% of the OpenAI API cost at moderate request volumes.

For APAC enterprises with data sovereignty requirements — financial services organisations that cannot route customer data through US-hosted API endpoints, healthcare organisations with patient data constraints, government agencies with sovereign AI mandates — Llama 4's performance at open-weight quality enables APAC infrastructure deployment without the capability sacrifice that previous open-weight model generations required. APAC enterprises running Llama 4 on Singapore-hosted infrastructure can achieve frontier-class AI capability while satisfying MAS TRM, PDPC, and APRA data residency requirements without dependency on US-hosted model providers.

Llama 4's release compresses the commercial open-weight AI deployment timeline for APAC enterprises by substantially reducing the effort required to justify open-source deployment over closed APIs: the performance gap that previously required APAC AI leaders to explain and defend when recommending self-hosted inference has narrowed to the point where Llama 4 performance is competitive for the majority of enterprise AI use cases without requiring extensive justification.

Beyond this story

Cross-reference our practice depth.

News pieces sit on top of working capability. Browse the service pillars, industry verticals, and Asian markets where AIMenta turns these stories into engagements.

Tagged
#meta #llama #open-source #model-release #apac #enterprise-ai

Related stories