NUS and NTU Publish APAC-Bench: Open-Source LLM Benchmark for APAC Regulatory and Financial Tasks

NUS and NTU release APAC-Bench, an open-source LLM benchmark with 12,000 APAC regulatory, legal, and financial tasks — finding GPT-4o and Claude Sonnet outperform Chinese models on English tasks but underperform on Chinese regulatory document reasoning.

AE By AIMenta Editorial Team · Apr 26, 2026

Researchers at the National University of Singapore and Nanyang Technological University have published APAC-Bench, an open-source evaluation benchmark specifically designed to assess large language model performance on tasks grounded in APAC regulatory frameworks, legal documents, and financial instruments — addressing the gap between Western-centric LLM benchmarks and APAC enterprise AI deployment requirements.

APAC-Bench contains 12,000 tasks across six APAC-specific categories: MAS TRM and HKMA regulatory compliance Q&A (English), CSRC and CBIRC Chinese securities and banking regulation (Mandarin), Japanese FSA financial reporting interpretation (Japanese), Southeast Asian consumer protection law analysis (Bahasa Indonesia and Bahasa Malaysia), APAC financial statement extraction and calculation (bilingual), and APAC legal contract clause identification (mixed language).

Key findings from the APAC-Bench evaluation of 12 leading LLMs: GPT-4o and Claude Sonnet 3.7 top the English-language APAC regulatory categories by 8-12 points over Chinese models (Qwen-2.5, Doubao-pro-32k). However, on Chinese-language regulatory reasoning tasks, Qwen-2.5-72B outperforms GPT-4o by 14 points and Claude Sonnet by 19 points — a reversal of the English ranking that validates the commercial case for Chinese foundation models in Mandarin-primary APAC enterprise workflows. The benchmark is Apache 2.0 licensed and available on Hugging Face, with evaluation scripts for reproducibility by APAC AI engineering teams building foundation model selection frameworks.

NUS and NTU Publish APAC-Bench: Open-Source LLM Benchmark for APAC Regulatory and Financial Tasks

Cross-reference our practice depth.

Related stories

Samsung and Anthropic Partner to Bring Claude Enterprise AI to Galaxy Commercial Devices for APAC B2B

ByteDance Open-Sources Doubao-1.5 Multilingual Model Family for APAC Enterprise Deployment

Japan FSA Finalises AI Model Risk Management Framework for Financial Institutions

Kakao Corp Spins Out KakaoAI as Independent APAC Enterprise AI Subsidiary

CISA and APAC Agencies Publish Joint AI Security Guidance for Critical Infrastructure Operators