What it does

Key features

11 harm categories: APAC configurable harm taxonomy with custom category extension
Dual checkpoint: APAC input + output classification for complete safety coverage
On-premises: APAC sensitive industry deployment without cloud moderation API
Context-aware: APAC understands intent vs rule-based filter false positives
Audit logging: APAC harm classification records for AI governance compliance
Llama 3 based: APAC 8B model for production; distilled for high throughput

When to reach for it

Best for

APAC engineering teams deploying production LLM applications in regulated industries or consumer-facing contexts — particularly APAC organizations that need configurable, on-premises safety classification without cloud moderation API dependency, and teams building APAC AI governance audit trails that require logged safety classifications for regulatory compliance.

Don't get burned

Limitations to know

! APAC multilingual safety accuracy varies — English-trained model may miss APAC-language harm patterns
! APAC 8B model requires GPU for production throughput — distilled versions trade accuracy for speed
! APAC custom harm categories require fine-tuning labeled APAC examples for reliable classification

Context

About Llama Guard

Llama Guard is Meta AI's open-source LLM safety model — a fine-tuned Llama model trained to classify both user inputs and LLM outputs against a configurable taxonomy of harm categories, providing APAC engineering teams with a production-grade safety guardrail that operates as a judge model in LLM application pipelines. Unlike rule-based content filters or regex blocklists, Llama Guard understands context and intent — distinguishing between a medical professional asking about drug dosages and a malicious actor requesting synthesis instructions.

Llama Guard's taxonomy covers 11 harm categories aligned with major AI safety frameworks: violent crimes, non-violent crimes, sex-related crimes, child sexual exploitation, defamation, specialized advice (medical, legal, financial), privacy, intellectual property, hate speech, suicide and self-harm, and sexual content. APAC teams can customize which categories to enforce and can extend the taxonomy with APAC-specific harm categories — adding APAC regulatory compliance categories (Chinese cybersecurity law prohibitions, Japan Act on Protection of Personal Information violations) or cultural sensitivity categories relevant to specific APAC markets.

Llama Guard 3 (the current version) is based on Llama 3 and is available in multiple sizes — 8B for production deployments requiring full context understanding, and smaller distilled versions for high-throughput moderation. APAC teams integrate Llama Guard as a pre-processing step (checking user input before sending to LLM) and post-processing step (checking LLM output before returning to user), creating a dual safety checkpoint that catches both prompt injection attacks and harmful model generations.

Llama Guard's on-premises deployment capability makes it specifically valuable for APAC regulated industries — APAC financial institutions, healthcare organizations, and government entities deploying LLMs internally can run Llama Guard locally to classify content without sending conversation data to third-party content moderation cloud APIs. APAC compliance teams use Llama Guard to log and classify all potentially harmful interactions for audit trail purposes under APAC AI governance frameworks.

Llama Guard

Key features

Best for

Limitations to know

About Llama Guard

Where this category meets practice depth.