What it does

Key features

100+ probes: APAC prompt injection/jailbreak/toxicity/hallucination automated testing
RAG security: APAC indirect prompt injection probes for RAG document poisoning
Scan reports: APAC structured vulnerability evidence for AI governance review
Multi-model: APAC GPT/Claude/Llama/custom APAC-language model compatibility
Before/after: APAC quantified safety improvement measurement post-mitigation
NVIDIA maintained: APAC active development with new probes for emerging attack vectors

When to reach for it

Best for

APAC AI engineering teams performing pre-deployment safety assessment of LLM applications — particularly APAC organizations with AI governance review processes that require objective vulnerability evidence, and teams building RAG applications that need indirect prompt injection testing to ensure retrieved document content cannot compromise LLM application safety.

Don't get burned

Limitations to know

! APAC probe coverage focuses on English attack patterns — APAC-language jailbreaks require custom probe development
! APAC false positives require manual review — not all flagged outputs represent real safety failures
! APAC scan time scales with number of probes × model API calls — full scans take hours on large probe sets

Context

About garak

Garak (Generative AI Red-teaming and Assessment Kit) is an open-source LLM vulnerability scanner maintained by NVIDIA that provides APAC AI engineering teams with a framework for systematically probing LLM applications for failure modes — running 100+ automated attack probes covering prompt injection, jailbreaking, toxic content generation, hallucination, data leakage, and model extraction attacks. APAC teams using garak establish an objective, reproducible baseline of LLM safety posture before deploying in production, analogous to how APAC security teams run vulnerability scanners against web applications before release.

Garak's probe library covers the major categories of LLM failure modes documented by AI safety researchers — direct prompt injection (bypassing system prompt instructions), indirect prompt injection (malicious instructions embedded in retrieved content for RAG applications), encoding-based jailbreaks (base64, leetspeak, emoji encoding to bypass filters), persona-based jailbreaks (roleplay scenarios that elicit harmful outputs), and toxicity triggers (probing for racial slurs, hate speech, and harmful content generation). APAC teams building RAG applications over enterprise documents use garak's indirect prompt injection probes to verify that malicious content embedded in retrieved documents cannot hijack their LLM application.

Garak produces structured scan reports showing which probes triggered successful attacks, the failure rate for each probe category, and which specific attack variations succeeded — enabling APAC engineering teams to prioritize safety mitigations based on actual vulnerability evidence rather than theoretical risk. APAC AI governance teams include garak scan results in LLM deployment risk assessments submitted to internal security review boards and external regulators.

Garak integrates with major LLM APIs and locally deployed models — APAC teams run garak against OpenAI GPT-4o, Anthropic Claude, locally deployed Llama, and custom fine-tuned APAC-language models. Comparing garak probe results before and after implementing safety mitigations (adding Llama Guard, NeMo Guardrails, or custom system prompt hardening) provides APAC teams with quantified evidence of safety improvement.

garak

Key features

Best for

Limitations to know

About garak

Where this category meets practice depth.