What it does

Key features

Token-level constraints: APAC structurally valid JSON/regex output (not post-processing)
Interleaved programs: APAC prompt + code + generation in unified template syntax
Local LLM support: llama.cpp/vLLM/Transformers for APAC data sovereignty
Select constraints: APAC predefined option sets for LLM choice points
Decision trees: APAC conditional generation branching based on LLM outputs
Open-source: MIT licensed for APAC commercial deployment and modification

When to reach for it

Best for

APAC AI engineers building structured data extraction and document processing pipelines who need guaranteed output format compliance — particularly APAC financial services, legal, and healthcare teams extracting specific fields from unstructured APAC documents where JSON parsing failures are unacceptable.

Don't get burned

Limitations to know

! Token-level constraints only work with supported APAC local backends — OpenAI API constraints are approximate
! Steeper learning curve than prompt engineering for APAC developers new to constrained generation
! Program complexity increases for APAC multi-step generation with many interleaved constraints

Context

About Guidance AI

Guidance AI is a Microsoft open-source framework for constrained LLM generation — allowing APAC developers to write programs that interleave natural language prompts with Python code, branching logic, and output constraints that the LLM must satisfy. Unlike prompt engineering (which asks the LLM to produce valid JSON), Guidance constrains the LLM's token sampling to only produce tokens that remain valid according to the specified structure.

Guidance's generation constraints work at the token level — when generating JSON, Guidance ensures the LLM only produces tokens that are valid continuations of a JSON string, making malformed JSON structurally impossible rather than just unlikely. This token-level constraint is the key difference from output parsers that try to fix LLM JSON after generation: Guidance produces valid structure on the first pass, eliminating APAC retry logic and error handling for malformed outputs.

Guidance programs use a template syntax that mixes static text, `{{gen}}` LLM generation blocks, and `{{select}}` constrained choice blocks — APAC teams write decision trees where the LLM fills in choices from a predefined APAC option set, generates text within validated format constraints, or branches based on LLM outputs. This enables APAC structured data extraction pipelines where the LLM populates a specific APAC schema without post-processing.

Guidance supports local LLM backends (llama.cpp, vLLM, Transformers) and API providers (OpenAI, Anthropic) — APAC teams running on-premise LLMs for data sovereignty use Guidance with local models for fully constrained generation without cloud API calls. For APAC financial services extracting structured data from regulatory documents, Guidance eliminates the JSON parsing failure mode that affects prompt-based extraction approaches.

Guidance AI

Key features

Best for

Limitations to know

About Guidance AI

Where this category meets practice depth.