What it does

Key features

Token-level constraints: APAC LLM cannot generate tokens violating JSON/regex grammar
Zero parse errors: guaranteed APAC structured output format compliance
JSON schema enforcement: required fields, enums, types for APAC extraction tasks
Regex patterns: constrain APAC generation to phone numbers, dates, IDs, formats
Local model support: llama.cpp/transformers integration for APAC self-hosted inference
Template syntax: Handlebars-style APAC generation templates with interleaved code

When to reach for it

Best for

APAC ML teams building structured extraction or generation pipelines where parse errors are unacceptable — particularly document processing, medical record parsing, and financial data extraction APAC use cases.

Don't get burned

Limitations to know

! Token-level constraints require local models or logprob API access — APAC OpenAI API users get soft enforcement only
! Grammar definition complexity for APAC teams needing custom structured outputs beyond JSON
! Less active development than PydanticAI or Instructor for APAC structured output use cases

Context

About Guidance

Guidance is a Microsoft open-source library that constrains LLM generation at the token level — rather than prompting the LLM to return JSON and hoping the output parses, Guidance enforces a grammar or schema during generation so the LLM physically cannot produce tokens that violate the APAC structure. This eliminates the parse error retry loop that plagues APAC structured generation pipelines.

Guidance's constraint approach works by intersecting the LLM's token probability distribution with a mask derived from the APAC schema state — at each generation step, only tokens that could be part of a valid APAC output given the current position are permitted. For a JSON schema with `{"name": string, "age": integer}`, after generating `{"name": "`, only string tokens are allowed; after the closing `"` and `,`, only `"age"` is allowed as the next key.

For APAC teams building information extraction pipelines (extracting structured entities from APAC documents, parsing medical records, extracting financial data from reports), Guidance provides 100% parse success rate on the constrained fields — the LLM cannot generate malformed JSON, missing required fields, or out-of-enum values for APAC categorical fields.

Guidance supports local models (via llama.cpp, transformers) as well as API-based LLMs (GPT-4, Claude) — though token-level constraint is only possible with local models or API endpoints that expose token logprobs. For APAC teams using GPT-4 via the OpenAI API without logprobs access, Guidance falls back to output parsing with retry, losing the guarantee but maintaining the ergonomic template syntax.

Guidance

Key features

Best for

Limitations to know

About Guidance

Where this category meets practice depth.