Key features
- Token-level constraints: APAC LLM cannot generate tokens violating JSON/regex grammar
- Zero parse errors: guaranteed APAC structured output format compliance
- JSON schema enforcement: required fields, enums, types for APAC extraction tasks
- Regex patterns: constrain APAC generation to phone numbers, dates, IDs, formats
- Local model support: llama.cpp/transformers integration for APAC self-hosted inference
- Template syntax: Handlebars-style APAC generation templates with interleaved code
Best for
- APAC ML teams building structured extraction or generation pipelines where parse errors are unacceptable — particularly document processing, medical record parsing, and financial data extraction APAC use cases.
Limitations to know
- ! Token-level constraints require local models or logprob API access — APAC OpenAI API users get soft enforcement only
- ! Grammar definition complexity for APAC teams needing custom structured outputs beyond JSON
- ! Less active development than PydanticAI or Instructor for APAC structured output use cases
About Guidance
Guidance is a Microsoft open-source library that constrains LLM generation at the token level — rather than prompting the LLM to return JSON and hoping the output parses, Guidance enforces a grammar or schema during generation so the LLM physically cannot produce tokens that violate the APAC structure. This eliminates the parse error retry loop that plagues APAC structured generation pipelines.
Guidance's constraint approach works by intersecting the LLM's token probability distribution with a mask derived from the APAC schema state — at each generation step, only tokens that could be part of a valid APAC output given the current position are permitted. For a JSON schema with `{"name": string, "age": integer}`, after generating `{"name": "`, only string tokens are allowed; after the closing `"` and `,`, only `"age"` is allowed as the next key.
For APAC teams building information extraction pipelines (extracting structured entities from APAC documents, parsing medical records, extracting financial data from reports), Guidance provides 100% parse success rate on the constrained fields — the LLM cannot generate malformed JSON, missing required fields, or out-of-enum values for APAC categorical fields.
Guidance supports local models (via llama.cpp, transformers) as well as API-based LLMs (GPT-4, Claude) — though token-level constraint is only possible with local models or API endpoints that expose token logprobs. For APAC teams using GPT-4 via the OpenAI API without logprobs access, Guidance falls back to output parsing with retry, losing the guarantee but maintaining the ergonomic template syntax.
Beyond this tool
Where this category meets practice depth.
A tool only matters in context. Browse the service pillars that operationalise it, the industries where it ships, and the Asian markets where AIMenta runs adoption programs.
Other service pillars
By industry