Skip to main content
Vietnam
AIMenta
G

Guidance

by Microsoft

Microsoft open-source library for constrained LLM generation — enforcing APAC LLM outputs to exact JSON schemas, regex patterns, and context-free grammars at the token level for reliable structured generation.

AIMenta verdict
Decent fit
4/5

"Structured LLM generation — APAC ML teams use Microsoft Guidance to constrain APAC LLM outputs to exact grammars, JSON schemas, and regex patterns, eliminating parsing errors in APAC structured generation tasks."

Features
6
Use cases
1
Watch outs
3
What it does

Key features

  • Token-level constraints: APAC LLM cannot generate tokens violating JSON/regex grammar
  • Zero parse errors: guaranteed APAC structured output format compliance
  • JSON schema enforcement: required fields, enums, types for APAC extraction tasks
  • Regex patterns: constrain APAC generation to phone numbers, dates, IDs, formats
  • Local model support: llama.cpp/transformers integration for APAC self-hosted inference
  • Template syntax: Handlebars-style APAC generation templates with interleaved code
When to reach for it

Best for

  • APAC ML teams building structured extraction or generation pipelines where parse errors are unacceptable — particularly document processing, medical record parsing, and financial data extraction APAC use cases.
Don't get burned

Limitations to know

  • ! Token-level constraints require local models or logprob API access — APAC OpenAI API users get soft enforcement only
  • ! Grammar definition complexity for APAC teams needing custom structured outputs beyond JSON
  • ! Less active development than PydanticAI or Instructor for APAC structured output use cases
Context

About Guidance

Guidance is a Microsoft open-source library that constrains LLM generation at the token level — rather than prompting the LLM to return JSON and hoping the output parses, Guidance enforces a grammar or schema during generation so the LLM physically cannot produce tokens that violate the APAC structure. This eliminates the parse error retry loop that plagues APAC structured generation pipelines.

Guidance's constraint approach works by intersecting the LLM's token probability distribution with a mask derived from the APAC schema state — at each generation step, only tokens that could be part of a valid APAC output given the current position are permitted. For a JSON schema with `{"name": string, "age": integer}`, after generating `{"name": "`, only string tokens are allowed; after the closing `"` and `,`, only `"age"` is allowed as the next key.

For APAC teams building information extraction pipelines (extracting structured entities from APAC documents, parsing medical records, extracting financial data from reports), Guidance provides 100% parse success rate on the constrained fields — the LLM cannot generate malformed JSON, missing required fields, or out-of-enum values for APAC categorical fields.

Guidance supports local models (via llama.cpp, transformers) as well as API-based LLMs (GPT-4, Claude) — though token-level constraint is only possible with local models or API endpoints that expose token logprobs. For APAC teams using GPT-4 via the OpenAI API without logprobs access, Guidance falls back to output parsing with retry, losing the guarantee but maintaining the ergonomic template syntax.

Beyond this tool

Where this category meets practice depth.

A tool only matters in context. Browse the service pillars that operationalise it, the industries where it ships, and the Asian markets where AIMenta runs adoption programs.