Skip to main content
Malaysia
AIMenta
Acronym intermediate · Generative AI

Chain of Thought (CoT)

Prompting an LLM to produce intermediate reasoning steps before a final answer — reliably improves accuracy on multi-step reasoning tasks.

Chain of Thought (CoT) prompting instructs a language model to produce intermediate reasoning steps before stating a final answer. The canonical Wei et al. (2022) paper showed that adding the phrase "Let's think step by step" or including worked examples with explicit reasoning dramatically improved LLM accuracy on arithmetic, commonsense, and symbolic reasoning tasks — especially for larger models where the capability emerges. The mechanism is not magic: by forcing the model to produce reasoning tokens, you allow more computation to happen before the answer token, and you let the model condition its answer on its own partial work.

The pattern has generated an ecosystem. **Zero-shot CoT** — the minimal "think step by step" phrasing — is the cheapest entry. **Few-shot CoT** — prompt contains worked examples with explicit reasoning — usually outperforms zero-shot for structured tasks. **Self-consistency** — sample multiple reasoning chains and take a majority vote on the final answer — significantly improves accuracy at modest cost. **Tree of Thoughts** and similar search-over-reasoning approaches branch and prune over reasoning paths for harder problems. **Reasoning-specialised models** (OpenAI o1 / o3, Claude 3.7 Sonnet Thinking, DeepSeek R1, Gemini Thinking) bake extended chain-of-thought into the model itself via reinforcement-learning fine-tuning — these models produce long internal reasoning tokens before responding, at real inference-cost and latency cost.

For APAC mid-market teams, CoT is the single most reliable prompt engineering pattern for tasks that involve multi-step reasoning — calculations, ambiguous policy interpretations, multi-criteria decisions. The rule: **if a task requires thinking, ask the model to think**. The cost is additional output tokens and latency; the benefit is often a 10-30 percentage-point accuracy lift on reasoning benchmarks and, more importantly, traces that let you debug wrong answers by reading the reasoning chain.

The non-obvious operational note: **CoT reasoning is not always faithful to the underlying computation**. The model can produce a plausible-looking reasoning chain while its actual answer comes from pattern completion, and the chain is a post-hoc rationalisation. Interpretability research has found cases where perturbations to the chain do not change the answer, suggesting the chain was ornamental rather than causal. Treat CoT output as a helpful signal for debugging, not as a faithful transcript of how the model decided.

Where AIMenta applies this

Service lines where this concept becomes a deliverable for clients.

Beyond this term

Where this concept ships in practice.

Encyclopedia entries name the moving parts. The links below show where AIMenta turns these concepts into engagements — across service pillars, industry verticals, and Asian markets.

Continue with All terms · AI tools · Insights · Case studies