Category · 23 terms
Generative AI
defined clearly.
Foundation models, LLMs, diffusion, and the creative output stack.
Chain of Thought (CoT)
Prompting an LLM to produce intermediate reasoning steps before a final answer — reliably improves accuracy on multi-step reasoning tasks.
Claude
Anthropic's family of large language models, known for long-context reasoning, instruction-following, and Constitutional-AI safety training.
Code Assistant
An LLM-powered helper that lives inside the developer's IDE — completing code, generating tests, explaining errors, and proposing refactors without taking over authorship.
Code Generation
AI-assisted authoring of source code — from inline autocomplete in an IDE to full-app scaffolding from a prompt.
Context Window
The maximum number of tokens a language model can process in a single inference call — its working memory for the current task.
Diffusion Model
A generative-model architecture that learns to reverse a noise-adding process — used in Stable Diffusion, DALL-E, Midjourney, and most modern image/video generators.
Foundation Model
A large model pretrained on broad data at scale that can be adapted to many downstream tasks — the base layer under modern generative AI.
Frontier Model
The most capable foundation models at the leading edge of the AI capability curve — typically the top offerings from OpenAI, Anthropic, Google, and their peers.
Gemini
Google DeepMind's multimodal LLM family — Gemini 1.0/1.5/2.0/2.5 — natively trained on text, images, audio, and video together.
GPT (Generative Pre-trained Transformer)
OpenAI's family of autoregressive language models — the architecture and product line that defined the modern LLM era.
Hallucination
When an LLM generates content that is fluent and confident but factually wrong — the core reliability concern in production generative AI.
Image Generation
The capability of producing novel images from a prompt, reference image, or structured input — the field encompassing text-to-image, image-to-image, and controllable generation.
Large Language Model (LLM)
A neural network trained on vast text corpora to predict tokens — the technology behind ChatGPT, Claude, Gemini, and the modern generative-AI wave.
Multimodal Model
A model that processes and/or generates multiple data modalities — text, images, audio, video — within a single architecture.
Prompt Engineering
The craft of designing model inputs to elicit reliable, high-quality outputs. Half art, half empirical iteration.
Reasoning Model
A language model trained or instructed to "think" before answering — generating extended intermediate reasoning that lifts performance on math, code, and multi-step problems.
RLHF (Reinforcement Learning from Human Feedback)
A training technique that uses human-labelled preferences to fine-tune a language model's behaviour — the core technique behind ChatGPT's launch-era helpfulness.
Stable Diffusion
An open-weights text-to-image diffusion model released in 2022 by Stability AI — democratised generative image AI and spawned a massive ecosystem.
Temperature (Sampling)
A hyperparameter that controls how random LLM outputs are — low temperature gives deterministic, focused outputs; high temperature gives diverse, creative outputs.
Text-to-Image
Producing a new image from a natural-language description — the application that made diffusion models a household concept.
Text-to-Speech (TTS)
Synthesising natural-sounding speech from written text — the inverse of ASR, now approaching human parity for major languages.
Text-to-Video
Synthesising short video clips from a text prompt — the frontier modality where quality is rising fast but temporal coherence remains the hard problem.
Video Generation
Synthesising novel moving footage — from short animated clips to minute-long narrative sequences — using diffusion and autoregressive models extended to the temporal dimension.