Optimization — AIMenta AI Encyclopedia

Optimization is the mathematical discipline of finding parameter values that minimise (or maximise) an objective function subject to constraints. It is the engine under every trained machine-learning model — training *is* optimisation of the loss function over the parameter space. The discipline spans convex optimisation (unique global minimum, well-behaved, efficient algorithms), non-convex optimisation (multiple local minima, deep-learning's playground), constrained optimisation (Lagrangians, KKT conditions), combinatorial optimisation (scheduling, routing, packing), and stochastic optimisation (when the objective is only estimable via random samples).

The algorithmic landscape splits roughly by smoothness and scale. **First-order methods** (gradient descent and variants: SGD, Adam, AdamW, Lion) dominate deep learning — cheap per step, scale to billions of parameters, tolerate noisy gradients. **Second-order methods** (Newton, quasi-Newton, L-BFGS) converge in fewer steps but require the Hessian or approximations, limiting them to moderate scale. **Evolutionary and black-box** methods (CMA-ES, Bayesian optimisation) handle problems where gradients are unavailable — hyperparameter search, neural-architecture search, non-differentiable simulators. **Convex-optimisation** solvers (CVXPY, commercial solvers like Gurobi, CPLEX, MOSEK) handle scheduling, portfolio, and operations-research problems with guaranteed-optimal solutions.

For APAC mid-market teams, optimisation shows up in three business contexts beyond model training. **Operations research** — scheduling delivery fleets, optimising warehouse layouts, balancing production lines — is often handled by commercial convex solvers and is usually undersold as an AI opportunity. **Hyperparameter tuning** uses Bayesian optimisation libraries (Optuna, Hyperopt, Ray Tune) to efficiently search model configurations. **Decision systems** — pricing, promotion, inventory — frequently reduce to constrained optimisation problems where a mid-sized LP or MILP produces value that complex ML cannot match.

The non-obvious advice: **do not reach for gradient-free optimisation until you have confirmed gradients are unavailable**. Teams sometimes default to Bayesian optimisation or evolutionary search for problems that have perfectly usable gradients, paying the cost of slow convergence for no benefit. Conversely, for genuinely black-box problems (hyperparameters, architecture choices, business-rule tuning), gradient-free methods can find good configurations in far fewer evaluations than random or grid search.

Where AIMenta applies this

Service lines where this concept becomes a deliverable for clients.

service AI Strategy & Advisory

Beyond this term

Where this concept ships in practice.

Encyclopedia entries name the moving parts. The links below show where AIMenta turns these concepts into engagements — across service pillars, industry verticals, and Asian markets.

Other service pillars

Training & Enablement Talent & Hiring Workflow Automation Software & Platforms Infrastructure & Cloud

By industry

Financial services Retail & e-commerce Manufacturing Logistics Healthcare Professional services Public sector Real estate Technology Education

By Asian market

🇭🇰 Hong Kong 🇨🇳 Mainland China 🇹🇼 Taiwan 🇯🇵 Japan 🇰🇷 Korea 🇸🇬 Singapore 🇲🇾 Malaysia 🇻🇳 Vietnam 🇮🇩 Indonesia

Continue with All terms · AI tools · Insights · Case studies