Skip to main content
Singapore
AIMenta
intermediate · Machine Learning

Loss Function (Objective Function)

The function a model tries to minimise during training — defines what "good" means mathematically.

The loss function (also called objective function or cost function) is the mathematical expression of what the model is trying to minimise during training. Given a prediction and a ground-truth label, the loss function returns a scalar that measures how wrong the prediction is; training consists of adjusting parameters to drive that scalar down across the training set. The choice of loss function is one of the most consequential decisions in any ML project because it literally defines what the model treats as "good" — a model will optimise the loss you give it, and if that loss does not match what you actually care about, the model will cheerfully succeed at the wrong objective.

The canonical losses map to task types. **Mean Squared Error** and **Mean Absolute Error** for regression — MSE when large errors matter quadratically, MAE when they matter linearly and you want robustness to outliers. **Cross-entropy** (binary or categorical) for classification — the default pairing with softmax output, with clean gradient properties. **Cosine similarity / contrastive losses** (InfoNCE, triplet) for representation learning. **Huber loss** for a hybrid that behaves MSE-like near zero and MAE-like for large errors. **Focal loss** for heavily imbalanced classification. For structured outputs there are CTC (speech recognition), CRF-style losses (sequence labelling), and for generative models the reconstruction or diffusion losses specific to their architecture.

For APAC mid-market teams, the practical guidance is: **start with the standard loss for your task type**, then customise only when you can point to a specific way the default mis-measures what you want. Custom losses that weight business-relevant errors differently (false-negative-expensive classifiers, cost-sensitive regression) are legitimate and often valuable; custom losses invented to try to improve generic performance are usually a distraction. Every hour spent on data quality and evaluation design beats an hour spent inventing clever losses.

The non-obvious failure mode: **the loss you train on and the metric you evaluate on are often different**, and they can drift apart. You train classification on cross-entropy but evaluate on F1 or on a business metric like revenue-weighted accuracy. When they diverge, you get a model whose training curve looks great but whose user-facing metrics move sideways or worse. Monitor both during training; if they move in opposite directions for more than a few epochs, something is wrong with either the loss, the metric, or how they compose.

Where AIMenta applies this

Service lines where this concept becomes a deliverable for clients.

Beyond this term

Where this concept ships in practice.

Encyclopedia entries name the moving parts. The links below show where AIMenta turns these concepts into engagements — across service pillars, industry verticals, and Asian markets.

Continue with All terms · AI tools · Insights · Case studies