What it does

Key features

CBPE: confidence-based performance estimation without ground truth labels
Data reconstruction performance estimation for non-probabilistic APAC models
Data drift detection: univariate and multivariate APAC feature drift
Performance drift detection: estimated vs realized performance tracking
Chunk-based analysis: time-windowed monitoring for APAC batch pipelines
Integration with MLflow for APAC experiment and monitoring unified tracking

When to reach for it

Best for

APAC ML teams operating models where ground truth labels are delayed weeks or months (credit scoring, churn, fraud) who need early performance degradation detection without waiting for labels.

Don't get burned

Limitations to know

! CBPE accuracy depends on model calibration quality — poorly calibrated APAC models give unreliable estimates
! Newer library with smaller APAC community and fewer production case studies than Evidently
! Less suitable for APAC models with very short label-return windows

Context

About NannyML

NannyML is an open-source Python library that addresses the most common practical challenge in APAC production ML monitoring: the ground truth label delay problem. Standard model performance monitoring (as in Evidently) requires actual labels to compute accuracy metrics — but in most APAC production scenarios, ground truth is delayed or unavailable. A churn prediction model's predictions become verifiable only 30-90 days later when APAC customers actually churn or stay. NannyML solves this with Confidence-Based Performance Estimation (CBPE): estimating expected model performance from the model's confidence scores without waiting for labels.

NannyML's CBPE algorithm uses the relationship between model confidence scores and actual performance observed during calibration to estimate production performance distribution. APAC ML teams can detect model degradation in production days or weeks before ground truth labels arrive — enabling proactive APAC model retraining rather than reactive response to observed accuracy drops.

In addition to CBPE, NannyML provides data reconstruction-based performance estimation for models without meaningful probability outputs, and standard data drift detection for APAC input feature monitoring. The library is particularly valuable for APAC financial services ML models (credit scoring, fraud detection) where ground truth labels arrive weeks to months after the model's prediction.

NannyML

Key features

Best for

Limitations to know

About NannyML

Where this category meets practice depth.