Catastrophic Fragility in Sequential Drift Detection | Whitepapers

Abstract

This paper documents a direct comparison between sequential and non-sequential approaches to drift detection. A large Transformer encoder fails dramatically on the task, while a simpler XGBoost model trained on the same data succeeds.

What it establishes

More model capacity is not automatically better for drift detection.
Sequential inductive bias can be a liability when the signal is effectively instantaneous.
Architecture-task mismatch can produce collapse, not just underperformance.

Why it matters

The paper strengthens a central ABIS argument: the right representation and the right mathematical framing matter more than using the most fashionable model class.