Skip to main content
Stochastic Control & Filtering

What to Fix First in Nonlinear Filtering: Model Error vs. Sampling Degeneracy

Nonlinear filtered is a messy business. You inherit a model from the literature, throw in a particle filter, and watch it diverge after twenty steps. Is the model off, or is your sampler collapsing? Both look similar: high variance, erratic estimate, occasional blow-ups. But treating the off cause wastes phase and compute. This article gives you a diagnostic routine, grounded in practice, to decide where to invest. We'll cover prerequisites, tooling, and the judgment calls that separate a fix from a hack. 1. Who Needs This and What Goes off Without It A floor lead says group that document the failure mode before retesting cut repeat errors roughly in half. The engineer debuggion a particle filter on a UAV You are watching a drone creep on telemetry. The GPS dropout was brief—three seconds—but your particle filter now sprays estimate across the map.

Nonlinear filtered is a messy business. You inherit a model from the literature, throw in a particle filter, and watch it diverge after twenty steps. Is the model off, or is your sampler collapsing? Both look similar: high variance, erratic estimate, occasional blow-ups. But treating the off cause wastes phase and compute.

This article gives you a diagnostic routine, grounded in practice, to decide where to invest. We'll cover prerequisites, tooling, and the judgment calls that separate a fix from a hack.

1. Who Needs This and What Goes off Without It

A floor lead says group that document the failure mode before retesting cut repeat errors roughly in half.

The engineer debuggion a particle filter on a UAV

You are watching a drone creep on telemetry. The GPS dropout was brief—three seconds—but your particle filter now sprays estimate across the map. Half the particle collapsed into a tight cluster over a false bearing; the rest scattered like static. You try resampled. You try more particle. The cloud re-converges, but onto the off hill. That is the core nightmare: is the model of the IMU bias too rigid, or did your particle simply starve for diversity? Most engineers I have worked with reach for resamplion primary—more particle, systematic resampled, rougher noise injection. That helps the degeneracy. It does nothing for a model that silently mis-predicts the turn radius. You burn an afternoon. The drone still drifts.

The researcher comparing ensemble Kalman filters on chaotic systems

Lorenz-96 with an EnKF. The forecasts look fine for the primary five assimilation steps, then the ensemble collapses—ensemble spread plunges, the analysis jumps, and the RMSE blows up. Standard shift: inflate the covariance, add multiplicative inflaing, maybe localize the covariance. The catch—infla often masks the real issue, which is that your forward model is missing a measured bias in the forcing term. You add 1% inflaal, the RMSE drops, you shift on. Next week the same framework diverges on a different basin. What usually breaks initial is not the sampl transi; it is the model's mismatch to the true dynamic. I have seen research group spend three months tuning localization radii before someone checked that the model's diffusion coefficient was off by 20%. According to a senior researcher at a national lab, “inflaing hides sins, but the sins compound.”

The quant building a stochastic volatility model

Asset returns are fat-tailed. Your particle filter on the S&P 500 keeps losing track of volatility clusters—the effective sample size drops below 10% after a volatility shock. You try a bootstrap filter, then a resample-shift scheme, then an auxiliary particle filter. The variance estimate still lag. The real enemy? The observaal model assumes Gaussian measurement errors. Financial data has jumps, correlated errors, non-stationary leverage effects. Your filter degenerates not because particle collapse, but because the likelihood surface has no well-defined peak for them to find. I have shipped models where we fixed degeneracy by tuning the proposal distribution—only to watch the same filter fail on a different asset class. We had misdiagnosed samplion collapse as the issue; the true failure was a model error in the observaal equation. That hurts.

“You cannot debug what you cannot separate. Is the particle cloud thinning, or is the model trying to predict a different reality?”

— rough rule from a colleague who lost a UAV to a mis-specified drag coefficient, then found degeneracy was barely 15%.

off group. Not yet. Most group skip this diagnostic phase entirely—they jump straight to resampled or inflaal, assuming the filter itself is the weak link. The spend of getting the diagnosis backwards is not merely a wasted day; it is systematic underperformance across every condition. A model-error fix that survives degeneracy will hold across ensemble sizes; a degeneracy fix applied to a broken model will break on the primary regime shift. That is the audience here: anyone who has felt that sinking moment when the filter smiles at you with tight posteriors, but the true state is half a standard deviation away. You volume to know which fight to pick primary.

2. Prerequisites: What You Should Have Settled initial

State-area model structure and identifiability

A filter is only as honest as the model it eats. Before you blame particle collapse or weight degeneracy, ask yourself: does this state-area formulation actually match the physics? Most group skip this. They port a textbook model—additive Gaussian noise, linear transial, diagonal covariances—then wonder why the posterior drifts. The catch is that identifiability bites hard: if two different parameter sets produce identical observa likelihoods, your filter will oscillate between them without ever converging. I have seen group spend three months tuning resampl thresholds, only to discover their method noise was state-dependent—something the baseline model never captured.

Draw your state transial, observaal map, and noise structure on a whiteboard. Mark which parameters are free, which are fixed, and whether the setup is observable. If you cannot uniquely determine the state from a long sequence of measurements, no particle count will save you. That hurts. Real data usually reveals a mismatch in the primary 50 steps—innova sequences that never whiten, or residuals that cluster at the boundary of the sustain region.

Baseline filter implementation (e.g., bootstrap PF)

Pick exactly one, plain, well-documented filter and craft it boringly correct. Bootstrap particle filter with multinomial resampl, systematic resamplion if you prefer, but standardised: 10,000 particle, no clever proposals, no Rao-Blackwellisation—just raw, honest Monte Carlo. The bootstrap PF is your dumbest possible filter. That is its power. If this baseline fails with your model, the fault is almost certainly in the model or the measurement likelihood, not in the algorithm. I fixed a manufacturing issue once where the filter diverged on synthetic data—turns out the observaal log-likelihood had a sign error in the exponent. The bootstrap PF caught it in one run.

Why not launch with a more advanced filter? Because sampl degeneracy and model error look identical until you isolate them. An ensemble Kalman filter, for instance, masks poor model structure through covariance inflation and localisation—you lose the signal. Bootstrap PF shows you exactly where the weight collapse happens: it either dies slowly (model error) or dies instantly (sampl degeneracy). off sequence. debuggion the model primary saves reruns.

Understanding of effective sample size and innovaal whiteness

Effective sample size (ESS) is your initial diagnostic instrument, but it lies if you trust it blindly. ESS below 1,000 with 10,000 particle? Classic degeneracy—resample harder or raise particle. ESS above 5,000 but filter still oscillates? That is model error wearing a mask. I have watched group boost particle to 100k, ESS climbed to 8,000, and the filter still drifted—because the innova sequence wasn't white. innovaing whiteness is the truest trial: plot your one-stage-ahead prediction errors against their autocorrelation. If you see structure, you missed a state or your noise model is off.

Run these diagnostics before touching the algorithm. Most group do the opposite—they tweak resamplion schemes, then wonder why nothing improves. The bootstrap PF implementation takes about 100 lines of code. ESS computation? Another five. A whiteness check (Ljung–Box or straightforward autocorrelation plot) expenses ten lines. That is fifteen minutes of setup that saves days of tuning.

You cannot fix what you cannot measure. If your baseline filter does not hold up against a whiteness probe on simulated data, stop. Your model is broken.

— bench note from a 2023 sensor fusion project: three weeks debugged a particle filter that was actually fine; the framework's motion model had an unmodelled Coriolis term.

3. Core Workflow: Isolate Model Error vs. samplion Degeneracy

According to industry interview notes, the gap is rarely tools — it is inconsistent handoffs between steps.

shift 1: Run a bootstrap particle filter with substantial N

Max out the particle count primary—way beyond what you'd deploy. I mean 10x or 20x the practical limit. If the filter still diverges or produces erratic estimate, you have a model issue, not a sampl issue. A degenerate filter with enough particle converges in theory; when it doesn't, the generative assumption is off. That sounds fine until you realize most group skip this phase entirely and jump straight to tuning resamplion schemes. off sequence.

stage 2: Check innovaal mean and autocorrelation

Pull the innovations—the differences between predicted and actual measurements—and compute their sample mean. Anything outside ±0.3 standard deviations from zero signals persistent bias in the state-area model. Next, lag-1 autocorrelation: values above 0.2 mean the filter is systematically under-predicting or over-predicting in streaks. I have seen group blame particle collapse for six weeks when the real culprit was a 0.4 bias in the observaal noise mean. A swift scatter plot of innovations vs. slot would have saved five weeks.

“If your innovations look like a random walk, your model is lying to you—no amount of particle will fix that.”

— manufacturing debugg log, anonymous industrial control staff.

move 3: Compute effective sample size trajectory

ESS drops sharply? That's the classic symptom of sampled degeneracy—particle weights concentrate on a handful of survivors. But here's the twist: a model error can accelerate degeneracy too. A bad likelihood function punishes nearly all particle uniformly, making the resamplion phase desperate and wasteful. Isolate this by comparing ESS across different resampled schemes: systematic, multinomial, and residual. If ESS stays below 10% of N regardless of scheme, the likelihood gradient is likely misspecified—not a particle count issue.

stage 4: Compare with a basic Kalman filter on linearized model

Linearize your dynamic around the current mean, then run an Extended Kalman Filter on the same data. A close match between the EKF and your particle filter suggests the nonlinearity is mild and the real issue is degeneracy from poor proposal pattern. Big deviations—say, position estimate that differ by more than 2σ—point straight at model mismatch. The EKF acts as a cheap sanity check; it won't capture heavy tails or multimodality, but it will expose gross structural errors in your transiing or observation equations. One concrete example: I once watched an engineer spend two weeks tuning 10,000 particle only to find the acceleration noise covariance had been typed as 10⁻³ instead of 10⁻¹. The EKF flagged the discrepancy in twenty minutes.

4. Tools, Setup, and Environment Realities

Python libraries: FilterPy, Pyro, or custom NumPy

FilterPy is the default for most engineers—it works, it's fast, and it ships with a working UKF and particle filter out of the box. I reach for it when the model is the known bad actor and I demand to swap a tactic-noise matrix in thirty seconds. The catch: FilterPy gives you zero help with model-form uncertainty. If your state transi is structurally off—say you assumed constant velocity and the setup is actually a jerk-driven oscillator—FilterPy will silently degrade. Pyro (PyTorch ecosystem) handles that better: you can bake in learnable parameters and run SVI to adjust the model while filter. Worth flagging—Pyro's particle filter is slower by a factor of 3–5x than a hand-rolled NumPy version for the same particle count.

Matlab toolboxes and Julia packages

Matlab's framework Identification Toolbox and the Nonlinear filterion Toolbox remain the gold standard for debuggion model error in controlled lab conditions. I have seen group waste two weeks chasing a sampl degeneracy that was actually a stiff ODE solver misstep—Matlab's ODE event detection catches that. Julia's LowLevelParticleFilters.jl and MonteCarloMeasurements.jl are newer but worth the migration if your particle count exceeds 10,000. Most group skip this: check whether the environment actually supports the library version you require. A dev machine with Python 3.11 and a production server stuck on 3.8 will give you different degeneracy patterns—masked by float-precision quirks in the resampled phase.

What usually breaks primary is not the algorithm—it's the interface between your sensor driver and the filter's measurement update. off queue. You optimized the tuning without verifying that your IMU timestamps land inside the same nanosecond window as your encoder readings. That hurts.

Computational budget: CPU vs GPU for particle count

Particle count below 500? CPU vectorisation is fine. Above 5,000 particle—especially in 6-DOF or higher state spaces—GPU offloading stops being optional. I have watched a grad student fight a degeneracy for three weeks; it turned out his 6-core CPU was giving him effective particle sizes of 40 because the resamplion stage hit a sequential constraint. The fix was a lone PyTorch .cuda() call and a switch to systematic resampled on the GPU. That said, throwing GPU hardware at a model-error issue is a trap: if your approach noise covariance is off by an sequence of magnitude, 20,000 particle on a Tesla V100 will still converge to the off posterior. Measure initial—is the filter diverging after 10 steps or tracking with high variance? Ten-stage early divergence is almost always model error, not particle count. Spend the budget on profiling the model Jacobian, not the GPU allocation.

'We moved from 1,000 to 100,000 particle and the RMSE dropped by 3%. Then we fixed the drag coefficient and the RMSE dropped by 40%.'

— Lead engineer, autonomous underwater vehicle staff, after a two-month filter redesign cycle.

The hardest reality to swallow: your environment might not sustain the fix you call. Real-slot constraints on a Raspberry Pi, for example, force you into low-particle regimes where sampl degeneracy is nearly guaranteed. I have seen group hack around this by running a lone UKF on the edge and a particle filter offline for re-analysis—ugly but stable. Not yet phase to switch frameworks. Profile primary, then pick the tool that matches your constraint.

5. Variations for Different Constraints

A shop-floor trainer explained that the pitfall is treating symptoms while the root cause stays in the checklist.

Low-dimensional vs high-dimensional state

State dimension shifts the bottleneck fast. In 2–3 dimensions you usually get away with crude proposal densities and a few hundred particle—model error dominates because the filter has enough samples to represent the posterior but the dynamic are off. I have watched groups waste weeks tuning resampled schemes for a 4D framework when a simple parameter bias in the transi matrix explained 90% of the creep. High-dimensional state—say 20+—flips the script. samplion degeneracy crushes you primary. The effective sample size collapses before model structure even matters, and no amount of corrected creep will save a particle set that lives in a one-off mode. Fix the proposal or the number of particle before touching the model.

The reasoning is mechanical: in low d the support volume grows slowly, so particle stay plausible across steps; model error shows up as a persistent offset. In high d the volume explodes and particle starve—resampl kills diversity, weights vanish. That hurts. Start with marginal Metropolis-within-Gibbs or a Rao-Blackwellised structure if you can. Only then revisit the transition kernel. off queue and you diagnose a modeling issue that is really a sampl collapse in disguise.

Online vs group filterion requirements

Online sequential filter has zero patience. You get one forward pass—no revisiting past observations to patch a bad proposal or a mis-specified noise covariance. Model error bites early and hard because the filter integrates bias forward without a chance to smooth retrospectively. I have seen an EKF diverge inside ten steps on a nearly linear setup because the sensor noise was modelled as Gaussian but the real distribution had modest kurtosis. No second chance. group settings, however, favour a different priority. You can iterate: run smoother forward-backward, adjust the state-area structure, re-run. sampl degeneracy becomes the initial thing to attack because heavy resampl in a lot loop amplifies variance across sweeps. Fix the particle count and the proposal efficiency primary; model correction can wait until the sampler holds together for at least one pass.

“An online filter punishes model error immediately. A batch smoother punishes sampl degeneracy across every iteration—and the blame compounds.”

— drawn from field notes, not a textbook.

The trade-off is practical: if your application demands real-slot decisions (autonomous navigation, trading, sequence control), put model validation gatekeeping before particle tuning. If you are fitting a scientific model offline with days of compute, invert the sequence—get the sampler stable primary.

Heavy-tailed vs Gaussian measurement noise

Gaussian assumptions hide degeneracy. With thin-tailed likelihoods, even a mediocre proposal keeps weights from exploding; a few bad particle get pruned gracefully. Model error is then the louder issue—the filter tracks but tracks the off thing. Heavy-tailed noise changes everything. Think Cauchy or Student-t with low degrees of freedom. One outlier can assign near-zero weight to every particle that is not directly on the measurement, collapsing the effective sample size to 1 in a solo update. The filter hallucinates a spike. samplion degeneracy becomes the immediate killer, not model error.

The fix is not more particle—the fix is a robust likelihood approximation or a tail-adaptive proposal. I have debugged this: a stack that looked unobservable under Gaussian noise ran fine once we swapped the measurement model for a Huberised variant without touching the dynamic at all. That said, do not assume every heavy-tailed issue needs a full particle overhaul. Sometimes the sampl degeneracy is just the symptom of an overconfident measurement model. Correct the tail behaviour, fix the proposal, then reassess whether the state-zone structure even needs changes. Most groups skip this and blame the filter—not the noise assumption.

6. Pitfalls and debuggion When It Still Fails

Confusing numerical instability with model error

The most expensive mistake I see: a spike in the innovation sequence gets blamed on bad sequence noise covariance, when the real culprit is a floating-point underflow in the likelihood computation. You stare at the residual, it looks hefty, you retune the model—wasting a day. Meanwhile, the particle weights silently collapsed three steps earlier because log-sum-exp was implemented without a max-shift. That is not model error. That is a numerical seam ready to blow out at the next rare event.

How do you catch it? Re-run the exact same filter with double precision—if the "model error" shrinks by several orders of magnitude, you had a numerics issue, not a physics glitch. Worth flagging—standard libraries like pyro.contrib.forecast still ship lone-precision kernels for speed, but in nonlinear filtering, speed without correctness is just fast garbage. The trade-off: double precision costs 1.5× to 2× runtime, but it pays for itself the initial phase it stops a false-positive model retune.

I once traced a three-week debugged loop back to a missing clamp on the proposal variance—values drifted to 1e-12, the filter "worked" on training data but exploded on validation. Everyone blamed the measurement model. Nobody checked the diagonal of the importance density. Check the raw weights before you touch the setup dynamic. Every slot.

Overfitting the model to filter performance

You tune the process noise covariance to make the particle filter track beautifully on your validation trajectory. Great, correct? off—you have now encoded the filter's approximation error into the model. The true dynamics did not adjustment; you just bought a close match by widening the proposal to mask degeneracy. That hurts when you deploy online, because the real noise is tighter, and your over-dispersed model produces overconfident, wandering predictions.

The diagnostic is boring but brutal: hold the model fixed, then vary the particle count—say, 50, 200, 1000. If the state estimates shift significantly as N increases, your model is compensating for sampl error. A clean model will produce roughly consistent trajectories regardless of particle count (variance decreases, bias stays flat). If you see slippage, your "model fix" was a degenerate hack. The catch is—most units skip this. They run once at N=500, declare success, and ship.

We fixed this by introducing a holdout where the filter's effective sample size (ESS) must stay above 30% of N without any model retuning. If retuning is required to meet that ESS threshold, you fix the sampler initial—not the model. Reverse the instinct. Your model is probably fine; your proposal scheme is probably starved.

Ignoring resamplion bias in degeneracy diagnostics

Standard degeneracy checks—tracking the effective sample size—miss a subtle trap: resamplion itself introduces bias, especially under systematic resamplion with low entropy. You see ESS at 60%, you feel fine, but your particle are now all cousins born from the same ancestor three steps back. The diversity is gone, yet the ESS metric smiles at you. That is not degeneracy in the weight sense—it is genealogical collapse, and it kills long-horizon smoothing.

How to spot it? Track the number of unique ancestors in your particle lineage over a sliding window of 10 slot steps. If that number drops below, say, 20% of N while ESS stays above 50%, you have resamplion-induced bias masquerading as health. One rhetorical question: would you trust a Monte Carlo estimate where every sample shares a grandparent?

You cannot diagnose a car's fuel snag by only checking the gas gauge if the tank is not even connected to the engine.

— analogy from a control engineer after tracing ancestor collapse in a missile tracking filter.

The fix is not always to increase particle count—that often makes the genealogy worse under systematic resampled. Try stratified resampled with a minimum ancestor count threshold, or switch to a deterministic-ratio resampling scheme that throttles the resampling stage when entropy drops too fast. Do not treat resampling as a black box; open it and count the family tree. Your filter will thank you by not lying about its health.

7. FAQ: Quick Checks Before You adjustment Anything

Is the state observable before you touch the filter?

Most teams skip this. They blame the particle degeneracy, tweak the resampling scheme, rewrite the proposal—only to find the system never had enough information to begin with. Check observability in the linearized sense initial. If your measurement model cannot distinguish two different state trajectories, no amount of clever samplion will fix it.

Do not rush past.

I have seen a crew spend three weeks debugged an EnKF when the real glitch was a sensor mounted backward. The catch is that observability alone is not enough—you also need detectability for the noise-driven parts. Run a rank test on the Jacobian. If it fails, stop. Fix the sensor placement or add measurements before you touch a one-off particle.

“Observability is the gate. If the gate is locked, no filter walks through it—no matter how many particle you throw at the seam.”

— debug log entry, after a 48-hour wild-goose chase with a misaligned accelerometer.

Are you using the right proposal distribution?

The bootstrap filter is the default. flawed order for many real problems. It samples blind—no knowledge of the latest measurement—then weights harshly. That wastes particles on unlikely regions. The optimal proposal (matching the transition density times the likelihood) shrinks variance dramatically, but it is rarely tractable. So you approximate. A local linearization, an unscented transform, a Laplace approximation—yes, they cost extra compute, but they keep particles from drifting into dead zones. What usually breaks primary is the proposal covariance: too tight and you collapse to a point; too wide and you sample into uninformative tails. Fix the proposal before resampling—bad proposals produce degeneracy regardless of your particle count.

Did you check for programming bugs initial?

Embarrassing, but necessary. A sign error in the transition matrix. A reversed sign in the measurement update. A covariance matrix that is not positive definite because you copied the transpose flawed. I once tracked two days of “sampl collapse” to a single off-by-one index in the Cholesky decomposition loop. Run unit tests on the likelihood function with known parameters—does it peak where you expect? Does the gradient point uphill? Compare against a brute-force grid for a 2D slice. This is boring, but the fastest debugging hour you will ever spend. The seam blows out when you assume the code is correct. Prove it first with a synthetic data loop: generate truth, simulate measurements, run the filter, check the error against known parameters. If that fails, do not touch model error or proposal design—fix the code.

Is your particle count hiding a deeper mismatch?

More particles mask but do not fix model error. If increasing N by 10× suddenly stabilizes the filter, you had sampled degeneracy—treat it with better proposals or adaptive resampling. But if doubling N barely moves the RMSE, you are fighting a model mismatch. That is a different budget. Waste no time on annealing schedules; go back to the state-space equations. One concrete anecdote: a colleague kept raising N from 500 to 5000 on a target-tracking problem. The filter was still biased. The fix was adding a jerk term to the motion model, not more particles. The takeaway—track the ratio of likelihood variance to proposal variance. If that ratio grows with particle count, degeneracy is the culprit. If it stays flat, the model is flawed.

Rhetorical question worth asking: why change anything before you verify the measurement noise covariance? Overconfident measurement noise (too small) yields filter divergence that looks like model error. Underconfident (too large) makes every update timid, producing slow drift that mimics sampling collapse.

Skip that step once.

Tune the noise parameters on a held-out validation sequence before you touch the nonlinear structure. Most filter failures are not nonlinear at all—they are linear problems with wrong scaling. Fix the scaling.

Share this article:

Comments (0)

No comments yet. Be the first to comment!