RobustGait: Robustness Analysis for Appearance-Based Gait Recognition

Reeshoon Sayera, Akash Kumar, Sirshapan Mitra, Prudvi Kamtam, Yogesh S Rawat
University of Central Florida
WACV 2026
RobustGait pipeline overview

In appearance-based gait recognition silhouettes extracted from RGB video is fed into the recognition model.

Variable silhouette quality from variable extractor leads to evaluation bias.

Direct silhouette corruption
limits perturbations to basic augmentations.

Noise applied to raw RGB
instead of silhouette masks


Abstract

Appearance-based gait recognition have achieved strong performance on controlled datasets, yet systematic evaluation of its robustness to real-world corruptions and silhouette variability remains lacking. We present RobustGait, a framework for fine-grained robustness evaluation of appearance-based gait recognition systems.

RobustGait evaluation spans four dimensions: the type of perturbation (digital, environmental, temporal, occlusion), the silhouette extraction method (segmentation and parsing networks), the architectural capacities of gait recognition models, and various deployment scenarios. The benchmark introduces 15 corruption types at 5 severity levels across CASIA-B, CCPG, and SUSTech1K, with in-the-wild validation on MEVID, and evaluates six state-of-the-art gait systems.

We find: (1) applying noise at the RGB level better reflects real-world degradation and reveals how distortions propagate through silhouette extraction, (2) gait accuracy is highly sensitive to silhouette extractor biases, (3) robustness depends on both perturbation type and model architecture, and (4) noise-aware training and knowledge distillation improve deployment readiness.


Robustness Under Real-World Noise

RobustGait evaluates degradations across digital, environmental, temporal, and occlusion noise. Digital noise and occlusions cause the strongest drops by distorting or removing body structure critical for silhouette extraction. Environmental and temporal noise tend to preserve shape, leading to more moderate degradation.

Noise overview
Noise overview

Silhouette Extraction Bias

Different silhouette extractors can drastically change silhouette quality, causing unfair comparisons across datasets. Higher segmentation IoU (mask quality) corresponds to higher Rank-1 recognition accuracy across gait models.

Segmentation comparison
Segmentation vs iou comparison

Robustness of Gait Models

Robustness varies by architecture: transformers often show stronger overall robustness, while CNNs degrade more under local corruptions; smaller set-based models are more robust to temporal noise compared to larger CNN models. .

Gait model robustness

Deployment Scenarios

Scenario 1: Both probe and gallery are noisy

When both the probe and the gallery contain noise, recognition performance drops significantly. Models trained only on clean data tend to rely on clean silhouette features, so when noise appears in both sets, matching becomes unstable. A clean gallery can partially stabilize recognition, but when noise affects both sides, errors compound and accuracy degrades.

Noisy probe vs gallery

Scenario 2: Cross-extractor evaluation

Gait models are highly sensitive to the silhouette extraction pipeline. If a model is trained using silhouettes from one extractor but evaluated using another, performance drops noticeably. This mismatch reveals hidden evaluation bias and shows that recognition accuracy depends not only on the gait model, but also on how silhouettes are generated.

Cross-extractor heatmap

Scenario 3: Cross-dataset transfer

Even when using the same silhouette extractor, performance changes across datasets due to differences in environment, camera setup, and data characteristics. An extractor or model that performs best on one dataset may not generalize to another, highlighting the importance of evaluating robustness across domains.

Cross-dataset bars

BibTeX

@inproceedings{sayera2026robustgait,
  title     = {RobustGait: Robustness Analysis for Appearance-Based Gait Recognition},
  author    = {Sayera, Reeshoon and Kumar, Akash and Mitra, Sirshapan and Kamtam, Prudvi and Rawat, Yogesh S.},
  booktitle = {Winter Conference on Applications of Computer Vision (WACV)},
  year      = {2026},
  url       = {https://arxiv.org/abs/2511.13065}
}