RobustGait: Robustness Analysis for Appearance-Based Gait Recognition

Reeshoon Sayera, Akash Kumar, Sirshapan Mitra, Prudvi Kamtam, Yogesh S Rawat

University of Central Florida

WACV 2026

📄 Paper 📝 arXiv 💻 Code 📂 Data

Abstract

Appearance-based gait recognition have achieved strong performance on controlled datasets, yet systematic evaluation of its robustness to real-world corruptions and silhouette variability remains lacking. We present RobustGait, a framework for fine-grained robustness evaluation of appearance-based gait recognition systems. RobustGait evaluation spans four dimensions: the type of perturbation (digital, environmental, temporal, occlusion), the silhouette extraction method (segmentation and parsing networks), the architectural capacities of gait recognition models, and various deployment scenarios. The benchmark introduces 15 corruption types at 5 severity levels across CASIA-B, CCPG, and SUSTech1K, with in-the-wild validation on MEVID, and evaluates six state-of-the-art gait systems. We came across several exciting insights. First, applying noise at the RGB level better reflects real-world degradation, and reveal how distortions propagate through silhouette extraction to the downstream gait recognition systems. Second, gait accuracy is highly sensitive to silhouette extractor biases, revealing an overlooked source of benchmark bias. Third, robustness is dependent on both the type of perturbation and the architectural design. Finally, we explore robustness-enhancing strategies, showing that noise-aware training and knowledge distillation improve performance and move toward deployment-ready systems.

Robustness Under Real-World Noise

RobustGait evaluates real-world degradations across digital, environmental, temporal, and occlusion-based noise. Digital noise and occlusions cause the strongest performance drops because they distort or remove key body structures needed for silhouette extraction. Environmental and temporal noise affect appearance and motion but generally preserve shape, leading to more moderate degradation. Overall, the benchmark reveals that gait models remain most vulnerable when structural integrity of the silhouette is compromised.

Silhouette Extraction Bias

Different silhouette extractors produce vastly different silhouette quality. This leads to unfair benchmark comparisons and can alter Rank-1 accuracy by more than 20%. High-IoU extractors (SCHP, M2FP) preserve structure better and significantly boost gait model performance.

Robustness of Gait Models

Gait models are most robust to environmental and temporal noise, but struggle with digital distortions and occlusion. Transformers show the strongest overall robustness, while CNNs degrade most under local corruptions, and set-based models remain stable under temporal disruptions. Overall, robustness depends strongly on both the type of perturbation and the model architecture.

Deployment Scenarios

Deployment scenarios reveal that gait models can struggle when training and testing conditions differ. When the gallery is noisy, performance drops sharply because both reference and query features become degraded, making matching difficult. Models are also highly sensitive to changes in silhouette extraction pipelines — training on one extractor and testing on another leads to large accuracy drops. Finally, cross-dataset evaluation shows that the best-performing extractor depends on dataset characteristics, highlighting that both dataset structure and extraction method strongly influence real-world performance.

Citation

        @inproceedings{
            sayera2026robustgait,
            title={{ RobustGait: Robustness Analysis for Appearance-Based Gait Recognition }},
            author={Reeshoon Sayera and Akash Kumar and Sirshapan Mitra and Prudvi Kamtam and Yogesh S Rawat},
            booktitle={Winter Conference on Applications of Computer Vision (WACV)},
            year={2026},
            url={https://arxiv.org/abs/2511.13065}
        }