TY - CONF
T1 - Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning
AU - Dong, Yongqi
AU - Lu, Xingmin
AU - Li, Ruohan
AU - Song, Wei
AU - van Arem, Bart
AU - Farah, Haneen
PY - 2024
Y1 - 2024
N2 - The burgeoning navigation services using digital maps provide great convenience to drivers. However, there are sometimes anomalies in the lane rendering map images, which might mislead human drivers and result in unsafe driving. To accurately and effectively detect the anomalies, this paper transforms lane rendering image anomaly detection into a classification problem and proposes a four-phase pipeline consisting of data pre-processing, self-supervised pre-training with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy loss with label smoothing, and post-processing to tackle it using state-of-the-art deep learning techniques, especially the Transformer models. Various experiments verify the effectiveness of the proposed pipeline. The proposed pipeline can deliver superior lane rendering image anomaly detection performance, and especially, the self-supervised pre-training with MiM can greatly improve the detection accuracy while significantly reducing the total training time, e.g, Swin Transformer with Uniform Masking as self-supervised pretraining (Swin-Trans-UM) obtained better accuracy at 94.77% and better Area Under The Curve (AUC) at 0.9743 compared with the pure Swin Transformer without pre-training (Swin-Trans) whose accuracy is 94.01% AUC is 0.9498, and the fine-tuning epochs reduced to 41 from original 280. Ablation study further regarding techniques to alleviate the data imbalance between normal and abnormal instances further enhances the model performance.
AB - The burgeoning navigation services using digital maps provide great convenience to drivers. However, there are sometimes anomalies in the lane rendering map images, which might mislead human drivers and result in unsafe driving. To accurately and effectively detect the anomalies, this paper transforms lane rendering image anomaly detection into a classification problem and proposes a four-phase pipeline consisting of data pre-processing, self-supervised pre-training with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy loss with label smoothing, and post-processing to tackle it using state-of-the-art deep learning techniques, especially the Transformer models. Various experiments verify the effectiveness of the proposed pipeline. The proposed pipeline can deliver superior lane rendering image anomaly detection performance, and especially, the self-supervised pre-training with MiM can greatly improve the detection accuracy while significantly reducing the total training time, e.g, Swin Transformer with Uniform Masking as self-supervised pretraining (Swin-Trans-UM) obtained better accuracy at 94.77% and better Area Under The Curve (AUC) at 0.9743 compared with the pure Swin Transformer without pre-training (Swin-Trans) whose accuracy is 94.01% AUC is 0.9498, and the fine-tuning epochs reduced to 41 from original 280. Ablation study further regarding techniques to alleviate the data imbalance between normal and abnormal instances further enhances the model performance.
KW - Anomaly Detection
KW - Lane rendering image
KW - Transformer
KW - Self-supervised learning
KW - Image classification
M3 - Poster
T2 - Transportation Research Board 103rd Annual Meeting 2024
Y2 - 7 January 2024 through 11 January 2024
ER -