Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning

Yongqi  Dong; Xingmin Lu; Ruohan Li; Wei Song; Bart  van Arem; Haneen  Farah

Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning

Yongqi Dong^*, Xingmin Lu, Ruohan Li, Wei Song, Bart van Arem, Haneen Farah

^*Corresponding author for this work

Transport and Planning

Research output: Contribution to conference › Poster › Scientific

37 Downloads (Pure)

Abstract

The burgeoning navigation services using digital maps provide great convenience to drivers. However, there are sometimes anomalies in the lane rendering map images, which might mislead human drivers and result in unsafe driving. To accurately and effectively detect the anomalies, this paper transforms lane rendering image anomaly detection into a classification problem and proposes a four-phase pipeline consisting of data pre-processing, self-supervised pre-training with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy loss with label smoothing, and post-processing to tackle it using state-of-the-art deep learning techniques, especially the Transformer models. Various experiments verify the effectiveness of the proposed pipeline. The proposed pipeline can deliver superior lane rendering image anomaly detection performance, and especially, the self-supervised pre-training with MiM can greatly improve the detection accuracy while significantly reducing the total training time, e.g, Swin Transformer with Uniform Masking as self-supervised pretraining (Swin-Trans-UM) obtained better accuracy at 94.77% and better Area Under The Curve (AUC) at 0.9743 compared with the pure Swin Transformer without pre-training (Swin-Trans) whose accuracy is 94.01% AUC is 0.9498, and the fine-tuning epochs reduced to 41 from original 280. Ablation study further regarding techniques to alleviate the data imbalance between normal and abnormal instances further enhances the model performance.

Original language	English
Number of pages	1
Publication status	Published - 2024
Event	Transportation Research Board 103rd Annual Meeting 2024 - Walter E. Washington Convention Center, Washington DC, United States Duration: 7 Jan 2024 → 11 Jan 2024 https://www.trb.org/AnnualMeeting/AnnualMeeting.aspx

Conference

Conference	Transportation Research Board 103rd Annual Meeting 2024
Abbreviated title	TRB 2024
Country/Territory	United States
City	Washington DC
Period	7/01/24 → 11/01/24
Internet address	https://www.trb.org/AnnualMeeting/AnnualMeeting.aspx

Keywords

Anomaly Detection
Lane rendering image
Transformer
Self-supervised learning
Image classification

Access to Document

Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-TuningFinal published version, 867 KBLicence: CC BY

1 Article

Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning
Dong, Y., Lu, X., Li, R., Song, W., van Arem, B. & Farah, H., 1 Dec 2023, (Submitted) In: Transportation Research Record.
Research output: Contribution to journal › Article › Scientific › peer-review

File
3 Downloads (Pure)

Cite this

@conference{00b74fa58fef4514a21f877159d58c88,

title = "Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning",

abstract = "The burgeoning navigation services using digital maps provide great convenience to drivers. However, there are sometimes anomalies in the lane rendering map images, which might mislead human drivers and result in unsafe driving. To accurately and effectively detect the anomalies, this paper transforms lane rendering image anomaly detection into a classification problem and proposes a four-phase pipeline consisting of data pre-processing, self-supervised pre-training with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy loss with label smoothing, and post-processing to tackle it using state-of-the-art deep learning techniques, especially the Transformer models. Various experiments verify the effectiveness of the proposed pipeline. The proposed pipeline can deliver superior lane rendering image anomaly detection performance, and especially, the self-supervised pre-training with MiM can greatly improve the detection accuracy while significantly reducing the total training time, e.g, Swin Transformer with Uniform Masking as self-supervised pretraining (Swin-Trans-UM) obtained better accuracy at 94.77% and better Area Under The Curve (AUC) at 0.9743 compared with the pure Swin Transformer without pre-training (Swin-Trans) whose accuracy is 94.01% AUC is 0.9498, and the fine-tuning epochs reduced to 41 from original 280. Ablation study further regarding techniques to alleviate the data imbalance between normal and abnormal instances further enhances the model performance. ",

keywords = "Anomaly Detection, Lane rendering image, Transformer, Self-supervised learning, Image classification",

author = "Yongqi Dong and Xingmin Lu and Ruohan Li and Wei Song and {van Arem}, Bart and Haneen Farah",

year = "2024",

language = "English",

note = "Transportation Research Board 103rd Annual Meeting 2024, TRB 2024 ; Conference date: 07-01-2024 Through 11-01-2024",

url = "https://www.trb.org/AnnualMeeting/AnnualMeeting.aspx",

}

TY - CONF

T1 - Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning

AU - Dong, Yongqi

AU - Lu, Xingmin

AU - Li, Ruohan

AU - Song, Wei

AU - van Arem, Bart

AU - Farah, Haneen

PY - 2024

Y1 - 2024

N2 - The burgeoning navigation services using digital maps provide great convenience to drivers. However, there are sometimes anomalies in the lane rendering map images, which might mislead human drivers and result in unsafe driving. To accurately and effectively detect the anomalies, this paper transforms lane rendering image anomaly detection into a classification problem and proposes a four-phase pipeline consisting of data pre-processing, self-supervised pre-training with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy loss with label smoothing, and post-processing to tackle it using state-of-the-art deep learning techniques, especially the Transformer models. Various experiments verify the effectiveness of the proposed pipeline. The proposed pipeline can deliver superior lane rendering image anomaly detection performance, and especially, the self-supervised pre-training with MiM can greatly improve the detection accuracy while significantly reducing the total training time, e.g, Swin Transformer with Uniform Masking as self-supervised pretraining (Swin-Trans-UM) obtained better accuracy at 94.77% and better Area Under The Curve (AUC) at 0.9743 compared with the pure Swin Transformer without pre-training (Swin-Trans) whose accuracy is 94.01% AUC is 0.9498, and the fine-tuning epochs reduced to 41 from original 280. Ablation study further regarding techniques to alleviate the data imbalance between normal and abnormal instances further enhances the model performance.

AB - The burgeoning navigation services using digital maps provide great convenience to drivers. However, there are sometimes anomalies in the lane rendering map images, which might mislead human drivers and result in unsafe driving. To accurately and effectively detect the anomalies, this paper transforms lane rendering image anomaly detection into a classification problem and proposes a four-phase pipeline consisting of data pre-processing, self-supervised pre-training with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy loss with label smoothing, and post-processing to tackle it using state-of-the-art deep learning techniques, especially the Transformer models. Various experiments verify the effectiveness of the proposed pipeline. The proposed pipeline can deliver superior lane rendering image anomaly detection performance, and especially, the self-supervised pre-training with MiM can greatly improve the detection accuracy while significantly reducing the total training time, e.g, Swin Transformer with Uniform Masking as self-supervised pretraining (Swin-Trans-UM) obtained better accuracy at 94.77% and better Area Under The Curve (AUC) at 0.9743 compared with the pure Swin Transformer without pre-training (Swin-Trans) whose accuracy is 94.01% AUC is 0.9498, and the fine-tuning epochs reduced to 41 from original 280. Ablation study further regarding techniques to alleviate the data imbalance between normal and abnormal instances further enhances the model performance.

KW - Anomaly Detection

KW - Lane rendering image

KW - Transformer

KW - Self-supervised learning

KW - Image classification

M3 - Poster

T2 - Transportation Research Board 103rd Annual Meeting 2024

Y2 - 7 January 2024 through 11 January 2024

ER -

Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning

Abstract

Conference

Keywords

Access to Document

Fingerprint

Research output

Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning

Cite this