TY - JOUR
T1 - Boosting field data using synthetic SCADA datasets for wind turbine condition monitoring
AU - Milani, Ali Eftekhari
AU - Zappalá, Donatella
AU - Castellani, Francesco
AU - Watson, Simon
PY - 2024
Y1 - 2024
N2 - State-of-the-art Deep Learning (DL) methods based on Supervisory Control and Data Acquisition (SCADA) system data for the detection and prognosis of wind turbine faults require large amounts of failure data for successful training and generalisation, which are generally not available. This limitation prevents benefiting from the superior performance of these methods, especially in SCADA-based failure prognosis. Data augmentation approaches have been proposed in the literature for generating failure data instances within a SCADA sequence to reduce the imbalance between healthy and faulty state data points, which is relevant to fault detection tasks. However, the successful implementation of DL-based failure prognosis methods requires the availability of multiple run-to-failure SCADA sequences. This paper proposes a data-driven method for generating synthetic run-to-failure SCADA sequences with custom operational and environmental conditions and progression of degradation. An Artificial Neural Network (ANN) is trained with signals that represent these factors to reconstruct the SCADA signals. Then, it is used to generate synthetic SCADA datasets based on data available from a wind turbine that experienced a gearbox failure. Synthetic data sets generated are evaluated on the basis of the similarity of their signal distributions, the temporal dynamics within each signal, and the temporal dynamics among different SCADA signals with those in similar field datasets. The results show that the generated synthetic datasets are consistent with their field counterparts, with a comparatively lower diversity in their dynamic behaviour in time.
AB - State-of-the-art Deep Learning (DL) methods based on Supervisory Control and Data Acquisition (SCADA) system data for the detection and prognosis of wind turbine faults require large amounts of failure data for successful training and generalisation, which are generally not available. This limitation prevents benefiting from the superior performance of these methods, especially in SCADA-based failure prognosis. Data augmentation approaches have been proposed in the literature for generating failure data instances within a SCADA sequence to reduce the imbalance between healthy and faulty state data points, which is relevant to fault detection tasks. However, the successful implementation of DL-based failure prognosis methods requires the availability of multiple run-to-failure SCADA sequences. This paper proposes a data-driven method for generating synthetic run-to-failure SCADA sequences with custom operational and environmental conditions and progression of degradation. An Artificial Neural Network (ANN) is trained with signals that represent these factors to reconstruct the SCADA signals. Then, it is used to generate synthetic SCADA datasets based on data available from a wind turbine that experienced a gearbox failure. Synthetic data sets generated are evaluated on the basis of the similarity of their signal distributions, the temporal dynamics within each signal, and the temporal dynamics among different SCADA signals with those in similar field datasets. The results show that the generated synthetic datasets are consistent with their field counterparts, with a comparatively lower diversity in their dynamic behaviour in time.
UR - http://www.scopus.com/inward/record.url?scp=85196429642&partnerID=8YFLogxK
U2 - 10.1088/1742-6596/2767/3/032033
DO - 10.1088/1742-6596/2767/3/032033
M3 - Conference article
AN - SCOPUS:85196429642
SN - 1742-6588
VL - 2767
JO - Journal of Physics: Conference Series
JF - Journal of Physics: Conference Series
IS - 3
M1 - 032033
T2 - 2024 Science of Making Torque from Wind, TORQUE 2024
Y2 - 29 May 2024 through 31 May 2024
ER -