Assessing the signal quality of electrocardiograms from varied acquisition sources: A generic machine learning pipeline for model generation

Adnan  Albaba; Neide  Simões-Capela; Yuyang Wang; Richard Hendriks; Walter De Raedt; Chris Van Hoof

doi:10.1016/j.compbiomed.2020.104164

Assessing the signal quality of electrocardiograms from varied acquisition sources: A generic machine learning pipeline for model generation

Adnan Albaba, Neide Simões-Capela, Yuyang Wang, Richard Hendriks, Walter De Raedt, Chris Van Hoof

Signal Processing Systems

Research output: Contribution to journal › Article › Scientific › peer-review

9 Citations (Scopus)

66 Downloads (Pure)

Abstract

Background and objective: Long-term electrocardiogram monitoring comes at the expense of signal quality. During unconstrained movements, the electrocardiogram is often corrupted by motion artefacts, which can lead to inaccurate physiological information. In this situation, automated quality assessment methods are useful to increase the reliability of the measurements. A generic machine learning pipeline that generates classification models for electrocardiogram quality assessment is presented in this article. The presented pipeline is tested on signals from varied acquisition sources, towards selecting segments that can be used for heart rate analysis in lifestyle applications. Methods: Electrocardiogram recordings from traditional, wearable and ubiquitous devices, are segmented in 10 s windows and manually labeled by experienced researchers into two quality classes. To capture the electrocardiogram dynamics, a comprehensive set of 43 features is extracted from each segment, based on the time-domain signal, its Fast Fourier Transform, the Autocorrelation function and the Stationary Wavelet Transform. To select the most relevant features for each acquisition source we employ both a customized hybrid approach and the state-of-the-art Neighborhood Component Analysis method and compare them. Support Vector Machines (SVM), Decision Trees, K-Nearest-Neighbors and supervised ensemble methods are tested as possible binary classifiers. Results: The results for the best performing models on traditional, wearable and ubiquitous electrocardiogram datasets are, respectively: balanced-accuracy: 89%, F1-score: 93% with the Fine Gaussian SVM model and 10 features; balanced-accuracy: 93%, F1-score: 93% with the Fine Gaussian SVM model and 11 features; balanced-accuracy: 95%, F1-score: 86%, with the Fine Gaussian SVM model and 8 features. Conclusions: According to the results, our generic pipeline can generate classification models tailored to individual acquisition sources, provided that a standard Lead I or Lead II is available. Such models accurately establish whether the electrocardiogram quality is good or bad for heart rate analysis. Furthermore, removing bad quality segments decreases errors in heart rate calculation.

Original language	English
Article number	104164
Pages (from-to)	1-11
Number of pages	11
Journal	Computers in Biology and Medicine
Volume	130
DOIs	https://doi.org/10.1016/j.compbiomed.2020.104164
Publication status	Published - 2021

Bibliographical note

Accepted author manuscript

Keywords

Classification
Electrocardiogram
Feature selection
Motion artefact
Non-contact
Signal quality
Ubiquitous
Wearables

Access to Document

10.1016/j.compbiomed.2020.104164

SQI_27Nov2020_ElevierCompBioMedAccepted author manuscript, 941 KBLicence: CC BY-NC-ND

Cite this

@article{e3c86c6b3aed460fb772773345b65087,

title = "Assessing the signal quality of electrocardiograms from varied acquisition sources: A generic machine learning pipeline for model generation",

abstract = "Background and objective: Long-term electrocardiogram monitoring comes at the expense of signal quality. During unconstrained movements, the electrocardiogram is often corrupted by motion artefacts, which can lead to inaccurate physiological information. In this situation, automated quality assessment methods are useful to increase the reliability of the measurements. A generic machine learning pipeline that generates classification models for electrocardiogram quality assessment is presented in this article. The presented pipeline is tested on signals from varied acquisition sources, towards selecting segments that can be used for heart rate analysis in lifestyle applications. Methods: Electrocardiogram recordings from traditional, wearable and ubiquitous devices, are segmented in 10 s windows and manually labeled by experienced researchers into two quality classes. To capture the electrocardiogram dynamics, a comprehensive set of 43 features is extracted from each segment, based on the time-domain signal, its Fast Fourier Transform, the Autocorrelation function and the Stationary Wavelet Transform. To select the most relevant features for each acquisition source we employ both a customized hybrid approach and the state-of-the-art Neighborhood Component Analysis method and compare them. Support Vector Machines (SVM), Decision Trees, K-Nearest-Neighbors and supervised ensemble methods are tested as possible binary classifiers. Results: The results for the best performing models on traditional, wearable and ubiquitous electrocardiogram datasets are, respectively: balanced-accuracy: 89%, F1-score: 93% with the Fine Gaussian SVM model and 10 features; balanced-accuracy: 93%, F1-score: 93% with the Fine Gaussian SVM model and 11 features; balanced-accuracy: 95%, F1-score: 86%, with the Fine Gaussian SVM model and 8 features. Conclusions: According to the results, our generic pipeline can generate classification models tailored to individual acquisition sources, provided that a standard Lead I or Lead II is available. Such models accurately establish whether the electrocardiogram quality is good or bad for heart rate analysis. Furthermore, removing bad quality segments decreases errors in heart rate calculation.",

keywords = "Classification, Electrocardiogram, Feature selection, Motion artefact, Non-contact, Signal quality, Ubiquitous, Wearables",

author = "Adnan Albaba and Neide Sim{\~o}es-Capela and Yuyang Wang and Richard Hendriks and {De Raedt}, Walter and {Van Hoof}, Chris",

note = "Accepted author manuscript",

year = "2021",

doi = "10.1016/j.compbiomed.2020.104164",

language = "English",

volume = "130",

pages = "1--11",

journal = "Computers in Biology and Medicine",

issn = "0010-4825",

publisher = "Elsevier",

}

TY - JOUR

T1 - Assessing the signal quality of electrocardiograms from varied acquisition sources

T2 - A generic machine learning pipeline for model generation

AU - Albaba, Adnan

AU - Simões-Capela, Neide

AU - Wang, Yuyang

AU - Hendriks, Richard

AU - De Raedt, Walter

AU - Van Hoof, Chris

N1 - Accepted author manuscript

PY - 2021

Y1 - 2021

N2 - Background and objective: Long-term electrocardiogram monitoring comes at the expense of signal quality. During unconstrained movements, the electrocardiogram is often corrupted by motion artefacts, which can lead to inaccurate physiological information. In this situation, automated quality assessment methods are useful to increase the reliability of the measurements. A generic machine learning pipeline that generates classification models for electrocardiogram quality assessment is presented in this article. The presented pipeline is tested on signals from varied acquisition sources, towards selecting segments that can be used for heart rate analysis in lifestyle applications. Methods: Electrocardiogram recordings from traditional, wearable and ubiquitous devices, are segmented in 10 s windows and manually labeled by experienced researchers into two quality classes. To capture the electrocardiogram dynamics, a comprehensive set of 43 features is extracted from each segment, based on the time-domain signal, its Fast Fourier Transform, the Autocorrelation function and the Stationary Wavelet Transform. To select the most relevant features for each acquisition source we employ both a customized hybrid approach and the state-of-the-art Neighborhood Component Analysis method and compare them. Support Vector Machines (SVM), Decision Trees, K-Nearest-Neighbors and supervised ensemble methods are tested as possible binary classifiers. Results: The results for the best performing models on traditional, wearable and ubiquitous electrocardiogram datasets are, respectively: balanced-accuracy: 89%, F1-score: 93% with the Fine Gaussian SVM model and 10 features; balanced-accuracy: 93%, F1-score: 93% with the Fine Gaussian SVM model and 11 features; balanced-accuracy: 95%, F1-score: 86%, with the Fine Gaussian SVM model and 8 features. Conclusions: According to the results, our generic pipeline can generate classification models tailored to individual acquisition sources, provided that a standard Lead I or Lead II is available. Such models accurately establish whether the electrocardiogram quality is good or bad for heart rate analysis. Furthermore, removing bad quality segments decreases errors in heart rate calculation.

AB - Background and objective: Long-term electrocardiogram monitoring comes at the expense of signal quality. During unconstrained movements, the electrocardiogram is often corrupted by motion artefacts, which can lead to inaccurate physiological information. In this situation, automated quality assessment methods are useful to increase the reliability of the measurements. A generic machine learning pipeline that generates classification models for electrocardiogram quality assessment is presented in this article. The presented pipeline is tested on signals from varied acquisition sources, towards selecting segments that can be used for heart rate analysis in lifestyle applications. Methods: Electrocardiogram recordings from traditional, wearable and ubiquitous devices, are segmented in 10 s windows and manually labeled by experienced researchers into two quality classes. To capture the electrocardiogram dynamics, a comprehensive set of 43 features is extracted from each segment, based on the time-domain signal, its Fast Fourier Transform, the Autocorrelation function and the Stationary Wavelet Transform. To select the most relevant features for each acquisition source we employ both a customized hybrid approach and the state-of-the-art Neighborhood Component Analysis method and compare them. Support Vector Machines (SVM), Decision Trees, K-Nearest-Neighbors and supervised ensemble methods are tested as possible binary classifiers. Results: The results for the best performing models on traditional, wearable and ubiquitous electrocardiogram datasets are, respectively: balanced-accuracy: 89%, F1-score: 93% with the Fine Gaussian SVM model and 10 features; balanced-accuracy: 93%, F1-score: 93% with the Fine Gaussian SVM model and 11 features; balanced-accuracy: 95%, F1-score: 86%, with the Fine Gaussian SVM model and 8 features. Conclusions: According to the results, our generic pipeline can generate classification models tailored to individual acquisition sources, provided that a standard Lead I or Lead II is available. Such models accurately establish whether the electrocardiogram quality is good or bad for heart rate analysis. Furthermore, removing bad quality segments decreases errors in heart rate calculation.

KW - Classification

KW - Electrocardiogram

KW - Feature selection

KW - Motion artefact

KW - Non-contact

KW - Signal quality

KW - Ubiquitous

KW - Wearables

UR - http://www.scopus.com/inward/record.url?scp=85098465231&partnerID=8YFLogxK

U2 - 10.1016/j.compbiomed.2020.104164

DO - 10.1016/j.compbiomed.2020.104164

M3 - Article

AN - SCOPUS:85098465231

SN - 0010-4825

VL - 130

SP - 1

EP - 11

JO - Computers in Biology and Medicine

JF - Computers in Biology and Medicine

M1 - 104164

ER -

Assessing the signal quality of electrocardiograms from varied acquisition sources: A generic machine learning pipeline for model generation

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this