TY - GEN
T1 - Relating big data and data quality in financial service organizations
AU - Wahyudi, Agung
AU - Farhani, Adiska
AU - Janssen, Marijn
N1 - Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
PY - 2018
Y1 - 2018
N2 - Today’s financial service organizations have a data deluge. A number of V’s are often used to characterize big data, whereas traditional data quality is characterized by a number of dimensions. Our objective is to investigate the complex relationship between big data and data quality. We do this by comparing the big data characteristics with data quality dimensions. Data quality has been researched for decades and there are well-defined dimensions which were adopted, whereas big data characteristics represented by eleven V’s were used to characterize big data. Literature review and ten cases in financial service organizations were invested to analyze the relationship between data quality and big data. Whereas the big data characteristics and data quality have been viewed as separated domain ours findings show that these domains are intertwined and closely related. Findings from this study suggest that variety is the most dominant big data characteristic relating with most data quality dimensions, such as accuracy, objectivity, believability, understandability, interpretability, consistent representation, accessibility, ease of operations, relevance, completeness, timeliness, and value-added. Not surprisingly, the most dominant data quality dimension is value-added which relates with variety, validity, visibility, and vast resources. The most mentioned pair of big data characteristic and data quality dimension is Velocity-Timeliness. Our findings suggest that term ‘big data’ is misleading as that mostly volume (‘big’) was not an issue and variety, validity and veracity were found to be more important.
AB - Today’s financial service organizations have a data deluge. A number of V’s are often used to characterize big data, whereas traditional data quality is characterized by a number of dimensions. Our objective is to investigate the complex relationship between big data and data quality. We do this by comparing the big data characteristics with data quality dimensions. Data quality has been researched for decades and there are well-defined dimensions which were adopted, whereas big data characteristics represented by eleven V’s were used to characterize big data. Literature review and ten cases in financial service organizations were invested to analyze the relationship between data quality and big data. Whereas the big data characteristics and data quality have been viewed as separated domain ours findings show that these domains are intertwined and closely related. Findings from this study suggest that variety is the most dominant big data characteristic relating with most data quality dimensions, such as accuracy, objectivity, believability, understandability, interpretability, consistent representation, accessibility, ease of operations, relevance, completeness, timeliness, and value-added. Not surprisingly, the most dominant data quality dimension is value-added which relates with variety, validity, visibility, and vast resources. The most mentioned pair of big data characteristic and data quality dimension is Velocity-Timeliness. Our findings suggest that term ‘big data’ is misleading as that mostly volume (‘big’) was not an issue and variety, validity and veracity were found to be more important.
KW - 11Â V
KW - Big data
KW - Data quality
KW - Finance service organization
KW - Value
KW - Variety
UR - http://www.scopus.com/inward/record.url?scp=85055845089&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-02131-3_45
DO - 10.1007/978-3-030-02131-3_45
M3 - Conference contribution
AN - SCOPUS:85055845089
SN - 9783030021306
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 504
EP - 519
BT - Proceedings of 17th IFIP WG 6.11 Conference on e-Business, e-Services, and e-Society, I3E 2018, Proceedings
A2 - Mäntymäki, Matti
A2 - Al-Sharhan, Salah A.
A2 - Simintiras, Antonis C.
A2 - Tahat, Luay
A2 - Moughrabi, Issam
A2 - Ali, Taher M.
A2 - Janssen, Marijn
A2 - Dwivedi, Yogesh K.
A2 - Rana, Nripendra P.
PB - Springer
T2 - 17th IFIP WG 6.11 Conference on e-Business, e-Services, and e-Society, I3E 2018
Y2 - 30 October 2018 through 1 November 2018
ER -