TY - JOUR
T1 - SPO
T2 - A Secure and Performance-aware Optimization for MapReduce Scheduling
AU - Maleki, Neda
AU - Rahmani, Amir Masoud
AU - Conti, Mauro
PY - 2021
Y1 - 2021
N2 - MapReduce is a common framework that effectively processes multi-petabyte data in a distributed manner. Therefore, MapReduce is widely used in heterogeneous environments, such as cloud, to provide performance adequate for system needs. Despite the MapReduce benefits, tweaking the system configuration to achieve the maximum performance is still challenging and needs deep expertise. Besides, some new MapReduce security issues, which has not been well-addressed yet, are recently raised. In this paper, we present a performance-aware and secure framework, named SPO, to minimize the makespan of the tasks while considering task security constraints. Inspired by the HEFT algorithm, first, we introduce SPO, which proposes a two-stage static scheduler in Map and Reduce phases, respectively, to minimize makespan while considering network traffic. Plus, SPO∗ introduces a mathematical optimization model of the proposed scheduler aiming to estimate the system performance while considering security constraints with an error of less than 2%. The experimental results demonstrate that SPO outperforms Hadoop-stock in terms of makespan and network traffic by 29% and 31%, respectively, for the tasks running in heterogeneous environments.
AB - MapReduce is a common framework that effectively processes multi-petabyte data in a distributed manner. Therefore, MapReduce is widely used in heterogeneous environments, such as cloud, to provide performance adequate for system needs. Despite the MapReduce benefits, tweaking the system configuration to achieve the maximum performance is still challenging and needs deep expertise. Besides, some new MapReduce security issues, which has not been well-addressed yet, are recently raised. In this paper, we present a performance-aware and secure framework, named SPO, to minimize the makespan of the tasks while considering task security constraints. Inspired by the HEFT algorithm, first, we introduce SPO, which proposes a two-stage static scheduler in Map and Reduce phases, respectively, to minimize makespan while considering network traffic. Plus, SPO∗ introduces a mathematical optimization model of the proposed scheduler aiming to estimate the system performance while considering security constraints with an error of less than 2%. The experimental results demonstrate that SPO outperforms Hadoop-stock in terms of makespan and network traffic by 29% and 31%, respectively, for the tasks running in heterogeneous environments.
KW - Bigdata
KW - Hadoop
KW - Heterogeneity
KW - Makespan
KW - MapReduce
KW - Optimization model
KW - Scheduling
KW - Security
UR - http://www.scopus.com/inward/record.url?scp=85097580785&partnerID=8YFLogxK
U2 - 10.1016/j.jnca.2020.102944
DO - 10.1016/j.jnca.2020.102944
M3 - Article
AN - SCOPUS:85097580785
SN - 1084-8045
VL - 176
JO - Journal of Network and Computer Applications
JF - Journal of Network and Computer Applications
M1 - 102944
ER -