Performance modelling and analysis of mapreduce/hadoop workloads

Performance modelling and analysis of mapreduce/hadoop workloads Data center is the infrastructure in big data processing, which constructs computing platform by distributed computer. The paper aims to investigate the analytical model by adopting queueing theory indata center of big data. The new queueing model developed fits the MapReduce programming model accurately and discovers the nature of the programming model. The utilizations and mean waiting times of Mapper and Reducer are obtained respectively. The effect of workload (and number of Mapper slots) on the system performance (i.e., utilization) is exposed. The significance of this paper is it explores the theoretical insight of the MapReduce programming model and provides the optimal parameter recommendation for computing resource configuration.