FSBD: A Framework for Scheduling of Big Data Mining in Cloud Computing

FSBD: A Framework for Scheduling of Big Data Mining in Cloud Computing Cloud computing is seen as an emerging technology for big data mining and analytics. Cloud computingcan provide data mining results in the form of a Software As a Service (SAS). Both performance and quality of mining are fundamentals criteria for the use of a data mining application provided by a Cloudcomputing environment. In this paper, we propose a Cloud computing framework, which is responsible to distribute and schedule a Cluster-Based data mining application and its data set. The main goal of our proposed framework for scheduling of Big Data Mining (FSBD) is to decrease the overall execution time of the application with minimum loss in mining quality. We consider the Cluster-based data mining technique as a pilot application for our framework. The results show an important speedup with a minimum loss in quality of mining. We obtained a ratio of 2 of the normalized actual makespan vis-a-vis the ideal makespan. The quality of mining scales well with the number of clusters and the increasing size of the dataset. The results are promising, encouraging the adoption of the framework by Cloudproviders.