Automating sequence dataset generating by using SeqGen

Automating sequence dataset generating by using SeqGen Data preprocessing describes any type of processing performed on raw data to prepare it for another processing procedure. Commonly used as a preliminary data mining practice, data preprocessing transforms the data into a format that will be more easily and effectively processed. Sequential PatternMining finds interesting sequential patterns among the large database. Data acquired from the dataset may not be sequential. In this paper, we propose a SeqGen algorithm as preprocessing step in sequential pattern mining. The main objective of the algorithm is to generate sequences with timestamp on user personalization. The reference attribute is given as parameter for generating the sequences. Experimental results have shown that raw data in any form can be easily transformed into sequence dataset once the reference attribute is given.