Application of data mining techniques to study the sex ratio distortion in crossbred dairy cattle

Application of data mining techniques to study the sex ratio distortion in crossbred dairy cattle The present investigation was undertaken to find the efficiency of three data mining techniques to study the sex ratio distortion in crossbred dairy cattle. Information on the date of birth, sex of calf, sire and dam identification number, parity number of dam and date of conception of dam of 873 crossbred calves born to 89 bulls at Cattle Breeding Farm, Thumburmuzhi, Kerala for the period from 1991 to 2005 were utilized for the present study. The three classificatory techniques viz., logistic regression, decision tree and artificial neural networks were employed using the Weka (Witten and Frank, 2005) software. Confusion matrix was developed by each classifier and evaluation of the technique was done by using parameters viz. kappa statistic, mean absolute error, root mean squared error, relative absolute error and root relative squared error. The results revealed that classification by logistic regression was the best as it had maximum (58.72) percentage of correctly classified instances; while the Decision tree (52.75) and artificial neural network (52.29) classifiers had almost similar efficiency. Based on the results obtained in the present study it may be concluded that the sex of calf can be classified using various attributes at an accuracy of 58.70 per cent. The decision tree understands and classifies the problem at a faster rate than the logistic regression and artificial neural network techniques. However, the logistic regression classifies the sex of the calf at a more accurate and better way than the other two classification techniques. Hence among the three techniques used the logistic regression was recommended for the classification of sex of calf in dairy cattle.