Abstract #93

# 93
Early lifetime information enhances calf selection by improving accuracy of predictions with machine learning algorithms and regression.
M. Schmitt1, F. Maunsell1, A. De Vries*1, 1University of Florida, Gainesville, FL.

Calf information collected early in life (genetics, growth, health) can inform dairy farmers about raising the best replacement heifers and culling surplus heifers if it improves the accuracy to predict future performance. Raising the best replacement heifers increases the expected milk sales if better selection decisions are made, but this increased revenue is offset by the cost to gather the information. Therefore, our objective was to find the break-even cost of information generated through the first lactation from culling surplus heifers based on predictions from early lifetime information with methods of gradient boosting, random forest, and regression. Survival to first calving and up to 305-d first-lactation milk production conditional upon first calving were predicted for 4,850 calves at 120 d of age for calves born between April 2012 and November 2015 on a single farm. The survival and conditional milk predictions were multiplied to obtain the prediction of expected milk sales used for heifer selection. The value of marginal milk was set at $0.29/kg. Predictions were generated through 10-fold cross-validation with 5 new random split replicates. The break-even cost of the information was the cost that could be spent per born calf. If 30% of heifers were culled, the genomic genetic estimates, growth and health data combined together predicted by gradient boosting, random forest and regression resulted in a break-even information cost of $116, $109 and $105, respectively. When parent average genetic estimates, growth and health were used together, gradient boosting generated the highest value of information ($98), followed by random forest ($80) and regression ($70). We conclude that gradient boosting, followed by random forest and regression predictions based on early lifetime information increases prediction accuracy. Genetic information was more valuable than growth and health data for prediction. This methodology could be extended to predict lifetime cow revenue for a more complete picture of break-even information costs to guide heifer selection decisions.

Key Words: machine learning, prediction, information value