Weekly Milk Prediction on Dairy Goats using Neural Networks C. Fernández*a, E. Soriab, P. Sánchez-Seiquera, L. Gómez-Chovab, R. Magdalenab, J. D. Martínb, M. J. Navarroc, and A. J. Serranob
a
Dep. Producción Animal y Ciencia de los Alimentos. Facultad de Ciencias
Experimentales y de la Salud. Universidad Cardenal Herrera CEU. 46113 Moncada, Valencia, Spain. b
Dep. Ingeniería Electrónica. E.T.S.E. Universidad de Valencia. 46100 Burjassot, Valencia, Spain. c
Dep. Tecnología Agroalimentaria. E.T.S.I. Agrónomos. Universidad Miguel Hernández. 03312 Orihuela. Alicante. Spain.
*
Corresponding author. email:
[email protected] ; tel.: +34-961369000 ;
fax: +34-961395272;
Abstract Artificial neural networks (NN) have been widely used for both prediction and classification tasks in many fields of knowledge, however, few studies are available on dairy science. In this work we use NN models to predict next week goat milk based on actual and previous milk production. A total of 35 Murciano-Granadina dairy goats were selected from a commercial farm according to number of lactation, litter size and body weight. Another input variable taken into account were diet, milk production, stage of lactation and days between partum and first control. From the 35 goats, 22 goats were used to build the neural model and 13 goats were used to validate the model. It is important to emphasize that these 13 goats were not used to build the model in
1
order to demonstrate the generalization capability of the network. Afterwards, the neural models that provided better prediction results were analyzed in order to determine the relative importance of the input variables of the model. We found that the most important inputs are present milk yield and previous milk production, followed by days between parturition, first milk control and type of diet. Besides we benchmark NN to other widely used prediction models, such as Auto-Regressive system modelling or naïve prediction. The results obtained with the neural models are better than with the rest of models. The best neural model in terms of accuracy provided a root mean square error, RMSE, equal to 0.31 Kg/d and a low bias (mean error, ME) equal to -0.05 kg/d. Dairy goat farmers could make management decisions during actual lactation from one week to next (present time), based on actual and/or previous milk production and dairy goat factors, without waiting until the end of lactation.
Keywords: Neural Network, Dairy Goat, Milk Yield Prediction.
1. Introduction Goat production is located principally in the central and southern regions in Spain. According to Falagán et al1, approximately 20% of the national goat livestock consists of Murciano-Granadina goats. Murciano-Granadina goats are well adapted to the Mediterranean climate characterised by semiarid conditions with low rainfall and high temperatures. Although goats are well adapted to semiarid conditions the availability of pastures on that Region is scarce and, high milk productions are obtained when dairy farmers allocate goats indoor and use total mixed ration (TMR) to feed their MurcianoGranadina dairy herds.
2
Breeding programs are based primarily on milk yield and milk composition2. Most Murciano-Granadina goats are machine milked once a day and official milk records for the Herd Book are obtained by monthly testing during lactation. Farmers’ income comes from milk production and composition, and the accurate measurement or prediction of milk yield (MY) is essential to their economy1,3. Moreover, not only accurate prediction is desired, it is interesting to determine and to evaluate the most important factors that can affect it like season, stage of lactation, litter size, diet, number of lactation, environmental conditions, etc. These factors have been widely studied in dairy cows and less in sheep and goats4,5,6,7,8. Only a few studies on goats9,10,11 have evaluated the factors that affect milk yield, such as lactation number and litter size. In diary science, different mathematical models have been used to predict MY and factors affect it. These models present the following limitations: •
Some of them consider linear relations between the input variables and the desired output10,12,13.
•
Others are static, i.e. they do not consider time as input variable, which is a fundamental variable in this kind of problems4,7,14.
•
Others assume certain relation for the model, i.e. they assume an a priori relationship between the input and output variables that could be none necessarily true15,16,17.
Artificial Neural Networks (NN) is mathematical models that learn nonlinear relationships between two data sets. They have the ability to find complex relationships in data18,19. NN, consist of a set of processing elements, also known as neurons or nodes whose functionality is loosely based on biological neurons. These units are organised in layers that process the input information and pass it to the following layer. The
3
processing ability of the network is stored in the inter unit connection strengths (or weights) that are obtained through a process of adaptation to a set of training patterns19. Some NN models have been used in dairy science. Heald et al.20 developed a NN model that discriminated among four categories of bacterial organism (contagious, environmental, no significant growth, and others), Paquet et al.21 used a three layer feedforward NN to model and predict the pH of cheese curd at various stages during the cheese making process; Grzesiak et al.22 used NN to predict 305 days MY based on MY of the first 4 test days; Fernández et al.23 use self organized maps to study farms situation characteristics. NN modelling should be a good approach for combining several inputs variables for predicting MY. For management purpose, predict MY each week is very useful instead of wait until the end of lactation to analyse what was the situation and conditions of the farm and take decisions for next lactation. Therefore, an interesting approach for dynamic mathematical models like NN should be take management decisions during the running period of lactation; such us know “next week” milk production based in actual and/or previous MY, based on present animal conditions (animal factors like body weight, diet, litter size, number of lactation, etc). As far as we know, there is no literature about NN modelling for MY prediction in dairy goats. The subject of this first study was a tentative approach to test the feasibility of NN analysis for predicting next week’s dairy MY based on the current week’s MY using individual goat management data. Therefore, the goal is to develop a tentative research NN model taken into account only a small group of goats from a commercial dairy farm.
2. Material and methods
4
2.1. Animals A homogenous group of dairy goats within a commercial farm (Excamur, S.L.) was selected in this trial. This farm is a member of ACRIMUR (Murciano-Granadina Goat Breeder Association -Asociación Española de Criadores de la Cabra MurcianoGranadina,-) and the herd size was of 300 goats. Briefly we are going to describe the standard management practises that follow an intensive farm of Murciano-Granadina goat, being this farm a representative sample of these kinds of farms in the Southeast of Spain. First parity takes place at 12-17 months of age, average fertility and prolificity is 0.85 partum per goat and 1.8 kids per partum, respectively. The typical milking routine for the Murciano-Granadina consists in one daily machine milking (8:30 am), kids are weaning at 24-72h after colostrum intake and reared by artificial feeding. The standard milking period is about 5-7 month, and farmer used to feed goats with TMR. Average goat MY per year is 500 kg and farmer sell milk to the cheese industry by 0.45€/liter. Under this situation we selected 35 Murciano-Granadina dairy goats, homogeneous for the number of lactation (4.92), litter size (1.9 kids), milk production in the previous lactation (480 kg/lactation and goat) and body weight (43.8 kg). The experimental trial had a lactation period of approximately 5 months, and during this time MY and body weight (BW) (Grupanor-Cercampo electronic scale) were recorded once per week (4 wks·5.25months = 21 controls per goat). Due to the small number of goats selected and, in order to have a better control of goats during the trial, they were allocated in individuals pens and a portable milking machine was used for milking. Water was freely available at all times. The goats were fed with two commercial TMR with the same chemical composition (92±1.5 %DM; 16±0.7 %CP; 32±2.6 %NDF; 18±0.9 MJ GE/kg DM) and the ingredients differed only in the source of protein; soybean meal or sunflower meal. The TMR ingredients were alfalfa hay, barley, corn, dehydrated beet
5
pulp, beet molasses, cotton seed, soybean or sunflower meal and vitamin-mineral premix. The balance of the diet was obtained using the recommended values of INRA24 and AFRC25 for energy, protein, fibre, calcium, phosphorus, sodium and chloride. The subject to test two different TMR is because in this Region of Spain, farmer usually does not make his own diet and generally buys elaborate TMR at lowest cost and, sometimes feed goats into the same farm with different TMR. Therefore, these two kind of diets were taken into account in this trial. The amount offered per day was 3 per goat kg/d splited into two equal portions; 9:00 (after milking) and 15:00. Therefore, two homogenous groups for number of lactation, litter size, milk production in the previous lactation and body weight was created and, 17 goats were fed with the TMR based on soybean meal (46% protein) and the other 18 goats with sunflower meal (30% protein). All the goats were housed in a building in which the environment was partially controled (the temperature varied between 16 and 20ºC). Throughout the trial, the goats were handled in accordance with the guidelines for the care of animals in experimentation published by NRC26.
2.2. Neural networks The used NN in this work is the multi-layer perceptron (MLP). This model consists in a layered arrangement of individual computation units known as artificial neurons. The neurons of a given layer feed with their outputs the neurons of the next layer. A single neuron is shown in Figure 1. The inputs xi to a neuron are multiplied by adaptive coefficients wi, called synaptic weights, which represent the connectivity between neurons. The output of a neuron is usually taken to be a sigmoid-shaped (sigmoid or hyperbolic tangent function) ϕ. The output of the jth neuron is given by
6
m
o j = ϕ (∑ wij ⋅ xi ) .
(1)
i =0
where ϕ is a non-linear function named activation function. Neurons from a specific network are grouped together in layers that form a fully connected network. The first layer contains the input nodes, which are usually fully connected to hidden neurons and these are, in turn, connected to the output layer. Figure 2 shows a scheme of a fully-connected multilayer perceptron. In our case, only one output neuron is necessary, since only one variable is predicted at each time. The multilayer perceptron network uses the backpropagation learning algorithm19. The objective of this algorithm is to determine the best network parameters to model the relationship between the input and output variables. This objective can be fulfiled with a minimization procedure of a function that depends on the difference between the obtained output of the neural network and the desired output value. This minimization procedure is followed in a wide range of the mathematical models27. The process in which the coefficients (synaptic weights) of the MLP are adjusted is known as training. It is convenient to emphasize some of the valuable characteristics of NN for its application in animal science problems18,19. •
No a priori knowledge of the problem is needed. The adjustment between the output and input variables is done without any assumption, avoiding errors produced by false suppositions.
•
Since equation (1) is nonlinear, the NN will act like a nonlinear model. Thus, it can find nonlinear relationships between the inputs and the output. In fact, it is possible to mathematically demonstrate that a multilayer perceptron is a universal regression model, i.e. it can find the relationship between any pair of data sets (if they are related and they represent sufficiently the problem).
7
These two characteristics of multilayer perceptron have led to an exponential growth of their applications. Nevertheless these two characteristics also present some drawbacks: •
It is necessary the participation of an expert in the problem to solve, in order to define which variables are relevant. As we indicated previously several factors affect milk production in dairy farms, and these factors include breed, number of lactation, season of kidding, litter size, stage of lactation, level of production, environmental and management factors, etc4,10,28,29. In our approach we use a reduced group of goats from a farm. Besides, some important variables that affect goats herds were not taken into account here (like breed, environmental condition, etc), because the subject of this study was a first tentative approach to built a NN model based on some animals factors and, their possibility of application to resolve a single problem: milk prediction. Therefore, the groups of goats selected belonged to the same farm, similar stage of lactation, and we study just one lactation period. So, we collected the following descriptors from 35 goats: number of lactation, litter size, days between parturition and first control, metabolic body weight (kg), type of diet (two TMR) and MY (kg/d). In our models, we use present and past values of the input variables to predict the next value of MY.
•
Since the MLP is a universal regression model, it is extremely flexible and its transfer function could fit any set of data points. However, the acquisition of the data is frequently affected by noise and a perfect adjustment to this noisy data could be not desirable. In order to avoid this excessive adjustment, data set is divided into two subgroups: the training set and the validation set19. The validation data set controls the flexibility of the model,
8
preventing model overfitting. It provides a criterion to stop the learning before the model learns the training data, since an excessive adjustment of the model in the training data could lead to poor results when new data is presented to the model. This procedure for model selection is known in the NN literature as early stopping procedure.
2.3 Experimental Setup. The training and validation of a MY prediction model is a complex task, and an improper split of the available data could lead to overoptimistic, biased and misleading results. We have circumvented this problem first by randomly splitting the dataset into two subsets (training and validation) and second, by including a balanced number of animals eating from one TMR or the other in the training and validation set. Twentytwo goats were used to build the neural model (11 goats were fed with soybean meal and the other 11 with sunflower meal) and thirteen goats were used for validation (6 goats were fed with soybean meal and 7 goats were fed with sunflower meal). This is a usual choice in the NN literature, i.e., to use two thirds of the patterns to obtain the model, and the other third of the patterns to validate it19. Table 1 shows the mean and standard deviation for both the training and validation groups; the desired signal is the next week’s MY. The same range for the input variables and for the desired signal was observed in the two data sets. Another factor to take into account in the neural model development is the initialisation of the synaptic weights. As mentioned above, the learning of the network consists of the iterative minimisation of a function. The final values of the weights in the network depend on the values considered at the beginning point because the learning rule finds the minimum point closest to the initial weights. Therefore, the usual procedure is to
9
develop a set of neural models with different initial values of the weights in order to obtain different adjustments. The number of hidden layers and the number of neurons in each layer also affect the modelling and generalization capacity of the NN. It is possible to demonstrate mathematically that two hidden layers are sufficient to solve any problem19. In this work, NN with one and two layers have been considered. Once NN models with one hidden layer were analysed, we trained models with two hidden layers. In this case, the results (not shown) were poorer than the ones obtained using a single hidden layer. However, this should not mislead the reader because, by using two hidden layers, the capacity of network modelling is increased but early stopping also becomes more difficult. Sometimes, when we increase the complexity of the model, the model capacity increases but generalization results do not increase accordingly, which is known as model overfitting. NN can be trained using on-line or batch adaptation. On-line learning consists of modifying the weights for each example, whereas batch learning updates the weights when all the data is presented to the model. The online learning is the best one to model variables that present abrupt variations while the batch learning determines models where the outputs are smoother, without abrupt variations. It is remarkable that, in many applications, those variations come from the measurement instruments and are not due to the own problem. For this reason is advisable to use both learning types. Both approaches are considered in this paper. In addition, as our objective is to predict next week’s MY, we have developed two kinds of NN models; using the MY of the present week (regression methodology) and using current and previous MY (forecasting methodology). In the first type of model the temporal relationship between variables is not considered (static model), whereas in the
10
second it is considered (dynamic model). Therefore, four models were developed according to the type of learning and its temporal characteristics. We trained NN with normalized data; this normalization keeps the NN from giving primacy to some inputs for their range instead of for their importance in solving the problem. In order to study the goodness of the NN models, some performance indexes were taken into account: mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE) and mean error (ME).
3. Results and discussion 3.1. Analysis of accuracy We followed a strategy of exhaustive search of the best subset for free parameters, which yielded 8400 trained models. These models have been developed using the scientific software package MATLAB®30, which is a world-wide standard for high-level technical computing. The routines of training and validation of the neural models have been developed by the authors due to the great number of parameters. Figure 3 shows the histograms of the RMSE, which is an accuracy measure for the four training configurations in the validation set. The best overall performance was obtained using a NN trained in batch mode that took into account the current and the previous week MY (RMSE=0.36 kg/d). The accuracy criteria for the neural models are shown in Table 2. In order to compare the goodness of the prediction of the model, a trivial model is included (naïve prediction), MY(t+1)=MY(t). This model considers that the next week’s MY will be the same as the current week’s MY. We have also included a first order autoregressive, AR(1), model for proper comparison. This model is optimal according to Akaike’s criterion31. Four goodness-of-fit parameters are considered in the analysis. Accuracy of
11
the model is measured by using MAE, RMSE and the slope of the regression line between the actual and predicted MY(a); the bias of the model is measured by using ME and the intercept of the same regression line (b). This last measure is important since some predictions could be accurate but biased. Figure 4 shows the performance index for each goat of the training group using the best NN model obtained. This representation is important because the regression and forecasting methodologies yield models that show a different behaviour for each goat, whereas classical population models propose a model for an average of the whole data set. The later models, which are very common in dairy science, only determine the MY average for all the goats. It can be observed that high accuracy for each goat is obtained by the NN model with the training data set. This approach permits the application of the model to each particular animal and not to the collective group or average approximation. Another reason for using NN models is their capacity for generalization. Figure 5 shows the goodness of fit of the NN model selected for each goat in the validation data set. This data set was not used to build the NN, but only to control model complexity. It is worth noting that goat #13 from the validation data set shows poorer adjustment than the other 12 goats, which is due to the fact that a low number of records were available. In any case, we can not conclude that NN would yield good results on different goat farms. Generalization capabilities are always related to achieving good results in data that has never been used in model training, but that belongs to the same data distribution. Another way to show the goodness of the selected model is to represent predictions of each goat’s milk yield during the whole lactation period. Figure 6 represents the actual and the best MY prediction for two goats in the training group and another two in the
12
validation group. They show good and poor results in both cases. We can conclude that good generalization capabilities are observed; the model can extrapolate data from a training set into a validation set of goats (see Figure 6(a)-(b)).
3.2. Interpretability of the model One of the main problems that some authors indicate is that NN work like “black boxes”, meaning that it is not possible to obtain any kind of information from a trained model18,19,32. However, some techniques to circumvent this a priori problem have been proposed and grouped under the name of sensitivity (or saliency) analysis. Sensitivity analysis is used to study the influence of input variables on the dependent variable and consists of evaluating the changes in training error that would result if an input were removed from the model. This measure, commonly known as delta error in the literature, produces a valuable ranking of the relevance of input variables33,34,35. In order to avoid bias in the conclusions, this methodology was followed using the 10 best NN from a set of 8400 neural models. Tables 3 and 4 show the relative importance of each input for these 10 NN obtained for training and validation groups, respectively. These results show that the MY prediction depends mainly on the present (relative value 1) and the previous values (relative value 2) of MY, which indicates the importance of these values. The next most important input for the training data set was the days between parturition and the first control, and type of diet. The least important inputs were metabolic weight, litter size and number of lactation because as we explained in the section “Material and Methods” we selected a small numbers of goats, belonged to the same farm and homogenous on number of lactation, litter size and body weight. So, the NN model obtained can predict MY and detect that previous milk production will affect milk production for the next week,
13
besides days between parturition and first milk control is also taken into account for the MY prediction by the NN model indicating the importance, under practical conditions, to have the first milk control as soon as possible because that affect MY predictions. Finally, type of diet is also detected as an important input by the NN model. Diets were TMR balanced in energy, protein and fibre but with different ingredients. It is important to be aware that NN is capable to detect changes on the diet with similar chemical composition but different ingredients. Although was not the subject of this study, different ingredients of the diet could influence the palatability and the intake capacity36. The obtained NN detects these changes in MY from one week to next week when diet, although similar in chemical composition differs on their ingredients. Similar results for the input units were found for the validation data set (Table 4). Therefore, NN models allow to take management decisions for next week (within the same lactation period evaluated) knowing the actual farm situation. They also avoid waiting until the end of lactation to get a farm evaluation and take future decisions. These results could also aid in defining suitable protocols and strategies to improve milk production and quality in a goat farm.
4. Conclusions Computer technology advances at an extremely rapid pace. For instance, the incorporation of computerized devices in dairy goat herds to monitor milk performance is becoming more extensively used by the dairy farmers. This study has shown that artificial NN are a suitable tool for prediction of next week MY from goat factors recorded on a farm and present MY. Further work is necessary to improve the robustness of the NN model. Moreover increasing the number of parameters related to
14
MY and herds should be considered in order to improve the results to and extract knowledge from other variables that were not considered in this study.
References 1. Falagán A., Guerrero, J. E., Serrano, A., 1995. Systèmes d’elevage caprin dans le sud de l’Espagne. Pages 38-50 in Proc. Goat production systems in the Mediterranean. EAAP Publication Nº 71. Wageningen, The Netherlands: Wageningen Pers. 2. Analla, M., Jiménez-Gamero, I., Muñoz-Serrano, A., Serradilla, J. M., Falagán., A., 1996. Estimation of genetic parameters for milk yield and fat and protein contents of milk from Murciano-Granadina goats. J. Dairy Sci. 79: 1895-1998. 3. Hanigan, M., Bequette, B., Cromton, L., France, J. 2000. Modeling mammary aminoacid metabolism. Livest. Prod. Sci. 70: 1-2: 63-78 4. Wilmink, J.B.M. 1987. Studies on test-day and lactation milk, fat and protein yield of dairy cows. PhD. Royal Dutch Cattle Syndicate. Arnhem. 123pp. 5. Van Tassel, C.P., Jones, L.R., Eicker, S.W. 1995. Production evaluation techniques base don lactation curves. J. Dairy Sci. 78: 457-465. 6. Scott, T.A., Yandell, B., Zepeda, L., Shaver, R.D., Smith, T.R. 1996. Use of lactation curves for analysis of milk production data. J. Dairy Sci. 79:1885-1894. 7. Cappio-Borlino, A., Portolano, B., Todaro, M., Macciota, N.P., Giaccone, P., Pulina, G. 1997. Lactation curves of Valle del Belice dairy ewes for yield of milk, fat and protein estimated with test day models. J. Dairy Sci. 80:3023-3029. 8. Fernández, C., Sánchez, A., Garcés, C., 2002. Modeling the lactation curve for test-day milk yield in Murciano-Granadina goats. Small Rumin. Res. 46: 29-41. 9. Pedauye, J., 1989. Lactation curve and milk composition in Murciano-Granadina
15
goats breed. Anales de Veterinaria, 5:3-11. 10. Gipson, T.A., Grossman, M. 1990. Lactation curves in dairy goats: a review. Small Rum. Res. 3: 383-396. 11. Falagán, A., González, C., Pérez, S. J., Goicoechea, A., Romero, C., 1991. Composition and production curve in the goat´s milk. Chem. Mikrobiol. Technol. Lebensm. 13: 76-82. 12. Grossman, M., Koops, W.J. 1988. Multiphasic analysis of lactation curves in dairy cattle. J. Dairy Sci. 71:1598-1608. 13. Beever, D.E., Rook, A.J., France, J., Dhanoa, M.S., Gill, M. 1991. A review of empirical and mechanistic model of lactational performance by the dairy cow. Livest. Prod. Sci. 29:115-130. 14. Lippmann, R. P., 1987. An introduction to computing with neural nets. IEEE ASSP Magazine. 4:4-22. 15. Carvalheira, J.G.V., Blake, R.W., Pollak, E.J., Quaas, R.L., Duran-Castro, C.V. 1998. Application of an autoregressive process to estimate genetic parameters and breeding values for daily milk yield in a tropical herd of Lucerna cattle and in United States Holstein herds. J. Dairy Sci. 81:2738-2751. 16. Pool, M.H., Meuwissen, T.H.E. 1999. Prediction of daily milk yield from a limited number of test days using test day models. J. Dairy Sci. 82:1555-1564. 17. Macciotta, N.P.P., Cappio-Borlino, A., Pulina, G. 2000. Time series autoregressive integrated moving average modelling of test-day milk of dairy ewes. J. Dairy Sci. 83:1094-1103. 18. Ripley, B. D., 1996. Pattern Recognition and Neural Networks, Cambridge University Press.
16
19. Haykin, S., 1999. Neural Networks: A Comprehensive Foundation, Prentice Hall. 20. Heald, C.W., Kim, T., Sischo, W. M., Cooper, J. B., Wolfgang, D. R., 2000. A computerized mastitis decision aid using farm-based records: an artificial neural network approach. J. Dairy Sci. 83:711-720. 21. Paquet, J., Lacroix, C., Thibault, J., 2000. Modeling of pH and acidity for industrial cheese production. J. Dairy Sci. 83:2393-2409. 22. Grzesiak, W., Lacroix, R., Wojcik, J., Blaszcyk, P., 2003. A comparison of neural network and multiple regression predictions for 305-day lactation yield using partial lactation records. Can. J. Anim. Sci. 83:307-310. 23. Fernández, C., Soria, E., Martin, J.D. and A.J. Serrano. 2005. Neural network for animal science applications; two case studies. Expert System with Applications (in press). 24. INRA, Institut National de la Recherche Agronomique. 1988. Page 471 in Alimentation des bovins, ovins and caprins (Feeding of cattle, sheep and goats). Paris. 25. AFRC, Agricultural and Food Research Council. 1993. Energy and protein requirements of ruminants. Wallington, UK; CAB International. Page 151. 26. NRC, National Research Council. 1998. Page 23 in Guide for the care and use of laboratory animals. Publication Nº 85, NIH, Washington, D.C. 27. Luenberger, D. Linear and Nonlinear Programming, Addison Wesley, Reading, Massachusetts, 1984, 2 nd Edition 28. Wood, P.D.P. 1969. Factors affecting the shape of the lactation curve in cattle. Anim. Prod. 11:307-312.
17
29. Auran, T., 1973. Studies on monthly and cumulative monthly milk yield record. 1. The effect of age, month of calving, herd and length of the first period. Acta Agric. Scand. 23: 189-199. 30. Matlab 1997, The Language of Technical Computing. The Mathworks Incorporation. 31. Ljung, L., 1999. System Identification. Theory for the User. 2nd ed. Prentice Hall. 32. Bishop, C.M., 1996. Neural Networks for Pattern Recognition. Clarendon Press. 33. Orr, G. B., Müller, K. R, Neural Networks: Tricks of the Trade, SpringerVerlag, Berlin, Heidenberg, 1998. 34. Refenes, A. N., Zapranis, A., Francis, G., 1994. Stock performance modeling using neural networks: comparative study with regression models. Neural Networks. 7(2):375–388. 35. Sarle, W. S., 2000. How to measure importance of inputs?, Available from ftp://ftp.sas.com/pub/neural/importance.html. Accessed Jan, 2003. 36. Fernández, C., Lachica, M., Garcés, C., Aguilera, J. 2004. Necesidades nutritivas del ganado caprino lechero. Ganado Caprino, ed. Agrícola Española. 312pp.
18
Table 1. Mean and standard deviation (σ) for the training and validation data set.
Training group Number of lactations Litter size Number of controls Metabolic weight, kg Milk yield, kg/d Validation group Number of lactations Litter size Number of controls Metabolic weight, kg Milk yield, kg/d
σ
Mean n=22 4.87 1.96 8.68 17.20 2.21 n=13 4.96 1.81 7.92 16.77 2.10
1.12 0.63 2.71 1.49 0.82 1.64 0.49 3.92 1.14 0.65
19
Table 2. Performance index for the best NN1 model obtained. Regression methodology; a trivial model [MY(t+1)=MY(t)]2. And Forecasting methodology; a first order autoregressive model [AR(1)]. MAE3
RMSE4
ME5
a6
b6
0.28 0.25
0.36 0.32
-0.03 0.00
0.83 0.73
0.41 0.57
0.32 0.29
0.42 0.38
-0.02 0.03
0.79 0.90
0.42 0.18
0.30 0.28
0.41 0.37
-0.02 -0.02
0.70 0.80
0.46 0.47
1
Best NN Validation Set Training Set MY(t+1)=MY(t) Validation Set Training Set AR(1) Validation Set Training Set 1
NN = Neural network. MY = Milk Yield. 3 MAE = Mean absolute error. 4 RMSE = Root mean square error. 5 ME = Mean error. 6 a and b are the slope and intercept of the regression line between the predicted and the actual signal, respectively. 2
20
Table 3. Relative importance value1 for the different inputs chosen by the ten most accurate neural models in the training data set. Neural network2
Number of lactation
Litter size
Days Type between diet partum and first control 1 5 6 3 7 2 7 6 4 3 3 6 4 3 7 4 7 5 4 3 5 6 4 5 3 6 6 5 4 3 7 6 5 3 4 8 5 6 4 3 9 7 5 4 3 10 4 5 7 3 1 value 1: the most important; value 7: the least important 2 NN are ranked according to their accuracy.
21
of
Metabolic weight
Present MY
Previous MY
4 5 5 6 7 7 7 7 6 6
1 1 1 1 1 1 1 1 1 2
2 2 2 2 2 2 2 2 2 1
Table 4. Relative importance value1 for the different inputs chosen by the ten most accurate neural models in the validation data set. Neural network2
Number of lactation
Litter size
Days Type between diet partum and first control 1 5 6 3 7 2 5 6 3 4 3 5 6 3 7 4 7 3 5 4 5 5 6 3 4 6 5 6 3 4 7 5 6 3 4 8 5 6 3 4 9 7 5 4 3 10 4 7 6 3 1 value 1: the most important, value 7 the least important 2 NN are ranked according to their accuracy.
22
of
Metabolic weight
Present MY
Previous MY
4 7 4 6 7 7 7 7 6 5
1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2
x1 w1
x2
w2
oj
+
w3
xm
bias: +1.0
Fig. 1. Scheme of a neuron; xi are the inputs and wj are the neuron weights
23
Inputs
First Hidden Layer
Second Hidden layer Output Layer
Fig. 2. Scheme of a multilayer perceptron.
24
30
40
25 Number of NN
Number of NN
30 20 15 10
20
10
5 0 0.4
0.45
0.5
0.55
0 0.3
0.6
0.35
100
80
80 Number of NN
Number of NN
100
60 40 20 0 0.37
0.4
0.45
0.5
(b)
(a)
60 40 20
0.38
0
0.39
0.32
0.325 (d)
(c)
Fig. 3. MSE (kg/d) histograms obtained for the validation data set using one hidden layer NN models; (a) training on-line using the current MY, (b) training on-line using the current and the previous week’s MY, (c) training batch the using current MY, (d) training batch using the current and the previous week’s MY.
25
Fig. 4. Performance indexes (RMSE, MAE, ME, slope a) obtained for each goat in the training group using the best model (batch training method using current and last week MY).
26
Fig. 5. Performance index (RMSE, MAE, ME, slope a) obtained for each goat in the validation group, using the best model (batch training method using current and last week MY).
27
Fig. 6. Next week MY () and prediction provided by the best neural network (o) obtained for 2 goats from training data set (a and c) and validation data set (b and d), and showing good (a and b) and poor (c and d) results.
28