Genetic algorithms for small enterprises default ...

4 downloads 86 Views 515KB Size Report
For Italian small firms, bank loans as a share of total liabilities ranged from 40 percent to 50 ... financial data is useful for business default prediction modeling.
Genetic algorithms for small enterprises default prediction: Empirical evidence from Italy Handbook of Soft computing intelligent algorithms in engineering, management, and Technology, (P.Vasant ed) IGI-Global. Niccolò Gordini University of Milan-Bicocca, Italy

ABSTRACT Company default prediction is a widely studied topic as it has a significant impact on banks and firms. Moreover, nowadays, due to the global financial crisis, there is the need to use even more advanced methods (such as soft computing techniques), which can pick up on time the signs of financial distress, to evaluate firms, especially small firms. Thus, we propose a genetic algorithms (GAs) approach (a soft computing technique) and show how GAs can contribute to small enterprise default prediction modeling. We applied GAs to a sample of 6,200 Italian small enterprises three years and also one year prior to bankruptcy. Subsequently, a multiple discriminant analysis and a logistic regression (the two main traditional techniques in default prediction modeling) were used to benchmarking GAs. Our results show that the best prediction results were obtained when using GAs.

INTRODUCTION Bankruptcy prediction has been extensively studied since the late 1960s (Altman, 1968, 1993, 2004; Altman et al., 2005; Altman & Sabato, 2005; Beaver, 1967, 1968; Berger, 2006; Berger & Frame, 2007; Blum, 1974). Analysis of the financial position of the firms is very useful for both the firms and banks alike, such a task is used for credit risk assessment and to maintain the stability of financial markets and general economic prosperity. The numbers of defaulting firms is, in fact, an important issue for the economy of every country and it can be considered as an index of the development and robustness of the economy of a country (Zopounidis & Dimitras, 1998). Entrepreneurs, labor and labor organizations, policy makers, industry participants, investors, creditors, auditors, stockholders, banks are all interested in bankruptcy prediction because it affects all of them alike (Etemadi et al., 2009, O’Leary, 1998; Ravi et al., 2008). For example, prediction models may be used by the shareholders in choosing between the opportunities open to them, divestment of the company, merging with or takeover by other companies. If the results are not satisfactory, then shareholders can choose the option to sell out the company at a price over real value, because the company still has a bargaining power to be sold over real value. If shareholders wait in the hope of becoming solvent, the company, they may face substantially low market values. Moreover, financial distress prediction models can be considered by the firm operating in a district or in a cluster. Since, failure of a such firm would ruin the reputation of the entire district and will affect investment decisions and credit decisions; therefore, the other firms in the district can monitor what is going on through estimation tools and of course can takeover other distressed firms that would be beneficial for them.

1

Prediction models can be used by labor organizations to reveal information about the financial health of the company, from which labor organizations can define the pay rise and insist on other labor rights. Moreover, labors of financially distressed predicted company can work harder to recover the company or seek another job for themselves (Aktan, 2011). Individual and institutional investors could utilize prediction methods to maximize their investment decisions. In fact, an individual investor may not have enough time for and experience/information in evaluation of investment tools. In such a situation, wrong decisions would be taken according to poor foresight and therefore the expected return on investment cannot be achieved. Failure prediction models provide significant advantages to the investors providing necessary time an investor needs and resolving the experience/information deficits. With the help of the prediction tools, investors can identify the poor stock or stocks in their portfolios and take actions to sell them before these stocks’ value evaporate. An investor can reshape his portfolio by prediction tools, i.e. if the financially distressed company is expected to recover from its problems, then investing on this company’s low priced stocks can bring high returns later, when the company resolves. Thus, investors achieve their objectives by investing in the right areas, and companies have the opportunity to be strong by rational distribution of funds. Resolution of difficulties in funding lets the opportunities for new investments and provides companies to grow and grasp competitive advantages (Aktan, 2011). Therefore, timely identification of firms’ impending failure is highly desirable (Jones, 1987), especially for banks and firms. On one hand, banks will incur a lost of profit due to the lack of the payback of capital loaned to the firm and, in a long range, they could enter themselves in a dangerous zone if, over a medium/long term, many firms will not payback their loans). Therefore, with the help of a company default prediction model, banks benefit from choosing the right company to loan. Evaluation and analysis of loan applications by this model provide more successful and faster results and help bank to reject loan application of a company which is predicted as financially distressed by the model, saving time and funds. Thus, the funds will be used in the right areas, and in this case, the country's economy and the credit institutes can gain great benefits. On the other hand, companies will have even more difficulties to access credit and, probably, they will not be able to pay all their stakeholders (i.e. suppliers, employees). Consequently, other firms (i.e. supplier firms) will be hit by the crisis, the employees will be laid off, and unemployment will grow. This will lead to a general deterioration of the economic and social stability and prosperity of the country. These high individual and social costs, incurred in corporate bankruptcies, have driven many authors (Altman, 1968, 1993; Beaver, 1967, 1968; Berger, 2006; Berger & Frame, 2007; Blum, 1974; Martin, 1977; Ohlson, 1980; Odom & Sharda, 1990) towards a better understanding and prediction capability. However, these studies mainly focus on medium and large enterprises which systematically produce detailed financial information, whilst only a small number of authors (Altman & Sabato, 2005; Behr & Güttler, 2007; Carter & Van Auken, 2006; Edmister, 1972; Pompe & Bilderbeek, 2005; Saurina & Trucharte, 2004) highlighted that specific default prediction models are required to evaluate the risk profiles of small enterprises (SEs). This is a significant lacking due to the fact that SEs have greater difficulties to gain credit access, have markedly different characteristics if compared to medium and large enterprises and, at the same time, represent the heart of the economy of virtually every nation (Berger & Frame, 2007; Morrison et al., 2003). In Italy, SEs account for more than 90 percent of all firms and employ over 80 percent of the workforce. On the one hand, SEs can be more flexible and react more quickly to the increasing turbulence in the global markets. In a SE, the owner, the entrepreneur and the manager are often the same person. Thus, the SE growth is a result of clear intentions, actions, skills, aptitudes and personalities of the owner-entrepreneur-manager, driven by the belief that the ownerentrepreneur-manager can produce the desired outcomes (Morrison et al., 2003). The entrepreneur's demographic, psychological and behavioral characteristics, as well as his or her managerial skills and technical know-how are often cited as the most influential factors related

2

to the performance of an SE (Cooper & Gascon, 1992; Man et al., 2002). On the other hand, SE entrepreneurs find it difficult to cope with the wide range of managerial skills required in today’s complex markets. It is especially difficult to juggle the wide range of financial skills required. The weak financial structure of SEs has been a well-known weakness of these firms. They are characterized by a marked undercapitalization and by the fact that they usually do not have access to organized capital markets or bank medium-term credit. This leads to a high dependency from short-term bank loans as a predominant source of external financing. For Italian small firms, bank loans as a share of total liabilities ranged from 40 percent to 50 percent in 2012. Because the share of bank loans in total financing tends to decline as the size of the company increases, the relative importance of bank financing for SEs is greater than for medium or large sized companies that have access to other sources of external debt financing (i.e. corporate bonds). In addition, the relationship bank-SE is asymmetric because banks do not have incentives to provide SEs detailed information about the way they calculate the individual SE probability of default. This leads to the conclusion that the importance of bank loans is extraordinarily high, thus underlining the dependence of SEs on their bank relationships to find funding. The use of accurate default prediction model by the SEs could reduce this asymmetric information problem because SE, using this model, can self-estimate, and consequently, better and deeper understand how it will be assessed by the banks and their expected probability of default and of encountering in a Type I and Type II errors. These specific characteristics of SEs and their fundamental role played throughout the economy of every country, justifies attempts to construct a default prediction model suitable for SEs. According to the previous discussions, in this study we applied genetic algorithms (GAs), a soft computing technique, to a sample of 6,200 Italian SEs three years and also one year before bankruptcy. The large size of the sample is an important strength for the reliability of our findings. We then compare the prediction accuracy rate of GAs to those of traditional methods, such as multivariate discriminant analysis (MDA) and logistic regression (LR), and investigate how the size of the sample influences the accuracy of the prediction model. This study contributes in several ways to the body of knowledge of this topic. Firstly, it starts with a brief overview of the evolution of the studies and of the methodologies used in bankruptcy prediction from the origin in the late 1960s to today. Secondly, the study applies a new and very promising method (the genetic algorithms) for company default prediction modeling and compare it, its major advantages and disadvantages, with the traditional, well known, statistical techniques such as multivariate discriminant analysis and logistic regression. Thirdly, the model is applied on a sample of Italian SEs almost equal to the entire Italian SEs population. The advantage of focusing the model on a so huge sample allows to tailor the model to the real and specific needs of this firms and to strengthen the validity and the robustness of the results obtained. Finally, our study provides significant financial advices for SEs entrepreneurs. Since they have even growing financial difficulties and are constantly looking for funding and better condition to credit access, having a default prediction model accurate and self-tailored for this kind of firms is crucial. In fact, it helps the entrepreneurs to better understand the real status of their firm and how banks will assess it, making them aware, ex ante, of their creditworthiness and of the expected probability of both Type I and II errors. Thus, using this model they can reduce the asymmetric information problem that they have with the banks putting the SEs in a better position to access credit or renegotiate loan conditions on the basis of a larger set of information or, according to Behr & Güttler (2007), leading to a “reduction of search costs by increasing the probability of successfully changing bank relationships on better terms, or creating opportunities to tap into alternative sources of financing, thus enlarging the scope of the firm’s financing options”. The chapter is organized as follows. It begins with an overview of the relevant literature on default prediction modeling. In the following section, the hypotheses and the methodology (the three models, the dataset and the variable selection process) of our study are briefly described. Next, the forecasting capabilities of the GAs model and those of classical techniques are compared and discussed. Later on, further research directions and managerial implications are

3

discussed. A conclusion section summarizes the study.

BACKGROUND After Altman’s 1968 study, many other empirical studies (Altman, 1968, 1993; Altman & Sabato, 2005; Beaver, 1967, 1968; Blum, 1974, Edmister, 1972; Ohlson, 1980) showed that financial data is useful for business default prediction modeling. From 1967 to 1980, multivariate discriminant analysis was the dominant technique used in these studies (Altman, 1968; Beaver, 1968; Blum, 1974; Deakin, 1976; Edmister, 1972). Altman (1968) used a multiple discriminant analysis technique to assess the effectiveness of ratio analysis in predicting enterprise bankruptcy. He used a sample of 33 manufacturing firms which were declared bankrupt under Chapter X during the period 1946-1965, and a stratified sample of 33 non-failing manufacturing firms. The predictive ability of Altman’s function, using five financial ratios (selected from an original list of 22), was 79% one year before failure. Edmister (1972) examined 19 financial ratios and five methods of analysis. MDA was employed to select a set of ratios and analytical methods which would best discriminate between future failure and non-failure of SEs. The results of this study confirmed the value of ratio analysis. Blum (1974) examined a sample of 115 industrial firms which failed in the years 1954-1968 with liabilities greater than $1 million and a stratified sample of 115 non-failing firms which were similar with respect to industry, annual sales, and employee numbers. The predictive ability of this model was 93-95% for bankruptcy occurring within one year. The accuracy declined to 80% for predictions related to failures occurring two years later and to 70% for predictions concerning failures three years later. However, MDA suffers from some limitations such as the linearity, normality and independence among the variables (Barnes, 1982; Hamer, 1983; Karels & Prakash, 1987; McLeay & Omar, 2000). Considering that the violation of these assumptions frequently occurs with variables like financial data (Deaking, 1976) which are not linear, and normal, and (most importantly) are not completely independent of one another (Karels & Prakash, 1987; Martin, 1977), this technique can have limitations in obtaining the effectiveness and validity when the prediction variables refer to financial data. Moreover, Altman et al. (1981) identified four related problems in the use of MDA in classification: (1) relative significance of the individual variables, (2) reduction of di-mensionality, (3) elimination of insignificant variables, and (4) existence of time series relationships. Recognizing the limitations of linear classifiers, a common practice is to accept the results as if the assumptions were satisfied. Hence, a number of studies have tested many other techniques in order to overcame MDA limitations and increase the default prediction accuracy rate. These methods can be summarized into two categories: traditional and soft computing systems. The first consists of logistic regression (Foreman, 2002; Martin, 1977; Ohlson, 1980), probit (Casey et al., 1986), linear probability (Stone & Rasp, 1991; Vranas, 1992). Amongst these, logistic regression was the most used technique. It is a linear technique that does not necessarily require the assumptions of MDA. Unlike MDA, it does not specify a cut-off point delineating bankrupt firms from non-bankrupt firms, but it assigns each firm a probability of bankruptcy. Martin (1977) was the first author to apply it in order to predict the probability of bankruptcy. His study was based on the data obtained from the Federal Reserve System. Ohlson (1980) employed the logit model to predict firm failure. The data was obtained from Moody’s Manual, Compustat data tapes and 10- K financial statements. The classification accuracy reported by Ohlson was 96.12%, 95.55% and 92.84% for prediction within one year, two years and one or two years respectively. Although LR avoids some of the major problems of MDA and can perform better than MDA in many applications, it assumes a logistic distribution and when the relationships of the system are non-linear its accuracy decreases and the misclassification errors (Type I and Type II errors) increase (Kida, 1980; Coats & Fant, 1993). Aiming at limiting the MDA and LR above-mentioned shortcomings, the soft computing system has begun to be applied. Neural Networks (Odom & Sharda, 1990), Genetic Algorithms (Back

4

et al., 1996; Etemadi et al., 2009; Kim & Han, 2003; Shin & Han, 1999; Shin & Lee, 2002; Tsakonas et al., 2006; Varetto, 1998), Support Vector Machine (Fan & Palaniswami, 2000; Van Gestel et al., 2003; Vapnik, 2005), Case Based Reasoning (Bryant, 1997; Buta, 1994; Park & Han, 2002), and Hybrid Models (Ganesan et al., 2011, 2012; Min et al., 2006; Vasant, 2010; Vasant et al., 2009, 2010; Verikas et al., 2010) constitute this second group. As a soft computing system is less vulnerable to MDA and LR limitations, due to their capability of identifying and representing a non-linear, non-parametric relationship, recent studies have demonstrated that they are powerful tools for corporate bankruptcy prediction. Many authors (Coats & Fant, 1993; Fletcher & Goss, 1993; Lacher et al., 1996; Odom & Sharma, 1990; Tam & Kiang, 1992; Wilson & Sharda, 1994; Zhang et al., 1999) have shown artificial neural networks (ANNs) to be more effective than LR or MDA. The basic idea of ANNs is that it learns from examples using several constructs and algorithms just like a human being learns new things. Thus, it is effective in function approximation, forecasting, classification, clustering and optimization tasks depending on the neural network architecture. Odom and Sharma (1990) use for the first time ANNs for default prediction. Their model had five input layers, one hidden layer and one output layer and they use the same five financial ratios used by Altman. Their research sample was made up of 65 bankrupt firms between 1975 and 1982, and 64 nonbankrupt firms. They compared their results to those obtained by MDA. As a result, ANNs correctly classified 81.8% of the sample whilst MDA only 74.28%. Tam and Kiang (1992) used a set of financial ratios collected from a group of Texan banks (59 failed banks and 59 nonfailed banks between 1985 and 1987). They showed that ANNs with MLP architecture and a back-propagation neural network rule is generally more accurate for predictions than MDA, LR, k-Nearest Neighbor (k-NN), and interactive dichotomizer 3 (ID3). Fletcher and Goss (1993) compared the ANNs’ prediction accuracy with a LR model using three financial ratios and on a sample of 36 bankrupt and non-bankrupt firms. The ANNs model had higher prediction rates than the LR model for almost all risk index cutoff values. Coats and Fant (1993) compared between ANNs and MDA. They obtained a classification accuracy in the range of 81.9% to 95.0% for the ANNs (depending on the horizon: from three-years ahead to less than a yearahead), and in the range of 83.7% to 87.9% for the MDA (also depending on the horizon). Wilson and Sharda (1994) compared back propagation trained neural network (BPNN) with MDA. Using Altman (1968) financial ratios, the authors conclude that BPNN outperform better than MDA over all test samples. Although many other authors showed the usefulness of ANNs in prediction studies, there are several drawbacks in building and using this model. It is not easy for the user to comprehend the final rules that the model acquires. In fact, it is very difficult to find easily the correct ANNs model because there are several networks architectures, learning methods, and parameters. Many neural network architectures need a lot of training data and training cycles (iterations). In addition, the determination of various parameters associated with training algorithms is not straightforward. Consequently, authors have begun to apply other methods like interactive dichotomizer 3 (ID3), case-based reasoning (CBR) and genetic algorithms for company default prediction. According to Tam and Kiang (1992), the ID3 method creates a decision tree that properly classifies the training sample (Quinlan 1979, 1983, 1986). This tree induction method has been applied in credit scoring (Carter & Catlett, 1987) and corporate failures prediction (Messier & Hansen 1988). Frydman, Altman and Kao (1985) applied a similar technique, called recursive partitioning, to generate a discriminant tree. Both ID3 and recursive partitioning employ a nonbacktracking splitting procedure that recursively partitions a set of examples into disjointed subsets. These methods differ in their splitting criteria. The ID3 method intends to maximize the entropy of the split subsets, while the recursive partitioning technique is designed to minimize the expected cost of misclassi-fications. Jo et al. (1997) compared DA, CBR and BPNN to predict Korean firms bankruptcy. Their variables were selected using the dimension reduction techniques such as stepwise selection and t-test. The prediction accuracy rate of DA, CBR and BPNN was 82.22%, 81.52& and 83.79%, repsectively. Therefore, they conclued that BPNN outperformed DA and CBR. Also Bryant

5

(1997) applied a CBR system for default prediction. The author compared CBR with Ohlson’s logit model(1980), using 25 financial variables for one year, two years and three years data. He conclude that logit outperformed CBR in term of less Type I error accuracy. Genetic algorithms mimic Darwinian principles of natural selection and evolution to solve nonlinear, non-convex global optimization problem. They have traditionally been used in optimization problems (Armin & Babak, 2011; Gaby et al., 2010; Ganesan et al., 2012; Leng et al., 2012; Pinkey et al., 2011; Provas Kumar & Dharmadas 2012; Svancara et al., 2012; Vasant, 2012, 2013) but, with a few enhancements, they are also a new and promising method for improving default prediction accuracy (Back et al., 1996; Kingdom & Feldman, 1995; Shin & Han, 1999; Shin & Lee, 2002; Varetto, 1998; Walker et al., 1995). In fact, GAs offer several advantages over other techniques. First, GAs are adaptive algorithms (Holland, 1992), capable, in theory, of perpetual innovation. Second, GAs is capable of extracting rule that are easy to understand and comprehend for users like expert systems (Shin & Lee, 2002). Third, GAs require only fitness information, not gradient information. According to Mahfoud and Mani (1995), given a nondifferentiable or an otherwise ill-behaved problem, many traditional optimization techniques are of no use. Since GAs do not require gradient information, but only fitness information, they can be used in such situations. Finally, GAs are good at finding global optimum of a highly nonlinear, non-convex function without becoming trapped in local minima (Ravi Kumar & Ravi, 2007). As Mahfoud and Mani (1995) highlighted “GAs are designed to search highly nonlinear spaces for global optima. While traditional optimization techniques are likely to converge to a local optimum once they are in its vicinity, GAs conduct search from many points simultaneously, and are therefore more likely to find a global optimum”. The main disadvantages of GAs are that they take long time to converge, may not yield global optimal solution always unless it is augmented by a suitable direct search method. Back, Laitinen and Sere (1996) compared MDA, LR and GAs in a sample of 37 Finnish failed companies and their non-failed mates between 1986 and 1989. The best prediction results were achieved when using GAs. Varetto (1998) compared GA performance with that of LR. He conducted the study for: (i) one year prior to bankruptcy data and (ii) three years prior to bankruptcy data. One year prior to bankruptcy genetic algorithm function yielded a 92% classification rate for bankrupt companies and LDA yielded 90.1%. However, three years prior, he observed that LDA outperformed the genetic linear function in the case of sound companies. Shin and Han (1999) show that GAs have a higher prediction accuracy rate (75%) than the MDA, ID3, and case-based reasoning (CBR) models. Shin and Lee (2002) applied GAs on a sample of 528 mid-sized manufacturing firms (264 failed and 264 non-failed during the period 1995-1997) and demonstrated that GAs are a useful and promising tool for bankruptcy prediction modelling. Kim and Han (2003) use GAs for bankruptcy prediction using a sample of 772 Korean cases and six qualitative factors. Comparisons of the rule based outcome to similar ones derived from inductive learning and neural networks are made, showing larger coverage of the data. A genetic programming for insurance companies bankruptcy prediction was proposed by Salcedo-Sanz et al. in 2005. They use a sample of 36 bankrupt and 36 non-bankrupt Spanish insurance firms and 21 financial ratios. Not all the ratios are used with the genetic programming approach to form the decision model, however accuracy is promising. Comparisons are made with rough sets approaches. Lensberg, Eilifsen and McKee (2006) develop a bankruptcy classification model using genetic programming (GP), a sample of 422 Norwegian firms for the period between 1993 and 1998, and 28 potential variables (six of them prove significant). The results show that GP outperforms (81%) the LR model (77%). Huang, Tzeng and Ong (2006) compared the prediction accuracy rate of a two stage genetic programming (2SGP) with five other methods (ANNs, Cart, C4.5, Rough sets, and LR), using two real-world data sets compiled from a German sample (the first data set) and an Australian sample (the second data set). The results of their study suggest that the 2SGP model outperforms the other models. Etemadi et al. (2009) analyzed a sample of 144 bankrupt and non-bankrupt Iranian firms comparing genetic programming (GP) to the MDA model. GP obtained 94% and 90% of prediction accuracy rates in training and hold-out sample, while the MDA model achieved 77%

6

and 73% accuracy rates in training and hold-out sample, respectively. Bozsik (2010) found that the model based on genetic algorithms is able to achieve better classification accuracy using the two-step freezing method than the MDA. Recently, GAs has been extensively applied in conjunction with other soft computing techniques such as ANNs, CBR, and support vector machine (SVM). McKee and Lensberg (2002) proposed a hybrid intelligence approach combining genetic programming and rough sets. The authors used a sample of 291 US firms relating to the period from 1991 to 1997 and 11 variables to describe the cases. Rough set model obtained 100% classification accuracy on training set and 67% on hold-out sample. The Genetic Programming model yielded 82.6% accuracy on the training set and 80.3% on the hold-out sample. The authors also analysed Type I and Type II error using second genetic model. New model was trained using the same GP algorithm, except that the hold-out sample of first GP model was used as the training sample for second GP model. These two models yielded 81% and 83% accuracies respectively over the entire sample. They concluded that the hybrid model reaches an accuracy of 80% on the validation set, while the simple rough set performed considerably lower with the same data (67%). Pendharkar and Rodger (2004) applied GA-based ANN for default prediction. The authors used real-value GA to learn the connection weights in an ANN and two datasets of which the first one is taken from the study of Joachimsthaler and Stam (1988) and the second one was a real-life data set. They demonstrated that GA-based ANN performed comparably with ANN on hold-out sample. Min et al. (2006) applied a hybrid genetic algorithms and support vector machines (GA-SVM) model to the bankruptcy prediction problem using a real dataset form Korean companies. To evaluate the effectiveness of the GASVM model the authors compare the result of the hybrid model to those obtained by other three models: LR, ANN, and Pure SVM. Their results suggest that the GA-SVM model improve the prediction accuracy compared to the other models.

HYPOTHESES Based on the above literature review, we center our analysis around two hypotheses. Firstly, following the arguments briefly highlighted above, we would expect a higher prediction accuracy rate for GAs compared to traditional models (MDA and LR), both one and three years before defaulting. Hence, we derive the following hypothesis: H1: When compared to traditional methods, GAs give a better contribution to SE default prediction in term of prediction accuracy rate and misclassification costs both one and three years before defaulting. H2: When compared to MDA and LR, the GAs Type II error is significantly lower than with LR and MDA. Hypotheses 3 and 4 add that the accuracy rate (and the misclassification costs) can be increased (decreased) if prediction models are calculated on the basis of SEs size. We would expect that the size of the sample influences the accuracy of a prediction model. Therefore: H3: When the three models (GAs, MDA and LR) are separately calculated according to size they give a significantly higher level of overall prediction accuracy and a significantly lower misclassification costs than when they are applied on the whole sample and GAs continue to show the highest accuracy rate (and the lowest misclassification costs) compared to MDA and LR. H4: When the three models are separately calculated according to size, GAs continue to show the highest reduction of Type II error compared with MDA and LR.

METHODOLOGY

7

The aim of this study is the implementation of a genetic algorithms model to automatically extract an intelligible classification rule for SE default prediction. To determine the higher prediction accuracy rate of the GAs model, we benchmarked it against MDA and LR models, being more prevalent methodologies.

GAs Model GAs, developed by Holland (1975), simulate the Darwinian evolution. They are stochastic search techniques that are able to seek out large and complicated spaces on the ideas from natural genetics and evolutionary principle (Davis, 1991; Etemadi et al., 2009; Holland, 1975; Goldberg, 1989; Shin & Lee, 2002). GAs differ from other non-linear optimization techniques in that they search by maintaining a population (or in this case a data base) of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem (Min et al., 2006). In a genetic algorithm, a population of strings (called chromosomes or the genotypes of the genome), which encode a potential solution (called individuals) to the problem, is evolved toward better solution. A population of candidate solutions, whose individuals are characterized by possessing a chromosome, is maintained, and after a generation is accomplished, the population evolves until converges at levels regarded as optimal. The evolution (initialization stage) usually starts from a population of randomly generated individuals and happens in generations. Each chromosome is evaluated utilising a user-defined fitness function. For real-world applications of GAs, choosing the fitness function is the most critical step. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness), and modified (recombined and possibly randomly mutated) to form a new population. The new population is then used in the next iteration of the algorithm. The algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. Three fundamental steps are mostly used in GAs: 1. Selection of better individuals. The initial population of individuals is generated randomly. The selection process measures the adaptation of individuals in relation to the outside world and it is based on the principle of the individual’s adequacy to the needs of the outside world. According to Darwin (1859) it is not the strongest or the tallest or the fastest who survives but the fittest overall. Thus, in this step, strings with a higher fitness value have more change required to be selected and copied onto the next generation; 2. Crossover. Selection identifies which elements of a population survive to reproduce, a fact that leads to the recombination of genes. Starting from the individuals selected in the selection step, crossover (or genetic recombination) is a process of taking more than one parent solutions and producing a child solution from them. Thus, in this phase, two strings, called parent individual, are selected and a part of one string is combined with a part of another string, obtaining an even better string (called the offspring) after the merge. In this way, we hope to combine the best part of each of the strings to cause the population to evolve and to improve the result of the fitness function; 3. Mutation. This process, analogous to biological mutation, is used to maintain genetic diversity from one generation to the next. Mutation introduces further genetic change in the population or, at least, prevents the loss of it. Under mutation, a gene can obtain a value that did not occur in the population before, or that has been lost due to reproduction (Back et al., 1996). Mutation enriches the variety of individuals present in the population, preventing them from tending to be too uniform, hence losing the capacity to evolve (Varetto, 1998). Thus, in mutation, the solution may change entirely from the previous solution and GAs can arrive at a better solution by using mutation. Figure 1 describes and summarizes the typical steps of a GAs process (Etemadi et al. 2009; Shin & Lee, 2002; Tsakonas, 2006; Varetto, 1998).

8

Figure 1 Typical steps of GAs process

Non è possibile visualizzare questa immagine.

In setting up a GAs procedure, there are two essential points: the genetic representation of solutions to the problem and the definition of the evaluation function (fitness). Since GAs operate on symbolic strings, the solutions to the problem to be to solve can be represented by a code that can be manipulated by the algorithm. This genetic coding may take on a variety of forms: given the object of our study, the binary alphabet, made up of sequences of 0 and 1, is used. The fitness function is a way to measure the individual performances in the population. For example, while the MDA gives as a result of the estimation process only one function, with GA we obtain a population of many functions (individuals). Thus, the fitness function is designed to evaluate the classification rate’s performance of the individuals who make up the populations, transforming the fitness of the solutions proposed by the GAs into numeric values based on their performance (Varetto, 1998). In this study, we used GAs to find a set of defaulting rules based on the sign and the cut-off value of the selected financial ratios. The type of rule we use is similar to that adopted by Bauer (1994), Mahfoud and Mani (1995), Shin and Lee (2002), and Varetto (1998). So, we apply GAs to find a threshold value (a cut-off value) for each of our financial ratios, above or below which a company is considered bankrupt. For example, if the model was based only on one variable, the structure of the rule looks like the following: IF (R ≥ C) THEN Dangerous

9

where: R is the financial ratio; C is the cut off value or the threshold numeric value with respect to which the test (≥