Second International Symposium on Information Science and Engineering
Ant colony algorithm used for bankruptcy prediction Shuihua Wang1*, Lenan Wu1, Yudong Zhang123, Zhengyu Zhou23 (1 School of Information Science and Engineering, Southeast University, Nanjing China) (2 Brain Image Lab, Division of Child Psychiatry, New York State Psychiatric Institute, New York USA) (3 Medical Center, Columbia University, New York USA)
[email protected]
shortcomings in building and using the model. First, it is an art to find an appropriate NN model which can reflect problems characteristics because there numerous network architectures, learning methods, and parameters. Second, the user cannot readily comprehend the final rules that the NN models acquire. This characteristic of NNs is often referred to as ‘black boxes’. Thus, a novel rule extract method is proposed in this paper to solve above shortcomings. Moreover, ant colony algorithm (ACA) is used to find the optimal rule. ACA is a novel algorithm developed recently [10]. It mimics the techniques employed by real ants to rapidly establish the shortest route from food source to their nest and vice versa. Ants start searching the area surrounding their nest in a random manner. When an isolated ant comes across some food in its sojourn, it deposits a quantity of pheromone on that location. Other ants in the neighborhood can detect this marked pheromone trail. More ants follow the pheromone rich trail, and the probability of the trial being followed by other ants is further enhanced by increased trial deposition. It is this auto catalytic process characterized by positive feedback mechanism that helps the ants to establish the shortest route. II. INTRODUCTION OF ACA The ACA algorithm is inspired by the observation of real ant colonies. Ants are social insects living in colonies with interesting foraging behavior. In particular, an ant can find shortest paths between food source and a nest. While walking from food sources to the nest and vice versa, ants deposit on ground a substance called pheromone, forming a pheromone trail. Ants can smell pheromone and when choosing their way then tend to choose paths marked by strong pheromone concentrations. A. Initialization An undirected graph G=(N,A), where N is the set of nodes and A the set of arcs connecting the nodes. The density of the nodes determines the precision of a solution but also memory and computation time demands of the algorithm. All arcs A are initialized with a small amount of pheromone τ0. Our target is to find the shortest way from the source node N1 to the destination node N2. B. Probability Calculation In the second step, C ants are sequentially launched from N1, where C is the number of ants in the colony. Each ant walks pseudo-randomly from node to node via connecting arcs as far as the N2 or dead end is reached.
Abstract—Bankruptcy prediction is a hot topic. Traditional methods consist of univariate model and multivariate model such as neural network. However, the NNs can not extract effective rules. Thus, a novel approach was proposed in this paper to extract rules. First, t-test method was used to select 5 features from 55 original features. Second, the rule encoding was constructed. Third, the ant colony algorithm was utilized to find the optimal rule. Experiments on 200 corporate demonstrate that this proposed algorithm is effective and rapid. Keywordsoptimization
Bankruptcy
prediction;
ant
colony
I. INTRODUCTION Corporate bankruptcy is a very important economic phenomenon. The health and success of the firms are of widespread concern to policy makers, industry participants, investors, and managers [1]. It also is a problem that affects the economy of a country and it can be considered as an index of the developments and robustness of the economy. The high individual, economic, and social costs encountered in corporate failures or bankruptcies have spurred searches for better understanding and prediction capability [2]. Prediction of corporate bankruptcy is a phenomenon of increasing interest to investors/creditors, borrowing firms, and government alike. Timely identification of firms’ impending failure is indeed desirable [3]. By this time, several methods have been used for predicting bankruptcy. Early research focused on primarily on one-variable models such as individual financials ratios. The ratios were used individually and a cutoff score was calculated for each ratio on the basis of minimizing misclassification. The one-variable methods were later criticized, in spite of its considerable results, because of the correlation among ratios and providing different signals for a form by ratios [4]. Later research turned to multivariable models used statistical techniques such as multiple discriminant analysis (MDA) [5], logit [6] and probit [7]. Recently, however lots of research showed that artificial intelligence such as neural networks (NNs) can be an alternative methodology for classification problems to which traditional statistical method have long been applied[8][9]. Although lots of studies and experiments stated the usefulness of NNs in different studies, there are some 978-0-7695-3991-1/09 $26.00 © 2009 IEEE DOI 10.1109/ISISE.2009.11
137
When deciding which node j to go from a specific node i, the probability Pij is assigned as follows τ ij (t )α ηij (t ) β (1) Pij = ∑τ ij (t )α ηij (t )β
Tab. 1 Encoding mechanism Antecedent Antecedent " Element n Element 1 CV1min CV1max " CVn min CVn max
Consequent Element Bankrupt
j
This operator randomly flips some of the bits in a chromosome. Mutation can occur at each bit position in a string with some probability, usually very small. As mentioned above, τij(t) is the amount of pheromone currently available at time t in the arc from node i to node j. ηij(t) is the heuristic function value (visibility). α and β control the relative importance of trail and visibility, respectively. C. Global Update for Pheromone Trails The pheromones of all arcs are updated according to following rules ⎧⎪(1 − ρ )[τ ij (t ) + Q], (i, j ) ∈ BestRoute τ ij (t + 1) = ⎨ (2) (i, j ) ∉ BestRoute ⎪⎩ (1 − ρ )τ ij (t ), At any iteration, calculate the best route from C routes. Then the pheromones of the best route are enforced while others are evaporated. It should be noted that there exist local update in some models; however the one with local update can not guarantee the convergence. D. Termination Sections B to C are repeated for a fixed number of iterations or as log as the desired solution does not reached. After the termination of the algorithm, a stored solution of the very best ant indicates the fastest path between N1 and N2. III. RULE EXTRACTION USING ACA Assume the classification problem contains c classes in the n dimensional pattern space, and there are p vectors Predictioni=[xi1, xi2, " , xin](i=1,2, " ,p).
A.
B.
Fitness Evaluation of Rules
To measure the prediction (classification) accuracy of the rule, the fitness function can be depicted from reference [13] Number of bankrupt firms predicted correctly f = (3) Number of all bankrupted firms It should be noted that the fitness value is expected to be maximum. C.
Parameter setting
The parameter settings of ACA were obtained by trialand-error method, and listed Tab. 2. Tab. 2 Parameters setting Parameters Value Population size 100 Stagnation limit 10 Generation limit 200 α 1 β 3 ρ 0.1 Q 500 IV.
EXPERIMENTS AND CONCLUSION
A.
Data preprocessing The data set contains 200 externally audited midsized manufacturing firms, 100 of which filed for bankruptcy and the other 100 for non-bankruptcy during the period 20062008. Then we select five financial variables using t-test [14] method from 55 variables which is chosen by factor analysis for refinement, namely net income to stocker’s equality, quick ratio, retained earnings to total assets, stockholders’ equity to total assets, and financial expenses to assets. The data set is spit into two subsets, a training set and a validation set of 90 and 10% of antire data respectively. The training data are used for learning rules, and the validation data which have not been used to development the system are used to test the results. B. Rule generation Our ant colony search process finally extracts the bankruptcy prediction rule. The rule generated and corresponding descriptions are illustrated in Tab. 3.
Encoding mechanism
An IF-THEN rule is represented here: Bankruptcy Rule: IF (CV1min ≤ xi1 ≤ CV1max) AND (CV2min ≤ xi2 ≤ CV2max) AND " AND (CVnmin ≤ xin ≤ CVnmax) THEN The firm will bankrupt Where n is the number of attributes, CVkmin and CVkmax are the minimum and maximum bounds of the kth attribute xik, respectively. The rule is then encoded as Tab. 1.
Tab. 3 Rule and Meaning Antecedent Antecedent Antecedent Antecedent Antecedent Element 1 Element 2 Element 3 Element 4 Element 5 Rule CV1min CV1max CV2 min CV2 max CV3 min CV3 max CV4 min CV4 min CV5 min CV5 max 0.132 0.660 0.220 0.803 0.103 0.687 0.389 0.683 0.234 0.579 Meaning If Net income to stockholder’s equity is between 0.012 and 0.560 AND Quick ratio is
138
between 0.220 and 0.803 AND Retained earnings in between 0.103 and 0.687 AND Stockholders’ equity to total assets is between 0.389 and 0.683 AND Financial expenses to sales is between 0.234 and 0.579, Then the firm will bankrupt. 194. 1997 [9] T. Bell. Neural nets or the logit model: “A comparison of each models’ ability to predicit commercial bank failures”. Intelligent Systems in Accounting, Finance and management. Vol. 6. pp.249-264. 1997 [10] M. Dorigo, C. G. Di, and L. M Gambardella. “Ant algorithms for discrete optimization”. Artificial Life, vol .5. pp. 137-172. 1999 [11] F. M Zhu and S. U Guan. “Cooperative co-evolution of GA-based classifiers based on input decomposition”. Engineering Applications of Artificial Intelligence.: vol. 10..pp. 01-09. 2008 [12] P. S. Shelokar, V. K. Jayaraman, and B. D. Kulkarni. “An ant colony classifier system: application to some process engineering problems”. Computers & Chemical Engineering. vol. 28 pp.1577-1584.2004 [13] M. H. Tseng, S. J. Chen, , G. H. Hwang and M. Y. Shen “A genetic algorithm rule-based approach for land-cover classification”. Photogrammetry & Remote Sensing,vol.63., pp. 202-212.2008 [14] C. F. Tsai. “Feature selection in bankruptcy prediction” KnowledgeBased Systems, Vol. 22, , pp.: 120-127. 2009
classification accuracy
0.8 0.6 0.4 0.2 0
0
5
10
15 Epoch
20
25
30
Fig. 1 classification accuracy against Epoch In Fig. 1, we have shown the plot of objective score against epoch via ACA. It is clear that the ACA can find the optimal rules in less than 30 generations. C. Rule validation Tab. 4 the performance of derived rule Data source Classification accuracy Train (180 cases) 79.1% Test (20 cases) 76.3% The accuracy of the bankruptcy prediction calculated from stimulation results are summarized in Tab. 4. In Tab. 4, the train set stands for the rate of correct prediction if the rule is fired, while the test set denotes the overall prediction accuracy of the set. V. CONCLUSION We applied ACA to extract rules that can predict corporate failure. The results show that rule extraction approach via ACA for bankruptcy prediction is satisfying and acceptable. The future work is to improve the prediction accuracy and fasten the algorithm.
REFERENCE [1] O’Leary, D. E. “Using neural networks to predict corporate failure. International Journal of Intelligent Systems in Accounting finance management”. vol. 7. pp.: 187-197. 1998 [2] F. Noorbakhsh. “A modified human development index World Development”, Vol. 26. pp.: 517-528. 1998 [3] T. E. Mckee and, T. Lensberg. Genetic and rough sets: “A hybrid approach to bankruptcy classification. European Journal of Operational Research”. Vol. 138. pp. 603-614. 2002 [4] B. Boardman and J. J. Perry. “Access to gambling and declaring personal bankruptcy. Journal of Socio-Economics”. Vol 36. pp. 789-801. 2007 [5] H. Etemadi, A. A. A. Rostamy, and H. F. Dehkordi. “A genetic programming model for bankruptcy prediction: Empirical evidence from Iran”. Expert Systems with Applications vol. 36. pp. 3199-3207. 2009 [6] T. Johnsen and R. W. Melicher. “Predicting corporate bankruptcy and financial distress: Information value added by multinomial logit models”. Journal of Economics and Business, vol. 46. pp. 269-286. 1994 [7] C. Lennox “ Identifying failing companies: a re-evaluation of the logit, probit and DA approaches. Journal of Economics and Business”. Vol. 51, pp. 347-364. .1999 [8] R. Bauer, A Agarwal and R. Leach. “Predicting the outcome flowing bankruptcy filing: A Three-state classification using neural networks”. Intelligent Systems in Accounting, Finance and management. Vol. 6. pp. 177-
139