تعداد نشریات | 418 |
تعداد شمارهها | 10,005 |
تعداد مقالات | 83,623 |
تعداد مشاهده مقاله | 78,424,789 |
تعداد دریافت فایل اصل مقاله | 55,450,235 |
A New Multi-Stage Feature Selection and Classification Approach: Bank Customer Credit Risk Scoring | ||
Journal of Industrial Engineering International | ||
دوره 17، شماره 1، خرداد 2021، صفحه 78-87 اصل مقاله (608.46 K) | ||
نوع مقاله: Original Article | ||
شناسه دیجیتال (DOI): 10.30495/jiei.2021.1919589.1079 | ||
نویسنده | ||
Farshid Abdi* | ||
Islamic Azad University | ||
چکیده | ||
Abstract Lots of information about customers are stored in the databases of banks. These databases can be used to assess the credit risk. Feature selection is a well-known concept to reduce the dimension of such databases. In this paper, a multi-stage feature selection approach is proposed to reduce the dimension of database of an Iranian bank including 50 features. The first stage of this paper is devoted to removal of correlated features. The second stage of it is allocated to select the important features with genetic algorithm. The third stage is proposed to weight the variables using different filtering methods. The fourth stage selects feature through clustering algorithm. Finally, selected features are entered into the K-nearest neighbor (K-NN) and Decision Tree (DT) classification algorithms. The aim of the paper is to predict the likelihood of risk for each customer based on effective and optimum subset of features available from the customers. | ||
کلیدواژهها | ||
Clustering؛ Credit risk prediction؛ filtering method؛ Genetic algorithm؛ Hybrid feature selection | ||
مراجع | ||
[1] Abdi, F., Khalili-Damghani, K., Abolmakarem, S. (2017) "Solving customer insurance coverage sales plan problem using a multi-stage data mining approach", Kybernetes, 47(1) . https://doi.org/10.1108/K-07-2017-0244 [2] Apornak A., Raissi S., Keramati A., Khalili-Damghani K., (2020), optimizing human resource cost of an emergency hospital using multi-objective Bat algorithm, International Journal of Healthcare Management, 1-7 [3] Arora, N., Kaur, P.D. (2020) "A Bolasso based consistent feature selection enabled random forest classification algorithm: An application to credit risk assessment", Applied Soft Computing, 86, 105936, https://doi.org/10.1016/j.asoc.2019.105936 [4] Bijak K, Thomas L.C. (2012). "Does segmentation always improve model performance in credit scoring?" Expert Systems with Applications 39, 2433–2442 [5] Danenas P., Garsva G. (2015). "Selection of Support Vector Machines based classifiers for credit risk domain", Expert Systems with Applications, 42, 3194–3204 [6] Florez-Lopez R., Ramon-Jeronimo J.M., (2015) "Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal", Expert Systems with Applications, 42, 5737–5753 [7] Guyon S, Elisseeff A, (2003) "An Introduction to Variable and Feature Selection", Journal of Machine Learning Research, 3, 1157-1182 [8] Hajek P., Michalak K., (2013). "Feature selection in corporate credit rating prediction", Knowledge-Based Systems, 51, 72–84 [9] Harris T. (2015). "Credit scoring using the clustered support vector machine", Expert Systems with Applications, 42, 741–750 [10] Henley, W. E. (1995). "Statistical aspects of credit scoring. Dissertation", The Open University, Milton Keynes, UK. [11] Hens A.B., Tiwari M.K. (2012) "Computational time reduction for credit scoring: An integrated approach based on support vector machine and stratified sampling method", Expert Systems with Applications, 39, 6774–6781 [12] Hsieh N-C, Hung L-P. (2010) "A data driven ensemble classifier for credit scoring analysis", Expert Systems with Applications, 37, 534–545 [13] Khalili-Damghani, K., Abdi, F., Abolmakarem, S. (2018) " Hybrid soft computing approach based on clustering, rule mining, and decision tree analysis for customer segmentation problem: Real case of customer-centric industries", Applied Soft Computing, 73, 816-828 [14] Khalili-Damghani, K., Abdi, F., Abolmakarem, S. (2018) "Solving customer insurance coverage recommendation problem using a two-stage clustering-classification model", International Journal of Management Science and Engineering Management, 14(1)9-19 [15] Khashei M,Rezvan M.T., A ZeinalHamadani, AND MBijari. (2013). "A Bi-Level Neural-Based Fuzzy Classification Approach for Credit Scoring Problems", Complexity, 18 (6), 46-57. [16] Khashman A. (2010) "Neural networks for credit risk evaluation: Investigation of different neural models and learning schemes", Expert Systems with Applications, 37, 6233–6239 [17] Khashman A. (2011) "Credit risk evaluation using neural networks: Emotional versus conventional models", Applied Soft Computing 11, 5477–5484 [18] Kittidecha C, Yamada K (2018) Application of Kansei engineering and data mining in the Thai ceramic manufacturing. Journal of Industrial Engineering International 14, 757–766 Int. https://doi.org/10.1007/s40092-018-0253-y [19] Laha A. (2007). "Building contextual classifiers by integrating fuzzy rule based classification technique and K-NN method for credit scoring", Advanced Engineering Informatics, 21, 281– 291 [20] Larose D. T., Larose C.D., (2014) "Discovering knowledge in data: an introduction to data mining", Second ed., John Wiley & Sons, Inc., Hoboken, New Jersey. [21] Lessmann S, Baesens B, Seow H-V, and Thomas L.C., (2015), "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research", European Journal of Operational Research, 247(1),124-136 [22] Maldonado S, Perez J, Bravo C (2017) "Cost-based feature selection for Support Vector Machines –An application in credit scoring", European Journal of Operational Research, 261 (2) 656–665 [23] Marqués A.I, Garcia. V., Sanches J.S. (2012) "Two-level classifier ensembles for credit risk assessment", Expert Systems with Applications, 39, 10916–10922 [24] Moradkhani M, Amiri A, Javaherian M, Safari H, (2015) "A hybrid algorithm for feature subset selection in high- dimensional datasets using FICA and IWSSr algorithm", Applied Soft Computing, 35, 123-135 [25] Nalić, J., Martinović, G, Žagar, D. (2020) "New hybrid data mining model for credit scoring based on feature selection algorithm and ensemble classifiers", Advanced Engineering Informatics, 45, 101130 [26] Nourian R. , Meysam Mousavi S., Raissi S., (2019) A fuzzy expert system for mitigation of risks and effective control of gas pressure reduction stations with a real application, Journal of Loss Prevention in the Process Industries,59, 77-90. [27] Oreski S, Oreski D, Oreski G, (2012) "Hybrid system with genetic algorithm and artificial neural networks and its application to retail credit risk assessment", Expert Systems with Applications, 39, 12605–12617 [28] Oreski S, Oreski G. (2014). "Genetic algorithm-based heuristic for feature selection in credit risk assessment", Expert Systems with Applications, 41 (4) 2052-2064 [29] Papouskova, M., Hajek, P. (2019) "Two-stage consumer credit risk modelling using heterogeneous ensemble learning", Decision Support Systems, Vol.118, pp.33-45 [30] Pelleg D., Moore A. (2002). "X-means: Extending K-means with Efficient Estimation of the Number of Clusters", Proceedings of the Seventeenth International Conference on Machine Learning, PP. 727-734. [31] Pławiak, P., Abdar, M., Acharya, UR. (2019), "Application of new deep genetic cascade ensemble of SVM classifiers to predict the Australian credit scoring", Applied Soft Computing, Vol. 84, 105740, https://doi.org/10.1016/j.asoc.2019.105740 [32] Ping Y., Yongheng L. (2011). "Neighborhood rough set and SVM based hybrid credit scoring classifier", Expert Systems with Applications, 38, 11300–11304 [33] Rtayli, N. Enneya, N. (2020) "Selection Features and Support Vector Machine for Credit Card Risk Identification", Procedia Manufacturing, 45, 941-948. [34] Shen, F., Zhao, X., Li. Z., Li. K., Meng. Z. (2019) "A novel ensemble classification model based on neural networks and a classifier optimisation technique for imbalanced credit risk evaluation", Physica A: Statistical Mechanics and its Applications, Vol. 256, 121073, https://doi.org/10.1016/j.physa.2019.121073 [35] Thomas L. C., Edelman D. B., Crook J. N. (2002). "Credit scoring and its applications". Philadelphia, PA: SIAM. [36] Tsai C-F, Eberle W, Chu C-Y, (2013). "Genetic algorithms in feature and instance selection", Knowledge-Based Systems, 39, 240–247 [37] Tsai C-F., Hsu Y-F., Yen D.C., (2014) "A comparative study of classifier ensembles for bankruptcy prediction", Applied Soft Computing. 24, 977–98. [38] Wang G, Ma J, Huang L, Xu K, (2012) "Two credit scoring models based on dual strategy ensemble trees", Knowledge- Based Systems, 26, 61–68 [39] Wang G, Ma J, (2012). "A hybrid ensemble approach for enterprise credit risk assessment based on Support Vector Machine", Expert Systems with Applications, 39, 5325–5331 [40] Wang D, Zhang Z, Bai R, Mao Y. (2018). "A hybrid system with filter approach and multiple population genetic algorithm for feature selection in credit scoring", Journal of Computational and Applied Mathematics, 329, 307-321 [41] Wu, W.-W. (2011). "Improving classification accuracy and causal knowledge for better credit decisions". International Journal of Neural Systems, 21(04), 297–309 [42] Xiao J, Xie L, He C, Jiang X, (2012). "Dynamic classifier ensemble model for customer classification with imbalanced class distribution", Expert Systems with Applications, 39, 3668–3675 [43] Yap B. W., Ong S.H., Mohamed Husain N.H. (2011). "Using data mining to improve assessment of credit worthiness via credit scoring models", Expert Systems with Applications, 38, 13274–13283 [44] Yu L., Yao X., WangSh., LaiK.K.(2011). "Credit risk evaluation using a weighted least squares SVM classifier with design of experiment for parameter selection", Expert Systems with Applications, 38, 15392–15399 [45] Zhao Z, XuSh, KangB, KabirM.M.J., LiuY, and Wasinger R. (2015) "Investigation and improvement of multi-layer perceptron neural networks for credit scoring", Expert Systems with Applications, 42 (7) 3508–3516 [46] Zhu, H., Beling, P. A., and Overstreet, G. A. (2002). "A Bayesian framework for the combination of classifier outputs". The Journal of the Operational Research Society, 53(7), 719– 727. | ||
آمار تعداد مشاهده مقاله: 426 تعداد دریافت فایل اصل مقاله: 462 |