Developing Financial Distress Prediction Models Based on Imbalanced Dataset: Random Undersampling and Clustering Based Undersampling Approaches

Razavi Ghomi, Seyed Behrooz; Mehrazin, Alireza; Shourvarzi, Mohammad Reza; Masih Abadi, Abolghasem

doi:10.22034/amfa.2022.1956898.1743

همایش سردبیران نشریات علمی دانشگاه آزاد اسلامی

سامانه یکپارچه نشریات علمی دانشگاه آزاد اسلامی

تعداد نشریات	418
تعداد شماره‌ها	10,013
تعداد مقالات	83,708
تعداد مشاهده مقاله	79,587,574
تعداد دریافت فایل اصل مقاله	56,280,404

	Developing Financial Distress Prediction Models Based on Imbalanced Dataset: Random Undersampling and Clustering Based Undersampling Approaches
Advances in Mathematical Finance and Applications
مقالات آماده انتشار، پذیرفته شده، انتشار آنلاین از تاریخ 28 تیر 1401 اصل مقاله (746.37 K)
نوع مقاله: Research Paper
شناسه دیجیتال (DOI): 10.22034/amfa.2022.1956898.1743
نویسندگان
seyed behrooz razavi ghomi¹؛ Alireza Mehrazin^* ¹؛ Mohammad reza shourvarzi¹؛ Abolghasem Masih Abadi²
¹Department of Accounting, Neyshabur Branch, Islamic Azad University, Neyshabur, Iran
²Department of Accounting, Sabzevar Branch, Islamic Azad University, Sabzevar, Iran
چکیده
So far, distress prediction models have been based on balanced, such sampling is not consistent with the reality of the statistical community of companies. If the data are balanced, the bias in sample selection may lead to an underestimation of typeI error and an overestimation of the typeII error of models. Although imbalanced data-based models are compatible with reality, they have a higher typeI error compared to balanced data-based models. The cost of typeI error is more important to Beneficiaries than the cost of typeII error. In this study, for reducing typeI error of imbalanced data-based models, random and clustering-based undersampling were used. Tested data included 760 companies since 2007-2007 with 4 different degrees and the results of the H1 to H3 test represented them. In all cases of the typeI error, typeII error of balanced data-based models were lower and more, respectively, compared to imbalanced data-based models; also, in most cases, the geometric mean of balanced data-based models was higher compared to imbalanced data-based models, respectively. The results of testing H4 to H6 show that in most cases, typeI error, typeII error and the geometric mean criterion of models based on modified imbalanced data were less, more, and more, respectiively compared to the models based on imbalanced data, in other words, applying Undersampling methods on imbalanced training data led to a decrease in typeI error and an increase in typeII error and geometric mean criteria. As a result using models based on modified imbalanced data is suggested to Beneficiaries
کلیدواژه‌ها
Imbalanced datasets؛ Undersampling؛ financial distress prediction models؛ financial ratios؛ machine learning

آمار تعداد مشاهده مقاله: 215 تعداد دریافت فایل اصل مقاله: 40

سامانه مدیریت نشریات علمی. قدرت گرفته از سیناوب

پیوندهای مفید

اخبار و اعلانات

آمار

Developing Financial Distress Prediction Models Based on Imbalanced Dataset: Random Undersampling and Clustering Based Undersampling Approaches