تعداد نشریات | 418 |
تعداد شمارهها | 9,997 |
تعداد مقالات | 83,560 |
تعداد مشاهده مقاله | 77,801,361 |
تعداد دریافت فایل اصل مقاله | 54,843,969 |
Dimension Reduction of Big Data and Deleting Noise and Its Efficiency in the Decision Tree Method and Its Use in Covid 19 | ||
International Journal of Mathematical Modelling & Computations | ||
مقاله 4، دوره 12، 3 (SUMMER) - شماره پیاپی 47، آذر 2022، صفحه 183-190 اصل مقاله (106.92 K) | ||
نوع مقاله: Full Length Article | ||
شناسه دیجیتال (DOI): 10.30495/ijm2c.2022.1947200.1239 | ||
نویسندگان | ||
Fazel Badakhshan Farahabadi1؛ Kianoush Fathi Vajargah* 2؛ Rahman Farnoosh3 | ||
1Department of Statistics, Islamic Azad University, Science and Research Branch, Tehran, Iran | ||
2Department of Statistics, Islamic Azad University, Tehran North Branch | ||
3School of Mathematics, Iran University of Science and Technology, Tehran, 16844, Iran | ||
چکیده | ||
In today's world, with the advancement of science and technology, data is generated at high speeds, and with the increase in the size and volume of data, we often face a lot of extensions and redundant data and noise data that make the task of analysis difficult. Therefore, dimension reduction of the data without losing useful information in the data is very important to prepare the data for data mining and can increase the speed and even accuracy of the analysis. In this research, we present a dimensional reduction method using a copula function that reduces the dimensions of the data by identifying the relationships between the data. The copula function provides a good pattern of dependence for comparing multivariate distributions to better identify the relationship between data. In fact, by fitting the appropriate copula function to the data and estimating the copula function parameter, we measure the structural correlation of the variables and eliminate variables that are highly structurally correlated with each other. As a result, in the method presented in this study, using the copula function, we identify noise data and data with many common features and remove them from the original data. | ||
کلیدواژهها | ||
Copula function؛ Gaussian copula function (normal)؛ Decision tree؛ C4.5؛ Covid 19 | ||
آمار تعداد مشاهده مقاله: 118 تعداد دریافت فایل اصل مقاله: 35 |