dc.contributor.author | Parlak, Bekir | |
dc.contributor.author | Uysal, Alper Kursat | |
dc.date.accessioned | 2024-03-12T19:29:57Z | |
dc.date.available | 2024-03-12T19:29:57Z | |
dc.date.issued | 2023 | |
dc.identifier.issn | 0165-5515 | |
dc.identifier.issn | 1741-6485 | |
dc.identifier.uri | https://doi.org/10.1177/0165551521991037 | |
dc.identifier.uri | https://hdl.handle.net/20.500.12450/2446 | |
dc.description.abstract | As the huge dimensionality of textual data restrains the classification accuracy, it is essential to apply feature selection (FS) methods as dimension reduction step in text classification (TC) domain. Most of the FS methods for TC contain several number of probabilities. In this study, we proposed a new FS method named as Extensive Feature Selector (EFS), which benefits from corpus-based and class-based probabilities in its calculations. The performance of EFS is compared with nine well-known FS methods, namely, Chi-Squared (CHI2), Class Discriminating Measure (CDM), Discriminative Power Measure (DPM), Odds Ratio (OR), Distinguishing Feature Selector (DFS), Comprehensively Measure Feature Selection (CMFS), Discriminative Feature Selection (DFSS), Normalised Difference Measure (NDM) and Max-Min Ratio (MMR) using Multinomial Naive Bayes (MNB), Support-Vector Machines (SVMs) and k-Nearest Neighbour (KNN) classifiers on four benchmark data sets. These data sets are Reuters-21578, 20-Newsgroup, Mini 20-Newsgroup and Polarity. The experiments were carried out for six different feature sizes which are 10, 30, 50, 100, 300 and 500. Experimental results show that the performance of EFS method is more successful than the other nine methods in most cases according to micro-F1 and macro-F1 scores. | en_US |
dc.language.iso | eng | en_US |
dc.publisher | Sage Publications Ltd | en_US |
dc.relation.ispartof | Journal Of Information Science | en_US |
dc.rights | info:eu-repo/semantics/closedAccess | en_US |
dc.subject | Dimension reduction | en_US |
dc.subject | feature selection | en_US |
dc.subject | text classification | en_US |
dc.title | A novel filter feature selection method for text classification: Extensive Feature Selector | en_US |
dc.type | article | en_US |
dc.department | Amasya Üniversitesi | en_US |
dc.authorid | Uysal, Alper Kursat/0000-0002-4057-934X; | |
dc.identifier.volume | 49 | en_US |
dc.identifier.issue | 1 | en_US |
dc.identifier.startpage | 59 | en_US |
dc.identifier.endpage | 78 | en_US |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
dc.identifier.scopus | 2-s2.0-85104287630 | en_US |
dc.identifier.doi | 10.1177/0165551521991037 | |
dc.department-temp | [Parlak, Bekir] Amasya Univ, Dept Comp Engn, Fac Technol, Yesilirmak Campus, TR-05100 Amasya, Turkey; [Uysal, Alper Kursat] Canakkale Onsekiz Mart Univ, Fac Engn, Dept Comp Engn, Canakkale, Turkey | en_US |
dc.identifier.wos | WOS:000641912500001 | en_US |
dc.authorwosid | Uysal, Alper Kursat/P-3089-2019 | |
dc.authorwosid | PARLAK, Bekir/IXM-9534-2023 | |