A HYBRID FEATURE SELECTION APPROACH USING WEIGHTED SCORING IN HIGH-DIMENSIONAL GENE EXPRESSION DATA FOR BINARY CLASSIFICATION
DOI:
https://doi.org/10.63075/abbed627Keywords:
high dimension; feature selection; trees; wilcoxon; classificationAbstract
Analyzing data with multiple dimensions presents a challenge for machine learn- ing, data mining researchers, and engineers working with various forms of data like videos, photos, text, and voice. The high dimensionality of such data hinders decision-making processes. To address this issue, dimensionality reduction techniques, specifically feature selection, have gained significance. This paper introduces a novel hybrid method that combines the characteristics of the Random Forest and the Wilcoxon Rank Sum test. The proposed method consists of two stages. In the first stage, important features are identified using the Random Forest algorithm and sorted in ascending or- der of magnitude. In the second stage, the Wilcoxon Sign Rank method is utilized to select additional important features. These selected features are then combined to create a new hybrid model. The hybrid model utilizes the chosen genes to train the classifiers. To evaluate the effectiveness of the proposed method, six high-dimensional datasets are used, and its performance is compared against other feature selection methods, namely Proportional Overlapping Score (POS), Robust Proportional Overlapping Score (RPOS), Wilcoxon, Sigmoid Function (SigF), Genomic Clustering (Gclust), and Minimal-Redundancy-Maximal-Relevance (mRmR). The proposed method gives improved or comparable results to other techniques in most of cases.Downloads
Published
2025-01-24
Issue
Section
Articles
How to Cite
A HYBRID FEATURE SELECTION APPROACH USING WEIGHTED SCORING IN HIGH-DIMENSIONAL GENE EXPRESSION DATA FOR BINARY CLASSIFICATION. (2025). Review Journal of Neurological & Medical Sciences Review, 3(1), 553-572. https://doi.org/10.63075/abbed627