Open Access Repository

An under-sampling method with support vectors in multi-class imbalanced data classification

Arafat, MY, Hoque, S, Xu, S ORCID: 0000-0003-0597-7040 and Farid, DM 2019 , 'An under-sampling method with support vectors in multi-class imbalanced data classification', in Proceedings of the 13th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2019) , Institute of Electrical and Electronics Engineers, United States, pp. 1-6 , doi: 10.1109/SKIMA47702.2019.8982391.

Full text not available from this repository.

Abstract

Multi-class imbalanced data classification in supervised learning is one of the most challenging research issues in machine learning for data mining applications. Although several data sampling methods have been introduced by computational intelligence researchers in the past decades for handling imbalanced data, still learning from imbalanced data is a challenging task and played as a significant focused research interest as well. Traditional machine learning algorithms usually biased to the majority class instances whereas ignored the minority class instances. As a result, ignoring minority class instances may affect the prediction accuracy of classifiers. Generally, under-sampling and over-sampling methods are commonly used in single model classifiers or ensemble learning for dealing with imbalanced data. In this paper, we have introduced an under-sampling method with support vectors for classifying imbalanced data. The proposed approach selects the most informative majority class instances based on the support vectors that help to engender decision boundary. We have tested the performance of the proposed method with single classifiers (C4.5 Decision Tree classifier and naïve Bayes classifier) and ensemble classifiers (Random Forest and AdaBoost) on 13 benchmark imbalanced datasets. It is explicitly shown by the experimental result that the proposed method produces high accuracy when classifying both the minority and majority class instances compared to other existing methods.

Item Type: Conference Publication
Authors/Creators:Arafat, MY and Hoque, S and Xu, S and Farid, DM
Keywords: data sampling methods, ensemble learning, imbalanced data, over-sampling, under-sampling
Journal or Publication Title: Proceedings of the 13th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2019)
Publisher: Institute of Electrical and Electronics Engineers
DOI / ID Number: 10.1109/SKIMA47702.2019.8982391
Copyright Information:

Copyright 2019 IEEE

Related URLs:
Item Statistics: View statistics for this item

Actions (login required)

Item Control Page Item Control Page
TOP