代价敏感惩罚AdaBoost算法的非平衡数据分类
作者:
作者单位:

河北大学数学与信息科学学院,河北省机器学习与计算智能重点实验室,保定 071002

作者简介:

通讯作者:

鲁淑霞,女,教授,E-mail: cmclusx@126.com。

中图分类号:

TP391

基金项目:

河北省科技计划重点研发项目(19210310D);河北省自然科学基金(F2021201020)。


Imbalanced Data Classification Based on Cost Sensitivity Penalized AdaBoost Algorithm
Author:
Affiliation:

College of Mathematics and Information Science, Hebei Province Key Laboratory of Machine Learning and Computational Intelligence, Hebei University, Baoding 071002,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对非平衡数据分类问题,提出了一种基于代价敏感的惩罚AdaBoost算法。在惩罚Adaboost算法中,引入一种新的自适应代价敏感函数,赋予少数类样本及分错的少数类样本更高的代价值,并通过引入惩罚机制增大了样本的平均间隔。选择加权支持向量机(Support vector machine,SVM)优化模型作为基分类器,采用带有方差减小的随机梯度下降方法(Stochastic variance reduced gradient,SVRG)对优化模型进行求解。对比实验表明,本文提出的算法不但在几何均值(G-mean)和ROC曲线下的面积(Area under ROC curve,AUC)上明显优于其他算法,而且获得了较大的平均间隔,显示了本文算法在处理非平衡数据分类问题上的有效性。

    Abstract:

    How to improve the classification accuracy of minority instances is one of the hot topics in machine learning research. In order to solve the problem of imbalanced data classification, a penalized AdaBoost algorithm based on cost sensitivity is proposed. In the penalized Adaboost algorithm, a new adaptive cost sensitive function is introduced, which gives higher cost value to the minority instances and the misclassified minority instances. It can obtain a larger average margin by introducing penalty mechanism. The weighted support vector machine (SVM) optimization model is used as the base classifier. The stochastic variance reduced gradient (SVRG) with variance reduction method is used to solve the optimization model. The comparative experiments show that the proposed algorithm is not only superior to other algorithms in terms of geometric-mean(G-mean) and area under ROC curve(AUC), but also can obtain a larger average margin by introducing penalty mechanism, which fully demonstrates the effectiveness of the proposed algorithm in handling imbalanced data classification problems.

    参考文献
    相似文献
    引证文献
引用本文

鲁淑霞,张振莲,翟俊海.代价敏感惩罚AdaBoost算法的非平衡数据分类[J].南京航空航天大学学报,2023,55(2):339-346

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-07-26
  • 最后修改日期:2022-07-29
  • 录用日期:
  • 在线发布日期: 2023-04-28
  • 出版日期:
您是第位访问者
南京航空航天大学学报 ® 2024 版权所有
技术支持:北京勤云科技发展有限公司