Improving depression prediction using a novel feature selection algorithm coupled with context-aware analysis

文献类型: 外文期刊

第一作者: Dai, Zhijun

作者: Dai, Zhijun;Zhou, Heng;Ba, Qingfang;Li, Guochen;Dai, Zhijun;Zhou, Yang;Wang, Lifeng;Li, Guochen;Wang, Lifeng

作者机构:

关键词: Depression prediction; Feature selection; Context-aware analysis; Maximal information coefficient; Support vector machine

期刊名称:JOURNAL OF AFFECTIVE DISORDERS ( 影响因子:6.533; 五年影响因子:6.569 )

ISSN: 0165-0327

年卷期: 2021 年 295 卷

页码:

收录情况: SCI

摘要: Background: Developing machine learning based depression prediction method with information from long-term recordings is important and challenging to clinical diagnosis of depression. Methods: We developed a novel two-stage feature selection algorithm conducted on the high-dimensional (over thirty thousand) features constructed by a context-aware analysis on the data set of DAIC-WOZ, including audio, video, and semantic features. The prediction performance was compared with seven reference models. The preferred topics and feature categories related to the retained features were also analyzed respectively. Results: Parsimonious subsets (tens of features) were selected by the proposed method in each case of prediction. We obtained the best performance in depression classification with F1-score as 0.96 (0.67), Precision as 1.00 (0.63), and Recall as 0.92 (0.71) on the development set (test set). We also achieved promising results in depression severity estimation with RMSE as 4.43 (5.11) and MAE as 3.22 (3.98), having a marginal difference with the best reference model (random forest with 'Selected-Text' features). Five most important topics related to depression were revealed. The audio features were predominant to the other feature categories in depression classification while the contributions of the three feature categories to severity estimation were almost equal. Limitations: More depression samples in the database we used should be further included. The second stage of feature selection is relatively time-consuming. Conclusion: This pipeline of depression recognition as well as the preferred topics and feature categories are expected to be useful in supporting the diagnosis of psychological distress conditions.

分类号:

  • 相关文献
作者其他论文 更多>>