ki:elements

Two-minute automated speech analysis: comparable performance to Beck Depression Inventory-II in distinguishing depressed and healthy individuals

Felix Menne, Felix Dörr, Julia Schräder, Ute Habel, Johannes Tröger, Alexandra König & Lisa Wagels

* Poster presented at 37th ECNP Congress, Italy

Abstract

Background: Compared to other areas of medicine, psychiatry lacks objective biomarkers that are not solely based on subjective assessments and clinical observations. In the field of psychiatry, novel precision medicine approaches may aid in issues such as prediction response to lithium, resistance to antidepressants or outcome prediction [1]. In recent years, a growing number of digital biomarkers have emerged to more objectively assess behavioral or biological information for psychiatric conditions [2,3]. Among these, speech analysis presents significant opportunities, as psychiatric symptoms often manifest in speech [4,5].

Aims: This project aims to identify speech features able to differentiate between patients with major depressive disorder (MDD) and healthy controls (HC), based on a symptom severity measure.

Methods: 44 MDD patients and 52 HC were recruited from the Psychiatry Department, University Hospital Aachen, Germany. Participants underwent a comprehensive clinical and neuropsychological assessment. This included the German version of the Structured Clinical Interview based on the DSM-V (SCID-5) to assess the diagnosis and the Beck Depression Inventory (BDI-II) to gauge depression severity. Participants were prompted to narrate two stories consisting of positive and negative life events, each approximately one minute. These were recorded for automated speech analysis. Transcribed audio recordings underwent feature extraction, based on the “Sigma” library defined by our working group [6,7]. Several classification models were computed to differentiate between MDD and HC: one baseline model consisting of demographic (age, gender, years of education) and clinical data (Trail-Making Tests A/B, Digit Span Forwards/Backwards, and the Multiple-choice vocabulary intelligence test (MWT-B)); and one model including the BDI-II. In a leave-one-out cross validation approach, these models were compared to one speech model consisting of 10 linguistic and acoustic features extracted from the story-telling task selected based on mutual information.

Results: Both groups (HC and MDD) did not differ significantly in their demographic data (age HC: 26.17±7.1, MDD: 26.14±6.2 years (p = 0.88). Years of education were 14.1±2.2 years for HC, 12.4±2 years for MDD (p = 0.2). 63.5% of HC and 56.8% of MDD were female (p = 0.22). Mean BDI-II scores for HC were 3.9±3.1 and 26.4±10.4 for MDD (p < 0.01). The results of the classification models can be seen in Table 1. Our results demonstrate that the speech model (AUC = 0.960) outperformed the baseline model (AUC = 0.697) in all performance metrics. Moreover, it performed similar to the model using only the BDI-II (AUC = 0.99).

Conclusion: Our results show the utility of speech features derived from a two-minute assessment to differentiate between healthy and depressed individuals. A model consisting of selected speech features proved superior compared to one including demographic and clinical data and almost equal to a validated pen-and-paper test to measure depressive symptom severity, the BDI-II. In future, these findings may shape voice-based biomarkers, enhancing clinical diagnosis and MDD severity monitoring while being economical and remotely accessible.

References:
1 Manchia M, Pisanu C, Squassina A, Carpiniello B. Challenges and Future Prospects of Precision Medicine in Psychiatry. Pharmacogenomics Pers. Med. 13, 127–140 (2020).
2 Jacobson NC, Weingarden H, Wilhelm S. Digital biomarkers of mood disorders and symptom change. Npj Digit. Med. 2(1), 1–3 (2019).
3 Schultebraucks K, Yadav V, Galatzer-Levy IR. Utilization of Machine Learning-Based Computer Vision and Voice Analysis to Derive Digital Biomarkers of Cognitive Functioning in Trauma Survivors. Digit. Biomark. 16–23 (2020).
4 Dikaios K, Rempel S, Dumpala SH, Oore S, Kiefte M, Uher R. Applications of Speech Analysis in Psychiatry. Harv. Rev. Psychiatry 31(1), 1 (2023).
5 Kim AY, Jang EH, Lee S-H, Choi K-Y, Park JG, Shin H-C. Automatic Depression Detection Using Smartphone-Based Text-Dependent Speech Signals: Deep Convolutional Neural Network Approach. J. Med. Internet Res. 25(1), e34474 (2023).
6 Lindsay H, Tröger J, König A. Language impairment in Alzheimer’s disease—robust and explainable evidence for AD-related deterioration of spontaneous speech through multilingual machine learning. Front Aging Neurosci 13 (2021).
7 König A, Linz N, Zeghari R et al. Detecting apathy in older adults with cognitive disorders using automatic speech analysis. J. Alzheimers Dis. 69(4), 1183–1193 (2019).

Share this article