ki:elements

Speech pauses: biomarkers for differentiating depressed patients from healthy controls

Felix Menne, Elisa Mallick, Johannes Tröger, Alexandra König, Janna Schulze, Diana Immel, Simon Barton & René Hurlemann

* Poster presented at 37th ECNP Congress, Italy

Abstract

Background: Compared to other areas of medicine, psychiatry lacks reliable and objective biomarkers not solely based on patients’ and clinicians’ subjective assessments. To move towards precision medicine in psychiatry, new approaches are needed to customize treatments for individual patients. Currently, psychiatric conditions are typically evaluated using questionnaire-based scales that pertain to specific symptomatic domains and may be prone to bias. Consequently, the identification of objective markers for psychiatric disease states, including behavioral phenotypes, is imperative to bolster transdiagnostic dimensional approaches to disease classification. Among various emerging digital biomarkers [1], there is evidence from automated speech and audio analysis demonstrating promising results for detecting affective states in psychiatric patients [2,3].

Aims: This project aims to investigate which speech features differ between healthy participants and depressed patients and to study the extent to which depression and its symptoms affect the speech of patients.

Methods: 19 patients diagnosed with MDD according to DSM-V and 21 healthy controls (HC) were recruited at the Karl-Jaspers Clinic, University Hospital Oldenburg, Germany. Participants were examined with an extensive clinical assessment, including the Beck Depression Inventory (BDI-II) to gauge depression severity. Three free speech tasks (positive, negative, and neutral storytelling) were performed and recorded in the clinic at two time points 14 days apart. The audio recordings were processed and analyzed in order to extract speech and language features, based on the “Sigma” library defined by our working group [4,5]. To control for age, gender and word count, these variables were regressed out from the speech features using linear regressions. Group comparisons were performed using repeated measures ANOVA between diagnosis groups (depressed vs. HC) within subjects.

Results: The patients had a mean age of 41.37±13.92 years, HC were 40.05±10.5 years old. 57.9% of patients and 47.6% of HC were female. BDI total score at T1 was 22.1±10.4 for patients and 2.7±2.5 for HC, and 17.3±13.8 for patients and 2.9±2.6 for HC at T2. The group comparisons of speech features revealed significant effects for the positive and neutral story, but not for the negative story. Detailed results for the group differences are depicted in Table 1. We found several features to be significantly different across both stories, i.e., number of pauses and pause duration, whereas depressed individuals displayed a higher number of pauses with a longer pause duration, respectively.

Conclusion: We found significant differences between healthy and depressed individuals for several temporal speech features, derived from a two-minute assessment. The number of pauses and pause duration showed robust findings with medium effect sizes across two assessments. These findings are in line with existing literature [6]. In future, these findings may aid in creating voice-based biomarkers, enhancing clinical diagnosis and MDD severity monitoring. At the same time, automated speech analysis allows for a remote and economical assessment.

References:
1 Jacobson NC, Weingarden H, Wilhelm S. Digital biomarkers of mood disorders and symptom change. Npj Digit. Med. 2(1), 1–3 (2019).
2 Dikaios K, Rempel S, Dumpala SH, Oore S, Kiefte M, Uher R. Applications of Speech Analysis in Psychiatry. Harv. Rev. Psychiatry 31(1), 1 (2023).
3 Cummins N, Dineley J, Conde P et al. Multilingual markers of depression in remotely collected speech samples: A preliminary analysis. J. Affect. Disord. 341, 128–136 (2023).
4 Lindsay H, Tröger J, König A. Language Impairment in Alzheimer’s Disease—Robust and Explainable Evidence for AD-Related Deterioration of Spontaneous Speech Through Multilingual Machine Learning. Front. Aging Neurosci. 13 (2021).
5 König A, Linz N, Zeghari R et al. Detecting apathy in older adults with cognitive disorders using automatic speech analysis. J. Alzheimers Dis. 69(4), 1183–1193 (2019).
6 Yamamoto M, Takamiya A, Sawada K et al. Using speech recognition technology to investigate the association between timing-related speech features and depression severity. PLOS ONE 15(9), e0238726 (2020).

Share this article