Measuring Stress in Health Professionals Over the Phone Using Automatic Speech Analysis During the COVID-19 Pandemic: Observational Pilot Study

Alexandra König1, BSc, MSc, PhD; Kevin Riviere2*; Nicklas Linz3; Hali Lindsay4*, MSc; Julia Elbaum2; Roxane Fabre2; Philippe Robert5

1Stars Team, Institut national de recherche en informatique et en automatique, Valbonne, France
2Département de Santé Publique, Centre Hospitalier Universitaire de Nice, Université Côte d’Azur, Nice, France
3ki elements, Saarbrücken, Germany
4German Research Center for Artificial Intelligence (DFKI), Saarbrücken, Germany
5CoBteK (Cognition-Behaviour-Technology) Lab, La Fédération de Recherche Interventions en Santé, Université Côte d’Azur, Nice, France
Cognition-Behaviour-Technology Research Lab, University Côte d’azur
*these authors contributed equally


Background: During the COVID-19 pandemic, health professionals have been directly confronted with the suffering of patients and their families. By making them main actors in the management of this health crisis, they have been exposed to various psychosocial risks (stress, trauma, fatigue, etc). Paradoxically, stress-related symptoms are often underreported in this vulnerable population but are potentially detectable through passive monitoring of changes in speech behavior.
Objective: This study aims to investigate the use of rapid and remote measures of stress levels in health professionals working during the COVID-19 outbreak. This was done through the analysis of participants’ speech behavior during a short phone call conversation and, in particular, via positive, negative, and neutral storytelling tasks.
Methods: Speech samples from 89 health care professionals were collected over the phone during positive, negative, and neutral storytelling tasks; various voice features were extracted and compared with classical stress measures via standard questionnaires. Additionally, a regression analysis was performed.
Results: Certain speech characteristics correlated with stress levels in both genders; mainly, spectral (ie, formant) features, such as the mel-frequency cepstral coefficient, and prosodic characteristics, such as the fundamental frequency, appeared to be sensitive to stress. Overall, for both male and female participants, using vocal features from the positive tasks for regression yielded the most accurate prediction results of stress scores (mean absolute error 5.31).
Conclusions: Automatic speech analysis could help with early detection of subtle signs of stress in vulnerable populations over the phone. By combining the use of this technology with timely intervention strategies, it could contribute to the prevention of burnout and the development of comorbidities, such as depression or anxiety.

Full paper below: