Today, clinical research on early Alzheimer’s Disease (AD) dementia relies on invasive procedures and extended cognitive examinations both for screening and within trial usage. With a new era of decentralized clinical trials to come and an ever-higher focus on real-world evidence, there is a need to generate sensitive, non-invasive measures for at-scale remote cognitive impairment detection. Due to recent technological innovations, it is today feasible to use speech as a source for reliable cognition biomarkers. To address the current needs, ki:elements relies on newest advances in machine learning and sets out to build and validate a remote, fully automatic speech-based digital biomarker that detects cognition in an MCI state. For that, ki:e has conducted new experiments that evaluate the verification and validation of the ki:e speech biomarker for cognition (ki:e SB-C). Within a greater effort using the V3 framework, which provides a guide for the evaluation of digital sensing products, and the recently published playbook for digital biomarker validation for clinical trials from the DiMe society, we would like to report a subset of experiments needed for the evaluation demonstrating:
- precision of automatic extraction of speech features, obtained through a remote fully automated telephone assessment
- analytical validity of the performance of automatically extracted speech metrics to measure cognitive function
- discrimination between healthy individuals and individuals with MCI
The results bring us closer to the overarching mission of developing digital remote speech-based biomarkers that can be used in diagnostic or monitoring settings to identify (subtle) cognitive impairments due to AD.
automated speech recognition quality check
The speech input is generated during a fully automated telephone call. The subject performs a multi-trial list verbal learning task (RAVLT) which includes an immediate and a delayed recall and thus captures learning and memory capabilities. During the delay, a semantic verbal fluency task (SVF) is performed which allows the measurement of executive function. To accomplish the first goal a performance test was conducted to examine whether sensor technology captures task-specific input data with high accuracy and precision. For that, automatically generated transcripts were compared with manually corrected transcripts of the task protocol conducted via the telephone. The word error rate was calculated for each specific task and for two languages (Dutch and English). The mean word error rate ranges between 14.04% – 14.93% for English and between 23.53% – 21.80% for Dutch.
correlation with the gold-standard of cognition
However, beyond the technological interface used for its assessment, the core of the ki:e SB-C is the ML-based algorithm, which calculates a composite score for cognition using around 100 different speech features. The algorithm was developed and refined based on a ki:elements speech database and consists of features that reveal learning and memory abilities, use of strategies, as well as semantic and temporal characteristics of inhibition, mental flexibility and switching. To validate that this score captures cognitive abilities, we compared it against a traditional manual measure for old-age cognition: the Mini Mental State Examination (MMSE).
The MMSE is a gold-standard clinical scale, which also provides cognitive composite scores. Analysis reveals a significant correlation of the predicted biomarker scores with the values of the MMSE (r=-0.41, p= <0.05).
To demonstrate that the algorithm also quantifies a meaningful aspect of health, a model that was fit to best separate MCI and SCI patients was tested. A Kruskal Wallis test revealed a significant difference between MCI and SCI patients in the biomarker scores (H= 28.9, p<.001). Therefore, the algorithm was able to successively differentiate between the two clinical groups in a test data set.
Taken together the results show that the ki:e Speech Biomarker for cognition is a valid measure to infer cognition especially to remotely assess AD-related cognitive impairment at scale. The ki:e SB-C is an appropriate measure that withstands validation in a healthy and a clinical sample, correctly detects cognitive abilities and distinguishes among two clinical groups.