How can speech recorded with smartphones help monitor depression?

09 October 2023

How can speech recorded with smartphones help monitor depression?

Researchers from the Institute of Psychiatry, Psychology & Neuroscience (IoPPN) at King’s College London are using novel methods to analyse speech recorded in the RADAR-CNS project and applying insights gained from the project to shape new research in this growing field.

In a new study, published in the Journal of Affective Disorders, researchers analysed RADAR-CNS data collected over three years through a smartphone app from 461 participants with a diagnosis of depression.

The study identified features that could indicate fluctuations in symptoms, and found that people tend to talk slower and quieter when their depression becomes more severe. This was replicated across three different languages: English, Spanish and Dutch.

Our findings are significant as it’s the first time we have been able to compare changes in our voices with depression across different languages. Equally important, these findings were found on data collected remotely from people going about their daily lives, highlighting the role speech analysis could play in aiding the management of chronic conditions such as depression. These results really highlight the value of large international collaborations like RADAR-CNS - without it, we wouldn’t have these important insights.
Dr Nicholas Cummins, lead author on the study and Lecturer in AI for speech analysis at King's IoPPN

Making monitoring more personalised

Using the same dataset from the RADAR-CNS project, researchers have also taken the first steps in creating personalised models that consider individual differences in speaking styles. Presented at the INTERSPEECH conference by PhD student Edward Campbell, a visiting researcher at the IoPPN, the work created personalised models that were trained on an individual’s data from one time period, then applied it to classify the speech in a new separate time period. By doing this, researchers were able to incorporate differences in individual speaking styles into the models, which helped improve their accuracy when detecting depression severity.

Dr Nicholas Cummins said: “In this work we explored combinations of different speech characteristics and machine learning techniques with the aim of classifying if a speech sample is reflective of low or high depression symptom severity.

“Our findings highlight the advantage of personalising such models to increase their discriminative ability. Continued research is needed to refine these models to make them more robust before they can be considered for use in self-monitoring applications.”

Investigating the effects of acoustics on speech monitoring

Speech researchers from the IoPPN are also investigating the effects of room reverberation on smartphone speech recordings with the aim of improving monitoring in the future. The research was inspired from work on the RADAR-CNS project which highlighted how much the acoustics of the speaker’s surroundings, as well as the time of day, can influence how their voice sounds in a recording.

Researchers compared recordings of people in rooms that were either empty or filled with soft furnishings and acoustic foam. This showed that features of speech can vary markedly, depending on where you are recording. The findings highlight a need to take recording surroundings into account in smartphone monitoring applications in the future. The research was presented at the INTERSPEECH conference by Dr Jude Dineley Postdoctoral Research Associate in Speech for mHealth Applications at King’s IoPPN.

Investigating how recordings of our speech on smartphones can vary depending on when, where and what someone uses to record themselves is a critical aspect of developing reliable speech tools for monitoring health for use in health research and clinical practice. When we observe a change in someone’s speech, we need to be confident it is due to a change in their health, rather than a result of where or when they are recording, or even what device they are using. Room acoustics is one specific area – someone recording their voice in a bathroom at work will sound different from if they record at home on their sofa.
Dr Jude Dineley, Postdoctoral Research Associate in Speech for mHealth Applications at King’s IoPPN

The researchers have also been investigating how people’s voices can vary at different times and on different days of the week. Early results are confirming more marked differences between recordings in the morning and later in the day – the effect of so-called ‘morning voice’.

Dr Jude Dineley’s work was funded by the MRC and EPSRC IAA fund, the NIHR Maudsley BRC, the UK Acoustics Network/EPSRC and an IPEM Innovation Award.

In this story

Dr Nicholas Cummins

Senior Lecturer in Speech Analysis and Responsible AI in Health

King's College London