Research collaboration between EET, Sonde Health and the Black Dog Institute to investigate whether vocal analysis could be the Holy Grail of helping monitor depression and other mental health conditions.

People who suffer from depression may find it hard to recognise the onset of their symptoms. However, their smartphone may soon be able to hear signs of depression in their voice and provide the welcome nudge to seek help.

“My team has recently been awarded an ARC Linkage grant to develop a diagnostic aid to monitor depression using voice analysis software on a smartphone,” says Associate Professor Julien Epps from UNSW Electrical Engineering and Telecommunications.

“We’ll be working on the project with our longstanding partners at Sonde Health, who are located in Boston in the US, and the Black Dog Institute, a medical research institute affiliated with UNSW Medicine.”

The three-year, $500,000 project seeks to understand how artificial intelligence, or specifically machine learning, can help measure, analyse and understand characteristic changes in the voice related to one’s emotional and mental state. This platform aims to generate and aggregate this health information to create a clinical and/or individualised approach to monitoring and managing depression.

Non-depressed vs Depressed Speech

Epps has been studying emotion and mental state recognition from speech since 2007 and says vocal analysis has a number of benefits as an indicator of depression. The data is easy and cheap to collect, it’s non-invasive and can be monitored remotely. There are also a number of known vocal hallmarks of depressed speech. For example, due to psychomotor retardation, depression often results in the slowing down of thoughts and movement.

“Speech carries a significant amount of information that all plays together in the way it sounds. There is word choice and the speaker’s unique vocal signature, but embedded in that will be the speaker’s emotional state and information like fatigue level and cognitive load,” he says.

Epps explains that in many ways the project is already well-advanced. “Sonde Health has already developed a speech collection smartphone application and gathered large volumes of data. What they want to do next is improve the methods used to assess the level of depression from the speech they’ve captured,” he explains.

“Using short speech samples on everyday devices to assess mental state might sound like it’s in the realm of science fiction, but we're on the verge of turning this idea into reality,” continues Dr Michael Chen from Sonde Health.

Sonde Health is a digital medicine company focused on voice-based technology with the potential to transform the way mental and physical health is diagnosed and monitored. Chen says this Linkage Project is exciting because much of the work in this field so far has been laboratory-based.

Using short speech samples on everyday devices to assess mental state might sound like it’s in the realm of science fiction, but we're on the verge of turning this idea into reality.

Dr Michael Chen, Sonde Health

“There is a need to move this research into ‘real-world’ environments and this project will allow us to push the boundaries of acoustic health analysis and enhance health management,” Chen says.

The concept of having an app that can be used for the self-monitoring of health is not a new concept, and Epps says the Black Dog Institute is a global technological pioneer in this field.

“The Black Dog Institute already have a range of apps designed to help people track their mental state over time, so you can see why they were interested in sharing their expertise and supporting how things connect up,” says Epps.

Dr Mark Larsen from the Black Dog Institute explains that this project is part of their Digital Dog research program, where researchers are using new technologies to find markers for mental health risk and harnessing these to link people with evidence-based digital interventions for depression, anxiety and suicidality.

“Historically, one of the challenges has been identifying markers that can be collected robustly across different phones and handsets – but recording speech samples is something that nearly every phone can do, which is what makes this project so promising,” says Larsen.

In terms of when their research might be ready to be used clinically or by individuals, it’s hard to estimate. However, Epps says the strong interest from industry is fast tracking progress. “Industry has seen the potential of this technology early in the R&D process and there are a variety of start-up companies using behavioural indicators, collected from everyday devices, to provide automated assessment of mental disorders,” he says.

“Depression is a global problem that has so far been hard to monitor and measure. The opportunity here is to use behavioural signals, such as vocal analysis, to extract clinically meaningful health information from a patient in much the same way that biomedical devices extract information from physiological signals. An analogy that is the Holy Grail for this kind of project, is the creation of an ‘ECG for mental health.’”

Chen acknowledges inherent difficulties with this kind of project but is confident they have assembled a dream team to tackle these obstacles. “There are many challenges in translating this research into software that runs on everyday devices, but Julien and his colleagues have embraced this challenge head-on,” he says.

“The incredible expertise at UNSW, especially in signal processing and acoustic modelling, allows us to scale our R&D to tackle these high-risk problems and create a long-term research alliance. We’d particularly like to acknowledge the funding and support of the Australian Research Council in enabling this partnership.”


Written by: Penny Jones