Description of field of research

Despite significant advances in speech-based AI systems, their capabilities are still no match for human abilities. For instance, most people will have no trouble having a conversation in a noisy restaurant but wouldn’t expect the speech input on their smartphone to work in the same environment. We do not have a full understanding of how the human auditory system achieves this level of performance, but we have hints. The human brain makes use of prior knowledge of the structure of speech sounds to help decode it even in situations when the signal to noise ratio is low.

The aim of this project is the preliminary investigation and development of a Bayesian spectral analyses system that can track formant frequencies from speech under noisy conditions. If successful, this could be the first step in the path leading to the next generation of speech-based AI systems.

Research Area

Speech processing | Signal processing | Statistical modelling | Bayesian statistics

The work environment will be within the UNSW Speech Processing Lab in the School of Electrical Engineering and Telecommunications. The group is home to 4 full-time academic staff members, 8 research students, and 3 Postdoctoral Research Fellows, all of whom are actively engaged in research related to speaker verification, speech recognition, anti-spoofing countermeasures, emotion recognition, disordered speech, and mental state detection.

At the completion of the project, the student will have a good understanding of speech signal processing and statistical modelling. In addition, they will have significantly strengthened their technical/research skill as well as programming skills in MATLAB or Julia. It is anticipated that by the end of the project, the student will have developed and validated a statistical algorithm for speech spectral analyses and implemented this as easy to use code/toolbox in MATLAB or Julia with documentation. If the topic were extended into an honours thesis, more would be possible.

Contact or drop by EE442 to discuss the topic. (Enquiries are encouraged). 

Suggested reading: