Supervisor: Patrick A. Naylor (www.ee.ic.ac.uk/naylor)
Description:
Acoustic digital signal processing (ADSP) is at the heart of many new and emerging consumer and business applications of information technology that involve voice signals.
A particular example is automatic meeting transcription in which an ad hoc array of acoustic sensors picks up the sound during a business meeting and subsequently merges the audio signals into a format suitable for speech recognition (speech to text) and diarization (what’s happening when). The sophistication of each sensor may range from a simple microphone and transmitter through to a fully operation smart phone. The technical challenges in this scenario include an investigation into the trade offs between distributed processing and centralised processing in the ad hoc network. In the cases where each sensor has considerable processing power, it is intuitively advantageous to do more of the processing in the sensor node. On the other hand, the level of processing available will likely vary from one node to the next and therefore asymmetric schemes need to be considered.
From the point of view of the ADSP, algorithms including talker localisation are needed in order to diarize the signals (what’s happen when) and speech enhancement (including beam forming) to provide enhanced audio capture. A key challenge of this research is to identify optimal partitioning of the algorithmic processing among the sensor nodes – in reality, some algorithmic elements need to fuse information for the various sensor nodes whereas others operate independently. A target of this research is therefore a scalable set of signal processing algorithmic solutions that can be applied in, as an example, the meeting transcription task. The topic of automatic speech recognition, applied downstream of the audio capture, will only be considered as a turn-key operation in this context such that the majority of the research work will be based on the distributive capture, fusion and enhancement of the raw signals, as well as there re-synchonization after network transmission.
The key skill sets include a working knowledge of digital signal processing (such as a 3rd or 4th year UG module as a minimum), applied mathematics in particular the fundamental principles of linear algebra, the principles and applications of distributed computation, the ability to design and implement scientifically rigorous testing and evaluation of distributed signal processing algorithms using tools such as Matlab, and a thirst for creative solutions for real-world applications.
For further details please contact Patrick Naylor via email: p.naylor@imperial.ac.uk