How scalograms improve acoustic scene analysis

A recent paper by audEERING CEO Prof. Björn Schuller and his colleagues from the University of Augsburg has shown how acoustic scene analysis can be improved by computing scalograms of the recorded audio. The paper was published in the IEEE/CAA Journal of Automatica Sinica (JAS) and was recently summarized by the science news platform EurekAlert.

Prof. Schuller explains that “real-world audio is usually a highly blended mix of different sound sources”, so the goal of machine listening is to recognize audio in a holistic fashion – just like a human would be able to characterize speech, music, and other sound events in order to get “the whole picture in the audio”.

Prof. Schuller’s paper demonstrates significant progress in acoustic scene analysis by combining spectograms (visual representations of audio signals over time and frequency) and scalograms. Visual images of sound are generated and classified via convolutional neural networks. Scalograms allow for better visual representations of acoustic scenes when compared to spectograms with fixed time-frequency resolutions. For more details, visit EurekAlert. The full paper is available here.

Related Posts