Artificial robot audition involves several signal processing tasks, mainly localization, tracking and separation of multiple sound sources. However, noise and reverberation are very important factors that affect the performance of signal processing algorithms, and robust software and hardware solutions are needed to allow robots process different types of sounds.
Acoustic monitoring in cities is usually performed using only sound pressure levels (SPLs). However, besides SPLs, there are other important factors that affect the perception of sound in a given environment. For example, the assessment of the subjective annoyance caused by noise pollution can be better predicted by using psychoacoustic parameters taking into account the temporal and spectral characteristics of sound. Intelligent acoustic monitoring takes into account all these factors to develop meaningful models aimed at providing a better description of sound in urban areas.
Sound is a important part of everyday life, being our primary means of personal communication. Humans make use of their two ears to analyze the spatial dimension of sound, being able to localize sound sources in space and gathering important information about the environment. Capturing and synthesizing the spatial features of sound is not an easy task and usually requires the use of multiple microphones and loudspeakers. In this context, the field of spatial audio has evolved from sterephonic recording and reproduction to complex sound field synthesis techniques, such as Wave Field Synthesis (WFS) or Higher Order Ambisonics (HOA). Spatial audio is therefore a wide research topic covering multiple aspects related to the spatial dimension of sound.
Sound source localization is an important task for developing innovative acoustic technology. One of the major application areas of source localization is acoustic-based surveillance, where intelligent systems making use of multiple distributed microphones can be used to detect, recognize and locate acoustic events both in outdoor and indoor environments. These systems can be used to steer automatically video surveillance systems when some unexpected noise happens or to alert the user to suspicious activity.
Signal processing algorithms aimed at improving speech quality are of major importance in many communication environments and applications, ranging from mobile phones to teleconferencing or hearing aids. The objective is to increase Intelligibility, which is a common important factor in all these application areas. Enhancement algorithms allow to develop high-quality speech communication channels and provide robustness to automatic speech recognition systems.
Music signals have a very particular structure that require specific methods for analyzing their content or retrieving information from them. Signal processing methods for musical analysis are aimed at analyzing musical parameters such as rythm, structure and genre. Moreover, pitch detection methods can be used to retrieve the pitch or notes of one or several instruments within a musical piece. Source localization and separation algorithms can be also particularized to music signals in order to obtain information from the different instruments that make up a given sound segment.
Traditionally, the presentation of data to users has always been focused on visual media. However, human beings live and interact in a world full of different sounds. From ecological sounds that characterize our environment to voice, which is the basic element of man-man communication, sound provides valuable information to humans on the environment. However, despite its importance, auditory information is still underused in human-machine interfaces. The acoustic characterization of application environments, as well as the development of new auditory displays and sound-based interaction methods is also an active research field in acoustic technology.
Computer-simulated environments making use of immersive acoustic technology provide the highest feel of realism. The analysis of the heart rate and respiration rate of people involved in videogaming has showed that those playing games with audio had a higher level of arousal (a combination of heart rate and respiration rate), demonstrating the immersion capabilities of audio in games. Therefore, immersive audio by means of proper loudspeaker setups or binaural technology must be considered as an important factor in the design of virtual reality and 3D gaming systems.
A wireless sensor network (WSN) is a network of spatially distributed sensors aimed at monitoring different physical or environmental parameters. In the context of acoustic signal processing, the use of low-power, low-cost acoustic sensors with computing capabilities offers new opportunities for the development of new applications and processing strategies, different from the ones established by usual microphone array arrangements. The most significant advantage of a WSN over a traditional (wired) array is that it enables an increased spatial coverage by distributing sensors over a larger volume, providing a scalable and easy to deploy structure with better signal to noise ratio properties.