Hearing aids improved significantly after the integration of advanced
digital signal processing applications. This improvement will continue
and evolve through obtaining intelligent, individualized hearing aids integrating
top-down (knowledge-based) and bottom-up (signal-based) approaches
by making use of research done within cognitive science that
is the interdisciplinary study of mind and intelligence bringing together
various disciplines including Artificial Intelligence, Cognitive Psychology,
and Neuroscience.
This thesis focuses on three subjects within cognitive science related to
hearing. Initially, a novel method for automatic speech recognition using
binary features from binary masks, is discussed. The performance
of binary features in terms of robustness to noise is compared with the
ASR state of the art features, mel frequency cepstral coefficients. Secondly,
human top-down auditory attention is studied. A computational
top-down attention model is presented and behavioral experiments are
carried out to investigate the role of top-down task driven attention in
the cocktail party problem. Finally, automatic emotion recognition from
speech is studied using a dimensional approach and with a focus of integrating
semantic and acoustic features. An emotional speech corpus that
consists of short movie clips with audio and text parts, rated by human
subjects in two affective dimensions (arousal and valence), is prepared to
evaluate the method proposed. |