Shammur Absar Chowdhury, Giuseppe Riccardi, Firoj Alam
Workshop on Speech, Language and Audio in Multimedia (SLAM 2014), Penang, Malaysia
Publication year: 2014


We are interested in understanding speech overlaps and their function in human conversations. Previous studies on speech overlaps have relied on supervised methods, small corpora and controlled conversations. The characterization of overlaps based on timing, semantic and discourse function requires an analysis over a very large feature space. In this study, we discover and characterize speech overlaps using unsupervised techniques. Overlapping segments of human-human spoken conversations were extracted and transcribed using a large vocabulary Automatic Speech Recognizer (ASR). Each overlap instance is automatically projected onto a high- dimensional space of acoustic and lexical features. Then, we used unsupervised clustering to discover distinct and well- separated clusters that may correspond to different discourse functions (e.g., competitive, non-competitive overlap). We have evaluated recognition and clustering algorithms over a large set of real human-human spoken conversations. The automatic system separates two classes of speech overlaps. The clusters have been comparatively evaluated in terms of feature distributions and their contribution to the automatic classification of the clusters.

Leave a Reply

Your email address will not be published. Required fields are marked *

eighteen + fifteen =