pub4 - Shammur Absar Chowdhury

Workshop

Shammur Absar Chowdhury, Giuseppe Riccardi, Firoj Alam

Workshop on Speech, Language and Audio in Multimedia (SLAM 2014), Penang, Malaysia

Publication year: 2014

External Link

Abstract

We are interested in understanding speech overlaps and their function in human conversations. Previous studies on speech overlaps have relied on supervised methods, small corpora and controlled conversations. The characterization of overlaps based on timing, semantic and discourse function requires an analysis over a very large feature space. In this study, we discover and characterize speech overlaps using unsupervised techniques. Overlapping segments of human-human spoken conversations were extracted and transcribed using a large vocabulary Automatic Speech Recognizer (ASR). Each overlap instance is automatically projected onto a high- dimensional space of acoustic and lexical features. Then, we used unsupervised clustering to discover distinct and well- separated clusters that may correspond to different discourse functions (e.g., competitive, non-competitive overlap). We have evaluated recognition and clustering algorithms over a large set of real human-human spoken conversations. The automatic system separates two classes of speech overlaps. The clusters have been comparatively evaluated in terms of feature distributions and their contribution to the automatic classification of the clusters.

Shammur Absar Chowdhury

Qatar Computing Research Institute

Unsupervised Recognition and Clustering of Speech Overlaps in Spoken Conversations

Leave a Reply Cancel reply