Order statistics for voice activity detection in voip

R Muralishankar, RR Venkatesha Prasad, S Vijay, HN Shankar

    Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

    6 Citations (Scopus)

    Abstract

    Realtime voice communication over the Internet has rapidly gained popularity. It is indeed essential to reduce the total bandwidth consumption to efficiently use the available bandwidth for the subscribers having low speed connectivity and even otherwise. In this paper we introduce a novel technique to identify the voice and silent regions of a speech stream that is very much suitable for VoIP calls. We use an entropy measure, which is based on the spacings of order statistics of speech frames to differentiate the silence zones from the speech zones. We developed an algorithm that uses an adaptive thresholding to minimize the misdetection. The performance of our approach is compared with the built-in VAD of AMR codec. Our approach yields comparatively better saving in bandwidth yet maintaining a good quality of the speech streams. Further, the proposed approach has improved voice detection compared to the AMR schemes under noisy conditions. The ideas presented in this paper has been identified novel during the WIPO international patent search.
    Original languageEnglish
    Title of host publicationOrder statistics for voice activity detection in voip
    Place of PublicationCape Town
    PublisherIEEE Society
    Pages1-6
    Number of pages6
    ISBN (Print)978-1-4244-6402-9
    Publication statusPublished - 2010
    EventCommunications (ICC), 2010 IEEE International Conference - Cape Town
    Duration: 23 May 201027 May 2010

    Publication series

    Name
    PublisherIEEE

    Conference

    ConferenceCommunications (ICC), 2010 IEEE International Conference
    Period23/05/1027/05/10

    Keywords

    • conference contrib. refereed
    • Vakpubl., Overig wet. > 3 pag

    Fingerprint Dive into the research topics of 'Order statistics for voice activity detection in voip'. Together they form a unique fingerprint.

    Cite this