Intelligibility Enhancement Based on Mutual Information

Seyran Khademi; Richard C. Hendriks; W. Bastiaan Kleijn

doi:10.1109/TASLP.2017.2714424

Intelligibility Enhancement Based on Mutual Information

Seyran Khademi^*, Richard C. Hendriks, W. Bastiaan Kleijn

^*Corresponding author for this work

Research output: Contribution to journal › Article › Scientific › peer-review

16 Citations (Scopus)

Abstract

Speech intelligibility enhancement is considered for multiple-microphone acquisition and single loudspeaker rendering. This is based on the mutual information measured between the message spoken at far-end environment and the message perceived by a listener at near-end. We prove that the joint optimal processing can be decomposed into far-end and near-end processing. The former is a minimum variance distortionless response beamformer that reduces the noise in the talker environment and the latter is a post-filter that redistributes the power over the frequency bands. Disjoint processing is optimal provided that the post-filtering operation is aware of the residual noise from the beamforming operation. Our results show that both processing steps are necessary for the effective conveyance of a message and, importantly, that the second step must be aware of the remaining noise from the beamforming operation in the first step. In addition, we study the use of the mutual information applied on the perceptually more relevant powers per critical band.

Original language	English
Article number	7946152
Pages (from-to)	1694-1708
Number of pages	15
Journal	IEEE - ACM Transactions on Audio, Speech, and Language Processing
Volume	25
Issue number	8
DOIs	https://doi.org/10.1109/TASLP.2017.2714424
Publication status	Published - 2017

Keywords

Minimum variance distortionless response (MVDR) beamformer
mutual information
multi-microphone
speech intelligibility enhancement

Access to Document

10.1109/TASLP.2017.2714424

http://ens.ewi.tudelft.nl/pubs/seyran17taslp.pdf

Cite this

@article{e4f6065b6b604f8bb6c5e00e4545246e,

title = "Intelligibility Enhancement Based on Mutual Information",

abstract = "Speech intelligibility enhancement is considered for multiple-microphone acquisition and single loudspeaker rendering. This is based on the mutual information measured between the message spoken at far-end environment and the message perceived by a listener at near-end. We prove that the joint optimal processing can be decomposed into far-end and near-end processing. The former is a minimum variance distortionless response beamformer that reduces the noise in the talker environment and the latter is a post-filter that redistributes the power over the frequency bands. Disjoint processing is optimal provided that the post-filtering operation is aware of the residual noise from the beamforming operation. Our results show that both processing steps are necessary for the effective conveyance of a message and, importantly, that the second step must be aware of the remaining noise from the beamforming operation in the first step. In addition, we study the use of the mutual information applied on the perceptually more relevant powers per critical band.",

keywords = "Minimum variance distortionless response (MVDR) beamformer, mutual information, multi-microphone, speech intelligibility enhancement",

author = "Seyran Khademi and Hendriks, {Richard C.} and Kleijn, {W. Bastiaan}",

year = "2017",

doi = "10.1109/TASLP.2017.2714424",

language = "English",

volume = "25",

pages = "1694--1708",

journal = "IEEE - ACM Transactions on Audio, Speech, and Language Processing",

issn = "2329-9290",

publisher = "IEEE Advancing Technology for Humanity",

number = "8",

}

TY - JOUR

T1 - Intelligibility Enhancement Based on Mutual Information

AU - Khademi, Seyran

AU - Hendriks, Richard C.

AU - Kleijn, W. Bastiaan

PY - 2017

Y1 - 2017

N2 - Speech intelligibility enhancement is considered for multiple-microphone acquisition and single loudspeaker rendering. This is based on the mutual information measured between the message spoken at far-end environment and the message perceived by a listener at near-end. We prove that the joint optimal processing can be decomposed into far-end and near-end processing. The former is a minimum variance distortionless response beamformer that reduces the noise in the talker environment and the latter is a post-filter that redistributes the power over the frequency bands. Disjoint processing is optimal provided that the post-filtering operation is aware of the residual noise from the beamforming operation. Our results show that both processing steps are necessary for the effective conveyance of a message and, importantly, that the second step must be aware of the remaining noise from the beamforming operation in the first step. In addition, we study the use of the mutual information applied on the perceptually more relevant powers per critical band.

AB - Speech intelligibility enhancement is considered for multiple-microphone acquisition and single loudspeaker rendering. This is based on the mutual information measured between the message spoken at far-end environment and the message perceived by a listener at near-end. We prove that the joint optimal processing can be decomposed into far-end and near-end processing. The former is a minimum variance distortionless response beamformer that reduces the noise in the talker environment and the latter is a post-filter that redistributes the power over the frequency bands. Disjoint processing is optimal provided that the post-filtering operation is aware of the residual noise from the beamforming operation. Our results show that both processing steps are necessary for the effective conveyance of a message and, importantly, that the second step must be aware of the remaining noise from the beamforming operation in the first step. In addition, we study the use of the mutual information applied on the perceptually more relevant powers per critical band.

KW - Minimum variance distortionless response (MVDR) beamformer

KW - mutual information

KW - multi-microphone

KW - speech intelligibility enhancement

UR - http://www.scopus.com/inward/record.url?scp=85021717875&partnerID=8YFLogxK

U2 - 10.1109/TASLP.2017.2714424

DO - 10.1109/TASLP.2017.2714424

M3 - Article

AN - SCOPUS:85021717875

SN - 2329-9290

VL - 25

SP - 1694

EP - 1708

JO - IEEE - ACM Transactions on Audio, Speech, and Language Processing

JF - IEEE - ACM Transactions on Audio, Speech, and Language Processing

IS - 8

M1 - 7946152

ER -

Intelligibility Enhancement Based on Mutual Information

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this