Joint Maximum Likelihood Estimation of Microphone Array Parameters for a Reverberant Single Source Scenario

Changheng Li; Jorge Martinez; Richard Christian Hendriks

doi:10.1109/TASLP.2022.3231706

Joint Maximum Likelihood Estimation of Microphone Array Parameters for a Reverberant Single Source Scenario

Changheng Li^*, Jorge Martinez, Richard Christian Hendriks

^*Corresponding author for this work

Research output: Contribution to journal › Article › Scientific › peer-review

2 Citations (Scopus)

33 Downloads (Pure)

Abstract

Estimation of the acoustic-scene related parameters such as relative transfer functions (RTFs) from source to microphones, source power spectral densities (PSDs) and PSDs of the late reverberation is essential and also challenging. Existing maximum likelihood estimators typically consider only subsets of these parameters and use each time frame separately. In this paper we explicitly focus on the single source scenario and first propose a joint maximum likelihood estimator (MLE) to estimate all parameters jointly using a single time frame. Since the RTFs are typically invariant for a number of consecutive time frames we also propose a joint maximum likelihood estimator (MLE) using multiple time frames which has similar estimation performance compared to a recently proposed reference algorithm called simultaneously confirmatory factor analysis (SCFA), but at a much lower complexity. Moreover, we present experimental results which demonstrate that the estimation accuracy, together with the performance of noise reduction, speech quality and speech intelligibility, of our proposed joint MLE outperform those of existing MLE based approaches that use only a single time frame.

Original language	English
Pages (from-to)	695-705
Number of pages	11
Journal	IEEE - ACM Transactions on Audio, Speech, and Language Processing
Volume	31
DOIs	https://doi.org/10.1109/TASLP.2022.3231706
Publication status	Published - 2023

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

Dereverberation
maximum likelihood estima- tion
microphone array signal processing
PSD estimation
RTF estimation

Access to Document

10.1109/TASLP.2022.3231706

Joint_Maximum_Likelihood_Estimation_of_Microphone_Array_Parameters_for_a_Reverberant_Single_Source_ScenarioFinal published version, 751 KB

Cite this

@article{5b4d00c892f3411f91c0909b6dc1fc92,

title = "Joint Maximum Likelihood Estimation of Microphone Array Parameters for a Reverberant Single Source Scenario",

abstract = "Estimation of the acoustic-scene related parameters such as relative transfer functions (RTFs) from source to microphones, source power spectral densities (PSDs) and PSDs of the late reverberation is essential and also challenging. Existing maximum likelihood estimators typically consider only subsets of these parameters and use each time frame separately. In this paper we explicitly focus on the single source scenario and first propose a joint maximum likelihood estimator (MLE) to estimate all parameters jointly using a single time frame. Since the RTFs are typically invariant for a number of consecutive time frames we also propose a joint maximum likelihood estimator (MLE) using multiple time frames which has similar estimation performance compared to a recently proposed reference algorithm called simultaneously confirmatory factor analysis (SCFA), but at a much lower complexity. Moreover, we present experimental results which demonstrate that the estimation accuracy, together with the performance of noise reduction, speech quality and speech intelligibility, of our proposed joint MLE outperform those of existing MLE based approaches that use only a single time frame.",

keywords = "Dereverberation, maximum likelihood estima- tion, microphone array signal processing, PSD estimation, RTF estimation",

author = "Changheng Li and Jorge Martinez and Hendriks, {Richard Christian}",

note = "Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.",

year = "2023",

doi = "10.1109/TASLP.2022.3231706",

language = "English",

volume = "31",

pages = "695--705",

journal = "IEEE - ACM Transactions on Audio, Speech, and Language Processing",

issn = "2329-9304",

publisher = "IEEE Advancing Technology for Humanity",

}

TY - JOUR

T1 - Joint Maximum Likelihood Estimation of Microphone Array Parameters for a Reverberant Single Source Scenario

AU - Li, Changheng

AU - Martinez, Jorge

AU - Hendriks, Richard Christian

N1 - Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

PY - 2023

Y1 - 2023

N2 - Estimation of the acoustic-scene related parameters such as relative transfer functions (RTFs) from source to microphones, source power spectral densities (PSDs) and PSDs of the late reverberation is essential and also challenging. Existing maximum likelihood estimators typically consider only subsets of these parameters and use each time frame separately. In this paper we explicitly focus on the single source scenario and first propose a joint maximum likelihood estimator (MLE) to estimate all parameters jointly using a single time frame. Since the RTFs are typically invariant for a number of consecutive time frames we also propose a joint maximum likelihood estimator (MLE) using multiple time frames which has similar estimation performance compared to a recently proposed reference algorithm called simultaneously confirmatory factor analysis (SCFA), but at a much lower complexity. Moreover, we present experimental results which demonstrate that the estimation accuracy, together with the performance of noise reduction, speech quality and speech intelligibility, of our proposed joint MLE outperform those of existing MLE based approaches that use only a single time frame.

AB - Estimation of the acoustic-scene related parameters such as relative transfer functions (RTFs) from source to microphones, source power spectral densities (PSDs) and PSDs of the late reverberation is essential and also challenging. Existing maximum likelihood estimators typically consider only subsets of these parameters and use each time frame separately. In this paper we explicitly focus on the single source scenario and first propose a joint maximum likelihood estimator (MLE) to estimate all parameters jointly using a single time frame. Since the RTFs are typically invariant for a number of consecutive time frames we also propose a joint maximum likelihood estimator (MLE) using multiple time frames which has similar estimation performance compared to a recently proposed reference algorithm called simultaneously confirmatory factor analysis (SCFA), but at a much lower complexity. Moreover, we present experimental results which demonstrate that the estimation accuracy, together with the performance of noise reduction, speech quality and speech intelligibility, of our proposed joint MLE outperform those of existing MLE based approaches that use only a single time frame.

KW - Dereverberation

KW - maximum likelihood estima- tion

KW - microphone array signal processing

KW - PSD estimation

KW - RTF estimation

UR - http://www.scopus.com/inward/record.url?scp=85147548848&partnerID=8YFLogxK

U2 - 10.1109/TASLP.2022.3231706

DO - 10.1109/TASLP.2022.3231706

M3 - Article

SN - 2329-9304

VL - 31

SP - 695

EP - 705

JO - IEEE - ACM Transactions on Audio, Speech, and Language Processing

JF - IEEE - ACM Transactions on Audio, Speech, and Language Processing

ER -

Joint Maximum Likelihood Estimation of Microphone Array Parameters for a Reverberant Single Source Scenario

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this