Low-complexity computer simulation of multichannel room impules responses

Jorge Martinez

doi:10.4233/uuid:a3366a3f-1a76-4614-a4cf-fba7a0808ced

Low-complexity computer simulation of multichannel room impules responses

Jorge Martinez

Multimedia Computing

Research output: Thesis › Dissertation (TU Delft)

Abstract

The "telephone'' model has been, for the last one hundred thirty years, the base of modern telecommunications with virtually no changes in its fundamental concept. The arise of smaller and more powerful computing devices have opened new possibilities. For example, to build systems able to give to the user the illusion of being talking to the remote party as if both where in the same place. To achieve this still many challenges have to be overcome. In this thesis, a part of the acoustical signal processing problem is treated. To acoustically create the illusion of presence, fast and accurate control over the sound field in a room is required. The sound field given one or more sources is subject to different acoustical phenomena, such as reflection and diffraction. Because of these, to model or estimate the sound field in a room is in general a difficult task. In particular acoustical reflection poses an important challenge. The sound field reflects on the walls, ceiling and floor and a moment later those reflections reflect again, and later these reflect again. This recursive process makes the number of reflections as a function of time to increase, in general, at a geometric rate. To synthesize an artificial sound field in real time, one has to be able to model these reflections fast and accurately enough. In this thesis a fast algorithm to model the sound field in box-shaped rooms is proposed. Part one of this thesis begins with an introduction to the topic, here the different acoustical phenomena of interest are explained, and the concept of room impulse response (RIR) is introduced. The RIR is defined as the time-domain signal sensed at a receiver position as generated by a point source that emits an impulse. Assuming a linear time-invariant (LTI) model, if the point source emits not an impulse but an arbitrary signal, the actual sound field at a given observation location can then be modeled as a convolution of the source signal with the RIR. Moreover, since we are assuming a linear model, the sound field generated by an arbitrary number of point sources emitting arbitrary signals can be easily computed once the RIRs from the locations of the sources to the observation locations are known. Efficient computation of the RIR is therefore of theoretical and practical interest. Consequently, this part concludes with a summary of the most prominent algorithms to simulate the RIR. Part two of this thesis contains the relevant papers that make up this work. The analysis is given first for the case of fully reflective walls. It is noted that in this case all the acoustical reflections can be modeled by a set of virtual sources following a periodic structure over a lattice. The whole set of virtual sources is generated by the repetitions of a small set of sources called "the mother sources''. On the other side, the Poisson summation formula establishes the relation between periodicity and discretization under the Fourier transform. Relating these concepts, it is shown that by carefully discretizing the spectral representation of the RIR of the mother sources in free-field, the exact periodic structure that makes up the sound field in a room can be obtained. This is the key idea behind the proposed method. Carefully discretizing all domains, and making use of the fast Fourier transform (FFT), a fast multichannel RIR simulation method is obtained. Unfortunately this idea only works for fully reflective walls. By allowing the walls to have constant complex-valued reflection coefficients (this is, to model absorption and phase shift at the walls) the sound field of the set of virtual sources is not anymore periodic. A generalization of the Fourier transform is then introduced. First, a generalized Poisson summation formula is derived. This formula relates discretization in the generalized Fourier domain to a geometrically weighted periodic summation in the reciprocal domain. Basic properties of this transform are derived, its application to non zero-padded linear convolution is derived, but moreover a fast implementation, called the generalized fast Fourier transform (GFFT), is given. The proposed method is then extended to account for walls with constant complex-valued reflection coefficients. It is shown that by separating the sound field of the mother sources into its orthant-sided parts (the analogue of the single-sided parts of a function of a scalar variable), the sound field inside a room can be expressed as a sum of geometrically weighted sound fields generated by the periodic set of virtual sources. This summation is then related to a sampling condition on the generalized spectrum of the orthant-sided parts of the sound field of the mother sources. Using the GFFT the method simulates the RIR given a source at a dense set of spatial positions with very low complexity. In the experiments a comparison with a model called the mirror image source method (MISM) is given. In one scenario, the time the MISM would take to compute the RIR at a dense set of positions is estimated to be about one and a half years. The newly proposed method computes the RIR at all positions in only forty-eight minutes. This shows the contrasting difference in computational complexity, making the new method an important step on the road to simulate realistic sound fields in real time.

Original language	English
Qualification	Doctor of Philosophy
Awarding Institution	Delft University of Technology
Supervisors/Advisors	Lagendijk, R.L., Supervisor Heusdens, R., Advisor
Award date	22 Nov 2013
Print ISBNs	978-94-6108-553-5
DOIs	https://doi.org/10.4233/uuid:a3366a3f-1a76-4614-a4cf-fba7a0808ced
Publication status	Published - 2013

Keywords

Diss. prom. aan TU Delft

Access to Document

10.4233/uuid:a3366a3f-1a76-4614-a4cf-fba7a0808ced

Cite this

@phdthesis{4b7a2c3667164f91858dc6bfe1c6ba08,

title = "Low-complexity computer simulation of multichannel room impules responses",

abstract = "The {"}telephone'' model has been, for the last one hundred thirty years, the base of modern telecommunications with virtually no changes in its fundamental concept. The arise of smaller and more powerful computing devices have opened new possibilities. For example, to build systems able to give to the user the illusion of being talking to the remote party as if both where in the same place. To achieve this still many challenges have to be overcome. In this thesis, a part of the acoustical signal processing problem is treated. To acoustically create the illusion of presence, fast and accurate control over the sound field in a room is required. The sound field given one or more sources is subject to different acoustical phenomena, such as reflection and diffraction. Because of these, to model or estimate the sound field in a room is in general a difficult task. In particular acoustical reflection poses an important challenge. The sound field reflects on the walls, ceiling and floor and a moment later those reflections reflect again, and later these reflect again. This recursive process makes the number of reflections as a function of time to increase, in general, at a geometric rate. To synthesize an artificial sound field in real time, one has to be able to model these reflections fast and accurately enough. In this thesis a fast algorithm to model the sound field in box-shaped rooms is proposed. Part one of this thesis begins with an introduction to the topic, here the different acoustical phenomena of interest are explained, and the concept of room impulse response (RIR) is introduced. The RIR is defined as the time-domain signal sensed at a receiver position as generated by a point source that emits an impulse. Assuming a linear time-invariant (LTI) model, if the point source emits not an impulse but an arbitrary signal, the actual sound field at a given observation location can then be modeled as a convolution of the source signal with the RIR. Moreover, since we are assuming a linear model, the sound field generated by an arbitrary number of point sources emitting arbitrary signals can be easily computed once the RIRs from the locations of the sources to the observation locations are known. Efficient computation of the RIR is therefore of theoretical and practical interest. Consequently, this part concludes with a summary of the most prominent algorithms to simulate the RIR. Part two of this thesis contains the relevant papers that make up this work. The analysis is given first for the case of fully reflective walls. It is noted that in this case all the acoustical reflections can be modeled by a set of virtual sources following a periodic structure over a lattice. The whole set of virtual sources is generated by the repetitions of a small set of sources called {"}the mother sources''. On the other side, the Poisson summation formula establishes the relation between periodicity and discretization under the Fourier transform. Relating these concepts, it is shown that by carefully discretizing the spectral representation of the RIR of the mother sources in free-field, the exact periodic structure that makes up the sound field in a room can be obtained. This is the key idea behind the proposed method. Carefully discretizing all domains, and making use of the fast Fourier transform (FFT), a fast multichannel RIR simulation method is obtained. Unfortunately this idea only works for fully reflective walls. By allowing the walls to have constant complex-valued reflection coefficients (this is, to model absorption and phase shift at the walls) the sound field of the set of virtual sources is not anymore periodic. A generalization of the Fourier transform is then introduced. First, a generalized Poisson summation formula is derived. This formula relates discretization in the generalized Fourier domain to a geometrically weighted periodic summation in the reciprocal domain. Basic properties of this transform are derived, its application to non zero-padded linear convolution is derived, but moreover a fast implementation, called the generalized fast Fourier transform (GFFT), is given. The proposed method is then extended to account for walls with constant complex-valued reflection coefficients. It is shown that by separating the sound field of the mother sources into its orthant-sided parts (the analogue of the single-sided parts of a function of a scalar variable), the sound field inside a room can be expressed as a sum of geometrically weighted sound fields generated by the periodic set of virtual sources. This summation is then related to a sampling condition on the generalized spectrum of the orthant-sided parts of the sound field of the mother sources. Using the GFFT the method simulates the RIR given a source at a dense set of spatial positions with very low complexity. In the experiments a comparison with a model called the mirror image source method (MISM) is given. In one scenario, the time the MISM would take to compute the RIR at a dense set of positions is estimated to be about one and a half years. The newly proposed method computes the RIR at all positions in only forty-eight minutes. This shows the contrasting difference in computational complexity, making the new method an important step on the road to simulate realistic sound fields in real time.",

keywords = "Diss. prom. aan TU Delft",

author = "Jorge Martinez",

year = "2013",

doi = "10.4233/uuid:a3366a3f-1a76-4614-a4cf-fba7a0808ced",

language = "English",

isbn = "978-94-6108-553-5",

type = "Dissertation (TU Delft)",

school = "Delft University of Technology",

}

TY - THES

T1 - Low-complexity computer simulation of multichannel room impules responses

AU - Martinez, Jorge

PY - 2013

Y1 - 2013

N2 - The "telephone'' model has been, for the last one hundred thirty years, the base of modern telecommunications with virtually no changes in its fundamental concept. The arise of smaller and more powerful computing devices have opened new possibilities. For example, to build systems able to give to the user the illusion of being talking to the remote party as if both where in the same place. To achieve this still many challenges have to be overcome. In this thesis, a part of the acoustical signal processing problem is treated. To acoustically create the illusion of presence, fast and accurate control over the sound field in a room is required. The sound field given one or more sources is subject to different acoustical phenomena, such as reflection and diffraction. Because of these, to model or estimate the sound field in a room is in general a difficult task. In particular acoustical reflection poses an important challenge. The sound field reflects on the walls, ceiling and floor and a moment later those reflections reflect again, and later these reflect again. This recursive process makes the number of reflections as a function of time to increase, in general, at a geometric rate. To synthesize an artificial sound field in real time, one has to be able to model these reflections fast and accurately enough. In this thesis a fast algorithm to model the sound field in box-shaped rooms is proposed. Part one of this thesis begins with an introduction to the topic, here the different acoustical phenomena of interest are explained, and the concept of room impulse response (RIR) is introduced. The RIR is defined as the time-domain signal sensed at a receiver position as generated by a point source that emits an impulse. Assuming a linear time-invariant (LTI) model, if the point source emits not an impulse but an arbitrary signal, the actual sound field at a given observation location can then be modeled as a convolution of the source signal with the RIR. Moreover, since we are assuming a linear model, the sound field generated by an arbitrary number of point sources emitting arbitrary signals can be easily computed once the RIRs from the locations of the sources to the observation locations are known. Efficient computation of the RIR is therefore of theoretical and practical interest. Consequently, this part concludes with a summary of the most prominent algorithms to simulate the RIR. Part two of this thesis contains the relevant papers that make up this work. The analysis is given first for the case of fully reflective walls. It is noted that in this case all the acoustical reflections can be modeled by a set of virtual sources following a periodic structure over a lattice. The whole set of virtual sources is generated by the repetitions of a small set of sources called "the mother sources''. On the other side, the Poisson summation formula establishes the relation between periodicity and discretization under the Fourier transform. Relating these concepts, it is shown that by carefully discretizing the spectral representation of the RIR of the mother sources in free-field, the exact periodic structure that makes up the sound field in a room can be obtained. This is the key idea behind the proposed method. Carefully discretizing all domains, and making use of the fast Fourier transform (FFT), a fast multichannel RIR simulation method is obtained. Unfortunately this idea only works for fully reflective walls. By allowing the walls to have constant complex-valued reflection coefficients (this is, to model absorption and phase shift at the walls) the sound field of the set of virtual sources is not anymore periodic. A generalization of the Fourier transform is then introduced. First, a generalized Poisson summation formula is derived. This formula relates discretization in the generalized Fourier domain to a geometrically weighted periodic summation in the reciprocal domain. Basic properties of this transform are derived, its application to non zero-padded linear convolution is derived, but moreover a fast implementation, called the generalized fast Fourier transform (GFFT), is given. The proposed method is then extended to account for walls with constant complex-valued reflection coefficients. It is shown that by separating the sound field of the mother sources into its orthant-sided parts (the analogue of the single-sided parts of a function of a scalar variable), the sound field inside a room can be expressed as a sum of geometrically weighted sound fields generated by the periodic set of virtual sources. This summation is then related to a sampling condition on the generalized spectrum of the orthant-sided parts of the sound field of the mother sources. Using the GFFT the method simulates the RIR given a source at a dense set of spatial positions with very low complexity. In the experiments a comparison with a model called the mirror image source method (MISM) is given. In one scenario, the time the MISM would take to compute the RIR at a dense set of positions is estimated to be about one and a half years. The newly proposed method computes the RIR at all positions in only forty-eight minutes. This shows the contrasting difference in computational complexity, making the new method an important step on the road to simulate realistic sound fields in real time.

AB - The "telephone'' model has been, for the last one hundred thirty years, the base of modern telecommunications with virtually no changes in its fundamental concept. The arise of smaller and more powerful computing devices have opened new possibilities. For example, to build systems able to give to the user the illusion of being talking to the remote party as if both where in the same place. To achieve this still many challenges have to be overcome. In this thesis, a part of the acoustical signal processing problem is treated. To acoustically create the illusion of presence, fast and accurate control over the sound field in a room is required. The sound field given one or more sources is subject to different acoustical phenomena, such as reflection and diffraction. Because of these, to model or estimate the sound field in a room is in general a difficult task. In particular acoustical reflection poses an important challenge. The sound field reflects on the walls, ceiling and floor and a moment later those reflections reflect again, and later these reflect again. This recursive process makes the number of reflections as a function of time to increase, in general, at a geometric rate. To synthesize an artificial sound field in real time, one has to be able to model these reflections fast and accurately enough. In this thesis a fast algorithm to model the sound field in box-shaped rooms is proposed. Part one of this thesis begins with an introduction to the topic, here the different acoustical phenomena of interest are explained, and the concept of room impulse response (RIR) is introduced. The RIR is defined as the time-domain signal sensed at a receiver position as generated by a point source that emits an impulse. Assuming a linear time-invariant (LTI) model, if the point source emits not an impulse but an arbitrary signal, the actual sound field at a given observation location can then be modeled as a convolution of the source signal with the RIR. Moreover, since we are assuming a linear model, the sound field generated by an arbitrary number of point sources emitting arbitrary signals can be easily computed once the RIRs from the locations of the sources to the observation locations are known. Efficient computation of the RIR is therefore of theoretical and practical interest. Consequently, this part concludes with a summary of the most prominent algorithms to simulate the RIR. Part two of this thesis contains the relevant papers that make up this work. The analysis is given first for the case of fully reflective walls. It is noted that in this case all the acoustical reflections can be modeled by a set of virtual sources following a periodic structure over a lattice. The whole set of virtual sources is generated by the repetitions of a small set of sources called "the mother sources''. On the other side, the Poisson summation formula establishes the relation between periodicity and discretization under the Fourier transform. Relating these concepts, it is shown that by carefully discretizing the spectral representation of the RIR of the mother sources in free-field, the exact periodic structure that makes up the sound field in a room can be obtained. This is the key idea behind the proposed method. Carefully discretizing all domains, and making use of the fast Fourier transform (FFT), a fast multichannel RIR simulation method is obtained. Unfortunately this idea only works for fully reflective walls. By allowing the walls to have constant complex-valued reflection coefficients (this is, to model absorption and phase shift at the walls) the sound field of the set of virtual sources is not anymore periodic. A generalization of the Fourier transform is then introduced. First, a generalized Poisson summation formula is derived. This formula relates discretization in the generalized Fourier domain to a geometrically weighted periodic summation in the reciprocal domain. Basic properties of this transform are derived, its application to non zero-padded linear convolution is derived, but moreover a fast implementation, called the generalized fast Fourier transform (GFFT), is given. The proposed method is then extended to account for walls with constant complex-valued reflection coefficients. It is shown that by separating the sound field of the mother sources into its orthant-sided parts (the analogue of the single-sided parts of a function of a scalar variable), the sound field inside a room can be expressed as a sum of geometrically weighted sound fields generated by the periodic set of virtual sources. This summation is then related to a sampling condition on the generalized spectrum of the orthant-sided parts of the sound field of the mother sources. Using the GFFT the method simulates the RIR given a source at a dense set of spatial positions with very low complexity. In the experiments a comparison with a model called the mirror image source method (MISM) is given. In one scenario, the time the MISM would take to compute the RIR at a dense set of positions is estimated to be about one and a half years. The newly proposed method computes the RIR at all positions in only forty-eight minutes. This shows the contrasting difference in computational complexity, making the new method an important step on the road to simulate realistic sound fields in real time.

KW - Diss. prom. aan TU Delft

U2 - 10.4233/uuid:a3366a3f-1a76-4614-a4cf-fba7a0808ced

DO - 10.4233/uuid:a3366a3f-1a76-4614-a4cf-fba7a0808ced

M3 - Dissertation (TU Delft)

SN - 978-94-6108-553-5

ER -

Low-complexity computer simulation of multichannel room impules responses

Abstract

Keywords

Access to Document

Fingerprint

Cite this