

**Delft University of Technology** 

#### Pitch-Matched Integrated Transceiver Circuits for High-Resolution 3-D Neonatal Brain Monitoring

Guo, P.

DOI

10.4233/uuid:a356d348-add2-4995-9392-1b16daa8dbfa

**Publication date** 2023

**Document Version** Final published version

**Citation (APA)** Guo, P. (2023). *Pitch-Matched Integrated Transceiver Circuits for High-Resolution 3-D Neonatal Brain Monitoring.* [Dissertation (TU Delft), Delft University of Technology]. https://doi.org/10.4233/uuid:a356d348-add2-4995-9392-1b16daa8dbfa

#### Important note

To cite this publication, please use the final published version (if applicable). Please check the document version above.

Copyright

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

**Takedown policy** Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology. For technical reasons the number of authors shown on this cover page is limited to a maximum of 10.

## PITCH-MATCHED INTEGRATED TRANSCEIVER CIRCUITS FOR HIGH-RESOLUTION 3-D NEONATAL BRAIN MONITORING

## PITCH-MATCHED INTEGRATED TRANSCEIVER CIRCUITS FOR HIGH-RESOLUTION 3-D NEONATAL BRAIN MONITORING

Dissertation

for the purpose of obtaining the degree of doctor at Delft University of Technology by the authority of the Rector Magnificus, prof. dr. ir. T.H.J.J. van der Hagen chair of the Board for Doctorates to be defended publicly on Wednesday 27 September 2023 at 12:30 o'clock

by

#### Peng GUO

Master of Science in Electronics Science and Technology, Zhejiang University, China born in Huzhou, China This dissertation has been approved by the promotors.

Composition of the doctoral committee:

| Rector Magnificus,           | chairperson                                         |
|------------------------------|-----------------------------------------------------|
| Dr. ir. M. A. P. Pertijs     | Delft University of Technology, promotor            |
| Prof. dr. ir. N. de Jong     | Delft University of Technology, promotor            |
| Independent members:         |                                                     |
| Prof. dr. P. Tortoli         | University of Florence, Italy                       |
| Prof. dr. K. A. A. Makinwa   | Delft University of Technology                      |
| Prof. dr. L. C. N. de Vreede | Delft University of Technology                      |
| Dr. ir. P. J. A. Harpe       | Eindhoven University of Technology, The Netherlands |
| Dr. ir. H. J. Vos            | Erasmus University Medical Center, The Netherlands  |
| Prof. dr. P. French          | Delft University of Technology, reserve member      |

This research is part of the project MIFFY with project number 15293 of the Open Technology Programme, which is financed by the Dutch Research Council (NWO).

ISBN: 978-94-6469-514-4

Copyright © 2023 by Peng GUO

All rights reserved. No part of this publication may be reproduced, or distributed in any form or by any other means, or stored in a database or retrieval system, without the prior written permission of the author.

Printed by Proefschrift Maken | www.proefschriftmaken.nl

To my parents and my family (MinWen, QingYan) All that depart are like water in this river, parting ceaselessly by day and night. confucius

## **CONTENTS**

| 1 | Intr                                            | roduction                                                                |    |  |  |  |  |
|---|-------------------------------------------------|--------------------------------------------------------------------------|----|--|--|--|--|
|   | 1.1                                             | Motivation                                                               | 1  |  |  |  |  |
|   | 1.2                                             | Transfontenelle Ultrasonography Overview                                 | 2  |  |  |  |  |
|   | 1.3                                             | Wearable TFUS Device                                                     | 3  |  |  |  |  |
|   | 1.4                                             | Design Challenges                                                        | 3  |  |  |  |  |
|   |                                                 | 1.4.1    3-D TFUS Design Requirements                                    | 3  |  |  |  |  |
|   |                                                 | 1.4.2 Transducer-ASIC Process Integration                                | 5  |  |  |  |  |
|   |                                                 | 1.4.3 Transceiver ASIC Design Considerations                             | 7  |  |  |  |  |
|   |                                                 | 1.4.4 Conclusion                                                         | 8  |  |  |  |  |
|   | 1.5                                             | Thesis Organization                                                      | 9  |  |  |  |  |
| 2 | A P                                             | itch-Matched Low-Noise AFE With Accurate TGC for High-Density Ultrasound |    |  |  |  |  |
|   |                                                 |                                                                          | 17 |  |  |  |  |
|   | 2.1                                             | Introduction                                                             | 17 |  |  |  |  |
|   | 2.2                                             |                                                                          | 19 |  |  |  |  |
|   | 2.3                                             | Architecture design                                                      | 21 |  |  |  |  |
|   |                                                 | 2.3.1 Two-Stage Interpolating VGA                                        | 21 |  |  |  |  |
|   |                                                 |                                                                          | 22 |  |  |  |  |
|   |                                                 |                                                                          | 23 |  |  |  |  |
|   |                                                 | 2.3.4 Current Amplifier with Complementary Current-Steering Network      | 25 |  |  |  |  |
|   | 2.4                                             |                                                                          |    |  |  |  |  |
|   |                                                 | 2.4.1 Hardware Sharing TIA                                               | 26 |  |  |  |  |
|   |                                                 | 2.4.2 Bandwidth-Control Circuit                                          | 28 |  |  |  |  |
|   |                                                 | 2.4.3 Class-AB Current Amplifier                                         | 29 |  |  |  |  |
|   |                                                 | 2.4.4 Noise Analysis                                                     | 30 |  |  |  |  |
|   | 2.5                                             |                                                                          | 31 |  |  |  |  |
|   |                                                 | 2.5.1 ASIC Prototype                                                     | 31 |  |  |  |  |
|   |                                                 | 2.5.2 Electrical Characterization                                        | 31 |  |  |  |  |
|   |                                                 | 2.5.3 Acoustical Characterization                                        | 35 |  |  |  |  |
|   | 2.6                                             | Conclusion                                                               | 37 |  |  |  |  |
| 3 | A                                               | 1.2mW/Channel Pitch-Matched Transceiver ASIC Employing a                 |    |  |  |  |  |
|   | Box                                             | Boxcar-Integration-Based RX Micro-Beamformer for High-Resolution 3-D     |    |  |  |  |  |
|   | Ultrasound Imaging                              |                                                                          |    |  |  |  |  |
|   | 3.1                                             | 3.1 Introduction                                                         |    |  |  |  |  |
|   | 3.2                                             |                                                                          |    |  |  |  |  |
|   | 3.3                                             |                                                                          |    |  |  |  |  |
|   | 3.3.1 Boxcar-integration-based micro-beamformer |                                                                          | 50 |  |  |  |  |
|   |                                                 |                                                                          | 52 |  |  |  |  |
|   |                                                 |                                                                          |    |  |  |  |  |

| 8                                      | List of Publications                                          |         |                                                             |            |  |
|----------------------------------------|---------------------------------------------------------------|---------|-------------------------------------------------------------|------------|--|
| 7                                      | Samenvatting                                                  |         |                                                             |            |  |
| 6                                      | 6 Summary                                                     |         |                                                             | 117        |  |
|                                        |                                                               | 0       | Vork                                                        | 110        |  |
|                                        | 5.1<br>5.2                                                    |         | Findings                                                    | 107<br>108 |  |
| 5 Conclusion<br>5.1 Main Contributions |                                                               |         |                                                             |            |  |
| -                                      | 0                                                             |         |                                                             | 107        |  |
|                                        | 4.5                                                           |         | ion                                                         | 97         |  |
|                                        |                                                               |         | ltrasound B-mode Imaging                                    | 94<br>95   |  |
|                                        |                                                               |         | lectrical Characterization                                  | 90<br>94   |  |
|                                        |                                                               |         | SIC Prototype                                               | 88         |  |
| 4.4 Experimental results               |                                                               |         |                                                             | 88         |  |
|                                        |                                                               |         | V Pulser                                                    | 88         |  |
|                                        |                                                               |         | AM-16 DAC                                                   | 87         |  |
|                                        |                                                               |         | put Common-mode Feedback                                    | 86         |  |
|                                        |                                                               |         | harge-mode Reference Generation                             | 84         |  |
|                                        |                                                               |         | ynamic Comparator of the SAR ADC                            | 82<br>83   |  |
|                                        | 4.3                                                           |         | Design                                                      | 82<br>82   |  |
|                                        | 1 2                                                           |         | X Architecture                                              | 82<br>82   |  |
|                                        |                                                               |         | ata Transmitter (D-TX)                                      | 80         |  |
|                                        |                                                               |         | assive boxcar-integration-based µBF ADC                     | 78         |  |
|                                        |                                                               |         | FE                                                          | 76         |  |
|                                        |                                                               |         | ystem Overview                                              | 76         |  |
| 4.2 Architecture Design                |                                                               |         |                                                             | 76         |  |
|                                        | 4.1                                                           |         | tion                                                        | 73         |  |
|                                        | Multi-Level Signaling for 3-D Transfontanelle Ultrasonography |         |                                                             |            |  |
| 4                                      | A 1                                                           | 25um-Pi | tch-Matched Transceiver ASIC with Micro-Beamforming ADC and |            |  |
|                                        | 3.6                                                           | Conclus | ion                                                         | 63         |  |
|                                        | a -                                                           |         | coustical Characterization                                  | 62         |  |
|                                        |                                                               |         | lectrical Characterization                                  | 59         |  |
|                                        |                                                               |         | SIC Prototype                                               | 58         |  |
|                                        | 3.5                                                           | Experim | ental results                                               | 58         |  |
|                                        |                                                               |         | ransmitter                                                  | 57         |  |
|                                        |                                                               |         | utput Buffer                                                | 57         |  |
|                                        |                                                               |         | licro-beamformer                                            | 54         |  |
|                                        | 5.4                                                           |         | nalog front-end                                             | 54<br>54   |  |
| 3.4 Circuit design                     |                                                               |         |                                                             | 54         |  |

# 1

### INTRODUCTION

#### **1.1.** MOTIVATION

A CCORDING to a recent study, an estimated 10.6% of all live births were preterm in 2014 [1]. Globally this amounts to 15 million babies per year. The survival rate in preterm infants has been improved significantly in high-income countries, however, preterm newborns could still face severe consequences in their later life, due to a lack of sufficient newborn care in low-income and middle-income countries [1]. For instance, preterm babies often exhibit neuro-developmental problems linked to inadequate brain perfusion during and after the delivery, leading to disorders of neuro-cognitive development, motor disabilities and psychiatric diseases that may persist into adulthood [2].

Transfontanelle ultrasonography (TFUS) provides invaluable information for diagnosis of the neonates and prediction of neuromotor outcomes [3]. However, conventional cart-based ultrasound imaging systems have relatively low portability and require intervention by a sonographer, which significantly limits its access to the patient. In contrast, a wearable ultrasound device has added value, particularly for high-risk neonates. Such a device can be worn on the preterm baby's head and can provide long-term monitoring to continuously assess cerebral circulation and brain perfusion [2], [3]. A wearable device has been reported in [4], where a soft elastic hat containing a probe holder and a single-element ultrasound probe was placed on the neonate's head to measure blood flow velocity at several depths via Doppler imaging. However, due to the fixed position of the single-element probe on the neonate's head, the device has a limited field of view and is incapable of generating volumetric 3-D images [5]. Another wearable device has been reported in [6], where 3-D volumetric data is obtained by mechanically steering a linear ultrasound probe anchored in a custom 3-D printed headset. However, the required rotational motor prevents further miniaturization, weakens the stability and robustness and also leads to high power consumption of the device. These issues can be solved by integrating an ultrasound transducer array and the associated electronics into a small chip, which eliminates the mechanical motor by using electronic beam steering, thus minimizing the device and reducing the power consumption.

In this thesis, we particularly focus on the development of pitch-matched transceiver chip integrated with high-density 2-D transducer arrays, that pave the way for wearable ultrasound imaging devices enabling timely diagnosis at the bedside via transfontanelle ultrasonography (TFUS). As will be elaborated in Section 1.2, the targeted wearable device will assess the preterm baby's brain through the fontanel, also known as the "soft spot", by transmitting ultrasound signals and receiving echoes that reflect off the brain tissues.

#### **1.2.** TRANSFONTENELLE ULTRASONOGRAPHY OVERVIEW

As illustrated in Fig. 1.1(a), TFUS is a diagnostic imaging technique employing ultrasound waves to produce images of the brain through the fontanels of a newborn's skull, whereas the fontanels provide a anatomical window [see Fig. 1.1(b)] for ultrasound waves to penetrate through the skull, as the bones in these regions have not yet fused together.



Fig. 1.1: (a) An ultrasound diagnosis via TFUS [7]. (b) Anterior and posterior fontanels [8].

Various imaging modalities have been used to assess the neonatal brain perfusion, e.g., magnetic resonance imaging (MRI), which detects the radio signals activated by strong magnetic fields in the body to produce high-resolution images [3]; positron emission tomography (PET) , which uses a PET camera to detect high-energy gamma rays emitted by radiolabeled tracers [9]; near-infrared spectroscopy (NIRS), which measures changes in the concentration of oxygenated and deoxygenated hemoglobin in tissues by using the near-infrared region of the electromagnetic spectrum [10]. TFUS has several benefits compared with other modalities, including:

- ◆ Safe: Unlike PET and other nuclear imaging modalities, TFUS is noninvasive and regarded as very safe, since it eliminates the need for exposure to ionizing radiation [3].
- Sedation-free: TFUS can be performed on infants without the need for sedation, unlike MRI, for which the infant may need to be sedated to remain still during the process [2].
- ♦ Cost-effective: TFUS is generally less expensive than MRI and PET, making it a more costeffective option for routine imaging of infants [11], [12].
- Repeatable: TFUS can be performed as often as needed without significant risks or side effects, allowing for frequent monitoring of the infant's brain [12].
- ◆ Portable: The TFUS system can be made very small, thereby it can be easily transported to various locations, such as the neonatal intensive care unit, enabling bedside monitoring during critical care medicine [13].

 High-resolution: TFUS can provide high-quality images with a sub-millimeter spatial resolution, which is significantly better than NIRS which only achieves a spatial resolution of a few centimeters [14].

In short, TFUS is a very safe and cost-effective imaging modality that utilizes the naturally available anatomical windows, allowing for a bedside evaluation of the brain perfusion.

There are, however, a number of limitations associated with the traditional TFUS, such as the need for expertise in operating the hardware and understanding its functionality, which necessitates a long-term training process and additional resources, resulting in extra costs and hinder the widespread use of this technique especially in low-income counties [11], [15]. Besides, traditional TFUS is only capable of generating 2-D images, which show a 2-D cross-sectional view of the area being image and is highly operator-dependent, as the image quality is influenced by the skill and experience of the sonographer [16].

#### **1.3.** WEARABLE TFUS DEVICE

In contrast, a point-of-care ultrasound system capable of generating 3-D ultrasound images at the bedside has the potential to address these limitations. As depicted in Fig. 1.2(a), a wear-able ultrasound patch with a 3-D ultrasound probe embedded inside can be attached to infant's head, allowing for an ultrasound scan to be performed without the intervention of a sonographer. Meanwhile, the data can be uploaded via a high-speed wireless data link without the need for a cord, which would increase the comfort and mobility for infants during the ultrasound scan procedure, making it possible to perform long-term monitoring. Afterward, the ultrasound data is forwarded to a remote monitor workstation as depicted in Fig. 1.2(b), providing volumetric information and 3-D views of the brain to facilitate proper diagnosis by clinicians.

Fig. 1.2(c) shows the system architecture of the envisioned wearable ultrasound patch consisting of a WIFI module, an FPGA, a power management module, random-access memory (RAM) units, a battery module and an ultrasound chip. The FPGA controls the data communication between the WIFI module, the RAM units and the ultrasound chip, while the power management module manages and controls the power supply in the system, e.g., power cycling the ultrasound chip to save power. The core of such a wearable device is an 2-D transducer array and the associated transceiver ASIC, that illuminates the tissue, receives the reflected echo signals and enables high frame rate and high-resolution 3-D imaging. Typically, implementing these functions in a small package imposes stringent requirements on the transceiver ASIC design, in terms of on-chip signal conditioning to improve signal quality, sufficient channel-count reduction to enable packaging, power dissipation management to meet safety standards, etc.

#### **1.4.** DESIGN CHALLENGES

#### **1.4.1.** 3-D TFUS DESIGN REQUIREMENTS

In conventional TFUS,1-D curvilinear and linear array probes are recommended [17]. For evaluating the tiny structures in the infant brain, a high transducer frequency of about 7.5 - 11 MHz is preferable to obtain high spatial resolution [17]. For the same reason, the scan depth in TFUS can be relatively shallow (e.g., 3.5 cm in our application) compared with other applications, such as TEE, where a depth of about 6 - 12 cm is generally required [18], [19]. The



(a)



Fig. 1.2: A bedside ultrasound system consisting of (a) a wireless ultrasound patch attached to preterm baby's head, and (b) a monitor workstation directly accessible for the clinician in a control center. (c) The system architecture of the envisioned wearable ultrasound patch.

transducer frequency and the scan depth required for conventional TFUS remain applicable to a 3-D TFUS probe using 2-D transducer array. Moreover, new requirements also need to be fulfilled, such as the implementation of transmit/receive beamforming that enable steering of the ultrasound beam to elevation angles to eliminate the need for manipulating the probe. A high frame rate is also required, which is very important to obtain detailed information on the microvasculature in the neonatal brain [20], [21]. Besides, a 3-D TFUS probe needs to cover the infant's fontanel, leading to an aperture size of 20×10 mm<sup>2</sup> based on the studies of anterior fontanel size in preterm infants [22],[23].

Various technologies can be used to manufacture the required 2-D transducer arrays, such as bulk piezoelectric transducers and micromachined ultrasound transducers (MUTs). In the last few decades, piezoelectric materials based on lead zirconate titanate (PZT) ceramic have become the most popular materials, due to their good piezoelectric properties, chemical inertness, and physical strength [24], whereas MUTs like capacitive micromachined ultrasound transducers (CMUTs) have become an emerging type in recent years due to their ability to be produced in volume at low cost [25]. Although these technologies have been extensively utilized in the production of 1-D transducer arrays, the small pitch and larger aperture required in a 3-D TFUS probe, as well as the associated high electrical impedance, lead to a series of challenges both in transducer manufacturing and the interfacing ASIC design.

As an example, the required pitch of a 2-D transducer array with a center frequency of 7.5 MHz for TFUS is about 100  $\mu$ m to avoid strong grating lobes and the resulting image artifacts, which would reduce the spatial resolution, narrow the field of view, and lower the image quality [26]. As a result, the 100- $\mu$ m pitch leads to approximately 20,000 transducer elements in the aperture size of a TFUS probe, making the interconnection between the transducer array and the ASIC, and bonding process during chip packaging extremely difficult. In addition, the high electrical impedance associated with the small element results in stronger attenuation of the ultrasound signals and reduced signal-to-noise ratio due to the interconnection loading, in particular compared to the elements of a 1-D array. Moreover, the power dissipation of a 3-D TFUS probe also raises concerns about safety, if long-term monitoring without the intervention of sonographers is required.

#### **1.4.2.** TRANSDUCER-ASIC PROCESS INTEGRATION

A pitch-matched ASIC directly integrated onto a transducer array is an optimal solution to address the interconnection issue between the transducer elements and the ASIC. The term "pitch-matched" refers to the fact that the ASIC is designed to match the element pitch of the transducer array, thereby enabling the direct integration. Several techniques have been developed for realizing this, such as using flex circuits as an interposer between the PZT transducer and the ASIC [27], employing wafer-level bonding to establish electrical connections between CMUT and ASIC wafers [28], and utilizing a PZT-on-CMOS integration process to create connections between the PZT and ASIC, via a post-processing metal interconnection layer [19]. The last integration process is adopted in the development of our ultrasound chips, because it is well suited for prototyping and small-volume production without the need for a clean room, and also because it is easily accessible to us. Although a high-voltage (HV) BCD technology was adopted in our design, the same integration process steps are still valid since there is almost no change in the back-end process steps of the BCD technology, such as the metal deposition and oxidation, compared with the CMOS technology used in [19].



Fig. 1.3: (a) The cross-section and top view of a pitch-matched ultrasound transceiver chip utilizing the PZT-on-CMOS integration process [19]. (b) A typical system architecture of the pitch-matched transceiver ASIC.

The cross-section and top view of a pitch-matched ultrasound chip made using the direct PZT-on-CMOS integration process are illustrated in Fig. 1.3(a). The transceiver ASIC can be partitioned into two regions as depicted in the top view: a pitch-matched region where a transducer (TD) bonding pad array matching the pitch of the transducer array is fabricated using the top metal layer, after the associated ASIC circuits are built through the CMOS/BCD technology; and a peripheral region where the miscellaneous blocks like data link and digital communication circuits are implemented, along with the ASIC bonding pads used for system-level Inputs/Outputs (IOs).

Several post-processing steps are carried out after the ASIC chip is fabricated. The final step in the ASIC process includes passivation etching on top of the TD bonding pads, by which the bonding pads are exposed for the subsequent interconnecting. Afterward, a metallic interconnection layer is welded over the transducer bonding pads of the ASIC, aligned with the transducer array pattern, followed by the deposition of a non-conductive epoxy layer, which fills the gaps between the metal. This epoxy layer is then ground down to expose the metal, thereby creating electrical contacts for the PZT layer above, where the epoxy layer is also used as an electrical isolation layer between neighboring TD elements and a mechanical dicing buffer to protect the ASIC chip. Following this step, a layer of PZT material is attached to the ground epoxy layer via a layer of electrically conductive glue, which creates an electrical interconnection between the contacts and the backside electrode of the PZT ceramic. This layer also serves to minimize acoustic ringing and crosstalk of the elements. A conductive matching layer is applied on top of the piezoelectric layer for acoustic matching purposes. After this, the stack is saw-diced to build up the 2-D transducer array, in which the elements are isolated by the formed dicing kerfs. Although the overall PZT-on-CMOS process steps seems straightforward, producing a successful prototype is a non-trivial task. For instance, a considerable amount of effort and iteration has gone into overcoming challenges in our ultrasound chip that features a small pitch, such as the stress concentration during dicing which could crack the stack, and controlling the uniformity of the PZT layer, as variations can affect the transducer performance.

#### **1.4.3.** TRANSCEIVER ASIC DESIGN CONSIDERATIONS

Other challenges lie in the pitch-matched transceiver ASIC design. A typical system architecture of such an ASIC interfacing a *N*-element transducer array is depicted in Fig. 1.3(b). Corresponding to the ASIC layout arrangement shown in Fig. 1.3(a), the associated circuit design is also divided into two parts: the pitch-matched region, where the circuit design needs to be compact and low-power due to the strictly limited area and its major contribution to the overall power consumption; and the peripheral region, where the area restriction is relaxed, but power consumption remains a concern.

In the pitch-matched region, the circuits typically comprise TX and RX circuitry, as well as T/R switches. During TX, the T/R switches connect the transducer elements to *N*-channel HV pulsers, and isolate the low-voltage (LV) RX circuitry from the high voltage, allowing for generating intended ultrasound waves based on the TX beamforming settings, e.g., focused wave, diverging wave and plane wave [29]. Different kinds of pulser designs have been implemented in the prior art, such as unipolar pulsers [30]–[32], bipolar pulsers [33], and multi-level pulsers [34]. Meeting the HV requirement typically requires the use of a HV process, such as the Bipolar-CMOS-DMOS (BCD) technology. However the bulky HV devices used in the pulsers can often conflict with the area constraints, which becomes a serious problem in our design due to the small transducer pitch. Other concerns that need to be considered include effective distribution the beamforming settings.

During RX, the T/R switches connect the transducer elements to low-noise amplifiers (LNAs), followed by time-gain compensation (TGC) and the subsequent back-end RX signal processing. A range of LNA designs have been implemented in the prior art, which can be categorized into voltage-mode amplifiers (VA) and trans-impedance amplifiers (TIA). Each category includes numerous variations of amplifiers, e.g., single-ended common-source VA [35], single-ended inverter-based VA [36], differential inverter-based VA [37], single-ended common-source TIA [30], etc. The similar design requirements that apply to LNAs in other applications can also be applied to LNAs in ultrasound, including low noise, low power consumption, and good noise figure. The need for the following time-gain compensation function block arises from the attenuation of the ultrasound wave propagation in the medium, which follows an exponential trajectory as a function of scan depth. Handling the large dynamic range associated with the attenuation is a challenge in the design of the RX signal chain, whereas the TGC block compensates for the attenuation by adjusting the gain as a function of time, thereby reducing the complexity of subsequent back-end circuitries. Conventionally, the TGC function is implemented with an explicit variable gain amplifier (VGA) [38] or a programmable gain amplifier (PGA) [39]-[41], following the LNA, whose gain often follows an exponential trajectory controlled by an analog signal or a digital code. Recently, an emerging type of TGC, which is made of an interpolating VGA to reduce the linear-in-dB gain error, has been reported in [42]. It also combines the LNA with the TGC function into one stage, and utilizes a dynamic biasing scheme to optimize the power consumption. In contrast to PGAs and VGAs used in other applications [43], TGCs used in ultrasound imaging generally require an accurate gain that is insensitive to process/voltage/temperature (PVT) variation, and good power control to prevent overheating issues.

There are also special design considerations to take into account for the LNA and TGC that are associated with image artifacts. In ultrasound imaging, an image artifact is any structure or feature in the ultrasound image that does not present a valid object within the tissue [44]. Image artifacts can result from a variety of sources, many of which are related to the transducer structure and the imaged object. For example, reverberation artifacts can occur when the ultrasound beam reflects back and forth between a strongly reflecting interface and the transducer surface [45], whereas some artifacts can be traced back to transceiver ASIC design. For example, TGC artifacts [45] may appear in the image due to the mismatch between the applied gain compensation and the actual ultrasound attenuation rate in the target tissues. Other artifacts may arise from inappropriate switching transients that inject energy to the transducers during RX, leading to second-time reflection caused by parasitic ultrasound transmission, such as the gain-switching artifacts associated with the switching transients between adjacent gain steps in the AFE [42], and the TX/RX switching artifacts associated with the switching transients of the T/R switches [46]. During the design of the RX analog front-end (AFE), including the LNA and TGC, it is important to consider the need for mitigating such artifacts, in addition to the requirements for compact size, low noise, and low power consumption.

At this point, a solution to the interconnection issue between a transducer array and a related transceiver ASIC has been proposed, and the resulting challenges associated with transducer manufacturing and part of the ASIC design have also been discussed. However, it should be noted that the number of required ASIC outputs and pads remains high, as they need to match the number of transducer elements if the analog front-end circuits are directly buffered and output to the pads. For a transducer array with over 10,000 elements, chip packaging of such an ASIC is impractical and unrealistic. Consequently, channel-reduction techniques must be introduced in either the RX backend circuit, the peripheral signal processing circuit, or both. A variety of channel-reduction schemes have been proposed in the prior art, including timedivision multiplexing (TDM) [31], [37], [47], which enables different AFEs to share an output pad by assigning a dedicated time slot to each AFE; micro-beamforming [40], [48], which applies predefined delays to AFEs corresponding to neighboring transducer elements, followed by a circuitry to sum the delayed signals up into one output signal; and on-chip digitization [39], [49], which converts the analog outputs of the AFEs to digital signals in the pitch-matched region, followed by concatenating several digitized signals into one digital output at the periphery. As will be elaborated in the following chapters, these techniques have both advantages and disadvantages, while a significant reduction in channel count is necessary in our application. For instance, a 128-fold reduction is required to reduce the channel count below 200, thus reducing the complexity in chip packaging. Factoring in various design constraints, such as the high frame rate, near-field image resolution and very limited area, achieving a 128-fold reduction in channel count requires innovative and effective channel-reduction technique.

#### **1.4.4.** CONCLUSION

The ultrasound chip designed for 3-D TFUS requires a direct integration between the transducer and ASIC. The design challenges are not only limited to the electronics, but are also influenced by the transducer, and the two are closely intertwined. The art of balance can

only be found by factoring in different design aspects, and by a system-level optimization. In this thesis, two generations of ultrasound transceiver ASICs integrated with PZT transducer arrays for the TFUS application are presented. In the first generation, a novel AFE design that combines an LNA with the continuous TGC function is realized in a bid to mitigate the gain-switching and T/R switching artifacts. Besides, a new current-mode micro-beamforming ( $\mu$ BF) design based on boxcar integration (BI) is also implemented to reduce the channel count within a compact layout. In the second generation, the AFE is derived from the first version, while the design focuses on RX backend circuitry and channel-count reduction, including a passive BI-based  $\mu$ BF merged with a charge-sharing SAR ADC, which digitizes the delayed-and-summed (DAS) signals, and a subsequent multi-level data link, which concatenates outputs of four ADCs. In total, a 128-fold reduction in channel count is finally achieved.

#### **1.5.** Thesis Organization

The thesis is organized as follows.

Chapter 2 presents a compact analog front-end circuit for ultrasound receivers with linearin-dB continuous gain control for time-gain compensation. The AFE consists of two variablegain stages, both of which employ a novel complementary current-steering network as the interpolator to realize continuously-variable gain. The first stage is a trans-impedance amplifier (TIA) with a hardware-sharing inverter-based input stage to save power and area. The TIA's output couples capacitively to the second stage, which is a class-AB current amplifier. The AFE is integrated into an application-specific integrated circuit in a 180-nm high-voltage BCD technology and assembled with a 100µm-pitch PZT transducer array of 8×8 elements.

Chapter 3 presents the complete  $1^{st}$ -generation transceiver ASIC for 3-D TFUS, mainly focusing on a novel µBF receiver architecture. The µBF employs current-mode summation and boxcar integration to realize delay-and-sum on a *N*-element sub-array using *N*× fewer capacitive memory elements than conventional µBF implementations, thus reducing the hardware overhead associated with the memory elements. The boxcar integration also obviates the need for explicit anti-alias filtering in the analog front-end, thus further reducing die area. These features facilitate the use of µBF in smaller-pitch applications, as demonstrated by a prototype transceiver ASIC employing µBF on sub-arrays of *N*=4 elements, targeting a wearable ultrasound device that monitors brain perfusion in preterm infants via the fontanel.

Chapter 4 presents the  $2^{nd}$ -generation pitch-matched transceiver ASIC for wearable ultrasound devices used in TFUS. The ASIC interfaces with a  $16 \times 16$  2-D transducer array with 125µm pitch and 9-MHz central frequency, including element-level unipolar pulsers with transmit beamforming, and receive circuitries that combine 8-fold multiplexing, 4-channel microbeamforming (µBF) and sub-array-level digitization to achieve a 128-fold channel-count reduction. The µBF is based on passive boxcar integration, merged with a 10-bit 40 MS/s SAR ADC in the charge domain, thus obviating the need for explicit anti-alias filtering and powerhungry ADC drivers. A compact and low-power reference generator employs an area-efficient MOS capacitor as a reservoir to quickly set a reference for the ADC in the charge domain. A low-power multi-level data link, based on 16-level pulse-amplitude modulation, further concatenates outputs of four ADCs, leading to an aggregate 3.84 Gb/s data rate.

Chapter 5 concludes the thesis by summarizing the main contributions and findings, and providing recommendations for future research.

#### REFERENCES

- S. Chawanpaiboon, J. P. Vogel, A.-B. Moller, P. Lumbiganon, M. Petzold, D. Hogan, S. Landoulsi, N. Jampathong, K. Kongwattanakul, M. Laopaiboon, C. Lewis, S. Rattanakanokchai, D. N. Teng, J. Thinkhamrop, K. Watananirun, J. Zhang, W. Zhou, and A. M. Gülmezoglu, "Global, regional, and national estimates of levels of preterm birth in 2014: A systematic review and modelling analysis," *The Lancet Global Health*, vol. 7, no. 1, e37–e46, Jan. 2019.
- J. Baranger, O. Villemain, M. Wagner, M. Vargas-Gutierrez, M. Seed, O. Baud, B. Ertl-Wagner, and J. Aguet, "Brain perfusion imaging in neonates," *NeuroImage: Clinical*, vol. 31, p. 102 756, Jan. 2021.
- [3] M. Proisy, S. Mitra, C. Uria-Avellana, M. Sokolska, N. Robertson, F. Le Jeune, and J.-C. Ferré, "Brain Perfusion Imaging in Neonates: An Overview," *AJNR Am J Neuroradiol*, vol. 37, no. 10, pp. 1766–1773, Oct. 2016.
- [4] S. D. Vik, H. Torp, T. Follestad, R. Støen, and S. A. Nyrnes, "NeoDoppler: New ultrasound technology for continuous cerebral circulation monitoring in neonates," *Pediatr Res*, vol. 87, no. 1, pp. 95–103, Jan. 2020.
- [5] S. S. Ødegård, H. Torp, T. Follestad, M. Leth-Olsen, R. Støen, and S. A. Nyrnes, "Low frequency cerebral arterial and venous flow oscillations in healthy neonates measured by NeoDoppler," *Front Pediatr*, vol. 10, p. 929 117, Nov. 2022.
- [6] J. Baranger, C. Demene, A. Frerot, F. Faure, C. Delanoë, H. Serroune, A. Houdouin, J. Mairesse, V. Biran, O. Baud, and M. Tanter, "Bedside functional monitoring of the dynamic brain connectivity in human neonates," *Nat Commun*, vol. 12, no. 1, p. 1080, Feb. 2021.
- [7] *Cranial Ultrasound*, https://www.radiologyinfo.org/en/info/ultrasound-cranial.
- [8] Cranial sutures and fontanels, https://www.mayoclinic.org/diseases-conditions.
- [9] D. I. Altman and J. J. Volpe, "Positron Emission Tomography in Newborn Infants," *Clinics in Perinatology*, Newer Technologies and the Neonate, vol. 18, no. 3, pp. 549–562, Sep. 1991.
- [10] S. Hyttel-Sorensen, A. Pellicer, T. Alderliesten, T. Austin, F. van Bel, M. Benders, O. Claris, E. Dempsey, A. R. Franz, M. Fumagalli, C. Gluud, B. Grevstad, C. Hagmann, P. Lemmers, W. van Oeveren, G. Pichler, A. M. Plomgaard, J. Riera, L. Sanchez, P. Winkel, M. Wolf, and G. Greisen, "Cerebral near infrared spectroscopy oximetry in extremely preterm infants: Phase II randomised clinical trial," *BMJ*, vol. 350, no. jan05 2, g7635–g7635, Jan. 2015.
- [11] D. A. Nzeh, S. A. Erinle, S. A. Saidu, and S. D. Pam, "Transfontanelle Ultra-Sonography: An Invaluable Tool in the Assessment of the Infant Brain," *Trop Doct*, vol. 34, no. 4, pp. 226– 227, Oct. 2004.

- [12] M. Guillot, V. Chau, and B. Lemyre, "Routine imaging of the preterm neonatal brain," *Paediatrics & Child Health*, vol. 25, no. 4, pp. 249–255, Jun. 2020.
- [13] J. H. Squires, N. H. Beluk, V. K. Lee, T. D. Yanowitz, S. Gumus, S. Subramanian, and A. Panigrahy, "Feasibility and Safety of Contrast-Enhanced Ultrasound of the Neonatal Brain: A Prospective Study Using MRI as the Reference Standard," *American Journal of Roentgenology*, vol. 218, no. 1, pp. 152–161, Jan. 2022.
- [14] J.-K. Choi, M.-G. Choi, J.-M. Kim, and H.-M. Bae, "Efficient Data Extraction Method for Near-Infrared Spectroscopy (NIRS) Systems With High Spatial and Temporal Resolution," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 7, no. 2, pp. 169–177, Apr. 2013.
- [15] A. S. Ginsburg, Z. Liddy, P. T. Khazaneh, S. May, and F. Pervaiz, "A survey of barriers and facilitators to ultrasound use in low- and middle-income countries," *Sci Rep*, vol. 13, no. 1, p. 3322, Feb. 2023.
- [16] M. Necas, "The clinical ultrasound report: Guideline for sonographers," *Australasian Journal of Ultrasound in Medicine*, vol. 21, no. 1, pp. 9–23, 2018.
- [17] G. Meijler and S. J. Steggerda, *Neonatal Cranial Ultrasonography*. Cham: Springer International Publishing, 2019.
- [18] J. S. Shanewise, A. T. Cheung, S. Aronson, W. J. Stewart, R. L. Weiss, J. B. Mark, R. M. Savage, P. Sears-Rogan, J. P. Mathew, M. A. Quiñones, M. K. Cahalan, and J. S. Savino, "ASE/SCA Guidelines for Performing a Comprehensive Intraoperative Multiplane Transesophageal Echocardiography Examination: Recommendations of the American Society of Echocardiography Council for Intraoperative Echocardiography and the Society of Cardiovascular Anesthesiologists Task Force for Certification in Perioperative Transesophageal Echocardiography," *Journal of the American Society of Echocardiography*, vol. 12, no. 10, 1999.
- [19] C. Chen, E. Noothout, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, M. A. P. Pertijs, S. B. Raghunathan, Z. Yu, M. Shabanimotlagh, Z. Chen, Z.-y. Chang, S. Blaak, C. Prins, and J. Ponte, "A Prototype PZT Matrix Transducer With Low-Power Integrated Receive ASIC for 3-D Transesophageal Echocardiography," *IEEE Trans. Ultrason., Ferroelect., Freq. Contr.*, vol. 63, no. 1, pp. 47–59, Jan. 2016.
- [20] J. Kortenbout, S. Costerus, J. de Graaff, G. McLaughlin, N. de Jong, J. Dudink, H. Vos, and J. Bosch, "Automatic High Frame Rate Spectral Envelope Detection to Calculate Parameter Maps of Neonatal Brain Perfusion," in 2022 IEEE International Ultrasonics Symposium (IUS).
- [21] S. A. Costerus, A. J. Kortenbout, H. J. Vos, P. Govaert, D. Tibboel, R. M. H. Wijnen, N. de Jong, J. G. Bosch, and J. C. de Graaff, "Feasibility of Doppler Ultrasound for Cortical Cerebral Blood Flow Velocity Monitoring During Major Non-cardiac Surgery of Newborns," *Frontiers in Pediatrics*, vol. 9, 2021.
- [22] G. A. Popich and D. W. Smith, "Fontanels: Range of normal size," *The Journal of Pediatrics*, vol. 80, no. 5, pp. 749–752, May 1972.
- [23] G. Duc and R. H. Largo, "Anterior Fontanel: Size and Closure in Term and Preterm Infants," *Pediatrics*, vol. 78, no. 5, pp. 904–908, Nov. 1986.

- [24] Q. Yue, D. Liu, W. Wang, W. Di, D. Lin, X. Wang, and H. Luo, "Fabrication of a PMN-PT Single Crystal-Based Transcranial Doppler Transducer and the Power Regulation of Its Detection System," *Sensors*, vol. 14, no. 12, pp. 24 462–24 471, Dec. 2014.
- [25] B. T. Khuri-Yakub and O. Oralkan, "Capacitive micromachined ultrasonic transducers for medical imaging and therapy," *J Micromech Microeng*, vol. 21, no. 5, pp. 54004–54014, May 2011.
- [26] C. T. Lancée, R. Daigle, D. J. Sahn, and J. M. Thijssen, "Transducer applications in echocardiology," *Ultrasonics*, vol. 23, no. 5, pp. 199–205, Sep. 1985.
- [27] D. Wildes, W. Lee, B. Haider, S. Cogan, K. Sundaresan, D. M. Mills, C. Yetter, P. H. Hart, C. R. Haun, M. Concepcion, J. Kirkhorn, and M. Bitoun, "4-D ICE: A 2-D Array Transducer With Integrated ASIC in a 10-Fr Catheter for Real-Time 3-D Intracardiac Echocardiography," *IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control*, vol. 63, no. 12, pp. 2159–2173, Dec. 2016.
- [28] N. Sanchez, K. Chen, C. Chen, D. McMahill, S. Hwang, J. Lutsky, J. Yang, L. Bao, L. K. Chiu, G. Peyton, H. Soleimani, B. Ryan, J. R. Petrus, Y.-J. Kook, T. S. Ralston, K. G. Fife, and J. M. Rothberg, "34.1 An 8960-Element Ultrasound-on-Chip for Point-of-Care Ultrasound," in *2021 IEEE International Solid- State Circuits Conference (ISSCC)*, vol. 64, Feb. 2021, pp. 480–482.
- [29] L. Demi, "Practical Guide to Ultrasound Beam Forming: Beam Pattern and Image Reconstruction Analysis," *Applied Sciences*, vol. 8, no. 9, p. 1544, Sep. 2018.
- [30] G. Gurun, C. Tekes, J. Zahorian, T. Xu, S. Satir, M. Karaman, J. Hasler, and F. L. Degertekin, "Single-chip CMUT-on-CMOS front-end system for real-time volumetric IVUS and ICE imaging," *IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control*, vol. 61, no. 2, pp. 239–250, Feb. 2014.
- [31] G. Jung, C. Tekes, M. W. Rashid, T. M. Carpenter, D. Cowell, S. Freear, F. L. Degertekin, and M. Ghovanloo, "A reduced-wire ICE catheter ASIC with tx beamforming and rx timedivision multiplexing," *IEEE Trans. Biomed. Circuits Syst.*, vol. 12, no. 6, pp. 1246–1255, Dec. 2018.
- [32] E. Kang, Q. Ding, M. Shabanimotlagh, P. Kruizinga, Z.-Y. Chang, E. Noothout, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A Reconfigurable Ultrasound Transceiver ASIC With \$24\times40\$ Elements for 3-D Carotid Artery Imaging," *IEEE J. Solid-State Circuits*, vol. 53, no. 7, pp. 2065–2075, Jul. 2018.
- [33] M. Tan, E. Kang, J. An, Z. Chang, P. Vince, T. Matéo, N. Sénégond, and M. A. P. Pertijs, "A 64-Channel Transmit Beamformer With ±30-V Bipolar High-Voltage Pulsers for Catheter-Based Ultrasound Probes," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 7, pp. 1796–1806, Jul. 2020.
- [34] K. Chen, H.-S. Lee, and C. G. Sodini, "A Column-Row-Parallel ASIC Architecture for 3-D Portable Medical Ultrasonic Imaging," *IEEE J. Solid-State Circuits*, vol. 51, no. 3, pp. 738– 751, Mar. 2016.

- [35] M. Sautto, A. S. Savoia, F. Quaglia, G. Caliano, and A. Mazzanti, "A Comparative Analysis of CMUT Receiving Architectures for the Design Optimization of Integrated Transceiver Front Ends," *IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control,* vol. 64, no. 5, pp. 826–838, May 2017.
- [36] C. Chen, Z. Chen, D. Bera, S. B. Raghunathan, M. Shabanimotlagh, E. Noothout, Z. Chang, J. Ponte, C. Prins, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A front-end ASIC with receive sub-array beamforming integrated with a 32 × 32 PZT matrix transducer for 3-D transesophageal echocardiography," *IEEE J. Solid-State Circuits*, vol. 52, no. 4, pp. 994–1006, Apr. 2017.
- [37] M. Tan, C. Chen, Z. Chen, J. Janjic, V. Daeichin, Z.-Y. Chang, E. Noothout, G. van Soest, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A Front-End ASIC With High-Voltage Transmit Switching and Receive Digitization for 3-D Forward-Looking Intravascular Ultrasound Imaging," *IEEE J. Solid-State Circuits*, vol. 53, no. 8, pp. 2284–2297, Aug. 2018.
- [38] T. Halvorsrod, O. Birkenes, and C. Eichrodt, "A Low-Power Method Adding Continuous Variable Gain to Amplifiers," in 2005 IEEE International Symposium on Circuits and Systems, Kobe, Japan: IEEE, 2005, pp. 1593–1596.
- [39] C. Chen, Z. Chen, D. Bera, E. Noothout, Z.-Y. Chang, M. Tan, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A Pitch-Matched Front-End ASIC With Integrated Subarray Beamforming ADC for Miniature 3-D Ultrasound Probes," *IEEE J. Solid-State Circuits*, vol. 53, no. 11, pp. 3050–3064, Nov. 2018.
- [40] Zili Yu, S. Blaak, Zu-yao Chang, Jiajian Yao, J. G. Bosch, C. Prins, C. T. Lancée, N. de Jong, M. A. P. Pertijs, and G. C. M. Meijer, "Front-end receiver electronics for a matrix transducer for 3-D transesophageal echocardiography," *IEEE Trans. Ultrason., Ferroelect., Freq. Contr.*, vol. 59, no. 7, pp. 1500–1512, Jul. 2012.
- [41] J. Lee, K.-R. Lee, B. E. Eovino, J. H. Park, L. Y. Liang, L. Lin, H.-J. Yoo, and J. Yoo, "A 36-Channel Auto-Calibrated Front-End ASIC for a pMUT-Based Miniaturized 3-D Ultrasound System," *IEEE Journal of Solid-State Circuits*, vol. 56, no. 6, pp. 1910–1923, Jun. 2021.
- [42] E. Kang, M. Tan, J.-S. An, Z.-Y. Chang, P. Vince, N. Sénégond, T. Mateo, C. Meynier, and M. A. P. Pertijs, "A Variable-Gain Low-Noise Transimpedance Amplifier for Miniature Ultrasound Probes," *IEEE J. Solid-State Circuits*, vol. 55, no. 12, pp. 3157–3168, Dec. 2020.
- [43] J. Xiao, I. Mehr, and J. Silva-Martinez, "A High Dynamic Range CMOS Variable Gain Amplifier for Mobile DTV Tuner," *IEEE J. Solid-State Circuits*, vol. 42, no. 2, pp. 292–301, Feb. 2007.
- [44] J. L. Prince and J. M. Links, *Medical Imaging Signals and Systems*, 2nd ed. Boston: Pearson, 2015.
- [45] P. R. Hoskins, K. Martin, and A. Thrush, Eds., *Diagnostic Ultrasound: Physics and Equipment* (Cambridge Medicine), 2nd ed. Cambridge, UK; New York: Cambridge University Press, 2010.

- [46] T. Kim, F. Fool, D. S. dos Santos, Z.-Y. Chang, E. Noothout, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "Design of an Ultrasound Transceiver ASIC with a Switching-Artifact Reduction Technique for 3D Carotid Artery Imaging," *Sensors*, vol. 21, no. 1, p. 150, Jan. 2021.
- [47] G. Jung, M. W. Rashid, T. M. Carpenter, C. Tekes, D. M. J. Cowell, S. Freear, F. L. Degertekin, and M. Ghovanloo, "Single-chip reduced-wire active catheter system with programmable transmit beamforming and receive time-division multiplexing for intracardiac echocardiography," in 2018 IEEE International Solid - State Circuits Conference - (ISSCC), Feb. 2018, pp. 188–190.
- [48] B. Savord and R. Solomon, "Fully sampled matrix transducer for real time 3D ultrasonic imaging," in *IEEE Symposium on Ultrasonics, 2003*, vol. 1, Oct. 2003, 945–953 Vol.1.
- [49] M.-C. Chen, A. Peña Perez, S.-R. Kothapalli, P. Cathelin, A. Cathelin, S. S. Gambhir, and B. Murmann, "A Pixel Pitch-Matched Ultrasound Receiver for 3-D Photoacoustic Imaging With Integrated Delta-Sigma Beamformer in 28-nm UTBB FD-SOI," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 11, pp. 2843–2856, Nov. 2017.

## **2** A Pitch-Matched Low-Noise AFE With Accurate TGC for High-Density Ultrasound Transducer Arrays

This chapter is based on the publication "A Pitch-Matched Low-Noise Analog Front-End With Accurate Continuous Time-Gain Compensation for High-Density Ultrasound Transducer Arrays," in IEEE Journal of Solid-State Circuits, vol. 58, no. 6, pp. 1693-1705, June 2023, doi: 10.1109/JSSC.2022.3200160.

#### **2.1.** INTRODUCTION

A PPLICATION-SPECIFIC integrated circuits (ASICs) play a key role in the miniaturization of 3-D ultrasound imaging devices. ASICs minimize interconnection wires, reduce power consumption, and improve image quality in various ultrasound systems. For instance, they have been successfully employed in catheter-based ultrasound devices that enable real-time 3-D image-guided interventions through intracardiac echocardiography (ICE) [1], [2] and intravascular ultrasound imaging (IVUS) [3]. ASICs are also required for emerging wearable ultrasound devices, such as the device targeted in this work: a cap-like wearable device with built-in integrated circuits providing high-resolution 3D ultrasound images through transfontanelle ultrasonography (TFUS) for monitoring brain activity in premature neonates [4], [5]. For real-time 3D ultrasound imaging, a high-density 2-D transducer array needs to be integrated in a pitchmatched fashion with analog front-end (AFE) circuits that process the echo signals received by the transducer elements[6].

A key challenge in ultrasound AFE design is to handle the large dynamic range (DR) of the echo signals due to propagation attenuation. The amplitude of an ultrasound wave decreases as it propagates through the medium being imaged, i.e., the human body. Not only the transmitted pulse but also the returning echo signals are attenuated[7]. The amplitude  $A_z$  of an ultrasound wave can be expressed as

$$A_z = A_0 e^{-\mu_0 z}$$



Fig. 2.1: (a) Block diagram of the AFE with TGC function followed by an ADC. (b) Input and output signals of the AFE with TGC function. (c) Evolution of dynamic range as a function of time with TGC function.

where  $A_0$  is the initial amplitude at the surface of the ultrasound transducer,  $\mu_0$  is the amplitude attenuation factor, and z is the propagation distance of the acoustic wave in the medium. The equation indicates that the ultrasound wave decays exponentially. Besides, it decays faster at higher frequency because  $\mu_0$  is proportional to the frequency of the ultrasound wave. In a typical medium, e.g., the human brain, the attenuation factor is about 0.435 dB·cm<sup>-1</sup>·MHz<sup>-1</sup>[8],

implying that a 10-MHz ultrasound wave is attenuated by 35 dB after traveling a round-trip distance of 8 cm, corresponding to the desired 4-cm imaging depth in the mentioned neonatal brain-monitoring application. This leads to an overall dynamic range of 75 dB if an instantaneous dynamic range of about 40 dB is required at any depth. Taking the wide bandwidth of the ultrasound transducer into account, this poses considerable challenges to the circuit design and would lead to a power-hungry analog to digital converter (ADC) if the ultrasound signal would be directly digitized.

Time-gain compensation (TGC) is a technique to compensate for the attenuation by adjusting the gain of the AFE as a function of time, as illustrated in Fig. 2.1(a). Besides providing lownoise amplification, the AFE also provides TGC by providing gain that increases exponentially with time, as depicted in Fig. 2.1(b). The aforementioned 75 dB dynamic range is thus reduced to about 40 dB after the AFE with TGC function as shown in Fig. 2.1(c). This is essential to relax the complexity of the back-end circuitry, e.g., the ADC.

Image artifacts may appear when the TGC function does not match the attenuation rate in the medium[8]. E.g, in B-mode imaging, inappropriate compensation gain would result in image-brightness variations in uniform tissue. This implies that a well-designed TGC should have linear-in-dB gain control with a gain error as small as possible throughout the overall gain range. Other types of image artifacts associated with switching transients of the interfacing electronics should also be minimized in the AFE design, such as the gain-switching artifacts [9], and the transmission/reception (TX/RX) switching artifacts [10].

In this chapter, we present a compact, pitch-matched analog front-end combined with time-gain-compensation function that achieves a >2× better linear-in-dB gain error than prior designs [9], [11], [12], while consuming less power and occupying less area[13]. The AFE is equipped with a novel complementary current-steering network that interpolates between discrete gain steps, and employs a hardware sharing topology to reduce the area and power consumption. Thus, the gain of the AFE can be controlled in a linear-in-dB fashion without any abrupt gain changes by a continuously-ramping analog control voltage. The AFE biases the transducer to ground level, which prevents voltage changes on the transducer between pulse transmission (TX), where a unipolar high-voltage pulse is imposed on the transducer, and echo reception (RX), thus reducing the possibility of imaging artifacts associated with TX/RX switching [10]. As a proof of concept targeting the mentioned neonatal brain-monitoring application, a 100- $\mu$ m pitch 10-MHz transducer array consisting of 8×8 elements has been built on top of the pitch-matched AFE to demonstrate its functionality in a miniaturized ultrasound probe.

This chapter is organized as follows. Section 2.2 reviews the prior art and categorizes different TGC architectures according to the way an exponential gain curve is approximated. Section 2.3 describes the architecture design of our AFE and presents the theoretical basis of the novel CCSN. Section 2.4 presents the detailed circuit implementation. Section 2.5 describes the fabricated prototype, as well as the electrical and acoustic measurement results. This chapter ends with conclusions.

#### **2.2.** Prior Art

Various amplifier topologies that vary their gain linearly in dB can be used to implement TGC. These amplifiers are widely used in different applications such as mobile TV[15], wire-





Fig. 2.2: Circuits to implement TGC: (a) discrete-time PGA [6]; (b) analog VGA [14]; (c) interpolating VGA [15].

less communication systems[14], and ultrasound imaging systems[9]. These designs can be further divided into three categories, i.e, discrete-time programmable gain amplifiers (PGAs) [6], [16], analog variable-gain amplifiers (VGAs) [11], [14], [17], and interpolating variable-gain amplifiers [9], [15], [18], [19].

A discrete-time PGA utilizes a passive or active feedback network to generate discrete gain steps controlled by a digital signal, as illustrated in Fig. 2.2(a). In modern integrated circuits, these gain steps are generally defined by matched devices, and can thus be precisely distributed on an exponential trajectory. However, for an ultrasound imaging system, the switching transients between adjacent steps would lead to imaging artifacts if these steps are spaced far apart. Finer gain steps mitigate such artifacts but likely require a large chip area [20], and therefore are hard to realize in a pitch-matched AFE design with small pitch size.

A continuous TGC function can be implemented with an analog VGA, and various techniques can be used to approximate the exponential gain curve, e.g., using multiple open-loop stages as depicted in Fig. 2.2(b) [14]. In each stage, a first-order pseudo-exponential function is generally implemented, e.g.,  $\frac{1+x}{1-x}$  [17], [21], which results in a gain error of less than 0.5 dB within a overall gain range of 15 dB. A wider gain range can be obtained by cascading gain stages, resulting in a  $(\frac{1+x}{1-x})^n$  curve, at the cost of lower signal-chain bandwidth. However, these open-loop structures are usually vulnerable to temperature and process variation.

An interpolating VGA makes a good trade off between the aforementioned solutions. As illustrated in Fig. 2.2(c), it establishes a few exponentially-spaced gain steps and employs analog interpolation to smoothen out the gain curve between adjacent gain steps. This solution is less sensitive to temperature and process variation, as the gain steps are accurately defined by a matched network, e.g., a C-2C capacitor ladder [9], [15]. It also has no abrupt gain transition between two steps, hence mitigates the artifact issue to some extent. Nevertheless, the interpolation process still deviates from the ideal exponential trajectory and the interpolation error even dominates the overall linear-in-dB gain error [9].

As will be explained in section 2.3, we introduce a novel complementary current-steering network (CCSN) that interpolates the gain steps along a second-order pseudo-exponential trajectory resulting in very small gain error in the overall gain range. In combination with the scalable feedback network and hardware-sharing technique among different channels, we build a compact and accurate linear-in-dB time-gain compensation analog front-end circuit.

#### **2.3.** ARCHITECTURE DESIGN

#### **2.3.1.** Two-Stage Interpolating VGA

As aforementioned, precise linear-in-dB gain can be obtained by a multi-stage VGA realizing a higher-order pseudo-exponential function. In this work, we propose a two-stage interpolating VGA as depicted in Fig. 2.3(a). The first stage consists of a trans-impedance amplifier (TIA) with a feedback capacitor array  $C_{in}$ , followed by a capacitor array  $C_{out}$  that couples to the virtual ground of the second stage and thus turns the TIA's output voltage  $v_{out1}$  into a current signal  $i_{out1}$ . The large loop gain of the TIA and the low input impedance of the second stage guarantee  $i_{out1}$  to be a precisely scaled version of the transducer signal current  $i_{TD}$ , and the ratio of  $i_{out1}$  to  $i_{TD}$  is defined by the two capacitor arrays,  $C_{in}$  and  $C_{out}$ , and the complementary current-steering network (CCSN), as will be elaborated below.

The second stage is a class-AB current-mirror-based current amplifier (CA), in which the discrete gain steps are precisely defined by a series of current branches  $CB_{in}$ ,  $CB_1$ , ...,  $CB_n$ ,  $CB_{out}$ . The ratio of the CA's output current  $i_{out2}$  to  $i_{out1}$  is defined by the ratio of the number of branches connected to its input and the number connected to the output. The output current signal  $i_{out2}$  is then converted to a voltage via a load resistor  $R_L$ .

The *n* gain steps of both stages follow a  $(\frac{1+x}{1-x})$  trajectory resulting in a  $(\frac{1+x}{1-x})^2$  function that is an accurate approximation of the required exponential gain-curve with a theoretical gain error below ±0.36 dB in an overall 36-dB range. The CCSNs interpolate between these discrete gain steps by steering the signal current between the adjacent steps such that the interpolation also follows a  $(\frac{1+x}{1-x})^2$  function, and therefore it significantly reduces the linear-in-dB gain error.



Fig. 2.3: (a) Two-stage AFE with complementary current-steering networks (CCSNs); (b) realized pseudo-exponential interpolation with error relative to a linear-in-dB gain curve.

#### **2.3.2.** ANALYSIS OF CURRENT-STEERING PAIR

As depicted in Fig. 2.4(a), an NMOS current-steering pair consisting of NMOS transistors  $M_1$  and  $M_2$  steers the current signal flowing through  $M_0$  from one output to another in accordance with the applied control voltage  $V_c$ . The AC current  $i_0$  and the DC current  $I_0$  are both continuously directed from  $M_1$  to  $M_2$  controlled by a linear ramp-up signal  $V_c$ . The total currents flowing through  $M_1$  and  $M_2$  can be expressed in two parts, the large-signal currents  $I_{n1,2}$  [22] and the small-signal currents  $i_{n1,2}$  [15]:

$$I_{n1,2} \approx \frac{1}{2} I_0 \mp \frac{1}{2} \sqrt{\beta I_0 - \frac{1}{4} (\beta V_c)^2} \cdot V_c$$
(2.1)

$$i_{n1,2} \approx \left(\frac{1}{2} \mp \frac{\sqrt{\beta V_c}}{4\sqrt{I_0 - \frac{1}{4}\beta V_c^2}}\right) \cdot i_0$$
 (2.2)

where  $i_0$ ,  $I_0$  are the AC current and DC current flowing through M<sub>0</sub>,  $\beta = \mu_0 C_{ox}(W/L)_{1,2}$  and



Fig. 2.4: (a) NMOS current-steering pair with inset showing the control voltage as a function of time. (b) AC currents of NMOS pair as a function of control voltage V<sub>c</sub>.

 $(W/L)_{1,2}$  is the aspect ratios of  $M_1$  and  $M_2$ . The currents  $i_{n1}$  and  $i_{n2}$  are approximately linear functions of  $i_0$  and the control voltage  $V_c$  if the absolute value of  $V_c$  is small, while  $I_{n1,2}$  are non-linear components that should be cancelled in the circuit.

## **2.3.3.** TRANSIMPEDANCE AMPLIFIER WITH COMPLEMENTARY CURRENT-STEERING NETWORK

A simplified circuit diagram of the TIA with CCSN is shown in Fig. 2.5(a). A current-steering bias network (CSBN) converts the TGC control voltage  $V_c$  into a series of gate-control voltages for the CCSN, which are  $V_{cn1}, V_{cn2}, \ldots, V_{cnn}$  and  $V_{cp1}, V_{cp2}, \ldots, V_{cpn}$ , as depicted in Fig. 2.5(b). The TIA employs an inverter-based amplifier formed by M<sub>1</sub> and M<sub>2</sub> with DC bias current  $I_0$  and AC signal current  $i_0$ . The complementary current-steering network (CCSN) consists of n branches B<sub>1</sub>–B<sub>n</sub>, formed by PMOS steering array M<sub>p1</sub>–M<sub>pn</sub> and NMOS steering array M<sub>n1</sub>–M<sub>nn</sub>, and steers the AC signal currents  $\pm i_0$  to the virtual ground of TIA and the virtual ground of the next stage via the feedback capacitor array (C<sub>in</sub>) and the feed-forward capacitor array (C<sub>out</sub>), respectively.

At the beginning of receive interval (Fig. 2.5b), the AC currents  $\pm i_0$  are steered into the first capacitive divider of the capacitor arrays  $C_{in}$  and  $C_{out}$  via the only active steering branch  $B_1$ , corresponding to the lowest gain from  $i_{TD}$  to  $i_{out}$ . Throughout the receive interval, the CCSN steers the currents from branch to branch, interpolating between the exponentially-spaced capacitive divider ratios, thus gradually increasing the gain. At the end of the receive interval, all currents go to the final branch  $B_n$ , corresponding to the highest gain. As derived in the Appendix, this interpolation process follows a pseudo-exponential trajectory, leading to an accurate linear-in-dB gain sweep.

Note that this operation is different from the current-steering reported in[9], where a non-



Fig. 2.5: (a) Simplified circuit diagram of the TIA with CCSN; (b) bias voltages of the CCSN as a function of time.

complementary current-steering network interpolates between the exponentially-spaced gain steps of a capacitive ladder network. This interpolation is non-exponential, resulting in larger errors compared to a linear-in-dB sweep. Moreover, in contrast with the capacitor values in a C-2C ladder network, the unit capacitors  $C_{u1}-C_{un}$  in our capacitive dividers can be sized independently. The unit capacitors associated with the low-gain steps need to be sized large enough to avoid saturating the amplifier, while those associated with the high-gain steps, at which the signal-current amplitude is smaller due to the exponentially-decaying nature of the input signal, can be scaled down to save area.





Fig. 2.6: (a) Current amplifier with complementary current-steering network (CCSN); (b) bias voltages of CCSN as a function of time.

A similar CCSN is used in the second stage of the AFE, which is a class-AB current amplifier (CA) with a local feedback loop [23], as illustrated in Fig. 2.6(a). Instead of using capacitor arrays, as in the first stage, the CA employs current-mirrors to define the gain steps, which consist of an input current branch (CB<sub>in</sub>), an output current branch (CB<sub>out</sub>) and *j* unit current branches (CB<sub>1</sub>–CB<sub>j</sub>), where *j=m-n*. The input and output branches both comprise *n* unit branches. At the beginning of the RX phase, all unit current branches attach to the input branch via the CCSN,

resulting in a current gain of  $\frac{n}{m}$ . The CCSN adjusts the current mirror ratio across all the intermediate gain steps  $\frac{n+1}{m-1}$ ,  $\frac{n+2}{m-2}$ ,...,  $\frac{m-1}{n+1}$  throughout the RX phase until all unit branches attach to the output branches resulting in a final gain of  $\frac{m}{n}$ . A similar analysis as presented in the Appendix can be applied to the interpolation of all the gain steps of the CA. The resulting current-gain trajectory is still exponential if the gate bias voltages are properly applied as illustrated in Fig. 2.6(b). The local negative feedback loop formed by the amplifier, DC level-shifter capacitor (C<sub>dc</sub>) and input branch (CB<sub>in</sub>) effectively reduces the input impedance of the current amplifier and operates the current mirror in a class-AB mode.

#### **2.4.** CIRCUIT DESIGN

#### 2.4.1. HARDWARE SHARING TIA

A variety of amplifier topologies have been used in different ultrasound applications, e.g., the conventional two-stage amplifier [24], the single-ended inverter-based amplifier [6] and the differential current-reuse amplifier[3], [9]. A figure-of-merit called the noise efficiency factor (NEF) has been introduced in [25] to compare power/noise efficiency among different types of amplifiers. The conventional two-stage amplifier and differential current-reuse amplifier have relatively poor theoretical NEFs of 2 and 1.4, respectively, while NEF of the single-ended inverter-based amplifier are close to 0.7 [26], which makes it an attractive candidate. Nevertheless, the single-ended inverter-based amplifier to these interferences. Voltage regulators have been introduced to suppress these interferences at the cost of additional power consumption[6], but this reduces the power efficiency.

By noticing that the ultrasound system has multiple input channels, we can reduce area and power consumption by sharing hardware among the channels, e.g., the ground/ supply regulators [6]. We propose a hardware-sharing TIA as the AFE's first stage which is illustrated in Fig. 2.7(a). The power supply and ground voltage regulators (Reg<sub>n,p</sub>) are shared among four TIA channels. The mesh[27] and its biasing circuit control the bias current of the voltage regulators and the inverter-based amplifiers: a bandwidth-control circuit dynamically adjusts the bandwidth of the TIA via the mesh devices in accordance with the exponentially-growing gain, as will be discussed in Section 2.4.2. The local feedback loops of the two voltage regulators effectively decouple the four channels and also attenuate interferences from the power supply and ground.

Four high-efficiency inverter-based amplifiers process the ultrasound signals from four transducers. The amplifier of the first channel is formed by  $M_1$ ,  $M_2$  and  $C_1$ ,  $C_2$ . DC level-shifting capacitors  $C_1$  and  $C_2$  serve two purposes. They allow  $M_1$  and  $M_2$  to be biased at optimal gate voltages  $V_{RI}$  and  $V_{R2}$ , independent from the input bias level, thus maximizing the output swing of the inverter-based amplifier[6]. Moreover, they allow the amplifier's input to be biased at a well-defined ground level during the receiving phase. The latter plays a crucial role in preventing image artifacts when switching from transmit to receive. The ratio of five capacitive dividers are  $\frac{3}{11}, \frac{5}{9}, \ldots, \frac{11}{3}$ , respectively, corresponding to the gain steps described by (2.6) with parameters m=11 and n=3. Proper scaling of the unit capacitor was applied to those dividers leading to a 30% less area compared to a design with identical unit capacitors in the dividers. A 5-tap CCSN interpolates between these gain steps resulting in a total gain range of 22.6 dB. During the TX phase ( $\Phi_{tx}$ ), high-voltage switch S<sub>0</sub> is closed and high-voltage switch S<sub>1</sub>





Fig. 2.7: (a) Circuit diagram of the hardware-sharing TIA with CCSN; (b) the current-steering bias network (CSBN) of the TIA.

is open, and a unipolar high-voltage pulse (0V–36V) is applied to the transducer to transmit an acoustic pulse. In the meanwhile, the level-shifting capacitors  $C_1$  and  $C_2$  are reset via switches  $S_2$ – $S_4$ , and all capacitive dividers are reset via switches  $S_5$ – $S_9$ . At the beginning of the RX phase ( $\Phi_{rx}$ ), the transducer is connected to the TIA via the switch  $S_1$ . Noting that as the voltage on the top plate of the transducer is kept at ground level between the end of the TX phase and the RX phase, imaging artifacts associated with the transition from TX to RX are thus minimized. The CCSN of TIA then begins to traverse five capacitive dividers along the pseudo-exponential trajectory.

The current-steering bias network (CSBN), which generates the bias voltages for the CCSN (see Fig. 2.5), is based on the circuit reported in [9], as depicted in Fig. 2.7(b). As the TGC control

voltage  $V_c$  is swept along a linear ramp-up curve, the tail current  $I_{t0}$  is steered from the rightmost diode-connected NMOS transistor to the leftmost creating five gate bias voltages ( $V_{cn1}$ - $V_{cn5}$ ) for the NMOS transistors of the CCSN. These five voltages are also mirrored to five NMOS transistors which direct the tail current  $I_{t1}$  from the leftmost diode-connected PMOS transistor to the rightmost resulting in the bias voltages ( $V_{cp1}$ - $V_{cp5}$ ) for the PMOS transistors of CCSN, as shown in Fig. 2.5(b).  $I_{t1}$  is a scaled version of the static bias current  $I_0$  of the inverter-based amplifier in Fig. 2.5(a). All these NMOS/PMOS transistors are properly scaled to guarantee that the NMOS and PMOS current-steering pairs of the CCSN divide the DC bias current  $I_0$  and AC current  $i_0$  at the same rate. Voltage regulators similar to those in Fig. 2.7(a) are used to generate  $V_{reg3}$  and  $V_{reg4}$  in Fig. 2.7(b), which define the output swing of the inverter-based amplifier. The CSBN is shared among four channels to save area and power consumption without introducing noticeable noise into the main signal chain.

#### **2.4.2.** BANDWIDTH-CONTROL CIRCUIT



Fig. 2.8: (a) CSBN of the bandwidth-control circuit; (b) Dynamic current mirror with diagrams showing the TGC control voltage and the bias voltages as a function of time.

The close-loop bandwidth of the TIA should be above the -3dB bandwidth of the transducer, which is about 14 MHz in this design, in the presence of the wide current-gain range. The closed-loop bandwidth BW<sub>CL</sub> derived from the current transfer function  $i_{out1}/i_{TD}$ , can be expressed as

$$BW_{CL} \approx \frac{g_{mn} + g_{mp}}{C_p} \cdot \frac{1}{1 + \frac{C_{out}}{C_{in}} + \frac{C_{out}}{C_p}}$$
(2.3)

$$C_{\text{in}}, C_{\text{out}} = \left(\frac{m+n}{2} \mp \frac{m-n}{2} \cdot S_{ra}\right) \cdot C_{\text{u}}$$
$$S_{ra} \in [-1, 1]$$
(2.4)

where  $g_{mn}$  and  $g_{mp}$  are the trans-conductance of the input NMOS transistor  $M_1$  and PMOS transistor  $M_2$ , respectively.  $C_p$ ,  $C_{in}$ , and  $C_{out}$  are the transducer's capacitance, the feedback capacitance and the output capacitance of the TIA, respectively, as shown in Fig. 2.5(a).  $C_u$  is the unit capacitance, and  $S_{ra}$  is the steering ratio as a linear function of the TGC control voltage  $V_c$ , defined in the whole gain range. Substituting (2.4) to (2.3) and assuming  $C_u \ll C_p$  gives

$$BW_{CL} \approx \frac{g_{mn} + g_{mp}}{C_p} \cdot \frac{1 - \frac{m - n}{m + n} \cdot S_{ra}}{2}$$
(2.5)

Equation (2.5) implies that the bandwidth of the TIA is a linear function of the steering ratio  $S_{ra}$  resulting in 3.67× bandwidth shrinking during the RX phase. This wide variation of the bandwidth poses a challenge to the TIA loop design in terms of stability, power consumption and noise performance unless gain-dependent bandwidth compensation is applied. Therefore, we propose a dynamic-biasing scheme in which the transconductance of the inverterbased amplifier is continuously adjusted by changing the bias current to match the change of the steering ratio  $S_{ra}$ .

As depicted in Fig. 2.8(a), the TGC control voltage  $V_c$  is compared to the same reference voltages  $V_{re1}-V_{ref4}$  (see Fig. 2.7b) by four PMOS differential pairs with diode-connected NMOS transistors as their load. Thus, the circuit generates a series of gate-control voltages  $V_{bn1}-V_{bn4}$  and  $V_{bp1}-V_{bp4}$ , which continuously adjust the current-mirror ratio and the bias current of the TIA via four NMOS current-steering pairs, as shown in Fig. 2.8(b). The adjustment of the bias current changes the transconductance of the TIA which compensates the closed-loop bandwidth in (2.5). The dynamic-current-mirror ratios at the beginning and the end of the RX phase were carefully selected based on the simulation to guarantee sufficient loop stability and a roughly constant closed-loop bandwidth.

#### **2.4.3.** CLASS-AB CURRENT AMPLIFIER

As depicted in Fig. 2.9(a), the variable-gain current amplifier (CA) is based on a class-AB current mirror [23] of which the current-mirror ratio is continuously tuned by the CCSN. The CA consists of four unit current branches CB<sub>1</sub>–CB<sub>4</sub> and the associated input/output current branch. A 4-tap CCSN adjusts the current mirror ratio across five gain steps  $\frac{2}{6}$ ,  $\frac{3}{5}$ ,...,  $\frac{6}{2}$  throughout the RX phase resulting in a total gain range of 19.1 dB. These five gain steps again can be described by (2.6) with parameters *m*=6 and *n*=2. The bandwidth of the local feedback loop is designed to be at least a factor of two higher than the TIA's bandwidth throughout the overall gain range. Thanks to the class-AB operation and the fast local feedback loop, the current amplifier only has minor impact on overall bandwidth. The output current *i*<sub>out2</sub> of the CA is converted into a voltage *v*<sub>out2</sub> via a load resistor R<sub>L</sub> and fed to an output driver to drive an off-chip load.

A current-steering bias network (CSBN), shown in Fig. 2.9(b), generates the gate-control voltages for the CCSN of the CA. The TGC control voltage  $V_c$  is compared to a series of reference voltages  $V_{re1}-V_{ref4}$  via four NMOS differential pairs with diode-connected PMOS loads, similar to the bandwidth control circuit shown in in Fig. 2.8(a). The generated CCSN control voltages  $V_{cna1}-V_{cna4}$  and  $V_{cnb1}-V_{cnb4}$  are converted to  $V_{cpa1}-V_{cpa4}$  and  $V_{cpb1}-V_{cpb4}$  via four NMOS differential pairs with diode-connected PMOS loads. By properly sizing the NMOS differential pairs and the PMOS loads of the CSBN, the DC components of the CCSN are largely cancelled by the complementary structure and thus do not saturate the output branch CB<sub>out</sub>. The residual DC components introduce a low-frequency signal to the output voltage  $v_{out2}$  that changes at the same speed as the TGC control voltage  $V_c$ . Fortunately, the frequency of  $V_c$  is out of the signal bandwidth and therefore the low-frequency signal associated with it can be easily filtered out. Voltage regulators similar to those in Fig. 2.7(a), shared by four current amplifiers, generate the local supply rails  $V_{regc1,2,3}$  and thus attenuate interference from the power supply and ground.

#### **2.4.4.** Noise Analysis

The first TIA stage and the second CA stage have different contributions to the total inputreferred noise at different AFE gains. At the highest gain setting, the noise of the TIA dominates



Fig. 2.9: (a) Circuit diagram of the current amplifier with CCSN; (b) the current-steering bias network (CSBN) of the CA.

the input-referred noise. The noise of the CA only has a small impact due to the high gain of the first TIA stage. More specifically, the major noise sources in the TIA are the MOS devices in the inverter-based amplifier and the hardware-sharing regulators, while the high-voltage TX/RX switch only has a minor impact, as it was sized such that it contributes only 10% of the total noise. The feedback capacitors do not contribute thermal noise, whereas the CCSNs of the TIA contribute part of the noise during the interpolation. The noise of the TIA decreases as the TGC function moves towards the high gain region due to the dynamic-biasing scheme of the bandwidth-control circuit, thus significantly improving the power efficiency of the first TIA stage compared to a constant-biasing solution. At the low gain setting, the second CA stage dominates the final noise level due to the lower gain of the preceding stage and the larger DC bias current as a result of more unit current branches attaching to the input current branch of the CA. Nevertheless, the amplitude of input signal also becomes larger and signal-to-noise (SNR) ratio is even improved because the signal increases faster than the noise level, as will be elaborated in section 2.5.

# **2.5.** EXPERIMENTAL RESULTS **2.5.1.** ASIC PROTOTYPE



Fig. 2.10: Micrograph of the 64-channels transceiver ASIC, with inset showing the layout of 4 AFE channels with shared circuitry (PADs removed).

An ASIC prototype chip was fabricated in a 180-nm high-voltage BCDMOS process (Fig. 2.10). The chip contains 8 AFE channels, divided into two groups of four that share hardware, which are connected via multiplexers to the 64 transducer elements, and element-level pulsers capable of driving the elements with 36V unipolar pulses. Four AFE channels with shared hardware occupy  $200 \times 500 \ \mu\text{m}^2$ , corresponding to 0.025 mm<sup>2</sup> per channel. The 64 transducer channels are arranged into an 8×8 array which enables direct integration between the ASIC and PZT transducers in a pitch-matched fashion.

#### **2.5.2.** Electrical Characterization

For electrical characterization, an input current was generated by applying an external voltage signal to an on-chip capacitor of 1pF that mimics a transducer element. Fig. 2.11(a) shows the AFE's gain measured at different frequencies as a function of the TGC control voltage  $V_c$ .



Fig. 2.11: (a) Measured AFE gain at three frequencies as a function of  $V_c$ . (b) Calculated linearin-dB gain errors.

8-MHz, 10-MHz, and 12-MHz sinusoidal current inputs were used in the measurement which correspond to 40% transducer bandwidth. As expected, the gains of the AFE at different frequencies are approximately dB-linear functions of the control voltage  $V_c$ , and thus the TGC function is obtained by applying a linear ramp-up control voltage. The gain errors with respect to ideal linear-in-dB curves are extracted and depicted in Fig.2.11(b), which are all below ±0.4 dB within a 36.1-dB overall gain range. Fig. 2.12 shows the transient response of the AFE obtained by applying a 10-MHz sinusoidal current with a peak-to-peak amplitude that decays exponentially in 48 µs from 64 µA to 1µA (Fig. 2.12a). This corresponds an overall 36.1-dB input amplitude change. The same linear ramp-up voltage was applied to the TGC control inputs of three test samples (Fig. 2.12b), resulting in three output waveforms measured at each AFE's output as shown in Fig. 2.12(c). The extracted envelopes (Fig. 2.12d) show that the AFEs adequately compensate for the decay with gain errors all below ±0.4 dB.

Fig. 2.13(a) shows the transfer function measured at different TGC control voltages ( $V_c$ ), as well as a –3-dB bandwidth curve across the overall control voltage range. As depicted in Fig. 2.13(b), the measured –3-dB bandwidth at different TGC control voltages changes between 17.5 MHz and 25.4 MHz, resulting in a less than ±20% variation around 21.7 MHz across the gain range. Thanks to the proposed bandwidth-control circuit, the –3-dB bandwidth changes only slightly compared to the factor of about 100 gain variation of the AFE.

The output noise density of the AFE was measured by connecting the same on-chip 1pF capacitor at the input of the TIA to the ground and sweeping the TGC control voltage  $V_c$ . The input-referred noise density was calculated by dividing the measured output noise density by the measured gain, resulting in a series of input-referred noise spectra at different TGC control voltages, as shown in Fig. 2.14(a). The noise floor averaged from 6 MHz to 14 MHz is shown in Fig. 2.14(b), as a function of the TGC control voltage. The noise decreases for higher gains and reaches 1.31 pA/ $\sqrt{\text{Hz}}$  at a TGC control voltage of about 1.1 V, where the gain error still falls below ±0.4 dB. At the lowest gain where gain error still satisfies the boundary condition of ±0.4 dB, the noise floor is about 47 pA/ $\sqrt{\text{Hz}}$ . The noise floor grows by about 31 dB, but still falls be-



Fig. 2.12: (a) Exponentially decaying input current; (b) applied TGC control signal; (c) measured AFE output voltages of three test samples; (d) corresponding linear-in-dB gain errors

hind the input signal, which grows by about 36 dB in the overall TGC gain range. Therefore, the SNR improves as the TGC gain moves towards the low-gain region. The trend of the SNR improvement towards the low-gain region is verified by a dynamic range (DR) measurement as depicted in Fig. 2.15(a), where the SNR was measured as a function of the input current at different TGC control voltages. The dynamic range of the proposed AFE is about 78 dB, which is measured as the ratio of the highest input signal level at the 1-dB compression point and the lowest input signal level at which the input signal power equals the noise power.

The power-supply rejection (i.e. the attenuation from the supply to the output) was measured to be 22.6 dB at 10 MHz frequency, sufficient to prevent noise and interference from the supply from limiting the performance of our prototype. Simulations show that this is limited



Fig. 2.13: (a) Measured gain transfer function, (b) -3-dB bandwidth as a function of TGC control voltage  $V_c$ 



Fig. 2.14: (a) Measured input-referred noise spectra, (b) measured in-band noise density as a function of TGC control voltage  $V_c$ .

by the  $V_{cm}$  buffer in our prototype (Fig. 2.3) and can readily be improved.

Fig. 2.15(b) shows the measured power consumption as a function of the TGC control voltage. As expected, the power consumption varies along the measured curve to compensate the bandwidth variation during the RX phase, resulting in an average power consumption of 0.8mW from a 1.8-V supply voltage provided that the TGC control voltage changes properly to generate a 36-dB TGC gain range.



Fig. 2.15: (a) Measured SNR as a function of the input current in different TGC's control voltage. (b) Measured power consumption as a function of TGC control voltage ( $V_c$ ).

**2.5.3.** ACOUSTICAL CHARACTERIZATION



Fig. 2.16: (a) Micrograph of the ASIC with PZT transducers on top and a cross-sectional view showing PZT-on-CMOS integration. (b) Measurement setup of the acoustical characterization of the TGC function.

Fig. 2.16(a) shows the micrograph of a fabricated prototype with PZT transducers on top of the ASIC fabricated using the piezoelectric layer (PZT)-on-CMOS technology described in [28].

An 8×8 transducer array with a 100- $\mu$ m pitch, surrounded by a ring of protecting dummy elements, was directly built on the ASIC by means of mechanical dicing. The bottom electrodes of the transducer elements connect to the ASIC's bond pads via gold bumps, leading to minimized parasitic capacitance. The common ground electrodes of the transducers were connected to the PCB through the ground foil (not shown in Fig. 2.16a).

For characterization of the TGC function, a small water tank was mounted on top of the prototype, with an external single-element ultrasound transmitter submerged in the water (Fig. 2.16b). A series of exponentially-decaying pulses were generated by an arbitrary waveform generator (AWG), which excite the transmitter to generate the corresponding exponentiallydecaying ultrasound waves. The acoustic signal is converted into electrical signal by the PZT elements. The ASIC applies TGC function to the received current and outputs the compensated signal to an off-chip buffer connected to an oscilloscope. An FPGA controls the operation of the ASIC, e.g, switching between TX and RX phases, and also triggers the oscilloscope to record the received ultrasound data.

Noting that the attenuation of ultrasound waves in the water is negligible, the 10-MHz singleelement transmitter was excited by a train of 7 pulses modulated by an exponentially-decaying envelope to emulate the attenuation, with a decay rate tuned to match an attenuation factor of 0.4 dB·cm<sup>-1</sup>·MHz<sup>-1</sup>. As depicted in Fig. 2.17(a), the ultrasonic-pulse train was received by the AFE with its TGC function deactivated by keeping the TGC control voltage at a constant level during the RX phase, leading to a 36-dB signal attenuation within 60 µs at the AFE's output, which is in line with the preset attenuation rate. In the following measurement, the AFE processed the same incoming ultrasound pulses with the TGC function activated, compensating the exponentially-decaying envelope, resulting in a voltage output with an amplitude variation smaller than ±0.36 dB, as shown in Fig. 2.17(b).



Fig. 2.17: (a) Uncompensated ASIC output with the TGC function disabled, (b) compensated ASIC output with the TGC function enabled.

The same test bench was reconfigured for an imaging experiment, as shown in Fig. 2.18(a), in which the single-element transmitter was replaced by 3 needle reflectors positioned at 6 mm, 8 mm, and 10 mm from the transducer surface. A plane wave was transmitted by driving all the elements with 30-V pulses. As shown in Fig. 2.18(b), a B-mode image was reconstructed



Fig. 2.18: (a) Setup for imaging experiment. (b) B-mode image showing the position of the needles.

from the data acquired at each AFE's output, clearly showing the needle positions even with the relatively small aperture size.

The performances and characteristics of our work and the prior art have been summarized in Table 4.2 for comparison. Compared to AFEs with discrete-time TGC [3], [24], comparable performances have been achieved, but with a wider gain range and without introducing gainswitching and virtual ground-switching transients that could lead to image artifacts. Compared to AFEs with interpolating [9], [12] and analog TGCs [11], a >2× better linear-in-dB gain error is obtained in a wider effective gain range with less area and power consumption, as demonstrated both in the electrical and acoustic measurements.

#### **2.6.** CONCLUSION

This chapter has presented a pitch-matched AFE with continuous TGC function. The presented CCSN interpolates exponentially-spaced gain steps with a pseudo-exponential interpolation scheme, leading to a small linear-in-dB gain error and fewer passive devices than in solutions based on many small discrete gain steps. The hardware-sharing topology and inverterbased amplifiers used in the AFE further reduce the area and power consumption. The dynamic biasing scheme smoothly compensates for the bandwidth variation caused by the gain change of the TGC function without deteriorating the signal-to-noise ratio, thus providing a nearly constant gain-bandwidth product and better power efficiency. To the authors' best knowledge, the prototype is the first reported work that combines a continuous TGC function into a pitchmatched layout with a  $100\mu$ m-level pitch size. All these features make the solution promising for next-generation miniaturized 3-D ultrasound imaging devices.

|                                     | Table                | TADIC 2.1. I LIU OIUVENUCE COMMINUOOIN |                        |                     |                      |               |
|-------------------------------------|----------------------|----------------------------------------|------------------------|---------------------|----------------------|---------------|
|                                     | This work            | [3]                                    | [24]                   | [9]                 | [11]                 | [12]          |
| Process                             | 180-nm               | 180-nm                                 | 180-nm                 | 180-nm              | 180-nm               |               |
|                                     | BCD                  | BCD                                    | HVCMOS                 | BCD                 | CMOS                 | /             |
| Pitch-matched                       | Yes(100 µm)          | No                                     | No                     | No                  | No                   | No            |
| AFE Type                            | TIA                  | TIA                                    | TIA                    | TIA                 | VGA <sup>(2)</sup>   | Voltage amp.  |
| TGC Type                            | Interpolating        | Discrete                               | Discrete               | Interpolating       | Analog               | Interpolating |
| -3-dB BW                            | 17.5 MHz             | 16 MHz                                 | 10 MHz                 | $7.1 \mathrm{~MHz}$ | 3.1 MHz              | 50 MHz        |
| Maximum gain                        | 102 dB $\Omega$      | $119~{ m dB}\Omega$                    | $116\mathrm{dB}\Omega$ | $107~{ m dB}\Omega$ | 37 dB                | 38 dB         |
| Effective gain range <sup>(1)</sup> | 36 dB                | 12 dB                                  | 12 dB                  | 33 dB               | 37 dB                | 37 dB         |
| Gain error                          | ±0.4 dB              | ±3 dB                                  | ±3 dB                  | ±1 dB               | ±1.4 dB              | ±0.9 dB       |
| Input-referred                      | 1.31 pA/√Hz          | 2.0 pA/√Hz                             | 0.4 pA/√Hz             | 1.7 pA/√Hz          | 8.6 nV/√Hz           | 4.1 nV/√Hz    |
| noise density                       | @10 MHz              | @13 MHz                                | @5 MHz                 | @5 MHz              | @2 MHz               | @5 MHz        |
| Transducer capacitance              | 1 pF                 | 0.7 pF                                 | 2 pF                   | 15 pF               | -                    | _             |
| Power consumption                   | 0.8 mW               | 0.79 mW                                | $1.4~\mathrm{mW}$      | 5.2 mW              | 0.96 mW              | 52 mW         |
| Area                                | $0.025 \text{ mm}^2$ | $0.027 \mathrm{~mm}^2$                 | $0.028 \text{ mm}^2$   | $0.12 \text{ mm}^2$ | $0.025 \text{ mm}^2$ | _             |
|                                     |                      |                                        |                        |                     |                      |               |

Table 2.1: PERFORMANCE COMPARISON

Gain range that satisfies the specified gain error.
 Not including the LNA which is required before the VGA.

#### Appendix

In this Appendix, the gain trajectory of the TIA is derived at first. For simplicity, we assume only two adjacent steering branches are activated simultaneously in every time interval, e.g., only steering branches  $B_1$  and  $B_2$  are activated in the first time *inverval*<sub>1</sub> (Fig. 2.5b). The current-gain steps between  $i_{out}$  and  $i_{in}$  can be expressed by a unified pseudo-exponential function

$$\frac{i_{out}}{i_{in}} = \frac{1 + \frac{m-n}{m+n} \cdot y}{1 - \frac{m-n}{m+n} \cdot y} 
y = \{-1, -1 + \frac{2}{m-n}, \dots, 1\}$$
(2.6)

where the expression of discrete variable *y* corresponds to the beginning of each time interval *interval*<sub>1</sub>*-interval*<sub>n</sub> (Fig. 2.5b), e.g.,  $y = \{-1, -1 + \frac{2}{m-n}\}$  corresponds to the first two gain steps of the time *interval*<sub>1,2</sub>, *i*<sub>out</sub> is the feed-forward current coupling to the virtual ground of the next stage, *i*<sub>in</sub> is the feedback current identical to the transducer's input current *i*<sub>TD</sub> as a result of the negative feedback loop, and *m* and *n* are the number of unit capacitors associated with steering branch B<sub>1</sub>.

The CCSN interpolates between these two exponentially-spaced gain steps. Provided the trans-conductances of  $M_1$  and  $M_2$  (Fig. 2.5a) are designed to be equal and the gate bias voltages change properly as depicted in time *interval*<sub>1</sub> in Fig. 2.5(b), the PMOS current-steering pair  $(M_{p1}/M_{p2})$  divides the currents flowing through  $M_2$  at the same ratio as the NMOS current-steering pair  $(M_{n1}/M_{n2})$  divides the currents flowing through  $M_1$ . The large-signal currents flowing through the CCSN in (2.1) can be nearly cancelled and the first two small-signal output currents  $i_1$  and  $i_2$  of the CCSN branches  $B_1$  and  $B_2$  can be expressed as

$$i_{1,2} = (1 \mp \alpha \cdot V_{c1}) \cdot i_0 \qquad \alpha \cdot V_{c1} \in (-1, 1)$$
 (2.7)

based on (2.2), where  $V_{c1} = V_{cn2} - V_{cn1}$  is the TGC control voltage in *interval*<sub>1</sub>, and  $\alpha$  is a constant determined by the MOS transistors. The currents  $i_{1,2}$  are linear functions of the control voltage  $V_{c1}$ , the ratio of  $i_2$  to  $i_1$  is of the form  $(\frac{1+x}{1-x})$ . These currents divide between the output capacitors and the feedback capacitors connected to steering branches B<sub>1</sub> and B<sub>2</sub>. As a result, the total current flowing to the input  $i_{in}$ , which due to the feedback in the TIA equals the transducer's signal current  $i_{TD}$ , as well as the total current flowing to the output  $i_{out}$ , is a linear combination of  $i_1$  and  $i_2$ . The current gain during interpolation of first two gain steps can be expressed as

$$\frac{i_{out}}{i_{in}} = \frac{ni_1 + (n+1)i_2}{mi_1 + (m-1)i_2}$$
(2.8)

Substituting (2.7) into (2.8) gives

$$\frac{i_{out}}{i_{in}} = \frac{1 + \frac{\alpha \cdot V_{c1} - (m-n-1)}{m+n}}{1 - \frac{\alpha \cdot V_{c1} - (m-n-1)}{m+n}}$$
(2.9)

Now a steering ratio  $(S_{r1})$  can be defined as

$$S_{r1} = \frac{\alpha \cdot V_{c1} - (m - n - 1)}{m - n}$$
  

$$S_{r1} \in \left(-1, -1 + \frac{2}{m - n}\right)$$
(2.10)

Substituting (2.10) into (2.9) yields a similar equation as (2.6), except that instead of discrete variable *y*,  $S_{r1}$  is a continuous linear function of the TGC control voltage  $V_{c1}$ , corresponding to the interpolation of the first time *interval*<sub>1</sub>. Therefore the current gain during interpolation, which is the ratio of  $i_{out}$  to  $i_{TD}$ , is also of the form  $(\frac{1+x}{1-x})$ . A repetitive analysis can be applied to the other time interval<sub>2</sub> – *interval*<sub>n</sub> to prove that the overall gain curve is a pseudo-exponential function of the TGC control voltage.

A similar analysis can be applied to the interpolation of the CA. E.g., at the beginning of each time interval, the gain steps can be expressed with the same function (2.6). During the interpolation, the current gain  $\frac{i_{out2}}{i_{out1}}$  still conforms with (2.8), (2.9) and  $(\frac{1+x}{1-x})$ .

# REFERENCES

- [1] G. Jung, M. W. Rashid, T. M. Carpenter, C. Tekes, D. M. J. Cowell, S. Freear, F. L. Degertekin, and M. Ghovanloo, "Single-chip reduced-wire active catheter system with programmable transmit beamforming and receive time-division multiplexing for intracardiac echocardiography," in 2018 IEEE International Solid - State Circuits Conference - (ISSCC), Feb. 2018, pp. 188–190.
- [2] D. Wildes, W. Lee, B. Haider, S. Cogan, K. Sundaresan, D. M. Mills, C. Yetter, P. H. Hart, C. R. Haun, M. Concepcion, J. Kirkhorn, and M. Bitoun, "4-D ICE: A 2-D Array Transducer With Integrated ASIC in a 10-Fr Catheter for Real-Time 3-D Intracardiac Echocardiography," *IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control*, vol. 63, no. 12, pp. 2159–2173, Dec. 2016.
- [3] M. Tan, C. Chen, Z. Chen, J. Janjic, V. Daeichin, Z.-Y. Chang, E. Noothout, G. van Soest, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A Front-End ASIC With High-Voltage Transmit Switching and Receive Digitization for 3-D Forward-Looking Intravascular Ultrasound Imaging," *IEEE J. Solid-State Circuits*, vol. 53, no. 8, pp. 2284–2297, Aug. 2018.
- [4] J. Baranger, C. Demene, A. Frerot, F. Faure, C. Delanoë, H. Serroune, A. Houdouin, J. Mairesse, V. Biran, O. Baud, and M. Tanter, "Bedside functional monitoring of the dynamic brain connectivity in human neonates," *Nat Commun*, vol. 12, no. 1, p. 1080, Feb. 2021.
- [5] S. P. Miller, C. C. Cozzio, R. B. Goldstein, D. M. Ferriero, J. C. Partridge, D. B. Vigneron, and A. J. Barkovich, "Comparing the Diagnosis of White Matter Injury in Premature Newborns with Serial MR Imaging and Transfontanel Ultrasonography Findings," *American Journal of Neuroradiology*, vol. 24, no. 8, pp. 1661–1669, Sep. 2003.
- [6] C. Chen, Z. Chen, D. Bera, S. B. Raghunathan, M. Shabanimotlagh, E. Noothout, Z. Chang, J. Ponte, C. Prins, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A front-end ASIC with receive sub-array beamforming integrated with a 32 × 32 PZT matrix transducer for 3-D transesophageal echocardiography," *IEEE J. Solid-State Circuits*, vol. 52, no. 4, pp. 994–1006, Apr. 2017.
- [7] J. L. Prince and J. M. Links, *Medical Imaging Signals and Systems*, 2nd ed. Boston: Pearson, 2015.
- [8] P. R. Hoskins, K. Martin, and A. Thrush, Eds., *Diagnostic Ultrasound: Physics and Equipment* (Cambridge Medicine), 2nd ed. Cambridge, UK; New York: Cambridge University Press, 2010.
- [9] E. Kang, M. Tan, J.-S. An, Z.-Y. Chang, P. Vince, N. Sénégond, T. Mateo, C. Meynier, and M. A. P. Pertijs, "A Variable-Gain Low-Noise Transimpedance Amplifier for Miniature Ultrasound Probes," *IEEE J. Solid-State Circuits*, vol. 55, no. 12, pp. 3157–3168, Dec. 2020.

- [10] T. Kim, F. Fool, D. S. dos Santos, Z.-Y. Chang, E. Noothout, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "Design of an Ultrasound Transceiver ASIC with a Switching-Artifact Reduction Technique for 3D Carotid Artery Imaging," *Sensors*, vol. 21, no. 1, p. 150, Jan. 2021.
- [11] J.-Y. Um, "A Compact Variable Gain Amplifier With Continuous Time-Gain Compensation Using Systematic Predistorted Gain Control," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 69, no. 2, pp. 274–278, Feb. 2022.
- [12] "VCA2617," Texas Instrument, Dallas TX, USA, Datasheet, Aug. 2005.
- [13] P. Guo, Z. Chang, E. Noothout, H. Vos, J. Bosch, N. de Jong, M. Verweij, and M. Pertijs, "A Pitch-Matched Analog Front-End with Continuous Time-Gain Compensation for High-Density Ultrasound Transducer Arrays," in *ESSCIRC 2021 - IEEE 47th European Solid State Circuits Conference (ESSCIRC)*, Sep. 2021, pp. 163–166.
- [14] I. Choi, H. Seo, and B. Kim, "Accurate dB-Linear Variable Gain Amplifier With Gain Error Compensation," *IEEE J. Solid-State Circuits*, vol. 48, no. 2, pp. 456–464, Feb. 2013.
- [15] J. Xiao, I. Mehr, and J. Silva-Martinez, "A High Dynamic Range CMOS Variable Gain Amplifier for Mobile DTV Tuner," *IEEE J. Solid-State Circuits*, vol. 42, no. 2, pp. 292–301, Feb. 2007.
- [16] H. O. Elwan and M. Ismail, "Digitally programmable decibel-linear CMOS VGA for lowpower mixed-signal applications," *IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing*, vol. 47, no. 5, pp. 388–398, May 2000.
- [17] H. Liu, X. Zhu, C. C. Boon, and X. He, "Cell-Based Variable-Gain Amplifiers With Accurate dB-Linear Characteristic in 0.18 *Mm* CMOS Technology," *IEEE J. Solid-State Circuits*, vol. 50, no. 2, pp. 586–596, Feb. 2015.
- [18] H. Elwan, A. Tekin, and K. Pedrotti, "A Differential-Ramp Based 65 dB-Linear VGA Technique in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 44, no. 9, pp. 2503–2514, Sep. 2009.
- [19] B. Gilbert, "A Low-noise Wideband Variable-gain Amplifier Using An Interpolated Ladder Attenuator," in 1991 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, Feb. 1991, pp. 280–281.
- [20] P. Wang and T. Ytterdal, "Low noise, -50 dB second harmonic distortion single-ended to differential switched-capacitive variable gain amplifier for ultrasound imaging," *IET Circuits, Devices & Systems*, vol. 10, no. 3, pp. 173–180, 2016.
- [21] R. Harjani, "A low-power CMOS VGA for 50 Mb/s disk drive read channels," *IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing*, vol. 42, no. 6, pp. 370–376, Jun. 1995.
- [22] P. R. Gray, Ed., *Analysis and Design of Analog Integrated Circuits*, 5th ed. New York: Wiley, 2009.
- [23] F. Esparza-Alfaro, A. Lopez-Martin, J. Ramirez-Angulo, and R. Carvajal, "High-performance micropower class AB current mirror," *Electron. Lett.*, vol. 48, no. 14, p. 823, 2012.

- [24] K. Chen, H.-S. Lee, and C. G. Sodini, "A Column-Row-Parallel ASIC Architecture for 3-D Portable Medical Ultrasonic Imaging," *IEEE J. Solid-State Circuits*, vol. 51, no. 3, pp. 738– 751, Mar. 2016.
- [25] M. S. J. Steyaert and W. M. C. Sansen, "A micropower low-noise monolithic instrumentation amplifier for medical purposes," *IEEE Journal of Solid-State Circuits*, vol. 22, no. 6, pp. 1163–1168, Dec. 1987.
- [26] L. Shen, N. Lu, and N. Sun, "A 1-V 0.25- μW Inverter Stacking Amplifier With 1.07 Noise Efficiency Factor," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 3, pp. 896–905, Mar. 2018.
- [27] K.-J. De Langen and J. Huijsing, "Compact low-voltage power-efficient operational amplifier cells for VLSI," *IEEE J. Solid-State Circuits*, vol. 33, no. 10, pp. 1482–1496, Oct. 1998.
- [28] C. Chen, E. Noothout, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, M. A. P. Pertijs, S. B. Raghunathan, Z. Yu, M. Shabanimotlagh, Z. Chen, Z.-y. Chang, S. Blaak, C. Prins, and J. Ponte, "A Prototype PZT Matrix Transducer With Low-Power Integrated Receive ASIC for 3-D Transesophageal Echocardiography," *IEEE Trans. Ultrason., Ferroelect., Freq. Contr.*, vol. 63, no. 1, pp. 47–59, Jan. 2016.

# **3** A 1.2mW/Channel Pitch-Matched Transceiver ASIC Employing a Boxcar-Integration-Based RX Micro-Beamformer for

# HIGH-RESOLUTION 3-D ULTRASOUND IMAGING

This chapter is based on the publication "A 1.2mW/Channel Pitch-Matched Transceiver ASIC Employing a Boxcar-Integration-Based RX Micro-Beamformer for High-Resolution 3-D Ultrasound Imaging," in IEEE Journal of Solid State Circuits, in press, doi:10.1109/JSSC.2023.3271270.

## **3.1.** INTRODUCTION

T RANS-FONTANELLE ultrasonography (TFUS) is a favorable approach for monitoring brain perfusion in neonates. In contrast with other imaging techniques such as positron emission tomography (PET) and magnetic resonance imaging (MRI), it allows for bedside monitoring of the neonatal brain and avoids exposure to ionizing radiation and the need for sedation [1]. Moreover, it offers better spatial resolution than near-infrared spectroscopy (NIRS) [2].

Preterm infants regularly show inadequate brain perfusion during and after the delivery, resulting in brain injuries and neuro-developmental problems [3]. A wearable neuro-monitoring device that can continuously assess brain perfusion and brain development of preterm infants would enable timely treatment of brain perfusion abnormalities [4]. Fig. 3.1 shows the device envisioned in this work, which uses pulse-echo ultrasound through the fontanel to generate high-resolution images of the brain and Doppler techniques to image blood flow in the brain. To be able to assess the relevant part of the brain, a 2-D transducer array is required that allows for beamforming in 3-D space. To visualize sub-millimeter brain vessels, a small wavelength



Fig. 3.1: A wearable neuro-monitoring device for TFUS with inset showing the envisioned pitch-matched ASIC with the transducer array built on top.

is required, resulting in a high ultrasound frequency and small array pitch. A higher ultrasound frequency leads to faster attenuation in brain tissue [5] and thus shortens the penetration depth. As a trade-off, a 2-D transducer array with 10-MHz central frequency and 100- $\mu$ m pitch was chosen in our application to ensure sufficient resolution and penetration depth. As depicted in Fig. 3.1, an aperture size of 20×10 mm<sup>2</sup> is suitable according to studies of anterior fontanel size in preterm infants [6], [7], resulting in a vast 2-D transducer array of 20,000 elements for the selected 100- $\mu$ m pitch size in such a wearable device.

This high number of elements and small array pitch make pulse transmission (TX) and echo reception (RX) very challenging, and beamforming plays a central role in this. The most common RX beamforming approach is the delay-and-sum (DAS) algorithm as depicted in Fig. 3.2(a). After pulse transmission, the echoes from a focal point arrive at the elements of the transducer array at slightly different times, and are converted into electrical signals with corresponding time shifts by the transducer array. These signals can be coherently detected by applying correct delays to each channel and summing the resulting signals. Thanks to modern integrated circuit technology, DAS can be performed by high-speed processors like graphics processing units (GPUs) [8]. Nevertheless, this requires element-level digitization beforehand. Direct RF digitization was realized in [9], where an element-level  $\Delta\Sigma$  modulator directly digitizes the ultrasound RF signal from each transducer element, followed by an on-chip digital beamformer to post-process the digitized data in the chip's periphery. Noting that each RX channel in the reported work occupies nearly all element-level area of  $250 \times 250 \ \mu m^2$ , the element-level digitization poses fundamental challenges to our circuit design when taking the array pitch of 100 µm and the need to include element-level high-voltage (HV) pulsers into account.

An alternative is to divide the transducer array into sub-arrays of *N* elements each, and to combine the signals in each sub-array locally by means of an analog DAS operation, also referred to as micro-beamforming ( $\mu$ BF) [10],[11], as depicted in Fig. 3.2(b). Further beamforming of the  $\mu$ BF outputs can be done either off chip or on the periphery of the chip. The channel count is thus reduced by a factor of *N* allowing for a pitch-matched layout at the sub-array level



Fig. 3.2: (a) Beamforming process employing delay-and-sum (DAS). (b) micro-beamforming process employing two-step DAS.

and much less signal routing compared to the element-level digitization scheme. However, prior micro-beamformer implementations employ per-element capacitive memory to realize the delay [12], [13], making it cumbersome and unfavorable for the desired 100-µm-pitch array.

In this chapter, we present a compact, pitch-matched prototype ASIC employing a boxcarintegration-based RX micro-beamformer. In contrast to conventional RX micro- beamformers, which perform delay before the summation [12], the boxcar-integration-based beamformer sums the signals from different elements in the current domain before the delay operation takes place, leading to a reduction in the number of required memory cells, the associated dynamic switches and signal routing. In addition, the box-car integration provides built-in antialias filtering (AAF), obviating the need for an explicit AAF in-between the analog front-end (AFE) and the digitizer and relaxing the design requirements for the AFE. In the ASIC's TX circuitry, a novel row-level beamforming scheme is deployed that moves the bulky HV MOS transistors to the chip's periphery, in contrast with a conventional push-pull structure [13], and only uses one high-voltage isolation diode per element. All these result in a compact transceiver ASIC meeting stringent restrictions on die area. This chapter is organized as follows. Section 3.2 reviews the prior art and compares voltagemode and current-mode beamformers. Section 3.3 describes the architecture design of the boxcar-integration-based  $\mu$ BF and provides the system overview. Section 3.4 presents the detailed circuit implementation. Section 3.5 describes the fabricated prototype, as well as the electrical and acoustic measurement results. The chapter ends with conclusions.

#### **3.2.** Prior Art

Various receive (RX) beamformer topologies have been used to realize the DAS process. Direct RF digitization with subsequent beamforming in the digital domain [9], [14], as mentioned, requires a relatively large die area and power consumption and will not be discussed in more detail. Other beamformer topologies can be broadly classified into three categories: beamformers based on phase rotation applied to I/Q demodulated RF signals [15]; beamformers based on continuous-time delay lines in which the delays are tuned continuously [16], [17]; and beamformers based on discrete-time delay lines in which the delays are discretized conforming to a clock signal [11], [12], [18]–[20].

The phase-rotation-based beamformer as shown in Fig. 3.3(a) applies direct I/Q demodulation to the band-pass-filtered RX signals followed by digitization. The digitized I/Q signals are fed to a series of phase rotators in the digital domain to approximate time delays before the summation takes place. In [15], a direct sampling I/Q demodulation is employed, and analogto-digital conversion is required for each I/Q channel. The solution is attractive for narrowband ultrasound signals, for which phase shifting is a good approximation for delay, and which can be multiplexed on a high-speed ADC, thus saving area. However, echo signals in ultrasound imaging generally have a very wide bandwidth, e.g., 80% bandwidth around a 10-MHz central frequency in our application, leading to too many high-speed ADCs in the system, therefore making the solution unattractive for our application.

As depicted in Fig. 3.3(b), a continuous-time delay cell employs current-mirror-based circuits to construct an all-pass filter with the transfer function of  $\frac{1-s \cdot T_0}{1+s \cdot T_0}$ , where  $T_0$  is defined by the transconductance of  $M_0$  and the parasitic capacitance  $C_p$ . The transfer function is a first-order approximation of an ideal delay of  $T_0$ , tuned by the bias currents of the current mirror network. A delay line can be constructed by cascading multiple of such delay cells [16]. The implementation has no passive devices and is potentially small. However, it suffers from non-linear phase response, resulting in different delays for different frequency components in a wide-band ultrasound signal, and high sensitivity to temperature and process drift.

A discrete-time delay lines using bucket-brigade devices (BBDs) [18] is shown in Fig. 3.3(c). A series of bucket capacitors  $C_{B1-4}$  stores the information as charge packets transported by means of MOS transistors [21]. The speed of the charge transportation is controlled by two complementary clock signals, making the delay step equal to the clock period. Although the BBDs delay line uses a minimum number of transistors, potentially occupying small area, the charge transfer between two adjacent bucket capacitors has an error term that is dependent on the charge being transferred, resulting in a charge-transfer gain less than one and poor linearity [22].

Other types of discrete-time beamformers are based on switched-capacitor [12], [19] or switched-current delay lines [20] as depicted in Fig. 3.3(d), in which the capacitor array (e.g.,  $C_{11-1N}$ ) samples the voltage/current signals of all channels ( $CH_{1-N}$ ), controlled by a series of



in



write clock signals (e.g.,  $W_{11-1N}$ ), and stores the information for a certain amount of time until the read/summation (assuming they happen simultaneously) takes place controlled by a read clock signal (R<sub>1</sub>). Thus, the delays between adjacent channels (e.g.,  $\tau_1$ ) are accurately defined by the falling edges of the write clock signals (e.g.,  $W_{11}$  and  $W_{12}$ ). Multiple time-interleaved channels (TI<sub>1-k</sub>) are needed to prevent information loss during DAS, since the required maximum delay ( $\tau_{max}$ ) is generally much longer than the sampling period ( $T_s$ ). The total required number of time-interleaved channels *k* can be expressed as [19]

$$k = \frac{T_w + T_r + \tau_{max}}{T_s} \tag{3.1}$$

and the total required number of memory cells  $N_{mem}$ , as well as the number of switches  $N_{sw}$ , can be expressed as

$$N_{mem} = k \cdot N \tag{3.2}$$

$$N_{sw} = 2 \cdot k \cdot N$$

where  $T_{uv}$ ,  $T_r$ ,  $\tau_{max}$  are the write, read cycle time and the maximum RX beamforming delay, respectively, and *N* is the total number of ultrasound channels. The information can be delayed and summed in different forms, such as the voltage domain [19], the current domain [20] and the charge domain [12]. However, a large number of memory cells, the associated switches and interconnections are required in all cases, making these solutions less favorable for our application.

#### **3.3.** Architecture design

#### **3.3.1.** BOXCAR-INTEGRATION-BASED MICRO-BEAMFORMER

As shown in Fig. 3.3(d), each time-interleaved channel (TI) contains *N* memory cells, and these memory cells can not be shared with other time-interleaved channels during the delay process of the DAS, since the stored information can only be flushed once the following read/summation cycle completes. An effective way to reduce the number of memory cells is reversing the DAS process and using "sum and delay" instead, as depicted in Fig. 3.4(a). The proposed micro-beamformer operates on current-mode input signals  $I_{1-N}$ , and successively integrates these currents on the memory cells, e.g.,  $C_1$  of the first time-interleaved channel TI<sub>1</sub>. These input currents are generated by AFEs, that have high-impedance output stages, as will be elaborated in section 3.4, allowing summation to take place in the current domain before delays are applied. A similar clock scheme is used, which accurately sets the delay between two adjacent channels by the delay (e.g,  $\tau_I$ ) between the falling edges of two write cycles (e.g,  $W_{11}$  and  $W_{12}$ ). The total number of TI channels is the same as given by eq. (3.1), but there is only one memory cell in every TI channel, thus reducing the total number of memory cells by a factor of *N* compared to the aforementioned beamformers. The required total number of switches is reduced by about 2×.

The integration of input currents on the memory capacitors is an implementation of boxcar integration [23], which is an effective technique to suppress out-of-band noise from the preceding AFE, and acts as an implicit anti-aliasing filter (AAF) for sampling in the beamformer. The normalized magnitude response of boxcar integration can be expressed as[24]

$$|H(f)| = \left| \frac{\sin(\pi f T_{\text{int}})}{\pi f T_{\text{int}}} \right|$$
(3.3)



Fig. 3.4: (a) The box-integration-based  $\mu$ BF (b) Frequency response of the box-car integration.

where  $T_{int}$  is the integration time. Fig. 3.4(b) illustrates the normalized magnitude response, and the notch frequency of the boxcar integration is inversely proportional to the integration time. Therefore, an integration time equal to the sampling period  $T_s$  is selected in our design to maximize the filtering effect and minimize aliasing, as shown in Fig. 3.4(a). In contrast to the continuous-time delay line employing the all-pass filter, boxcar integration has a linear phase response and does not cause frequency-dependent delay for the wide-band ultrasound signal. The -3-dB corner of the boxcar integration filter is about  $0.44 \cdot f_s$  in our design, that is higher than the required signal bandwidth with insignificant in-band attenuation, as a sampling frequency of 4× of the transducer resonance frequency is selected in our design.

As derived in the Appendix, the theoretical total capacitance required to achieve a given output SNR is similar for the conventional switched-capacitor and the proposed boxcar-integration based beamformer. However, in practice, the latter still significantly reduces chip area taking into account that the conventional beamformer requires  $N \times$  more distributed memory capacitors, two times more switches and the associated interconnections. In addition, the boxcar-integration-based beamformer obviates the need for an explicit AAF which also saves area. All these factors result in very compact beamformer implementation.

A similar current-mode beamformer was introduced in [25]. In contrast with our proposal, this design employs a voltage-mode AFE with element-level sampling capacitors and explicit voltage-to-current converters at the input of the beamformer, resulting in large die area and additional power consumption. Moreover, either an explicit AAF or an AFE with limited bandwidth is needed to avoid thermal noise folding. Finally, a clocking scheme was used in which a half-cycle for sampling and a half-cycle for integrating, leading to weaker anti-aliasing filtering compared to our proposed full-cycle integration. These issues are mitigated in our design by the boxcar-integration-based micro-beamformer, leading not only to a compact circuit topology but also significantly relaxing the design requirements of the AFE, as will be further elaborated in Section 3.4.2.

#### **3.3.2.** System overview

As depicted in Fig. 4.1, the prototype ASIC interfaces with an 8×8 transducer array and can be divided into two regions: the element-level region in which the pitch-matched layout is strictly limited by the pitch of the transducer, and a peripheral region in which the area is less constrained. High-voltage (HV) pulsers consisting of peripheral pulsers and element-level diode isolators drive all elements to generate ultrasound pressure, with the ability to define time delays at the row- or column-level to steer the resulting TX beam to different angles. As a proof of concept, only the row-level TX beamforming is implemented in our design. During TX phase, unipolar square-wave pulses are generated for each transducer element in the prototype, with an amplitude up to 36V, 50-ns duration and a minimum 12.5-ns delay step. In contrast to a push-pull transmitter [13] where a pair of bulky HV PMOS/NMOS transistors generates the needed square wave, the element-level isolator only consists of an area-efficient HV diode and thus significantly reduces the required element-level die area. This also allows us to accommodate an RX circuit with more complicated functions in the pitch-matched layout area, such as an AFE with continuous time-gain compensation.

For echo reception, the 8×8 array is divided into sub-arrays of 2×2 elements. After the transmission, HV Transmit/Receive (T/R) switches and multiplexers connect two of these sub-arrays to the receive circuitry inside the element-level region. The signal currents from







Fig. 3.6: Circuit diagram of the analog front-end consisting of a TIA, a CA and a coupling capacitor in between.

the four elements of the selected sub-array are amplified by 4 AFEs and then fed to the RX micro-beamformer consisting of profile multiplexers that set the element delays, write switches, and boxcar memory cells that store the summed-and-delayed current signals. Four time-interleaved memory cells, each of which comprises an active integrator, are followed by a sample-and-hold stage (S/H) that joins time-interleaved signals together. An output buffer drives the S/H output off the chip.

## **3.4.** CIRCUIT DESIGN

#### **3.4.1.** ANALOG FRONT-END

The AFEs are based on the design presented in [26]. As illustrated in Fig. 3.6, each AFE channel consists of a capacitive-feedback trans-impedance amplifier (TIA), the output of which is capacitively coupled to the input of a current amplifier (CA). The CA provides a high-impedance output to drive the boxcar integrator in the  $\mu$ BF with an amplified version of the transducer's signal current. The AFE's gain can be continuously tuned in a range of 36dB by an external voltage  $V_{TGC}$  to provide time-gain compensation, i.e. to compensate for the stronger attenuation of echoes that arrive later. The AFE provides less than ±0.4dB gain error and has a 1.31 pA/ $\sqrt{Hz}$  input-referred noise density within 6MHz to 14MHz bandwidth at its maximum gain ( $V_{TGC} = 1.1$ V).

#### **3.4.2.** MICRO-BEAMFORMER

Fig. 3.7(a) shows the detailed implementation of the boxcar-integration-based  $\mu$ BE. As mentioned, it has *N*=4 input channels and four TI channels (TI<sub>1-4</sub>). In contrast with the concept shown earlier in Fig. 3.4, the inputs are not connected directly to the boxcar integrators, but through two layers of switches, i.e., the  $\mu$ BF profile multiplexer and the write switches W<sub>1-6</sub>(1:4). First, to set the delay profile, a  $\mu$ BF profile multiplexer connects each of the input current *I*<sub>1-4</sub> to one of 6 summation nodes, which correspond to 6 possible delay values. These





6 summation nodes are cyclically connected to the 4 boxcar integrators via the write switches  $(W_{1-6}\langle 1:4\rangle)$ , orchestrated by an 80-MHz clock CLK<sub>D</sub>, which is twice the output sampling rate of 40 MHz, and gives a delay step of 12.5 ns. Active boxcar integrators are implemented instead of passive integrators to improve the linearity in the write cycles ( $\bar{R}_{1-4}$ ). These are reconfigured to voltage buffers driving the following S/H stage in the read cycle ( $R_{1-4}$ ). Fig. 3.7(a) also shows an example output waveform ( $V_{BF1}$ ) of the first TI channel (TI<sub>1</sub>) receiving four sinusoidal current inputs ( $I_{1-4}$ ) with a uniform delay of 12.5 ns between each. These current inputs are connected to the first four summing nodes (node 1–4) respectively, corresponding to an equally-spaced delay profile of 12.5 ns, and sequentially integrated on the feedback capacitor  $C_{BF1}$  via switches  $W_{1-4}\langle 1 \rangle$ .

An important advantage of the two layers of switches is that the  $\mu BF$  profile multiplexer switches remain static during the RX period, while the switches connecting to the boxcar integrators can be driven by a simple profile-independent clock generator, providing accurate timing with low clock skew which is important for generating accurate steering angles during RX beamforming [27]. The clock control signals of the 24 write switches only have eight different phases, e.g., write switches  $W_2\langle 4 \rangle$ ,  $W_4\langle 3 \rangle$  and  $W_6\langle 2 \rangle$  can share the same control signal  $\Phi_8$  as illustrated in Fig. 3.7(a). Therefore, an 8-phase digital clock generator is implemented using a 1-bit shift register (SR) and generates 8 clock signals ( $\Phi_{1-8}$ ), each of which drives identical loads (i.e., 3 write switches). Compared to the single-layer switching scheme [28], in which a multibit SR and the following  $\mu$ BF delay profile multiplexers are both implemented in the digital domain to generate clock signals for every switch, resulting in a large number of clock signals that need to be routed across the digital and analog domain and difficulty to match the routing length, this clocking scheme both reduces the number of logical gates in the clock signal path, and the number of clock signal connections to the write switches, making it less demanding to equalize the propagation delay, thus effectively minimizing the clock skew. Thanks to the current-mode operation, the associated switches can be made small with an insignificant area overhead.

The active boxcar integrators alternate between integration ( $R_i = 0$ ) and readout ( $R_i = 1$ ) phases. During the latter, the accumulated charge is transferred to one of two S/H stages, which operate in a ping-pong fashion and alternately drive the output controlled by a 40-MHz clock CLK<sub>SH</sub>, as illustrated by example waveforms in Fig. 3.7(a). In these S/H stages, the accumulated charge of  $C_{BF1,3}$  and  $C_{BF2,4}$  is transferred to hold capacitors  $C_{H1}$  and  $C_{H2}$  during read phases  $R_{1,3}$  and  $R_{2,4}$ , respectively. During phases  $S_1$  and  $S_2$ , they serve as intermediate buffer stages and alternately drive the output  $V_{SHO}$  which is loaded by the following output buffer. During phases  $Q_1$  and  $Q_2$ , the hold capacitors are reset in preparation of the next cycle.

The clock signal CLK<sub>D</sub> sets the minimum delay step to 12.5ns and the delay range to 62.5ns, allowing for steering the 2×2 sub-array to an angle within the range of -74° to 74°. While CLK<sub>SH</sub>, independently, sets the output sampling period to 25 ns. The boxcar integration time ( $T_{int}$ ) is 25ns which provides effective anti-alias filtering for the AFE. In this work, the delay profile is predefined and can only be updated every pulse-echo cycle, while the proposed µBF architecture can be extended to other beamforming schemes (e.g., dynamic RX focusing) by adding additional digital control.

The operational transimpedance amplifiers (OTAs) in the boxcar integrators and S/H stages are implemented using inverter-based amplifiers [Fig. 3.7(b)] with current-reuse supply- and ground-regulators [26] that suppress interference and are shared at the sub-array level to save



Fig. 3.8: Circuit diagram of the class-AB output stage with the quasi-floating biasing scheme.

area. Two capacitive level shifters ( $C_1/C_2$ ) are used to enlarge the dynamic range of the OTAs. These are reset to the associated common-mode voltages ( $V_{bp}$ ,  $V_{bn}$  and  $V_{cm}$ ) during the TX period ( $\Phi_{TX}$ ) and hold the DC bias points during the RX period ( $\Phi_{RX}$ ).

#### 3.4.3. OUTPUT BUFFER

A class-AB output buffer as depicted in Fig. 3.8 is adopted to provide sufficient drive capability for off-chip loads, such as parasitic interconnect capacitance [29]. A dynamic biasing scheme is used instead of the pseudo-resistor used in [29] to dynamically bias the static current of the class-AB output stage, i.e, dynamically resetting the quasi-floating gate during the TX phase ( $\Phi_{TX}$ ) and keeping it floating during the RX phase, thus improving the noise performance of the output buffer by isolating the noise originating from the bias network. The bandwidth, noise and harmonic distortion of the output buffer were designed to be negligible compared to those of the preceding stages, and therefore have unnoticeable impact on the measurements.

#### 3.4.4. TRANSMITTER

A row-level transmission scheme is adopted in our design as a proof of concept, which employs row-level push-pull pulsers in the periphery and element-level HV diodes to provide necessary isolation between transducer elements, as depicted in Fig. 3.9. The scheme allows for row-level TX beamforming controlled by the digital delay line also located in the periphery. During transmission, HV pulses are generated by charging the transducer elements of a row via a peripheral HV PMOS and element-level HV diodes. The PMOS is turned on via a level-shifter



Fig. 3.9: Circuit diagram of the proposed transmitter employing HV diodes as isolation.

controlled by clock signal  $\Phi_1$ . The elements are then discharged via a peripheral HV NMOS, an element-level HV NMOS which also serves a T/R switch and a low-voltage NMOS ( $\Phi_{2,3} = 1$ ). The metal interconnect between the peripheral circuitry and the elements is dimensioned to ensure that the associated propagation delay is negligible. During reception, the transducer elements, isolated from each other by reverse biasing the HV diodes, connect to the AFEs via the T/R switch and a low-voltage multiplexer switch controlled by  $\Phi_{EN}$ . The passive HV diode is generally smaller than HV MOS transistors as widely adopted in other designs [13],[30],[31], due to the lack of active structure, thus making the transmitter very compact and reserving more room for the RX electronics.

### **3.5.** EXPERIMENTAL RESULTS

#### **3.5.1.** ASIC PROTOTYPE

The ASIC has been fabricated in a 180-nm HV BCD process. The chip size of the prototype is  $1.98 \times 1.88 \text{ mm}^2$  [see Fig. 3.10(a)], in which the peripheral TX circuitry occupies  $0.8 \times 0.35 \text{ mm}^2$  as shown in Fig. 3.10(b). The top-half floor plan of the element-level TX/RX circuitry is shown in Fig. 3.10(c), which includes 4 RX channels interfacing 8 sub-arrays of  $2 \times 2$  elements via the multiplexer and has the area breakdown shown in Fig. 3.11(a). Per channel, the RX circuitry occupies  $0.04 \text{ mm}^2$  of which the µBF plus the S/H occupies only  $0.005 \text{ mm}^2$ , when not taking the 8× multiplexing factor into account, i.e., dividing the area by a factor of 4. The power breakdown is shown in Fig. 3.11(b). The power consumption was measured on a chip with the transducer array built on top, and the measured total power consumption is 4.8 mW, i.e., 1.2 mW/channel, of which 0.8 mW is consumed by the AFE and biasing, 0.33 mW by the µBF and



Fig. 3.10: (a) Micrograph of the transceiver ASIC. (b) Inset showing the peripheral TX circuitry.(c) Inset showing the element-level TX/RX circuitry. (d) A prototype of the transceiver ASIC with PZT array on top.

the S/H, 0.06 mW by the output buffer and digital, while the TX consumes only  $32 \mu$ W/channel at a pulse repetition frequency of 10 kHz, i.e., transmitting one 50-ns 30-V pulse in every 100  $\mu$ s.

Fig. 3.10(d) shows a prototype with an 8×8 central PZT transducer array built on top, connected to the ASIC via the transducer bonding pads [see Fig. 3.10(c)], and surrounded by an outer-ring of dummy transducer elements. The array was built using a fabrication scheme similar to that described in [32]. The prototype was wire-bonded to a daughter PCB, which was then mounted on a mother PCB, for the following measurements. The mother PCB contains an FPGA which controls the TX and RX functionality via the ASIC's SPI interface (see Fig. 4.1) and also synchronizes the data acquisition between the ASIC and an oscilloscope.

#### **3.5.2.** Electrical Characterization

Test currents can be injected to the inputs of a sub-array by applying voltage signals to four 1-pF capacitors on the chip, which emulate  $2\times2$  transducer elements. By measuring the output voltage of the output buffer, the transfer function was extracted at different TGC control voltages ( $V_{\text{TGC}}$ ) as shown in Fig. 3.12(a), From this, the gain at 10 MHz was extracted as de-



Fig. 3.11: (a) Area breakdown and (b) power breakdown of 8 sub-arrays of 2×2 elements.

picted Fig. 3.12(b), which reveals that the measured RX gain in dB is a linear function of  $V_{TGC}$  from 0.5 V to 1.1 V, leading to a total gain range of 36 dB. A –3-dB bandwidth curve across the overall gain range was also derived from the transfer function, indicating the bandwidth variation at different AFE gains. The overall bandwidth at different gains is a bit smaller than that of the AFE alone [26], due to the additional filtering effect of the boxcar integration, but the minimum measured bandwidth, i.e., 14.7 MHz, still meets the requirement of our application. The measured bandwidth variation is less than ±10% around 16.3 MHz across the gain range, which is also smaller than the variation measured at the AFE's output [26], due to the fact that the –3-dB corner of the boxcar integration filter is accurately controlled by the clock regardless of different AFE gains.

Fig. 3.13(a) shows input-referred noise spectra for a set of TGC control voltages covering the full gain range. The output noise density was first measured at the output of the µBF via the output buffer by connecting the on-chip 1-pF capacitors at the inputs of the AFEs to ground, and the input-referred noise is then calculated by dividing the measured output noise density by the measured gains as shown in Fig. 3.12(a). The noise contribution of the output buffer was designed to be negligible. The input-referred noise of a single AFE channel was measured in the same way by internally bypassing the µBF. In Fig. 3.13(b), the noise density averaged from 6 to 14 MHz is plotted as a function of the TGC control voltage both for the complete signal path (i.e., 4 AFEs + µBF) and for the AFE only. The figure also shows the ratio between them. An averaged input-referred noise of 0.67 pA/ $\sqrt{Hz}$  is measured at the µBF's output when  $V_{TGC} = 1.1V$ , which is close to half of the input-referred noise (1.31 pA/ $\sqrt{Hz}$ ) measured at a single-channel AFE's output. This factor well meets the theoretical noise reduction of 2 expected from the µBF of 4 elements without noticeable noise-folding effects and is stably maintained across the full gain range [see Fig. 3.13(b)].

Fig. 3.14(a) shows the SNR as a function of the input current for different TGC control voltages, which was also measured via the final output buffer. It demonstrates a measured dynamic range (DR) of 82 dB, which is the ratio of the maximum input signal level at the 1-dB compres-



Fig. 3.12: (a) Measured gain transfer function. (b) Extracted –3-dB bandwidth as a function of TGC control voltage V<sub>TGC</sub>.



Fig. 3.13: (a) Measured input referred noise spectra. (b) Input referred in-band noise density measured at AFE's and  $\mu$ BF's output, respectively, as well as the ratio between them as a function of TGC control voltage V<sub>TGC</sub>.

sion point and the minimum detectable input signal level at which the input power is equal to the noise power, allowing for measuring an acoustic pressure within the range of about 3.4 Pa to 42 kPa.

For characterization of the TX circuitry, two of the transducer bonding pads were wire-bonded to the daughter board and were connected to two off-chip 1-pF capacitive loads emulating two transducer elements, respectively. The HV pulsers successfully produce



Fig. 3.14: (a) Measured SNR as a function of the input current in different TGC control voltage.(b) Measured outputs of two TX pulsers with inset showing a minimum delay of 12.5 ns and 36-V amplitude.

unipolar 7-cycle pulses with up to a 36-V amplitude and a minimum delay resolution of 12.5 ns, as depicted in Fig. 3.14(b).

The RX beamforming performs spatial filtering of incoming acoustic wave, resulting in spatial directivity for different beamformer steering angles. As depicted in Fig. 3.15(a), this spatial directivity was first evaluated through an electrical test by applying four time-shifted 10-MHz sinusoidal inputs via the on-chip capacitors to the AFEs, thus emulating acoustic waves arriving at different angles, and comparing the  $\mu$ BF response with the theoretical spatial directivity for different  $\mu$ BF steering angles. A 6.25-ns time shift step was applied to the four sinusoidal inputs during the measurement, which emulates an angle of incidence of an incoming acoustic wave that changes by 5° to 7° in each step, assuming the ultrasound speed is 1540 m/s. The  $\mu$ BF was configured to four different steering angles from 0° to 35°. As shown in Fig. 3.15(b), the  $\mu$ BF response is in very good agreement with the theoretical directivity curve.

#### **3.5.3.** ACOUSTICAL CHARACTERIZATION

The  $\mu$ BF directivity was also evaluated through an acoustic test by using the measurement setup shown in Fig. 3.16(a). A small water tank was mounted on top of the chip. An unfocused single-element probe was immersed in the water and was driven by a continuous 10-MHz sinusoidal wave to transmit acoustic waves to the chip at different incidence angles of 0°, 11° and 23°, aligned with the azimuth direction of the chip. The single-element probe was accurately positioned on a circular trajectory centered at the location of the PZT transducer array, by using a 3-D printed rotating holder (not shown). The  $\mu$ BF of a 2×2 sub-array was configured to different steering angles ranging from -50° to 50°, while its output amplitude was recorded and normalized, as depicted in Fig. 3.16(b). The theoretical ideal directivity curves are also plotted, which show a good alignment with the measured data. As depicted in Fig. 3.16(c), at a specific incidence angle of 0°, the change in output amplitude for different  $\mu$ BF steering angles clearly



Fig. 3.15: (a) Spatial directivity measurement emulating acoustic waves arriving at different angles by means of four time-shifted sinusoidal inputs. (b) Normalized measured μBF response as a function of emulated arrival angles, for μBF steering angles.

shows the expected µBF directivity and the consequential spatial filtering effect.

The same test bench was also used for an imaging experiment, by replacing the single-element probe with 3 needle reflectors positioned at about 8 mm above the chip [33]. The ASIC was used to generate 3-cycle 30V pulses for pulse transmission and to capture the returning echo signals. An image was generated using delay-and-sum beamforming with coherence factor weighting [34]. The resulting B-mode image in an azimuthal plane is shown in Fig. 3.17(a), which clearly shows the positions of the needles, while a rendered 3-D image based on the same recording is shown in Fig. 3.17(b), where the needle heads are clearly distinguishable from the background even with the small aperture size of 0.8×0.8 mm<sup>2</sup>.

The performance and characteristics of our work and the prior have been summarized in Table 4.2. Compared to prior  $\mu$ BF designs, this work achieves the smallest array pitch, the highest center frequency and the smallest  $\mu$ BF area per channel.

#### **3.6.** CONCLUSION

This chapter has presented a 100 $\mu$ m-pitch-matched transceiver ASIC with a boxcar-integration-based RX  $\mu$ BF and an element-level TX. For *N* input channels, the presented  $\mu$ BF reduces the total number of memory cells by a factor of *N* compared to a conventional  $\mu$ BF employing discrete-time delay lines, thus saving area, and combines boxcar integration with built-in AAF, thus relaxing the design requirements of the AFE and further reducing the area. Active boxcar integration has been implemented to make the  $\mu$ BF less sensitive to parasitic capacitance, followed by a ping-pong S/H and a high-efficiency output buffer. A novel TX scheme has been implemented allowing area-hungry devices, such as the HV PMOS/NMOS, to be moved to the periphery of the chip, and only using a small-sized HV diode as the element-level isolation plus a HV NMOS as the T/R switch. All these features



Fig. 3.16: (a) Acoustic measurement setup for μBF directivity test. (b) Measured μBF directivity at different probe incidence angles of 0°, 11° and 23° as a function of μBF steering angles, as well as the theoretical directivity. (c) Measured μBF outputs at different μBF steering angles with probe incidence angle of 0°, and inset showing the details of the amplitude change.



Fig. 3.17: (a) A B-mode image in elevation plane. (b) Rendered 3-D image.

make the design very compact and allow us to integrate more complicated functions, such as an AFE with continuous TGC, on the chip in a pitch-matched fashion.

#### Appendix

In this appendix, we derive the total capacitance required to obtain a given output SNR, for both a µBF based on switched-capacitor delay lines (as in Fig. 3.3d) and for the proposed boxcar-based µBF (as in Fig. 3.4a). Fig. 3.18(a) depicts a simplified switched-capacitor µBF for noise calculation, which consists of *N* voltage-mode LNAs with a gain of  $A_0$ , *N* AAFs with a bandwidth of  $B_0$ , sampling switches controlled by clock signals  $W_{1-N}$  and  $R_1$ , and sampling capacitors  $C_0$ .  $V_{in,1-N}$ , and  $\bar{v}_{n,1-N}$  represent the input voltage and the equivalent input-referred noise of the LNAs and AAFs, respectively. All noise sources are independent white noise with power spectral density (PSD) of  $S_0$ . We assume that the maximum peak output swing at all AAFs' outputs is  $V_{max}$ . During a write phase, delays are applied to each channel and accurately controlled by the write clock  $W_{1-N}$ , followed by the summation in the following read phase. At the end of the write phase, the amplified and band-limited noise of each LNA and AAF, as well as the noise of the sampling switch  $W_i$ , is sampled on the capacitor  $C_0$ , and the associated output noise power  $\overline{v_{nout,i}^2}$  can be expressed as

$$\overline{v^2}_{\text{nout},i} = A_0^2 \cdot S_0 \cdot B_0 + \frac{k \cdot T}{C_0} \qquad (1 \le i \le N)$$
(3.4)

| Ι                          | Ι                    | 138 V                 | I                                  | 36 V                                                             | TX voltage           |
|----------------------------|----------------------|-----------------------|------------------------------------|------------------------------------------------------------------|----------------------|
| Ν                          | Ν                    | Y                     | N                                  | Υ                                                                | Element-level TX     |
|                            | _                    | 85 dB                 | 85 dB                              | 82 dB                                                            | Input DR             |
| 60 dB                      |                      |                       | 52 dB                              | 54 dB                                                            | Peak SNR             |
| 33 mW                      | 18 mW                | 0.43 mW               | 0.91 mW                            | 1.2 mW                                                           | RX power/channel     |
| $0.088 \mathrm{~mm^2}$     | $0.132 \text{ mm}^2$ | $0.09 \mathrm{~mm}^2$ | $0.026 \text{ mm}^2$               | $0.04 \text{ mm}^2$                                              | RX area/channel      |
| LNA + PGA                  | LNA + PGA            | LNA + PGA             | LNA + PGA                          | LNA + TGC                                                        | AFE type             |
| $17.5~{ m mW}^{(\dagger)}$ | 12.4 mW              | 0.19 mW               | $0.17 \text{ mW}^{(\$)}$           | $0.33 \text{ mW}^{(*)}$                                          | µBF power/channel    |
| $0.041 \text{ mm}^{2}$ (†) | $0.03~\mathrm{mm}^2$ | $0.03 \mathrm{~mm}^2$ | $0.011 \text{ mm}^{2 \text{ (S)}}$ | $0.005 \text{ mm}^{2}$ (*)                                       | µBF area/channel     |
| 940 ns                     | 1000 ns              | 750 ns                | 233.3 ns                           | 62.5 ns                                                          | µBF maximum delay    |
| 8.33 ns                    | 6.25 ns              | 25 ns                 | 33.3 ns                            | 12.5 ns                                                          | µBF delay resolution |
| 20 MS/s                    | 40 MHz               | 40 MS/s               | 30 MS/s                            | 40 MS/s                                                          | Sampling rate        |
| 5 MHz                      | 3 MHz                | <5 MHz                | 5 MHZ                              | 10 MHz                                                           | Center frequency     |
| 250 µm                     |                      | 300 µm                | 150 µm                             | 100 µm                                                           | Pitch                |
| CMUT                       | CMUT                 | PZT                   | PZT                                | PZT                                                              | Transducer type      |
| $4 \times 4$               | 8                    | $4 \times 6$          | 3×3                                | 2×2                                                              | Sub-array size       |
| 28-nm CMOS                 | 130-nm CMOS          | 180-nm SOI            | 180-nm BCD                         | 180-nm BCD                                                       | Process              |
| Digital                    | Voltage S/H          | Voltage S/H           | Voltage S/H                        | <b>Boxcar integration</b>                                        | µBF type             |
| [9]                        | [35]                 | [13]                  | [12]                               | This work                                                        |                      |
| ΥT                         | TH THE PRIOR AF      | MPARISON WI           | MMARY AND CO                       | Table 3.1: PERFORMANCE SUMMARY AND COMPARISON WITH THE PRIOR ART | Table 3.1            |

 $^{\ast}$  Including S/H stage. § Including sub-array ADC. † Including element-level ADC.

CHAPTER 3



Fig. 3.18: (a) A simplified switched-capacitor-based µBF for noise calculation. (b) A simplified boxcar-integration-based µBF for noise calculation.

The noise of the *N* channels is averaged in the following read phase, leading to a maximum signal to noise ratio  $SNR_{sc}$  expressed as

$$SNR_{sc} = \frac{V_{max}^2 \cdot N}{2 \cdot \overline{v^2}_{nout,i}}$$

$$= \frac{V_{max}^2 \cdot N}{2 \cdot (A_0^2 \cdot S_0 \cdot B_0 + \frac{k \cdot T}{C_0})}$$
(3.5)

A similar analysis can be applied to the  $\mu$ BF based on boxcar integration, as depicted in Fig. 3.18(b). The voltage-mode LNAs are replaced by transconductance amplifiers with a transconductance of *G*<sub>m</sub>, and a short reset phase is required to clear the memory capacitor *C*<sub>1</sub> after each read phase R<sub>1</sub>, giving rise to *kT/C* noise. At the end of the write phase, the output noise power  $\overline{v_{nout,bc}^2}$  can be expressed as

$$\overline{\nu^2}_{\text{nout,bc}} = N \cdot A_1^2 \cdot S_0 \cdot B_1 + \frac{k \cdot T}{C_1}$$

$$A_1 = \frac{G_m}{C_1} \cdot T_0$$

$$B_1 = \frac{1}{2 \cdot T_0}$$
(3.6)

where  $A_1$  and  $B_1$  are the DC gain and the equivalent noise bandwidth of the boxcar integration, respectively;  $T_0$  is the integration time, and  $S_0$  is the input-referred PSD which is identical to the input-referred noise PSD of the switched-capacitor µBF. In the following read phase, the maximum signal to noise ration  $SNR_{bc}$  can be expressed as

$$SNR_{bc} = \frac{V_{max}^2}{2 \cdot \overline{v^2}_{nout,bc}}$$

$$= \frac{V_{max}^2}{2 \cdot (N \cdot A_1^2 \cdot S_0 \cdot B_1 + \frac{k \cdot T}{C_1})}$$
(3.7)

Note that  $A_1$  should comply with

$$A_1 = \frac{A_0}{N} \tag{3.8}$$

to achieve same maximum output swing  $V_{\text{max}}$ . Substituting (3.8) into (3.6) and (3.7), and letting  $C_1 = N \cdot C_0$ ,  $B_1 = B_0$  results in the same SNR for both µBF structures. We can conclude that the µBF based on boxcar integration requires the same total capacitance as the conventional µBF to achieve a given output SNR, while the proposed µBF architecture still occupies less area in practice factoring in the area needed for AAF, the required space between memory capacitors, the number of switches and the associated interconnections.

# REFERENCES

- M. Proisy, S. Mitra, C. Uria-Avellana, M. Sokolska, N. Robertson, F. Le Jeune, and J.-C. Ferré, "Brain Perfusion Imaging in Neonates: An Overview," *AJNR Am J Neuroradiol*, vol. 37, no. 10, pp. 1766–1773, Oct. 2016.
- [2] J.-K. Choi, M.-G. Choi, J.-M. Kim, and H.-M. Bae, "Efficient Data Extraction Method for Near-Infrared Spectroscopy (NIRS) Systems With High Spatial and Temporal Resolution," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 7, no. 2, pp. 169–177, Apr. 2013.
- [3] J. J. Volpe, "Brain injury in premature infants: A complex amalgam of destructive and developmental disturbances," *Lancet Neurol*, vol. 8, no. 1, pp. 110–124, Jan. 2009.
- [4] E. Macé, G. Montaldo, I. Cohen, M. Baulac, M. Fink, and M. Tanter, "Functional ultrasound imaging of the brain," *Nat Methods*, vol. 8, no. 8, pp. 662–664, Aug. 2011.
- [5] F. W. Kremkau, R. W. Barnes, and C. P. McGraw, "Ultrasonic attenuation and propagation speed in normal human brain," *The Journal of the Acoustical Society of America*, vol. 70, no. 1, pp. 29–38, Jul. 1981.
- [6] G. A. Popich and D. W. Smith, "Fontanels: Range of normal size," *The Journal of Pediatrics*, vol. 80, no. 5, pp. 749–752, May 1972.
- [7] G. Duc and R. H. Largo, "Anterior Fontanel: Size and Closure in Term and Preterm Infants," *Pediatrics*, vol. 78, no. 5, pp. 904–908, Nov. 1986.
- [8] J. P. Asen, J. I. Buskenes, C.-I. C. Nilsen, A. Austeng, and S. Holm, "Implementing capon beamforming on a GPU for real-time cardiac ultrasound imaging," *IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control*, vol. 61, no. 1, pp. 76–85, Jan. 2014.
- [9] M.-C. Chen, A. Peña Perez, S.-R. Kothapalli, P. Cathelin, A. Cathelin, S. S. Gambhir, and B. Murmann, "A Pixel Pitch-Matched Ultrasound Receiver for 3-D Photoacoustic Imaging With Integrated Delta-Sigma Beamformer in 28-nm UTBB FD-SOI," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 11, pp. 2843–2856, Nov. 2017.
- [10] J. D. Larson, "2-d phased array ultrasound imaging system with distributed phasing," US5229933A, Jul. 1993.
- [11] B. Savord and R. Solomon, "Fully sampled matrix transducer for real time 3D ultrasonic imaging," in *IEEE Symposium on Ultrasonics, 2003*, vol. 1, Oct. 2003, 945–953 Vol.1.
- [12] C. Chen, Z. Chen, D. Bera, E. Noothout, Z.-Y. Chang, M. Tan, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A Pitch-Matched Front-End ASIC With Integrated Subarray Beamforming ADC for Miniature 3-D Ultrasound Probes," *IEEE J. Solid-State Circuits*, vol. 53, no. 11, pp. 3050–3064, Nov. 2018.

- [13] Y. Igarashi, S. Kajiyama, Y. Katsube, T. Nishimoto, T. Nakagawa, Y. Okuma, Y. Nakamura, T. Terada, T. Yamawaki, T. Yazaki, Y. Hayashi, K. Amino, T. Kaneko, and H. Tanaka, "Single-Chip 3072-Element-Channel Transceiver/128-Subarray-Channel 2-D Array IC With Analog RX and All-Digital TX Beamformer for Echocardiography," *IEEE J. Solid-State Circuits*, vol. 54, no. 9, pp. 2555–2567, Sep. 2019.
- [14] K. Kaviani, O. Oralkan, P. Khuri-Yakub, and B. Wooley, "A multichannel pipeline analogto-digital converter for an integrated 3-D ultrasound imaging system," *IEEE Journal of Solid-State Circuits*, vol. 38, no. 7, pp. 1266–1270, Jul. 2003.
- [15] K. Ranganathan, M. Santy, T. Blalock, J. Hossack, and W. Walker, "Direct sampled I/Q beamforming for compact and very low-cost ultrasound imaging," *IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control*, vol. 51, no. 9, pp. 1082–1094, Sep. 2004.
- [16] G. Gurun, J. S. Zahorian, A. Sisman, M. Karaman, P. E. Hasler, and F. L. Degertekin, "An Analog Integrated Circuit Beamformer for High-Frequency Medical Ultrasound Imaging," *IEEE Trans. Biomed. Circuits Syst.*, vol. 6, no. 5, pp. 454–467, Oct. 2012.
- [17] J. Talman, S. Garverick, and G. Lockwood, "Integrated circuit for high-frequency ultrasound annular array," in *Proceedings of the IEEE 2003 Custom Integrated Circuits Conference, 2003.*, San Jose, CA, USA: IEEE, 2003, pp. 477–480.
- [18] Yaowu Mo, T. Tanaka, S. Arita, A. Tsuchitani, K. Inoue, and Y. Suzuki, "Pipelined delaysum architecture based on bucket-brigade devices for on-chip ultrasound beamforming," *IEEE J. Solid-State Circuits*, vol. 38, no. 10, pp. 1754–1757, Oct. 2003.
- [19] J.-Y. Um, Y.-J. Kim, S.-E. Cho, M.-K. Chae, B. Kim, J.-Y. Sim, and H.-J. Park, "A Single-Chip 32-Channel Analog Beamformer With 4-ns Delay Resolution and 768-ns Maximum Delay Range for Ultrasound Medical Imaging With a Linear Array Transducer," *IEEE Trans. Biomed. Circuits Syst.*, vol. 9, no. 1, pp. 138–151, Feb. 2015.
- [20] B. Stefanelli, I. O'Connor, L. Quiquerez, A. Kaiser, and D. Billet, "An analog beam-forming circuit for ultrasound imaging using switched-current delay lines," *IEEE J. Solid-State Circuits*, vol. 35, no. 2, pp. 202–211, Feb. 2000.
- [21] F. Sangster, "Integrated MOS and bipolar analog delay lines using bucket-brigade capacitor storage," in 1970 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, Philadelphia, PA, USA: IEEE, 1970, pp. 74–75.
- [22] M. Anthony, E. Kohler, J. Kurtze, L. Kushner, and G. Sollner, "A process-scalable lowpower charge-domain 13-bit pipeline ADC," in 2008 IEEE Symposium on VLSI Circuits, Jun. 2008, pp. 222–223.
- [23] D. Ware and P. Mansfield, "High Stability "Boxcar" Integrator for Fast NMR Transients in Solids," *Review of Scientific Instruments*, vol. 37, no. 9, pp. 1167–1171, Sep. 1966.
- [24] A. Mirzaei, S. Chehrazi, R. Bagheri, and A. A. Abidi, "Analysis of first-order anti-aliasing integration sampler," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 55, no. 10, pp. 2994–3005, Nov. 2008.
- [25] J.-Y. Um, "Current-mode ultrasound beamformer with multiple-input–sequential-output memory structure," *Electronics Letters*, vol. 54, no. 9, pp. 545–546, May 2018.

- [26] P. Guo, Z.-Y. Chang, E. Noothout, H. J. Vos, J. G. Bosch, N. de Jong, M. D. Verweij, and M. A. P. Pertijs, "A Pitch-Matched Low-Noise Analog Front-End With Accurate Continuous Time-Gain Compensation for High-Density Ultrasound Transducer Arrays," *IEEE J. Solid-State Circuits*, vol. 58, no. 6, pp. 1693–1705, Jun. 2023.
- [27] B. Steinberg, "Digital beamforming in ultrasound," *IEEE Transactions on Ultrasonics, Fer*roelectrics, and Frequency Control, vol. 39, no. 6, pp. 716–721, Nov. 1992.
- [28] C. Chen, Z. Chen, D. Bera, S. B. Raghunathan, M. Shabanimotlagh, E. Noothout, Z. Chang, J. Ponte, C. Prins, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A front-end ASIC with receive sub-array beamforming integrated with a 32 × 32 PZT matrix transducer for 3-D transesophageal echocardiography," *IEEE J. Solid-State Circuits*, vol. 52, no. 4, pp. 994–1006, Apr. 2017.
- [29] A. J. Lopez-Martin, J. Ramírez-Angulo, R. G. Carvajal, and L. Acosta, "Power-efficient class AB CMOS buffer," *Electronics Letters*, vol. 45, no. 2, pp. 89–90, Jan. 2009.
- [30] M. Tan, E. Kang, J. An, Z. Chang, P. Vince, T. Matéo, N. Sénégond, and M. A. P. Pertijs, "A 64-Channel Transmit Beamformer With ±30-V Bipolar High-Voltage Pulsers for Catheter-Based Ultrasound Probes," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 7, pp. 1796–1806, Jul. 2020.
- [31] G. Jung, M. W. Rashid, T. M. Carpenter, C. Tekes, D. M. J. Cowell, S. Freear, F. L. Degertekin, and M. Ghovanloo, "Single-chip reduced-wire active catheter system with programmable transmit beamforming and receive time-division multiplexing for intracardiac echocardiography," in 2018 IEEE International Solid - State Circuits Conference - (ISSCC), San Francisco, CA: IEEE, Feb. 2018, pp. 188–190.
- [32] C. Chen, E. Noothout, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, M. A. P. Pertijs, S. B. Raghunathan, Z. Yu, M. Shabanimotlagh, Z. Chen, Z.-y. Chang, S. Blaak, C. Prins, and J. Ponte, "A Prototype PZT Matrix Transducer With Low-Power Integrated Receive ASIC for 3-D Transesophageal Echocardiography," *IEEE Trans. Ultrason., Ferroelect., Freq. Contr.*, vol. 63, no. 1, pp. 47–59, Jan. 2016.
- [33] P. Guo, F. Fool, E. Noothout, Z.-V. Chang, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jonq, and M. A. Pertijs, "A 1.2mW/channel 100µm-Pitch-Matched Transceiver ASIC with Boxcar-Integration-Based RX Micro-Beamformer for High-Resolution 3D Ultrasound Imaging," in 2022 IEEE International Solid- State Circuits Conference (ISSCC), vol. 65, Feb. 2022, pp. 496–498.
- [34] K. Hollman, K. Rigby, and M. O'Donnell, "Coherence factor of speckle from a multirow probe," in 1999 IEEE Ultrasonics Symposium. Proceedings. International Symposium (Cat. No.99CH37027), vol. 2, Oct. 1999, 1257–1260 vol.2.
- [35] J.-Y. Um, E.-W. Song, Y.-J. Kim, S.-E. Cho, M.-K. Chae, J. Song, B. Kim, S. Lee, J. Bang, Y. Kim, K. Cho, B. Kim, J.-Y. Sim, and H.-J. Park, "24.8 An analog-digital-hybrid singlechip RX beamformer with non-uniform sampling for 2D-CMUT ultrasound imaging to achieve wide dynamic range of delay and small chip area," in 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), San Francisco, CA, USA: IEEE, Feb. 2014, pp. 426–427.

# **4** A 125µm-Pitch-Matched Transceiver ASIC with Micro-Beamforming ADC and Multi-Level Signaling for 3-D Transfontanelle Ultrasonography

This chapter is based on a draft paper "A 125µm-Pitch-Matched Transceiver ASIC with Micro-Beamforming ADC and Multi-Level Signaling for 3-D Transfontanelle Ultrasonography," in preparation for submission to IEEE Journal of Solid State Circuits.

# 4.1. INTRODUCTION

I NADEQUATE brain perfusion regularly shown in preterm infants exposes the developing brain to injury that could have severe consequences in later life [1]. Bedside monitoring of brain perfusion via trans-fontanelle ultrasonography (TFUS) has an added benefit, especially for high-risk neonates [2], [3]. To match fontanelle size and generate high-resolution 3-D images for visualizing the sub-millimeter vessels of neonates, a wearable probe with a 2-D array of >10,000 transducer elements would be required, featuring small pitch and high central frequency. This poses significant challenges for the electronics design of such a probe, such as interconnecting the transducer elements to power-efficient front-end circuits that interface with the elements to transmit acoustic pulses (TX) and receive the resulting high-frequency echo signals (RX). A pitch-matched application-specific integrated circuit (ASIC) directly integrated with a 2-D transducer array is a proven solution to the interconnection problem [4]. However, techniques are still required that can effectively reduce the RX output channel count to much less than number of transducer elements.

Various techniques have been reported to reduce the channel count. Time-division multiplexing (TDM), applied at the input of an RX analog front-end (AFE), allows the AFE to interface with multiple transducer elements in successive pulse-echo cycles at the cost of reduced frame rate [5]. When applied at the RX output, TDM allows multiple RX channels to share a single output by assigning each channel a dedicated time slot. However, the limited bandwidth

of the output channel (e.g., PCB trace or cable) leads to inter-symbol interference and deteriorates the signal to noise ratio (SNR) [6]. Frequency-division multiplexing (FDM) [7] also allows multiple RX channels to share an output by modulating the signal of each channel to a different frequency band, while it also suffers from crosstalk. Micro-beamforming ( $\mu$ BF) [8], [9] divides the transducer array into sub-arrays of *N* elements, and combines the echo signals received in each sub-array by means of an analog delay-and-sum (DAS) operation, effectively reducing the channel count by a factor of *N*, at the cost of reduced frame rate and somewhat degraded image quality [10]. On-chip digitization of the echo signals allows for concatenating multiple RX outputs to a serialized output, thus reducing the channel count, at the cost of increased power consumption[11], [12]. State-of-the-art designs often combine several of the aforementioned approaches to achieve an optimal balance between channel-count reduction, frame rate, image quality and power consumption [11], [13], [14].

The performance of on-chip digitization and subsequent digital processing is limited by the relatively mature high-voltage (HV) technology required to implement the HV pulsers needed to generate sufficient ultrasound pressure in TX [15], [16], since mainstream HV technology nodes still remain above 90 nm [17] and are not optimized for complex digital signal processing. Advanced packaging techniques, e.g. [18], allow for integrating a HV transceiver ASIC with a high-speed data processing unit (DPU), realized in a deeper sub-micron technology, in a single package. However, a low-power and compact on-chip digitization scheme and a power-efficient data link are still prerequisites for such a transceiver ASIC.

An element-level digitization scheme was reported in [19], where  $\Delta\Sigma$  modulators directly digitize the echo signals from each transducer element, followed by digital beamforming in the chip's periphery, requiring  $I^2$  ADCs for a  $I \times I$  transducer array. The scheme is associated with high-power consumption as well as large area. Alternatively, the digitization can be done at the output of a  $\mu$ BF [11], [12], that effectively reduces the total channel count, the number of ADCs and the associated power consumption by a factor of *N*, where *N* is again the number of elements in a sub-array. The outputs of *D* ADCs can be concatenated by a following data link, e.g., employing a low-voltage differential signaling (LVDS) [11], [12], realizing an additional *D*× channel-count reduction. However, the LVDS data link consumes excessive power especially in HV technologies, e.g., roughly 50% per-channel power is consumed by the LVDS data link in a 180-nm BCD technology in[11]. An alternative is multi-level signaling (MLS) [20], which transmits and receives multiple bits of data per symbol by using a multi-level signal. Compared to LVDS, MLS compresses the bandwidth and the signal level, allowing for a decrease in dynamic power consumption proportional to  $fCV^2$ .

In this work, we present a pitch-matched transceiver ASIC prototype for TFUS that is directly integrated with a  $16 \times 16$  transducer array with  $125 \cdot \mu m$  pitch and  $9 \cdot MHz$  center frequency as depicted in Fig. 4.1. A novel low-power RX architecture is implemented to reduce the number of RX channel by a factor of 128. The ASIC can be divided into two regions: a pitch-matched region where the ASIC layout needs to strictly match the transducer pitch, and a peripheral region where the layout requirement is more relaxed. Fig. 4.1 also shows that an envisioned DPU, consisting of data RX and the following digital processor, is connected to the transceiver ASIC via bonding wires, while other advanced packaging techniques like through silicon vias (TSVs) is also a possible choice. In contrast with other ultrasound application, such as catheter-based probes, in which the data transmitter (D-TX) drives a long coaxial cable [5], our application aims at processing all the ultrasound data inside the wearable device, and transmitting the im-





age volume data via a wireless link, therefore only a short-range data link is required in the transceiver ASIC.

This chapter is organized as follows. Section 4.2 describes the proposed system architecture. Section 4.3 presents the detailed circuit implementation. Section 4.4 describes the fabricated transceiver prototype, as well as the electrical and acoustic measurement results. This chapter ends with conclusions.

## **4.2.** ARCHITECTURE DESIGN

#### **4.2.1.** System Overview

As depicted in Fig. 4.2, the ASIC is divided into two regions: a pitch-matched region, which contains element-level HV pulsers, sub-group-level RX circuits and a local controller; and a peripheral region, which contains D-TXs and a system controller. The system controller receives commands externally and produces synchronized clock and data signals. It provides timing and configuration data to the local controllers that configure TX/RX beamforming profiles and manage TX/RX operations accordingly.

During TX, each transducer element is isolated from the low-voltage RX circuitry by a transmit/receive (T/R) switch, and pulsed by a unipolar square-wave pulser that provides up to 20-V amplitude and configurable delays to perform the TX beamforming. During the following RX, the transducer array is divided into 8 sub-groups as shown in Fig. 4.2, each of which is associated with a sub-group-level RX circuit. Each sub-group comprises 8 sub-arrays of  $2\times 2$  elements, which are selected by a 8:1 multiplexer and connected to four single-ended analog front ends (AFEs) via the T/R switches. The AFEs operate in current mode with built-in continuous time-gain compensation (TGC) [21] and provide high-impedance outputs for the following  $\mu$ BF ADCs, each incorporating boxcar-integration-based (BI-based)  $\mu$ BF [22] and a sub-group-level SAR ADC operating in the charge domain. The BI-based  $\mu$ BF comprises delay multiplexers that set up the  $\mu$ BF delays for each RX channel, and 5 memory capacitors that integrate the summed current signals in a time-interleaved manner to avoid information loss during the DAS operation.

After the digitization, the output data of the eight ADCs are concatenated into two data streams of 4-bit width and fed to two D-TXs with differential outputs at the periphery, thus achieving an overall 128-fold channel-count reduction. Despite the reduced frame rate caused by the 8:1 multiplexing at the input of the AFEs, it still satisfies the system requirement, owing to a relatively shallow scanning depth required in our application.

#### 4.2.2. AFE

As illustrated in Fig. 4.3, the AFE amplifies the current signal  $I_{in}$  of a transducer element (TD) and drives the following µBF ADC with an amplified current  $I_{out}$ . Its two-stage architecture, described in detail in [21], consists of a trans-impedance amplifier (TIA) with capacitive feedback that is capacitively coupled to a current amplifier (CA). The TIA provides the necessary low input impedance and low noise to sense the TD's signal current. The CA, which is formed by a current-mirror structure, provides the high output impedance required to drive the BI-based µBF. An external slope voltage ( $V_{TGC}$ ) continuously tunes the AFE gain to perform TGC by adjusting the feedback/coupling capacitance and the current-mirror ratio. A dynamic-







Fig. 4.3: AFE architecture.

biasing circuit, also controlled by  $V_{TGC}$ , adjusts the bias current proportional to the AFE gain, thus reducing the power consumption and ensuring a nearly constant AFE bandwidth [21].

#### **4.2.3.** Passive boxcar-integration-based µBF ADC

Fig. 4.4(a) shows a circuit diagram of the BI-based  $\mu$ BF ADC. Similar to the BI-based  $\mu$ BF described in [22], it implements a sum-and-delay operation, in which the output currents from 4 AFEs are cyclically steered into one of 5 integration capacitors  $C_{INT1-5}$  to realize boxcar integration. The  $\mu$ BF delays are defined through the relative timing of the switch control signals  $D_{1-5}\langle 1:4 \rangle$ . For instance, the delay between channel 1 (CH<sub>1</sub>) and channel 2 (CH<sub>2</sub>) is set by the time-shift  $\tau_1$  between  $D_{1-5}\langle 1 \rangle$  and  $D_{1-5}\langle 2 \rangle$ . The integration capacitors are cyclically connected to the ADC by read switches  $R_{1-5}$ .

The switch timing is orchestrated by an 80-MHz clock (CK<sub>RX</sub>), providing a delay resolution of 12.5 ns and a boxcar integration time of 25 ns, corresponding to a sampling rate of 40 MS/s. To accommodate a maximum delay ( $\tau_{max}$ ) of 62.5 ns, five time-interleaved (TI) channels are used.

In contrast with the active integrators used in [22], our design employs passive boxcar integration to save area and power, while retaining the merits of boxcar integration such as the built-in anti-alias filtering (AAF). Moreover, this passive-integration topology allows capacitors  $C_{INT1-5}$  to be connected directly to the following charge-sharing SAR ADC during the read phase  $R_{1-5}$  [11]. To convert the single-ended signal to a differential input for the ADC, a set of dummy capacitor  $C_{DMY1-5}$  is added, and an input common-mode feedback (ICMFB) circuit senses the common-mode (CM) voltage at the input of the ADC through capacitors  $C_{SNS}$  and regulates it to a CM reference  $V_{cm}$  through capacitors  $C_{CM1-5}$ , thus creating a differential signal charge on  $C_{INT1-5}$  and  $C_{DMY1-5}$ .

The ADC consists of a capacitive DAC (CDAC), a comparator and SAR logic. It digitizes the signal charge by neutralizing it in a successive approximation process, in which the CDAC capacitors are connected in parallel or anti-parallel to the ADC's input [23]. During the MSB conversion as depicted in Fig. 4.4(b), the SAR comparator first decides polarity of the differential





voltage ( $\Delta V$ ) on the integration capacitors (e.g.,  $C_{INT1}$  and  $C_{DMY1}$ ). Meanwhile, during  $\Phi_{REF}$ , the CDAC array is quickly charged to  $V_{REF}$  by sharing charge with a pre-charged reservoir capacitor  $C_{RSV}$ . This charge-mode operation allows for very fast settling of the reference voltage within the MSB conversion, without adding extra time to the AD conversion. Afterward, the binary-scaled unit capacitors ( $C_u$ ) are sequentially connected to the input of the SAR comparator ( $V_{IP}$ ,  $V_{IN}$ ) in either parallel or antiparallel configuration for the next comparison, depending on the previous decision made by the comparator, until the next 9 bits plus 1 redundant bit [24] have all been decided, providing a total 11-bit data in one conversion cycle of the ADC. The redundant bit provides immunity to settling error, such as those of the ICMFB and the CDAC. The associated correction is carried out in the background.

Finally, the CDAC array is reset ( $\Phi_{RST}$ ) in preparation for the next AD conversion. After digitization, the residual differential charge on the capacitors is nulled during reset phase  $S_{1-5}$ . Asynchronous SAR logic is employed to avoid distributing a high-frequency clock over the whole chip. This implementation combines the BI-based µBF with a SAR ADC in the charge domain, thereby eliminating the needs for both a power-hungry ADC driver and an explicit AAF.

#### 4.2.4. DATA TRANSMITTER (D-TX)

Realization of an efficient data transmitter in a mainstream BCD technology is challenging due to the relatively low cut-off frequencies of the MOS transistors. E.g., in a 180-nm process, they are about 60/30 GHz for NMOS/PMOS, respectively [25], resulting in a quick drop of the data-link energy efficiency, defined as the ratio of power consumption to data rate, for a LVDS transceiver operating at sub-10 GHz frequencies [26]. MLS, e.g., *M*-level pulse amplitude modulation (PAM-M), transmits/receives *n*-bit data per symbol and compresses the required bandwidth by a factor of *n*, where *n* equals  $log_2(M)$ , significantly improving the energy efficiency at a specific data rate. However, higher-order PAM reduces the signal level and the related SNR per bit, defined as the ratio of signal energy per bit to channel noise density ( $E_b/N_0$ ), resulting in a poorer bit error rate (BER) [27]. Insufficient channel bandwidth further reduces  $E_b/N_0$  and the associated eye width, and causes inter-symbol interference (ISI). All these restrict the use of high-order PAM for high-quality, middle/long-range communications. E.g., A BER < 10<sup>-12</sup> is required for a long distance of > 25 m, according to the 1000BASE-X standard [28]. As a result, PAM-4/8 are more extensively used than PAM-16 or even higher order in the prior art [26], [29], [30].

However, PAM-16 is adopted in this work due to the fact that: (a) the data-link channel in our application, i.e., the inter-chip communication within a single package, is formed by very short interconnections such as bonding wires, which only inflict slight attenuation and ISI on the transmitted signals [31]; (b) a relatively poor BER objective of 10<sup>-6</sup> is quite tolerable for ultrasound imaging applications [32], thereby easing the requirement of the SNR per bit; (c) a PAM-16 receiver with the reported energy efficiency of 1.5-pJ/bit could be realized, leveraged by more advanced technologies [33], [34], thus obtaining an overall data-link power optimization.

As depicted in Fig. 4.5(a), the differential outputs of four ADCs are transmitted to a D-TX at the chip's periphery where clock data recovery (CDR) circuits first recover the clock and data for each ADC, followed by 4 first-in, first-out buffers (FIFOs) that synchronize the data with the MLS clock signal (CK<sub>MLS</sub>), and insert a pilot bit (e.g., Pilot $\langle 1 \rangle$ ) periodically in every 11 ADC data bits, before concatenating four of such bitstreams into a 4-bit data [see Fig. 4.5(b)]. The



Fig. 4.5: (a) Data transmitter architecture consisting of CDR, FIFO and a 4-bit DAC. (b) The associated timing diagram.

pilot pattern facilitates a fast recognition of the data header on the data RX (D-RX) side. A 4bit current-steering DAC then converts the 4-bit data into a 16-level current signal and drives a 100- $\Omega$  load resistor, allowing an external PAM-16 data RX to recover the 4-bit data, as will be elaborated in Section 4.3.5. DC-coupled transmission is chosen instead of AC coupling, avoiding the need for a dedicated DC-balance encoder [35] and the related digital circuitry. Unlike an LVDS data link which would require a power-hungry PLL or DLL to generate the clock for a LVDS transmitter [11], [36], the PAM-16 D-TX is directly driven by a low-frequency, 480-MHz clock (CK<sub>MLS</sub>), thus making it a compact and low-power data-link scheme.



Fig. 4.6: A quadrant of daisy-chained shift register that stores the beamforming delays for an 8×8 transducer elements, and the associated element-level TX architecture.

#### **4.2.5.** TX ARCHITECTURE

As depicted in Fig. 4.6, to generate 20-V pulses with TX beamforming, an element-level TX comprising an 8-bit shift register, an 8-bit delay counter, a 3-bit pulse-number counter, and a HV pulser is employed. Controlled by a 72 MHz clock signal  $CK_{TX}$ , it achieves a minimum delay resolution of 13.9-ns, a maximum number of 255 delay steps, and a programmable number of pulses from 1 to 8. The 8-bit shift register is daisy-chained in a quadrant of the pitch-matched layout, corresponding to 8×8 transducer elements, and can be programmed by using a clock ( $CK_{PRG}$ ) with a frequency up to 120 MHz, resulting in a total programming time of 4.8 µs for loading the TX beamforming delay profile for all four of quadrants. At the start of the TX phase, the 8-bit delay counter loads the delay profile from the shift register, and sends an enable signal ( $EN_PNC$ ) to the pulse-number counter after the counting ends. Controlled by a 9-MHz clock signal derived from  $CK_{TX}$ , the pulse-number counter generates a number of pulses to drive the HV pulser, and provides a pulse duration of 55.6 ns. This TX scheme allows for transmitting various TX beam patterns, such as plane wave, diverging wave, or focused wave patterns.

# 4.3. CIRCUIT DESIGN

#### **4.3.1.** PASSIVE AMPLIFICATION OF THE µBF ADC

Capacitors  $C_{INT1-5}$  and  $C_{DMY1-5}$  shown in Fig. 4.4(a) are all varactors built using NMOS and PMOS capacitors [24]. As depicted in Fig. 4.7, in each TI channel (TI<sub>*i*</sub>), these capacitors are driven into their inversion region during µBF by connecting the source/drain (S/D) of the NMOS to ground, while connecting the S/D of the PMOS to the supply, thus maximizing their capacitances and effectively reducing the voltage swing during the boxcar integration. In the following read phase R<sub>*i*</sub>, when the capacitors are connected to the input nodes of the SAR comparator (V<sub>IP</sub>,V<sub>IN</sub>), they are driven into their depletion region by reversing the S/D voltages, thus reducing the capacitances by a factor of about 3. Due to charge conservation, this causes the associated differential voltage on the capacitors to be amplified [37], which effectively relaxes



Fig. 4.7: Passive amplification and the associated timing diagram.



Fig. 4.8: (a) Circuit diagram of the dynamic comparator. (b) The associated waveforms as a function of time.

the noise requirement of the SAR comparator [24].

#### **4.3.2.** Dynamic Comparator of the SAR ADC

Fig. 4.8(a) shows that the dynamic comparator, consisting of a preamplifier and a regenerative latch, decides the polarity during the AD conversion, and operates in a two-step fashion: first, amplification of the differential input signal ( $V_{IP}$ - $V_{IN}$ ) and then regeneration of the amplified signal ( $V_{AP}$ - $V_{AN}$ ) via a latch structure.

The preamplifer is based on a dynamic integrator [38], that employs a cascoded structure, realizing a double integration in contrast with a single-integration-based preamplifier, such as that used in [39], thus effectively increasing the gain of the preamplifier. Moreover, in contrast with [39], a tail current I<sub>0</sub> is employed in our design, which, in combination with the aforementioned ICMFB, rejects CM disturbances, such as those from the power supply.

As depicted in Fig. 4.8(b), the nodes  $V_{APAN}$  and  $V_{BPBN}$  are reset to ground at the beginning, followed by a double integration which amplifies the input signal. During the first integration,

the currents from the differential input pair charge the parasitic capacitors  $C_{p1}$  at the source nodes of  $M_{3,4}$  until the voltages of these nodes reach the threshold voltage  $V_{thp}$  and drive  $M_{3,4}$ into the saturation region, hence lowering the impedance seen at the nodes  $V_{BP,BN}$ , and diverting the currents towards nodes  $V_{AP,AN}$ . Afterward, the second integration happens at the source nodes of  $M_{5,6}$ , where the currents charge the parasitic capacitors  $C_{p2}$  until  $V_{AP,AN}$  reach the threshold voltage of  $M_{5,6}$ , followed by a latch regeneration [39]. The gain of the preamplifier  $G_{PA}$  can be expressed as

$$G_{PA} \approx \frac{2 \cdot g_m}{I_0} \cdot (V_{thn} + \frac{C_{p1}}{C_{p2}} \cdot V_{thp})$$
(4.1)

where  $g_m$  and  $I_0$  are the transconductance of the input differential pair and the tail current, respectively. The resulting large gain mitigates the non-ideal components of the regenerative latch, such as the offset, noise and metastability [40]. In addition, the extended voltage head-room allows for a large tail current to charge in a specific time period that is strictly limited by the AD conversion speed, thus effectively reducing the input-referred noise of the comparator.

#### **4.3.3.** Charge-mode Reference Generation

The ADC reference can be generated externally by using an off-chip voltage buffer [23], [24], which, however, is demanding with regard to distributing such a reference voltage for many ADCs throughout the whole chip. Instead, the voltage reference can be generated locally by a fast but power-hungry voltage buffer[16] to achieve a very short pre-charge time for the CDAC array. The power consumption can be reduced by employing two time-interleaved CDAC arrays [11], allowing for precharging one of the arrays to a reference voltage while using the other to perform AD conversion, but at the expense of a 2× larger CDAC area.

Fig. 4.9(a) shows a charge-mode reference scheme that solves the power and area problems by employing an area-efficient, precharged MOS capacitor  $C_{RSV}$  as a reservoir to quickly set a reference for the subsequent AD conversion. A servo loop is activated during the TX period to calibrate the charge current  $I_{CHG}$  for the subsequent RX period. As shown in Fig. 4.9(b), during the charge phase  $\Phi_{CHG}$  of the TX period,  $I_{CHG}$  charges the parallel-connected CDAC and the reservoir NMOS capacitor  $C_{RSV}$  before a latch comparator compares the resulting voltage on these capacitors with a reference voltage  $V_{REF}$ , controlled by a clock signal  $CK_{SL}$ . In the subsequent phase  $\Phi_{RST2}$ , the servo loop calibrates the current  $I_{CHG}$  by tuning the overdrive voltage of PMOS transistor  $M_{CHG}$  via a charge pump comprised of 125-nA sourcing and sinking current sources and a PMOS capacitor  $C_{SH}$ , in accordance with the decision made by the comparator. Meanwhile, CDAC and  $C_{RSV}$  are reset in readiness for the next charging. This process repeats until the voltage on these capacitors equals  $V_{REF}$ . At the end of the TX period, the total amount of charge  $Q_{REF}$  stored on these capacitors can be expressed as

$$Q_{\text{REF}} = (C_{\text{DAC}} + C_{\text{RSV}}) \cdot V_{\text{REF}}$$
  
= I<sub>CHG</sub> · T<sub>CHG</sub> (4.2)

where  $C_{DAC}$  and  $T_{CHG}$  are the total capacitance of the CDAC array, and the charging time accurately controlled by  $\Phi_{CHG}$ .

Throughout the RX period, the overdrive required to generate  $I_{CHG}$  is held on the PMOS capacitor  $C_{SH}$ , so that the same amount of charge is delivered to  $C_{RSV}$  during  $\Phi_{CHG}$ . During  $\Phi_{REF}$ , at the start of each AD conversion, this charge is redistributed with the CDAC. The charge as



Fig. 4.9: (a) Circuit diagram of the charge-mode reference generator and (b) the associated timing diagram.

expressed in (4.2) is conserved during the charging and redistribution process, providing a welldefined reference voltage at the end of  $\Phi_{\text{REF}}$ .

A charging time  $T_{CHG}$  of 12.5 ns is adopted, which effectively filters out the high-frequency noise of the current source  $M_{CHG}$  owing to the boxcar filtering mechanism similar to that of the µBF, thereby reducing the associated reference noise.  $C_{RSV}$  is sized to avoid voltage clamping during  $\Phi_{CHG}$ , while it still occupies a very small area compared to the CDAC, which is made of metal-oxide-metal (MOM) capacitors in our design, thanks to the much higher capacitance density of MOS capacitors. The fast charge-mode operation allows the charge sharing between the CDAC and  $C_{RSV}$  to complete within the MSB conversion, when the CDAC array is disengaged from the the ADC's input, hence without affecting the total required AD conversion time.



Fig. 4.10: (a) Circuit diagram of the ICMFB and (b) the related timing details.

The active blocks in the servo loop are all deactivated (EN<sub>CAL</sub>=0) during RX. These features lead to a very low-power, low-noise and area-efficient reference scheme.

#### **4.3.4.** INPUT COMMON-MODE FEEDBACK

As discussed in Section 4.3.1, the single-ended output of the  $\mu$ BF on capacitors C<sub>INT1-5</sub> is converted into a differential input to the ADC by means of dummy capacitors C<sub>DMY1-5</sub>, this would result in large CM variations. Fig. 4.10(a) depicts the circuit diagram of the ICMFB, which cancels out the CM variations through a capacitive feedback loop formed by C<sub>CMi</sub>, C<sub>SNS</sub> and an operational transimpedance amplifier (OTA).

Fig. 4.10(b) shows the timing diagram of TI<sub>*i*</sub>. The integration capacitor  $C_{INTi}$ , the dummy capacitor  $C_{DMYi}$  and the coupling capacitors  $C_{CMi}$  are first reset during phase  $S_i$ , followed by the µBF boxcar integration on  $C_{INTi}$ . During the boxcar integration, the CM voltage on  $C_{INTi}$  and  $C_{DMYi}$  gradually deviates from  $V_{cm}$ . During the subsequent read phase  $R_i$ , the SAR ADC starts to digitize TI<sub>*i*</sub>, and the CM voltage detected through capacitors  $C_{SNS}$  is fed to the negative input of the OTA, which adjusts the CM voltage on  $C_{INTi}$  and  $C_{DMYi}$  through the coupling capacitors  $C_{CMi}$  until the CM voltage equals  $V_{cm}$ . Once the AD conversion is completed, capacitors  $C_{SNS}$  are reset in  $\Phi_{RST}$  as preparation for the CM cancellation of the next TI channel. Notably, the use of the 1-bit redundancy in the AD conversion significantly relaxes the settling requirement of the ICMFB loop to about one ADC conversion period, i.e., 25 ns. As a result, the power consumption of the OTA is also reduced.



Fig. 4.11: Circuit diagram of 4-bit current-steering DAC with insets showing a TPSC DFF used in the decoders and PRWS, and the details of a unit current cell.

#### 4.3.5. PAM-16 DAC

As depicted in Fig. 4.11, a 4-bit current-steering DAC is used as the output driver of the D-TX, which receives the 4-bit data from the FIFO (as discussed in 4.2.4), or, for the test purposes, from a 4-bit pseudo-random number generator (PRNG). The DAC produces a differential output current ( $DO_P/DO_N$ ) to drive an off-chip 100- $\Omega$  load resistor, providing a 16-level voltage signal at the output. The PRNG, based on [41], generates the needed 2<sup>10</sup>-1 random number sequence for a BER measurement, as will be discussed in Section 4.4.2.

The DAC consists of a matrix of  $4 \times 4$  unit current cells, and row/column decoders which convert the received binary code (B<sub>1-4</sub>) into two thermometer codes (ROW<sub>1-4</sub> and COL<sub>1-4</sub>). The current cells comprise a complementary output stage and a local decoder which changes the output polarity in a thermometer fashion, according to the corresponding codes from the row/column decoders [42]. The logic cells of the decoders and PRNG are specially optimized for low-power operation and small area. E.g., all D flip-flops (DFFs) are built based on true single-phase-clock (TSPC) dynamic latches [43] which can steadily operate at a 1.2-V supply with much less power and area compared to a regular master-slave DFF.

The DA conversion is synchronized with a 480-MHz clock (CK<sub>MLS</sub>), and the input latch of the current cells improves timing accuracy by speeding up the toggling of the differential switch [44] which diverts the currents of the cascoded current sources to the load resistor. Each current source provides a current I<sub>0</sub> about 200- $\mu$ A, resulting in a maximum 600-mV peak-to-peak amplitude with 40 mV per LSB step, at a 1.2-V power supply.



Fig. 4.12: (a) Circuit diagram of a HV pulser and (b) the associated control clock signals and an output waveform of 3-cycle pulses.

#### **4.3.6.** HV PULSER

An element-level HV pulser consisting of high-side/low-side MOS transistors  $M_{HS}$  and  $M_{LS}$ , a T/R switch  $M_{TR}$ , and the associated HV level shifter is shown in Fig. 4.12(a), where  $M_{HS}$  and  $M_{TR}$  are laterally-diffused metal-oxide semiconductor (LDMOS) with a 20-V breakdown voltage between drain and source, and a 5-V breakdown voltage between gate and source. The HV level shifter is powered by an external power supply VDD<sub>HV</sub> and an internal source follower located at the periphery, providing a –5V supply VSS<sub>HV</sub> relative to VDD<sub>HV</sub> [45]. The TX pulse signal  $\Phi_{PU}$  couples to two 5-V inverters with the cross-coupling to the sources of their NMOS transistors via two level-shifting capacitors  $C_{SL}$ , followed by a set-reset (SR) latch that holds the on/off state for the high-side PMOS  $M_{HS}$  and introduces hysteresis to reject disturbances, thus making the pulser less sensitive to transients on VDD<sub>HV</sub>, e.g., the large switching transients caused by other pulsers toggling simultaneously during the TX beamforming.

Since the low-side  $M_{LS}$  is always on during TX, the HV MOS transistors  $M_{HS}$  and  $M_{TR}$  behave like an inverter controlled by the pulse signal  $\Phi_{PU}$ , hence generating the 20-V unipolar pulse to drive the transducer, as depicted in Fig. 4.12(b). In the following RX period, the T/R switch  $M_{TR}$  is kept on while  $M_{LS}$  is off, allowing the current signal of the transducer to be fed into the virtual ground of the AFE via the multiplexer  $M_{MUX}$  for subsequent signal processing.

## **4.4.** EXPERIMENTAL RESULTS

### 4.4.1. ASIC PROTOTYPE

Fig. 4.13(a) shows a transceiver ASIC prototype fabricated in a 180-nm BCD process, in the center of which a 16×16 pad array surrounded by two outer rings of dummy pads is located for bonding with a PZT transducer array, using a similar technique as described in [4]. Fig. 4.13(b) reveals the floor plan of the RX/TX circuitry that interfaces with one of the sub-groups,

occupying an area of  $500 \times 1000 \,\mu\text{m}^2$  with the area breakdown shown in 4.14(a). A tiling of eight such sub-group circuitries makes up the pitch-matched region for a  $16 \times 16$  transducer array,



Fig. 4.13: (a) Micrograph of the transceiver ASIC showing the pitch-matched and the peripheral region, with (b) inset showing the pitch-matched TX/RX circuitry for 8 sub-arrays of 2×2. (c) A prototype of the transceiver ASIC with PZT array built on top.



Fig. 4.14: (a) Area breakdown and (b) power breakdown of 8 sub-arrays of 2×2 elements.

while two D-TXs each serving 4 sub-groups, are situated at the periphery of the ASIC.

Per element, the RX circuitry occupies  $0.048 \text{ mm}^2$ , of which the µBF and ADC occupy  $0.0045 \text{ mm}^2$  and  $0.0054 \text{ mm}^2$ , respectively, revealing a compact solution for channel-count reduction. It should be noted that the 8:1 multiplexing has not been taken into account for an objective comparison with state-of-the-art works that do not apply multiplexing, resulting in a division of 4 in the calculations of per-element area and power.

At the maximum AFE gain, the RX circuitry of a sub-group consumes 6.64 mW power, while the power is reduced to an average of 5.7 mW when the AFE is dynamically biased to perform TGC in a 36-dB gain range [21]. The power breakdown of the sub-group RX circuitry is shown in 4.14(b), in which the  $\mu$ BF and ADC consume 0.72 mW and 1.82 mW, respectively, both operating at 1.8-V supplies. Compared to the RX circuitry, the power consumption of the TX circuitry is negligible as it operates in a very small duty cycle. As will be elaborated later, each D-TX consumes 6.4 mW at a 1.2-V power supply, resulting in overall power consumption of 2.06 mW/element at maximum AFE gain, and 1.83 mW/element when performing the TGC.

A prototype of the transceiver ASIC with a PZT array built on top is shown in Fig. 4.13(c), connecting to each other via the aforementioned transducer bonding pads. The chip was wire bonded to a daughter board via the peripheral pads for both electrical and acoustic measurements. The daughter board was mounted on a mother board containing an FPGA, which controls the TX/RX of the ASIC during the measurements, and also synchronizes the data transmit and receive between the ASIC and an oscilloscope. Rather than a dedicated D-RX, as a proof of concepts, an oscilloscope equipped with an active differential probe was used to acquire the PAM-16 output of the D-TX. The D-TX output was connected to the probe via 1-mm bond wires to the daughter board and 32-mm FR4 PCB trace, terminated by a  $100-\Omega$  resistor. The acquired data was uploaded to a PC for the following data processing, such as data decoding and image reconstruction.

#### **4.4.2.** ELECTRICAL CHARACTERIZATION

The gain transfer function of a sub-group RX channel was measured by applying voltage signals to 4 on-chip capacitors of 1-pF, that emulate 4 transducer elements. The 16-level data of the D-TX captured by the oscilloscope was then uploaded to a PC for decoding and subsequent data processing. To quantify the transfer of the AFE, the readout of the ADC was converted to equivalent AFE output voltages using the 0.5-V reference voltage set by the internal charge-mode reference generator. Thus, the transfer function for different TGC control voltages ( $V_{TGC}$ ) was extracted as depicted in Fig. 4.15(a), together with a -3–dB curve showing the bandwidth variation for different TGC gains. The -3-dB bandwidth as a function of  $V_{TGC}$  is plotted separately in Fig. 4.15(b), showing a minimum bandwidth of 14.3 MHz at 1.1-V  $V_{TGC}$ , and a maximum bandwidth of 15.8 MHz at 0.5-V V<sub>TGC</sub>. Since the bandwidth of the sub-group RX is about 120% around 9 MHz, which is much wider than the bandwidth of the transducer used in our application, the attenuation of the ultrasound signals is minimal. Fig. 4.15(b) also reveals that the RX gain at 9 MHz is a linear-in-dB function of V<sub>TGC</sub> from 0.5 V to 1.1 V, leading to a positive slope of 60 dB/V within the total 36-dB gain range. To ensure accurate TGC, the TGC control voltage is restricted between 0.5 V and 1.1 V in our design (e.g., the maximum AFE gain is achieved at  $V_{TGC} = 1.1 \text{ V}$ ).

Fig. 4.16(a) shows the normalized output spectrum obtained by measuring the output of a sub-group RX channel at the maximum AFE gain. The measurement was taken by feeding



Fig. 4.15: (a) Measured RX gain transfer function. (b) Extracted bandwidth as a function of TGC control voltage V<sub>TGC</sub>.

the four AFEs with 8.9-MHz sinusoidal currents, each with a peak value of 1 pA, via the same dummy capacitors mentioned earlier. A peak SNR of 50.7 dB was attained over a bandwidth of 5–13 MHz, representing an 89% bandwidth centered at 9 MHz. Two tones, located at  $\frac{2}{5}f_s - f_{sig}$  and  $\frac{1}{5}f_s$ , are observable, where  $f_s$  is the sampling frequency of the µBF ADC, and  $f_{sig}$  is the frequency of the input signals. The presence of  $\frac{1}{5}f_s$  tone suggests there is a mismatch in the five integration capacitors of the µBF, whereas the  $\frac{2}{5}f_s - f_{sig}$  tone is caused by intermodulation between the input signal and the disturbance induced by this capacitor mismatch. Because of the low power of these tones, they have a negligible impact on image quality.

To calculate the input-referred noise, the noise density at the ADC output was first measured while grounding the four 1-pF capacitors connected to the inputs of the AFEs. The resulting output noise density was then divided by the measured gain transfer function to obtain the input-referred noise spectra shown in Fig. 4.16(b). It should be noted that the 8-MHz tone,



Fig. 4.16: (a) Measured output spectrum of the RX sub-array at the maximum TGC gain. (b) Measured input-referred noise spectra with 8-MHz tones removed. (c) Input-referred in-band noise as a function of  $V_{TGC}$ . (d) Measured SNR as a function of the input current in different  $V_{TGC}$ .

resulting from the mentioned mismatch, has been removed to provide a clearer view of all the curves. Fig. 4.16(c) shows the input-referred in-band noise as a function of  $V_{TGC}$ , which was derived by averaging over the frequency range of 5 MHz to 13 MHz. The sub-group RX achieves a noise density of 0.7 pA/ $\sqrt{Hz}$  and 31 pA/ $\sqrt{Hz}$  at the maximum and minimum TGC control voltage. The SNR as a function of input current for different TGC control voltages is shown in Fig. 4.16(d), revealing a dynamic range (DR) of 83 dB, defined as the ratio of the maximum input signal level at the 1-dB compression point (P1dB) to the minimum detectable input signal level, at which the SNR drops to zero.



Fig. 4.17: (a) Eye diagram of D-TX at 1.92 Gb/s. (b) Measured outputs of four TX pulsers with uniform delays of 13.8 ns, and inset showing a delay of 13.8 ns and a maximum amplitude of 20 V.

To characterize the PAM-16 D-TX, the output signal of the D-TX was recorded using the oscilloscope for data analysis on a PC. As depicted in Fig. 4.17(a), the eye diagram, measured at a data rate of 1.92 Gb/s, distinctly shows 16 voltage levels and 15 eyes with a height of > 20.1 mV, and a width of > 0.91 ns, where the eye height is extracted at a sampling point at which the largest height can be found in the worst case of the 15 eyes. The measured peak-to-peak amplitude is about 560 mV. A BER of <  $10^{-10}$  was measured by selecting the PRNG (see Fig. 4.11) as the data source of the D-TX. The PRNG operates at 480 MHz and generates  $2^{10}$ -1 pseudo-random 4-bit word.

Table I summarizes the performance of the PAM-16 D-TX and gives a comparison with the prior art, all fabricated in similar 180-nm processes. The PAM-16 D-TX consumes 6.4 mW at 1.92 Gb/s data rate, and occupies 0.061 mm<sup>2</sup>, resulting in an energy efficiency of 3.3 pJ/bit, that

| 10010 4.1. D |                       |                      |        |                 | montani               |
|--------------|-----------------------|----------------------|--------|-----------------|-----------------------|
|              | This work             | JSSC'09 [46]         | JSSC'  | <b>04</b> [26]  | TCSI'13 [47]          |
| Process      | 180 nm                | 180 nm               | 180    | ) nm            | 180 nm                |
| Modulation   | PAM-16                | PAM-2                | PAM-4  |                 | PAM-10                |
| Data Rate    | 1.92 Gb/s             | 5 Gb/s               | 7 Gb/s | 10 Gb/s         | 10 GB/s               |
| D-TX Power   | 6.36 mW               | 57 mW                | 66 mW  | 120 mW          | 235 mW                |
| D-TX Energy  | 3.3                   | 11.4                 | 9.4    | 12              | 23.5                  |
| Efficiency   | pJ/bit                | pJ/bit               | pJ/bit | pJ/bit          | pJ/bit                |
| Area         | 0.061 mm <sup>2</sup> | $0.017  \text{mm}^2$ | 0.16   | mm <sup>2</sup> | $*0.057 \text{ mm}^2$ |
| Supply       | 1.2 V                 | 1.8 V                | 1.7 V  | 2.0 V           | 2 V                   |
| BER          | < 10 <sup>-10</sup>   | < 10 <sup>-12</sup>  | -      |                 | < 10 <sup>-12</sup>   |

Table 4.1: D-TX PERFORMANCE COMPARISON WITH PRIOR ART

\* Estimated from the layout.

is > 2.8× better than the other designs. As discussed in Section 4.2.4, this efficiency advantage partly relates to the relatively low data rate of our design, which still provides sufficient channelcount reduction while consuming lower overall power than a solution based on datalinks with higher data rate.

To evaluate the TX pulsers and TX beamforming, 8 adjacent transducer bonding pads were wire-bonded to the daughter board. Off-chip 1-pF capacitors were connected as loads to emulate the transducer capacitance during the measurement. Fig. 4.17(b) shows the outputs of four neighboring TX pulsers, which were programmed at a uniform delay of 13.8 ns, confirming that the TX beamforming accurately produces these delays. In addition, the pulsers successfully produce 8-cycle unipolar pulses with an amplitude of 20 V.

#### **4.4.3.** RX µBF CHARACTERIZATION

The RX beamforming filters incoming acoustic waves in terms of their arrival angles to selectively receive signals in a particular direction while minimizing interference from other directions. Fig. 4.18(a) shows an example of the beamforming, in which the RX  $\mu$ BF is steered to an angle of 0° by applying DAS on the data received by four AFEs, leading to a maximum sensitivity for signals arriving at an angle of 0° while suppressing those from other angles.

The RX  $\mu$ BF was evaluated both by electrical and by acoustic measurements. To emulate the arrival of acoustic waves at different angles in an electrical measurement, four time-shifted sinusoidal inputs were applied to the AFEs via the on-chip dummy capacitors, and the outputs for different  $\mu$ BF steering angles were recorded and compared with the theoretical directivity. During the electrical measurement, the  $\mu$ BF was steered to four angles (0°, 9°,18° and 28°), corresponding to a minimum delay step of 12.5 ns, and a time shift step of 6.25-ns was applied to four 10-MHz sinusoidal inputs. The response of the  $\mu$ BF, as illustrated in Fig. 4.18(b), is consistent with the theoretical directivity curves.

The acoustic experimental setup for  $\mu$ BF directivity testing is shown in Fig. 4.18(c). A water

tank was placed on top of the prototype chip with an unfocused single-element probe dipped into the water. To ensure accurate measurements, the probe, driven by a 10-MHz sinusoidal wave at its resonance frequency, is positioned at a sufficient distance to guarantee the chip is within its far field, thus allowing for the arrival of an approximate plane wave at the transducer surface. A 3-D printed rotating handler (not shown) was employed to precisely manipulate the incidence angle of the incoming ultrasound wave, by positioning the probe on a circular trajectory centered at the location of the transducer array.

The transient output responses were recorded by an oscilloscope at incidence angles of  $0^{\circ}$ ,  $-9^{\circ}$  and  $-18^{\circ}$ , for different µBF steering angles. As a demonstration, the response corresponding to an incidence angle of  $-9^{\circ}$  is illustrated in Fig. 4.18(e), with an inset confirming that the strongest response occurs at the expected µBF steering angle of  $-9^{\circ}$  and weaker responses at other angles. The normalized output amplitudes were extracted from the transient waves and then plotted as a function of the µBF steering angle as shown in Fig. 4.18(d), alongside the ideal response at the corresponding incidence angles. Again, a good agreement is evident between the measured data and the theoretical prediction, thus demonstrating the accuracy of the passive BI-based µBF.

#### 4.4.4. Ultrasound B-mode Imaging

Fig. 4.19(c) depicts the measurement setup used for reconstructing B-mode images, which is similar to the setup used for acoustic measurement of the  $\mu$ BF, whereas the probe was replaced by a three-needle phantom, positioned from 5 mm to 7 mm above the prototype chip. To produce B-mode image, a plane wave was first generated by the chip via the internal TX circuitry to illuminate the phantom, the received echo signals were processed by the RX circuitry and then encoded by the PAM-16 D-TX. Subsequently, the 16-level outputs were recorded using the oscilloscope and uploaded to a PC before the data was decoded in a software. Finally, a post-beamforming processing was applied to reconstruct the B-mode images as shown in Fig. 4.19(a), both in elevation and azimuth direction. Fig. 4.19(b) illustrates a 3-D image rendered from the same recording. Notably, all the images distinctly display the three-needle phantom at the correct position, with a clear contrast against the background.

The B-mode images reveals artifacts appearing after the primary echoes, which are identified as reverberation artifacts, caused by the presence of the ASIC beneath the transducer array and the absence of proper backing layer in the transducer manufacturing process [48]. As such, these artifacts are not related to the ASIC design and can be resolved by an improved transducer process.

The images are reconstructed from data acquired in 200 T/R cycles, corresponding to 8 times multiplexing and RX  $\mu$ BF steering at 25 different angles, with a pulse-repetition frequency of 20 kHz, allowing for a theoretical volume rate of 100 volumes per second. The bottleneck of the current system is the data-transfer speed between the oscilloscope and the PC, which could be overcome by implementing the envisioned DPU. The DPU would comprise the needed high-speed PAM-16 D-RX, along with the following image reconstruction processing, thus reducing the data throughput and minimizing the data-transfer time.

Table II compares this work with state-of-the-art ASIC designs for different ultrasound systems. Among all the pitch-matched ASIC designs, this work achieves the smallest array pitch, the highest center frequency, and largest RX channel bandwidth. Meanwhile, the PAM-16 data link provides the fastest data rate and the highest energy efficiency in comparison with other



Fig. 4.18: (a) A RX µBF steered at an angle of 0°, showing different sensitivity for ultrasound signals at different arrival angles. (b) Measured µBF directivity presented as normalized response as a function of emulated arrival angles. (c) Acoustic measurement μBF steering angles of -28° – 28°, at a probe incidence angle of -9°.  $0^\circ$ ,  $-9^\circ$ , and  $-18^\circ$ , sided with the ideal directivity curves for comparison. (e) Recorded output responses of the  $\mu$ BF relative to setup for evaluating µBF directivity. (d) Extracted µBF directivity as a function of µBF steering angles, at incidence angles of



Fig. 4.19: (a) B-mode imaging experiment setup. (b) Measured outputs of four TX pulsers with uniform delays of 13.8 ns, and inset showing a delay of 13.8 ns and a maximum amplitude of 20 V.

designs. Notably, the  $\mu$ BF and the ADC occupy a small per-element area, given that the ADC is shared within a relatively small sub-array of 2×2. Despite the complexity of the AFE design, which includes a more sophisticated TGC function, the RX circuitry still achieves a low per-element power consumption for a high center frequency transducer.

## **4.5.** CONCLUSION

This chapter has presented a transceiver ASIC that combines element-level pulsers and subgroup-level receivers in a pitch-matched fashion. The receiver incorporates an AFE with a continuous TGC function, a passive BI-based  $\mu$ BF, and a charge-sharing SAR ADC in a pitchmatched region to process incoming ultrasound signals, followed by a data link located in the periphery, that enables a low power, high-speed chip-to-chip communication. The proposed architecture allows for a power-efficient implementation of digitization and off-loading in a mature 180-nm BCD process that supports the required HV TX, and implementing further data processing in a companion chip made with an advanced process, thus optimizing the overall power consumption for a wearable ultrasound device. Meanwhile, a charge-mode reference effectively reduces the area consumption without compromising the speed of the ADC, hence allowing for a per-sub-group, low-power referencing scheme. The prototype ASIC offers a 128-fold overall channel-count reduction and consumes 1.83 mW/element RX power for a high center frequency, wide-bandwidth transducer array, thus laying the foundation for developing a medical ultrasound device with large aperture, high resolution, and good image quality.

| Tabi                                            | able 4.2: COMPARISON WITH THE STATE-OF-THE-ART ULTRASOUND ASIC DESIGNS | <b>DN WITH THE STA</b>     | ATE-OF-THE-ART          | ULTRASOUND A          | <b>NSIC DESIGNS</b>                 |             |
|-------------------------------------------------|------------------------------------------------------------------------|----------------------------|-------------------------|-----------------------|-------------------------------------|-------------|
|                                                 | This work                                                              | JSSC'18 [11]               | JSSC'22 [12]            | JSSC'19 [49]          | JSSC'17 [11]                        | JSSC'18 [5] |
| Process                                         | 180nm BCD                                                              | 180nm BCD                  | 180nm BCD               | 180nm SOI             | 28nm CMOS                           | 180nm BCD   |
| Center Frequency                                | 9 MHz                                                                  | 5 MHz                      | 6 MHz                   | <5 MHz                | 5 MHz                               | 13 MHz      |
| Sub-array Size                                  | 2×2                                                                    | $3 \times 3$               | $3 \times 2$            | $4 \times 6$          | $4 \times 4$                        | I           |
| Pitch-matched                                   | Υ                                                                      | Y                          | Υ                       | Υ                     | Y                                   | Z           |
| Transducer Pitch                                | 125 µm                                                                 | 150 µm                     | 160 µm                  | 300 µm                | 250 μm                              | I           |
| Digitization                                    | Υ                                                                      | Y                          | Υ                       | z                     | Y                                   | Υ           |
| Sampling Rate                                   | 40 MS/s                                                                | 30 MS/s                    | 24 MS/s                 | 40 MS/s               | 20 MS/s                             | 60 MS/s     |
| Datalink Type                                   | PAM-16                                                                 | LVDS                       | IVDS                    | I                     | I                                   | ⁴Load Mod.  |
| Data Rate                                       | 1.92 GHz                                                               | 1.5 GHz                    | 1.2 GHz                 | I                     | I                                   | 0.6 GHz     |
| Datalink Power                                  | 6.4 mW                                                                 | $15.4 \mathrm{mW}$         | 7 mW                    | I                     | I                                   | 2.7 mW      |
| Energy Efficiency                               | 3.3 pJ/bit                                                             | 10.3 pJ/bit                | 5.8 pJ/bit              |                       | I                                   | 4.5 pJ/Bit  |
| Channel Reduction                               | 128-fold                                                               | 36-fold                    | 12-fold                 | 24-fold               | 16-fold                             | 64-fold     |
| RX Bandwidth                                    | 14.3 MHz                                                               | 11.9 MHz                   | 8.1 MHz                 | $5.9 \mathrm{MHz}$    | I                                   | 16 MHz      |
| AFE Type                                        | LNA with TGC                                                           | LNA + PGA                  | LNA + PGA               | LNA + PGA             | LNA + PGA                           | LNA + PGA   |
| μBF Resolution                                  | 12.5 ns                                                                | 33 ns                      | 20.8 ns                 | 25 ns                 | 8.33 ns                             | I           |
| μBF Area/ El.                                   | $^{\dagger*}0.01 \text{ mm}^2$                                         | $*0.011  {\rm mm}^2$       | $*0.006  { m mm}^2$     | $0.03~{ m mm}^2$      | $^{\ddagger}$ 0.041 mm <sup>2</sup> | I           |
| RX Area/El.                                     | $^{\dagger*}0.05 \mathrm{mm}^2$                                        | $*0.026  { m mm}^2$        | $*0.017 \mathrm{~mm^2}$ | $0.09 \mathrm{~mm}^2$ | $^{\ddagger}0.049~\mathrm{mm}^{2}$  | I           |
| RX Power/El.                                    | †§1.83 mW                                                              | $^{\$}0.91\mathrm{mW}$     | <sup>§</sup> 1.23 mW    | $0.43\mathrm{mW}$     | <sup>‡</sup> 33 mW                  | I           |
| Input DR                                        | 83 dB                                                                  | 85 dB                      | 91 dB                   | 85 dB                 | I                                   | 53 dB       |
| Peak SNR                                        | 54 dB                                                                  | 52 dB                      | 52 dB                   | I                     | 60 dB                               | 42 dB       |
| TX Voltage                                      | 20 V                                                                   | I                          | 65 V                    | 138 V                 | I                                   | 28 V        |
| † Divided by 16, 8:1 multiplexing not included. | ng not included. * In                                                  | * Including sub-array ADC. | . § Including datalink. | k.                    |                                     |             |

↑ Divided by 16, 8:1 multiplexing not included. \* Including sub-array ADC. § Including datalink. ‡ Including element-level ADC. ¶ 2-level RZ load modulation.

# REFERENCES

- [1] T. M. O'Shea, "Cerebral palsy in very preterm infants: New epidemiological insights," *Mental Retardation and Developmental Disabilities Research Reviews*, vol. 8, no. 3, pp. 135–145, 2002.
- [2] M. Proisy, S. Mitra, C. Uria-Avellana, M. Sokolska, N. Robertson, F. Le Jeune, and J.-C. Ferré, "Brain Perfusion Imaging in Neonates: An Overview," *AJNR Am J Neuroradiol*, vol. 37, no. 10, pp. 1766–1773, Oct. 2016.
- [3] J. Baranger, C. Demene, A. Frerot, F. Faure, C. Delanoë, H. Serroune, A. Houdouin, J. Mairesse, V. Biran, O. Baud, and M. Tanter, "Bedside functional monitoring of the dynamic brain connectivity in human neonates," *Nat Commun*, vol. 12, no. 1, p. 1080, Feb. 2021.
- [4] C. Chen, E. Noothout, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, M. A. P. Pertijs, S. B. Raghunathan, Z. Yu, M. Shabanimotlagh, Z. Chen, Z.-y. Chang, S. Blaak, C. Prins, and J. Ponte, "A Prototype PZT Matrix Transducer With Low-Power Integrated Receive ASIC for 3-D Transesophageal Echocardiography," *IEEE Trans. Ultrason., Ferroelect., Freq. Contr.*, vol. 63, no. 1, pp. 47–59, Jan. 2016.
- [5] M. Tan, C. Chen, Z. Chen, J. Janjic, V. Daeichin, Z.-Y. Chang, E. Noothout, G. van Soest, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A Front-End ASIC With High-Voltage Transmit Switching and Receive Digitization for 3-D Forward-Looking Intravascular Ultrasound Imaging," *IEEE J. Solid-State Circuits*, vol. 53, no. 8, pp. 2284–2297, Aug. 2018.
- [6] G. Jung, M. W. Rashid, T. M. Carpenter, C. Tekes, D. M. J. Cowell, S. Freear, F. L. Degertekin, and M. Ghovanloo, "Single-chip reduced-wire active catheter system with programmable transmit beamforming and receive time-division multiplexing for intracardiac echocardiography," in 2018 IEEE International Solid - State Circuits Conference - (ISSCC), San Francisco, CA: IEEE, Feb. 2018, pp. 188–190.
- [7] M. W. Rashid, C. Tekes, M. Ghovanloo, and F. L. Degertekin, "Design of frequency-division multiplexing front-end receiver electronics for CMUT-on-CMOS based intracardiac echocardiography," in 2014 IEEE International Ultrasonics Symposium, Sep. 2014, pp. 1540–1543.
- [8] J. D. Larson, "2-d phased array ultrasound imaging system with distributed phasing," US5229933A, Jul. 1993.
- [9] Zili Yu, S. Blaak, Zu-yao Chang, Jiajian Yao, J. G. Bosch, C. Prins, C. T. Lancée, N. de Jong, M. A. P. Pertijs, and G. C. M. Meijer, "Front-end receiver electronics for a matrix transducer for 3-D transesophageal echocardiography," *IEEE Trans. Ultrason., Ferroelect., Freq. Contr.*, vol. 59, no. 7, pp. 1500–1512, Jul. 2012.

- [10] U.-W. Lok and P.-C. Li, "Microbeamforming With Error Compensation," *IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control*, vol. 65, no. 7, pp. 1153–1165, Jul. 2018.
- [11] C. Chen, Z. Chen, D. Bera, E. Noothout, Z.-Y. Chang, M. Tan, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A Pitch-Matched Front-End ASIC With Integrated Subarray Beamforming ADC for Miniature 3-D Ultrasound Probes," *IEEE J. Solid-State Circuits*, vol. 53, no. 11, pp. 3050–3064, Nov. 2018.
- [12] Y. M. Hopf, B. W. Ossenkoppele, M. Soozande, E. Noothout, Z.-Y. Chang, C. Chen, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A Pitch-Matched Transceiver ASIC With Shared Hybrid Beamforming ADC for High-Frame-Rate 3-D Intracardiac Echocardiography," *IEEE Journal of Solid-State Circuits*, vol. 57, no. 11, pp. 3228–3242, Nov. 2022.
- [13] A. Rezvanitabar, G. Jung, C. Tekes, T. M. Carpenter, D. M. J. Cowell, S. Freear, and F. L. Degertekin, "Integrated Hybrid Sub-Aperture Beamforming and Time-Division Multiplexing for Massive Readout in Ultrasound Imaging," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 16, no. 5, pp. 972–980, Oct. 2022.
- [14] Y. Hopf, B. Ossenkoppele, M. Soozande, E. Noothout, Z.-Y. Chang, C. Chen, H. Vos, H. Bosch, M. Verweij, N. De Jong, and M. Pertijs, "A Pitch-Matched ASIC with Integrated 65V TX and Shared Hybrid Beamforming ADC for Catheter-Based High-Frame-Rate 3D Ultrasound Probes," in *2022 IEEE International Solid- State Circuits Conference (ISSCC)*, vol. 65, Feb. 2022, pp. 494–496.
- [15] K. Chen, H.-S. Lee, A. P. Chandrakasan, and C. G. Sodini, "Ultrasonic Imaging Transceiver Design for CMUT: A Three-Level 30-Vpp Pulse-Shaping Pulser With Improved Efficiency and a Noise-Optimized Receiver," *IEEE J. Solid-State Circuits*, vol. 48, no. 11, pp. 2734– 2745, Nov. 2013.
- [16] M. Tan, E. Kang, J. An, Z. Chang, P. Vince, T. Matéo, N. Sénégond, and M. A. P. Pertijs, "A 64-Channel Transmit Beamformer With ±30-V Bipolar High-Voltage Pulsers for Catheter-Based Ultrasound Probes," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 7, pp. 1796–1806, Jul. 2020.
- [17] S. Hallereau and T. Herve, "BCD Technology and Cost Comparison 2021," SystemPlus Consulting, Tech. Rep., Sep. 2021.
- [18] M.-S. Lin, T.-C. Huang, C.-C. Tsai, K.-H. Tam, K. C.-H. Hsieh, C.-F. Chen, W.-H. Huang, C.-W. Hu, Y.-C. Chen, S. K. Goel, C.-M. Fu, S. Rusu, C.-C. Li, S.-Y. Yang, M. Wong, S.-C. Yang, and F. Lee, "A 7-nm 4-GHz Arm<sup>1</sup>-Core-Based CoWoS<sup>1</sup> Chiplet Design for High-Performance Computing," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 4, pp. 956–966, Apr. 2020.
- [19] M.-C. Chen, A. Peña Perez, S.-R. Kothapalli, P. Cathelin, A. Cathelin, S. S. Gambhir, and B. Murmann, "A Pixel Pitch-Matched Ultrasound Receiver for 3-D Photoacoustic Imaging With Integrated Delta-Sigma Beamformer in 28-nm UTBB FD-SOI," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 11, pp. 2843–2856, Nov. 2017.
- [20] R. Farjad-Rad, C.-K. Yang, M. Horowitz, and T. Lee, "A 0.4-/spl mu/m CMOS 10-Gb/s 4-PAM pre-emphasis serial link transmitter," *IEEE Journal of Solid-State Circuits*, vol. 34, no. 5, pp. 580–585, May 1999.

- [21] P. Guo, Z.-Y. Chang, E. Noothout, H. J. Vos, J. G. Bosch, N. de Jong, M. D. Verweij, and M. A. P. Pertijs, "A Pitch-Matched Low-Noise Analog Front-End With Accurate Continuous Time-Gain Compensation for High-Density Ultrasound Transducer Arrays," *IEEE J. Solid-State Circuits*, vol. 58, no. 6, pp. 1693–1705, Jun. 2023.
- [22] P. Guo, F. Fool, E. Noothout, Z.-V. Chang, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jonq, and M. A. Pertijs, "A 1.2mW/channel 100µm-Pitch-Matched Transceiver ASIC with Boxcar-Integration-Based RX Micro-Beamformer for High-Resolution 3D Ultrasound Imaging," in *2022 IEEE International Solid- State Circuits Conference (ISSCC)*, vol. 65, Feb. 2022, pp. 496–498.
- [23] J. Craninckx and G. van der Plas, "A 65fJ/Conversion-Step 0-to-50MS/s 0-to-0.7mW 9b Charge-Sharing SAR ADC in 90nm Digital CMOS," in 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, Feb. 2007, pp. 246–600.
- [24] B. Malki, T. Yamamoto, B. Verbruggen, P. Wambacq, and J. Craninckx, "A 70 dB DR 10 b 0to-80 MS/s Current-Integrating SAR ADC With Adaptive Dynamic Range," *IEEE Journal* of Solid-State Circuits, vol. 49, no. 5, pp. 1173–1183, May 2014.
- [25] K. Lee, I. Nam, I. Kwon, J. Gil, K. Han, S. Park, and B.-I. Seo, "The impact of semiconductor technology scaling on CMOS RF and digital circuits for wireless application," *IEEE Transactions on Electron Devices*, vol. 52, no. 7, pp. 1415–1422, Jul. 2005.
- [26] K. Farzan and D. Johns, "A CMOS 10-Gb/s Power-Efficient 4-PAM Transmitter," IEEE J. Solid-State Circuits, vol. 39, no. 3, pp. 529–532, Mar. 2004.
- [27] U. Madhow, *Fundamentals of Digital Communication*. Cambridge: Cambridge University Press, 2008.
- [28] "IEEE Standard for Ethernet," IEEE Std 802.3-2022 (Revision of IEEE Std 802.3-2018), pp. 1–7025, Jul. 2022.
- [29] Y. Chun, M. Megahed, A. Ramachandran, and T. Anand, "A PAM-8 Wireline Transceiver With Linearity Improvement Technique and a Time-Domain Receiver Side FFE in 65 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 57, no. 5, pp. 1527–1541, May 2022.
- [30] D. J. Foley and M. P. Flynn, "A Low-Power 8-PAM Serial Transceiver in 0.5- m Digital CMOS," vol. 37, no. 3, p. 7, 2002.
- [31] W. Tian, H. Cui, and W. Yu, "Analysis and Experimental Test of Electrical Characteristics on Bonding Wire," *Electronics*, vol. 8, no. 3, p. 365, Mar. 2019.
- [32] Z. Chen, M. Soozande, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "Impact of Bit Errors in Digitized RF Data on Ultrasound Image Quality," *IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control*, vol. 67, no. 1, pp. 13–24, Jan. 2020.
- [33] F. Celik, A. Akkaya, and Y. Leblebici, "A 32 Gb/s PAM-16 TX and ADC-Based RX AFE with 2-tap embedded analog FFE in 28 nm FDSOI," *Microelectronics Journal*, vol. 108, p. 104 967, Feb. 2021.

- [34] A. Khairi, Y. Krupnik, A. Laufer, Y. Segal, M. Cusmai, I. Levin, A. Gordon, Y. Sabag, V. Rahinski, I. Lotan, G. Ori, N. Familia, S. Litski, T. W. Grafi, U. Virobnik, D. Lazar, Y. Horwitz, A. Balankutty, S. Kiran, S. Palermo, P. M. Li, F. O'Mahony, and A. Cohen, "A 1.41-pJ/b 224-Gb/s PAM4 6-bit ADC-Based SerDes Receiver With Hybrid AFE Capable of Supporting Long Reach Channels," *IEEE Journal of Solid-State Circuits*, vol. 58, no. 1, pp. 8–18, Jan. 2023.
- [35] A. X. Widmer and P. A. Franaszek, "A DC-Balanced, Partitioned-Block, 8B/10B Transmission Code," *IBM Journal of Research and Development*, vol. 27, no. 5, pp. 440–451, Sep. 1983.
- [36] A. Momtaz, J. Cao, M. Caresosa, A. Hairapetian, D. Chung, K. Vakilian, M. Green, W.-G. Tan, K.-C. Jen, I. Fujimori, and Y. Cai, "A fully integrated SONET OC-48 transceiver in standard CMOS," *IEEE Journal of Solid-State Circuits*, vol. 36, no. 12, pp. 1964–1973, Dec. 2001.
- [37] S. Ranganathan and Y. Tsividis, "Discrete-time parametric amplification based on a three-terminal MOS varactor: Analysis and experimental results," *IEEE Journal of Solid-State Circuits*, vol. 38, no. 12, pp. 2087–2093, Dec. 2003.
- [38] F. van der Goes, C. M. Ward, S. Astgimath, H. Yan, J. Riley, Z. Zeng, J. Mulder, S. Wang, and K. Bult, "A 1.5 mW 68 dB SNDR 80 Ms/s 2 \$\times\$ Interleaved Pipelined SAR ADC in 28 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 12, pp. 2835–2845, Dec. 2014.
- [39] M. van Elzakker, E. van Tuijl, P. Geraedts, D. Schinkel, E. A. M. Klumperink, and B. Nauta, "A 10-bit Charge-Redistribution ADC Consuming 1.9 μW at 1 MS/s," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 5, pp. 1007–1015, May 2010.
- [40] C. Portmann and T. Meng, "Power-efficient metastability error reduction in CMOS flash A/D converters," *IEEE Journal of Solid-State Circuits*, vol. 31, no. 8, pp. 1132–1140, Aug. 1996.
- [41] W. McFarland, K. Springer, and C.-S. Yen, "1-Gword/s pseudorandom word generator," *IEEE Journal of Solid-State Circuits*, vol. 24, no. 3, pp. 747–751, Jun. 1989.
- [42] T. Miki, Y. Nakamura, M. Nakaya, S. Asai, Y. Akasaka, and Y. Horiba, "An 80-MHz 8-bit CMOS D/A converter," *IEEE Journal of Solid-State Circuits*, vol. 21, no. 6, pp. 983–988, Dec. 1986.
- [43] J. Yuan and C. Svensson, "High-speed CMOS circuit technique," *IEEE Journal of Solid-State Circuits*, vol. 24, no. 1, pp. 62–70, Feb. 1989.
- [44] Chi-Hung Lin and K. Bult, "A 10-b, 500-MSample/s CMOS DAC in 0.6 mm/sup 2/," IEEE Journal of Solid-State Circuits, vol. 33, no. 12, pp. 1948–1958, Dec. 1998.
- [45] Y. M. Hopf, B. Ossenkoppele, M. Soozande, E. Noothout, Z.-Y. Chang, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A Compact Integrated High-Voltage Pulser Insensitive to Supply Transients for 3-D Miniature Ultrasound Probes," *IEEE Solid-State Circuits Letters*, vol. 5, pp. 166–169, 2022.
- [46] K.-I. Oh, L.-S. Kim, K.-I. Park, Y.-H. Jun, J. S. Choi, and K. Kim, "A 5-Gb/s/pin Transceiver for DDR Memory Interface With a Crosstalk Suppression Scheme," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 8, pp. 2222–2232, Aug. 2009.

- [47] B. Song, K. Kim, J. Lee, and J. Burm, "A 0.18-/spl mu/m CMOS 10-Gb/s Dual-Mode 10-PAM Serial Link Transceiver," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 60, no. 2, pp. 457–468, Feb. 2013.
- [48] J. Janjic, M. Tan, V. Daeichin, E. Noothout, C. Chen, Z. Chen, Z.-Y. Chang, R. H. S. H. Beurskens, G. van Soest, A. F. W. van der Steen, M. D. Verweij, M. A. Pertijs, and N. de Jong, "A 2-D Ultrasound Transducer With Front-End ASIC and Low Cable Count for 3-D Forward-Looking Intravascular Imaging: Performance and Characterization," *IEEE Trans. Ultrason., Ferroelect., Freq. Contr.*, vol. 65, no. 10, pp. 1832–1844, Oct. 2018.
- [49] Y. Igarashi, S. Kajiyama, Y. Katsube, T. Nishimoto, T. Nakagawa, Y. Okuma, Y. Nakamura, T. Terada, T. Yamawaki, T. Yazaki, Y. Hayashi, K. Amino, T. Kaneko, and H. Tanaka, "Single-Chip 3072-Element-Channel Transceiver/128-Subarray-Channel 2-D Array IC With Analog RX and All-Digital TX Beamformer for Echocardiography," *IEEE J. Solid-State Circuits*, vol. 54, no. 9, pp. 2555–2567, Sep. 2019.

# **5** Conclusion

The techniques we developed have established the groundwork and removed the initial barriers for an electronics architecture suitable for a wearable 3D TFUS device. Despite the progress, a notable gap remains between the prototype we developed and a practical TFUS device, which requires additional development of a digital processing unit and system-level hardware/software, as well as several other challenges, including integration and packaging into a suitable wearable form factor.

This chapter summarizes the main contributions of this work and original findings of this work, and provides an outlook to future work.

## **5.1.** MAIN CONTRIBUTIONS

Implementation of a low-noise analog front-end with continuous time-gain compensation for high-density ultrasound transducer arrays (chapter 2).

An analog front-end (AFE) circuit with accurate continuous time-gain compensation designed for high-density, high resolution ultrasound transducer arrays has been reported. The proposed AFE interfaces a 100µm-pitch PZT transducer array of 8×8 elements, which is directly integrated onto the ASIC. The ASIC comprises of two stages, a trans-impedance amplifier and a current amplifier, and utilizes a novel complementary current-steering network to minimize gain error. Both electrical and acoustic measurements show that the AFE achieves a linear-in-dB gain error below ±0.4 dB within a 36-dB gain range, which is >2× better than the prior art. Per channel, the AFE occupies 0.025 mm<sup>2</sup> area, consumes 0.8 mW power and achieves an input-referred noise density of 1.31 pA/ $\sqrt{Hz}$ .

#### Implementation of an ultrasound transceiver ASIC employing boxcar-integration-based micro-beamforming for 3-D transfontanelle ultrasonography (chapter 3).

A pitch-matched ultrasound transceiver ASIC designed for TFUS has been reported. The proposed ASIC employs the AFE presented in chapter 2 along with a boxcar-integration-based  $\mu$ BF, that obviates the need for explicit anti-alias filtering, to realize a streamlined RX architecture. Compared to a conventional voltage-mode  $\mu$ BF, the current-mode  $\mu$ BF reduces the required total number of memory capacitors by a factor of *N*, where *N* is the total number of RX channels in a sub-array, thereby reducing the hardware overhead associated with the memory capacitors. To meet the strict spatial resolution requirement of TFUS, a 10-MHz

100 $\mu$ m-pitch piezoelectric transducer array is employed. The proposed  $\mu$ BF's response was found to be in good agreement with the theoretical directivity curve, as demonstrated by both electrical and acoustic measurements. Per element, the  $\mu$ BF occupies 0.005 mm<sup>2</sup>, leading to a die area >2× smaller than prior designs employing  $\mu$ BF.

#### Implementation of a 2<sup>nd</sup>-generation ultrasound transceiver ASIC for wearable ultrasound devices used in transfortanelle ultrasonography (chapter 4).

A  $2^{nd}$ -generation transceiver ASIC has been reported, which extends the transceiver ASIC presented in chapter 3 with element-level pulsers with TX beamforming and RX circuits that utilize multiple techniques to achieve a 128-fold reduction in channel count. The RX circuits include 8-fold multiplexing,  $2\times2 \mu$ BF based on passive boxcar integration and sub-array-level digitization followed by a multi-level datalink that concatenates outputs of four sub-arrays into one data stream. The ASIC interfaces with a 16×16 transducer array with 125-µm pitch and 9-MHz center frequency. The prototype transceiver ASIC was fabricated in a 180-nm BCD process. The electrical and acoustic measurements show that the ASIC achieves a peak signal-to-noise ratio of 54 dB and a dynamic range of 83 dB. The datalink achieves an aggregate 3.84 Gb/s data rate and a 3.3 pJ/bit energy efficiency. Per channel, the RX circuit consumes 1.83 mW and occupies 0.05 mm<sup>2</sup>. Compared to prior pitch-matched designs with on-chip digitization, this work achieves the largest RX bandwidth, the smallest array pitch and the highest center frequency, and also provides the highest data rate and energy efficiency in data transmission.

## **5.2.** ORIGINAL FINDINGS

#### Complementary current steering network (chapter 2)

An inventive complementary current steering network (CCSN) has been introduced in the AFE design to realize a continuous time-gain compensation with a small linear-in-dB gain error. As proved by both theoretical analysis and electrical/acoustic experiments, the CCSN interpolates discrete gain steps, accurately defined either by capacitive dividers or by current-mirrors, along a pseudo-exponential trajectory, thereby significantly reducing the gain error. The CCSN not only mitigates the attenuation artifacts by providing a small linear-in-dB gain error, but also alleviates the gain-switching artifacts by traversing gain steps in a continuous-time fashion, without causing excessive chip area and additional input-referred noise.

#### Biasing scheme for suppressing switching artifacts (chapter 2)

Instead of using a AC-coupling capacitor to bridge the transducer and the virtual ground of the TIA [1], or using a symmetrical dual power supply to bias the virtual ground of the TIA at a ground level [2], two DC level-shifting capacitors separately bias the NMOS and PMOS inputs of the TIA at a ground level, which allows us to use a single power supply to eliminate the need for a bulky isolation ring. This scheme also effectively reduces the total required coupling capacitance, ensures a constant input voltage for the transducer between TX and RX, and suppresses the occurrence of T/R switching artifacts.

#### • Adaptive biasing circuit for the AFE (chapter 2)

An adaptive biasing circuit is used to dynamically compensate for the closed-loop bandwidth variation of the AFE, providing a nearly constant bandwidth within the gain range of the TGC, and also reducing the average power consumption.

#### Boxcar-integration-based µBF (chapter 3)

A novel boxcar-integration-based  $\mu$ BF has been introduced to integrate the output current signal of the AFEs in a sub-array, which implements a sum-and-delay operation and thus significantly reduces the required memory elements compared to a conventional voltage-mode delay-and-sum  $\mu$ BF, and also eliminates the requirement for explicit anti-alias filtering, making the circuitry of a RX sub-array very compact and streamlined.

#### Row/column-level pulsers (chapter 3)

A new TX scheme employs row/column-level push-pull pulsers in the periphery and element-level high-voltage (HV) diodes to isolate transducer elements from each other, allowing for only using one active HV MOS transistor in the pitch-matched region. This row-by-row or column-by-column TX leads to a cross-shaped TX beam pattern, thus limiting the field of view [3]–[5]. However, it significantly reduces the hardware overhead [6]. As a result, the TX circuit becomes very compact, leaving more room for RX circuitries.

#### RX architecture with effective channel-count reduction (chapter 4)

A streamlined RX architecture has been proposed, which incorporates current-mode AFEs, a current-mode µBF based on passive boxcar integration and a charge-sharing SAR ADC to perform sub-array-level digitization in the pitch-matched implementation, followed by a multi-level data link that concatenates digital outputs of four sub-arrays to a differential data output. The proposed architecture features low power and compact area, providing a 16-fold channel-count reduction in the sub-array level, and a 128-fold reduction when taking the 8:1 multiplexing at the input of the AFE into account. This architecture could be adapted for other applications that require a significant reduction in channel count.

#### ♦ Passive-boxcar-integration based µBF ADC (chapter 4)

In contrast to the  $\mu$ BF implemented in chapter 3, a  $\mu$ BF has been proposed that uses passive capacitors to integrate the current output signals of the AFEs and shares the resulting charge signal with the CDAC array of the subsequent SAR ADC in a time-interleaved fashion for the following digitization. This implementation eliminates the requirement for explicit anti-alias filtering and power-hungry ADC drivers, thereby reducing the power and occupying a small area.

#### Charge-mode referencing (chapter 4)

A charge-mode referencing scheme has been implemented, which employs an area-efficient MOS capacitor as a reservoir to quickly build a reference voltage for the ADC in the charge domain. A current source charging the reservoir capacitor and the parallel connected CDAC is calibrated during TX by a servo loop, which is fully deactivated to reduce power consumption during the subsequent RX. During RX, the pre-charged reservoir capacitor and CDAC array share charge during the MSB conversion of the ADC, thus quickly charging the CDAC array to the right level. During the remainder of the conversion, the calibrated current source recharges the reservoir capacitor without affecting the overall AD conversion time.

#### PAM-16 datalink (chapter 4)

The datalink utilizes 16-level pulse-amplitude modulation to transmit four-bit data in one clock cycle, which reduces the symbol rate by a factor of 4 compared to a conventional LVDS datalink, thereby significantly improving energy efficiency and reducing power consumption. The multi-bit datalink is validated in short-range communication and can be used to effectively shift power consumption to the data receiver, which can be manufactured in a more advanced technology node with higher energy efficiency, thus achieving system-level power optimization (chapter 4).

### **5.3.** FUTURE WORK

Transforming the prototype into a market-ready product would involve years of substantial engineering work. Moreover, it generally takes years to obtain premarket approval (PMA) for clinical application from regulatory administrations [7]. Therefore, the commercialization of the product is expected to face significant challenges, which may possibly take 5 – 10 years. In terms of ASIC development, the following gives suggestions for future work.

#### Implementing the data receiver and digital processing unit.

Although the presented techniques pave the way for wearable devices used in TFUS, there are still outstanding tasks for the project. The most important tasks include the data receiver and digital processing unit that further compress the bandwidth of the raw ultrasound RF data. These functions are expected to be implemented in a dedicated chip made with a more advanced IC technology node, which would be connected to a commercial wireless transmitter module for the following wireless offload. An ADC-based data receiver AFE with specific equalization functions could be a good candidate [8], [9] to recover the data. The subsequent digital processing unit would comprise functions, such as in-phase and quadrature demodulation, followed by digital RX beamforming [10], [11].

#### Extending multi-level signaling technique to other ultrasound applications.

The multi-level datalink developed in our project was specially designed for short-range communications, however, it can be utilized for other ultrasound applications, such as in-tracardiac echocardiography (ICE) [12], [13] and transesophageal echocardiography (TEE) [14], [15], where mid-range cables are used to upload and download the data. Due to physical limitations, a significant reduction in channel count is often required to minimize the catheter size with a length up to a few meters [12], [16], resulting in very thin cables and large attenuation to the communication signals. Moreover, ICE/TEE catheters are used for real-time ultrasound imaging inside the body and therefore need to meet stringent power dissipation requirement and to avoid excessive tissue temperature elevation [17]. Owing to these limitations, commercial ICE/TEE catheters usually employ analog outputs instead of digital at the cost of degraded signal quality [12]. However, multi-level data link would potentially solve these problems by using more robust digital communication with low power consumption [18]. Additionally, a dedicated data receiver, with an equalization scheme specifically designed for cables used in ultrasound applications, is still required.

#### Process migration to SOI technology to reduce the size of high-voltage devices.

The BCD technology that was used to develop two transceiver ASICs isolates different voltage domains by using reverse-biased PN junctions with large lateral spacing, especially for the high-voltage isolation. As a result, the TX circuits in both designs occupy more than 50% of the pitch-matched regions, which puts a cap on the minimum achievable pitch of the transducers and also hampers the application of more sophisticated RX circuits. A possible solution is to use silicon-on-insulator (SOI) technology, which utilizes oxide isolation instead of the reverse-biased PN junction to isolate high voltages, allowing for minimally-sized high-voltage devices.

#### • Process migration to MUT technology for high-volume production.

The manufacturing process for bulk PZT transducers has high die-to-die variation and limited scalability, and is not compatible with modern IC technology, making it unsuitable for high yield rate, high-volume production. These challenges are generally not critical for using bulk PZT based ultrasound probes in high-end medical applications. However, they do present challenges in low-cost, high-volume applications, such as handheld portable ultrasound probes [11], and ultrasound devices for antenatal care in resource-limited countries [19], [20]. In contrast, micromachined ultrasonic transducers (MUTs) are gaining increasing interest because they are compatible with IC technology, and can therefore be mass-produced at low-cost [21]. To make our envisioned wearable TFUS monitoring device broadly available, including use in low- or middle-income countries, it is reasonable to consider migrating to CMUT/PMUT processes.

#### Broaden the use of the techniques developed in this project to other wearable ultrasound applications.

Point-of-care ultrasound devices have gained increased attention in recent years. Wearable ultrasound devices allow for continuous point-of-care monitoring and rapid diagnosis outside of conventional hospitals, making themselves incredibly valuable for patients [22]–[24]. However, current researches still center on the transducer manufacturing, and only limited efforts have been devoted to the associated electronics. For example, a board-level implementation of electronics has been reported in [25]. A hybrid implementation has been reported in [26], where a dedicated ASIC including TX and RX AFE is employed to interface with the transducer, while an external amplifier and subsequent ADC are used to digitize the ultrasound signals. As a result, these wearable devices are still bulky and have high power consumption. The techniques we have developed could be promising low-power solutions for fully-integrated transceivers used in miniaturized wearable ultrasound devices.

# REFERENCES

- M. Tan, E. Kang, J. An, Z. Chang, P. Vince, T. Matéo, N. Sénégond, and M. A. P. Pertijs, "A 64-Channel Transmit Beamformer With ±30-V Bipolar High-Voltage Pulsers for Catheter-Based Ultrasound Probes," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 7, pp. 1796–1806, Jul. 2020.
- [2] E. Kang, M. Tan, J.-S. An, Z.-Y. Chang, P. Vince, N. Sénégond, T. Mateo, C. Meynier, and M. A. P. Pertijs, "A Variable-Gain Low-Noise Transimpedance Amplifier for Miniature Ultrasound Probes," *IEEE J. Solid-State Circuits*, vol. 55, no. 12, pp. 3157–3168, Dec. 2020.
- [3] T. L. Christiansen, C. Dahl-Petersen, J. A. Jensen, and E. V. Thomsen, "2-D row-column CMUT arrays with an open-grid support structure," in *2013 IEEE International Ultrasonics Symposium (IUS)*, Jul. 2013, pp. 1712–1715.
- [4] M. F. Rasmussen and J. A. Jensen, "3D ultrasound imaging performance of a row-column addressed 2D array transducer: A simulation study," in *SPIE Medical Imaging*, J. G. Bosch and M. M. Doyley, Eds., Lake Buena Vista (Orlando Area), Florida, USA, Mar. 2013, p. 86750C.
- [5] K. Chen, H.-S. Lee, and C. G. Sodini, "A Column-Row-Parallel ASIC Architecture for 3-D Portable Medical Ultrasonic Imaging," *IEEE J. Solid-State Circuits*, vol. 51, no. 3, pp. 738– 751, Mar. 2016.
- [6] M. F. Rasmussen and J. A. Jensen, "3-D ultrasound imaging performance of a row-column addressed 2-D array transducer: A measurement study," in 2013 IEEE International Ultrasonics Symposium (IUS), Jul. 2013, pp. 1460–1463.
- [7] The Least Burdensome Provisions of the FDA Modernization Act of 1997: Concept and Principles:Final Guidance for FDA and Industry.
- [8] F. Celik, A. Akkaya, and Y. Leblebici, "A 32 Gb/s PAM-16 TX and ADC-Based RX AFE with 2-tap embedded analog FFE in 28 nm FDSOI," *Microelectronics Journal*, vol. 108, p. 104 967, Feb. 2021.
- [9] Y. Chun, A. Ramachandran, and T. Anand, "A PAM-8 Wireline Transceiver with Receiver Side PWM (Time-Domain) Feed Forward Equalization Operating from 12-to-39.6Gb/s in 65nm CMOS," in ESSCIRC 2019 - IEEE 45th European Solid State Circuits Conference (ESSCIRC), Sep. 2019, pp. 269–272.
- [10] B. Steinberg, "Digital beamforming in ultrasound," *IEEE Transactions on Ultrasonics, Fer*roelectrics, and Frequency Control, vol. 39, no. 6, pp. 716–721, Nov. 1992.
- [11] N. Sanchez, K. Chen, C. Chen, D. McMahill, S. Hwang, J. Lutsky, J. Yang, L. Bao, L. K. Chiu, G. Peyton, H. Soleimani, B. Ryan, J. R. Petrus, Y.-J. Kook, T. S. Ralston, K. G. Fife, and J. M. Rothberg, "34.1 An 8960-Element Ultrasound-on-Chip for Point-of-Care Ultrasound," in 2021 IEEE International Solid- State Circuits Conference (ISSCC), vol. 64, Feb. 2021, pp. 480–482.

- [12] D. Wildes, W. Lee, B. Haider, S. Cogan, K. Sundaresan, D. M. Mills, C. Yetter, P. H. Hart, C. R. Haun, M. Concepcion, J. Kirkhorn, and M. Bitoun, "4-D ICE: A 2-D Array Transducer With Integrated ASIC in a 10-Fr Catheter for Real-Time 3-D Intracardiac Echocardiography," *IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control*, vol. 63, no. 12, pp. 2159–2173, Dec. 2016.
- [13] Y. M. Hopf, B. W. Ossenkoppele, M. Soozande, E. Noothout, Z.-Y. Chang, C. Chen, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A Pitch-Matched Transceiver ASIC With Shared Hybrid Beamforming ADC for High-Frame-Rate 3-D Intracardiac Echocardiography," *IEEE Journal of Solid-State Circuits*, vol. 57, no. 11, pp. 3228–3242, Nov. 2022.
- [14] C. Chen, Z. Chen, D. Bera, E. Noothout, Z.-Y. Chang, M. Tan, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A Pitch-Matched Front-End ASIC With Integrated Subarray Beamforming ADC for Miniature 3-D Ultrasound Probes," *IEEE J. Solid-State Circuits*, vol. 53, no. 11, pp. 3050–3064, Nov. 2018.
- [15] J. Lee, K.-R. Lee, B. E. Eovino, J. H. Park, L. Y. Liang, L. Lin, H.-J. Yoo, and J. Yoo, "A 36-Channel Auto-Calibrated Front-End ASIC for a pMUT-Based Miniaturized 3-D Ultrasound System," *IEEE Journal of Solid-State Circuits*, vol. 56, no. 6, pp. 1910–1923, Jun. 2021.
- [16] W. Lee, W. Griffin, D. Wildes, D. Buckley, T. Topka, T. Chodakauskas, M. Langer, S. Calisti, S. Bergstol, J.-P. Malacrida, F. Lanteri, J. Maffre, B. Mcdaniel, K. Shivkumar, J. Cummings, D. Callans, F. Silvestry, and D. Packer, "A 10-Fr ultrasound catheter with integrated micromotor for4-D intracardiac echocardiography," *IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control*, vol. 58, no. 7, pp. 1478–1491, Jul. 2011.
- [17] Marketing Clearance of Diagnostic Ultrasound Systems and Transducers Guidance for Industry and Food and Drug Administration Staff, Sep. 2008.
- [18] Y. M. Hopf, "Integrated Circuits for 3D High-Frame-Rate Intracardiac Echocardiography Probes," Ph.D. dissertation, 2023.
- [19] T. L. A. van den Heuvel, H. Petros, S. Santini, C. L. de Korte, and B. van Ginneken, "Automated Fetal Head Detection and Circumference Estimation from Free-Hand Ultrasound Sweeps Using Deep Learning in Resource-Limited Countries," *Ultrasound in Medicine and Biology*, vol. 45, no. 3, pp. 773–785, Mar. 2019.
- [20] T. L. A. van den Heuvel, D. de Bruijn, D. M.-v. de Moesdijk, A. Beverdam, B. van Ginneken, and C. L. de Korte, "Comparison Study of Low-Cost Ultrasound Devices for Estimation of Gestational Age in Resource-Limited Countries," *Ultrasound in Medicine and Biology*, vol. 44, no. 11, pp. 2250–2260, Nov. 2018.
- [21] G. Gurun, C. Tekes, J. Zahorian, T. Xu, S. Satir, M. Karaman, J. Hasler, and F. L. Degertekin, "Single-chip CMUT-on-CMOS front-end system for real-time volumetric IVUS and ICE imaging," *IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control*, vol. 61, no. 2, pp. 239–250, Feb. 2014.

- [22] H. Hu, X. Zhu, C. Wang, L. Zhang, X. Li, S. Lee, Z. Huang, R. Chen, Z. Chen, C. Wang, Y. Gu, Y. Chen, Y. Lei, T. Zhang, N. Kim, Y. Guo, Y. Teng, W. Zhou, Y. Li, A. Nomoto, S. Sternini, Q. Zhou, M. Pharr, F. L. di Scalea, and S. Xu, "Stretchable ultrasonic transducer arrays for three-dimensional imaging on complex surfaces," *Science Advances*, vol. 4, no. 3, eaar3979, Mar. 2018.
- [23] H. Hu, H. Huang, M. Li, X. Gao, L. Yin, R. Qi, R. S. Wu, X. Chen, Y. Ma, K. Shi, C. Li, T. M. Maus, B. Huang, C. Lu, M. Lin, S. Zhou, Z. Lou, Y. Gu, Y. Chen, Y. Lei, X. Wang, R. Wang, W. Yue, X. Yang, Y. Bian, J. Mu, G. Park, S. Xiang, S. Cai, P. W. Corey, J. Wang, and S. Xu, "A wearable cardiac ultrasound imager," *Nature*, vol. 613, no. 7945, pp. 667–675, Jan. 2023.
- [24] J.-É. S. Kenny, C. E. Munding, J. K. Eibl, A. M. Eibl, B. F. Long, A. Boyes, J. Yin, P. Verrecchia, M. Parrotta, R. Gatzke, P. A. Magnin, P. N. Burns, F. S. Foster, and C. E. M. Demore, "A novel, hands-free ultrasound patch for continuous monitoring of quantitative Doppler in the carotid artery," *Sci Rep*, vol. 11, no. 1, p. 7780, Apr. 2021.
- [25] Z. Yin, H. Chen, X. Yang, Y. Liu, N. Zhang, J. Meng, and H. Liu, "A Wearable Ultrasound Interface for Prosthetic Hand Control," *IEEE Journal of Biomedical and Health Informatics*, vol. 26, no. 11, pp. 5384–5393, Nov. 2022.
- [26] H.-Y. Tang, D. Seo, U. Singhal, X. Li, M. M. Maharbiz, E. Alon, and B. E. Boser, "Miniaturizing Ultrasonic System for Portable Health Care and Fitness," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 9, no. 6, pp. 767–776, Dec. 2015.

# 6

# **SUMMARY**

T HIS thesis presents the design and implementation of integrated ultrasound transceivers for use in transfontanelle ultrasonography. This work aims to investigate the system- and circuit-level challenges, and identify corresponding solutions to overcome these bottlenecks. Consequently, several innovative techniques have been introduced to address issues such as image artifacts, constraints on area and power, and limitations on channel-count. The effectiveness and practicality of these techniques have been demonstrated through two generations of prototypes. The first prototype focuses on the design of an analog front-end (AFE) that utilizes continuous time-gain compensation (TGC) and current-mode micro-beamforming ( $\mu$ BF). The second prototype is an upgraded version of the first prototype, which incorporates a backend digitization scheme, consisting of a SAR ADC and a multi-level datalink, to directly digitize the  $\mu$ BF signals in the charge domain and further reduce the channel count.

#### CHAPTER 1

Chapter 1 introduces the motivation and background of this work, followed by a detailed analysis of the challenges that we face, and the corresponding possible countermeasures from different perspectives. The analysis compares and summarizes the pros and cons of the prior art, after which system-level strategies and corresponding targets for circuit-level designs are proposed for this work. Design challenges have been extensively discussed. For instance, the required high frequency, wide bandwidth for ultrasound chips used in TFUS, the required high-frame rate in volumetric ultrasound imaging, the difficulties of transducer-ASIC integration in a pitch-matched fashion and the required ASIC functionality such as combining HV TX and RX in a very limited area, reducing the channel count in a sufficient manner, improving power efficiency, and minimizing imaging artifacts.

### CHAPTER 2

Chapter 2 presents a novel AFE architecture, which features low-power consumption, compact area, and continuous TGC with suppression of gain- and T/R- switching artifacts. The AFE is a two-stage structure, consisting of two variable-gain stages and the associated complementary current-steering networks (CCSNs). The first stage is a trans-impedance amplifier with a hardware-sharing input stage and a capacitive feedback network, which couples to the second stage via a capacitive feedforward network. The second stage is a current-mirror-based current amplifier, which provides a high-impedance output for the subsequent back-end circuitry. In both stages, the CCSNs interpolate the discrete gain steps, which are formed either by the capacitive feedback/feedforward networks or by the current mirrors, to realize continuouslyvariable gains. The prototype AFE interfaces with a 100µm-pitch PZT transducer array of 8×8 elements, and is implemented in a 180-nm BCD technology. Both electrical and acoustic measurements have been used to verify the prototype, which show that a linear-in-dB gain error below ±0.4 dB within a 36-dB gain range has been achieved, leading to a 2× smaller gain error than the prior art. Per channel, the AFE consumes 0.8 mW, occupies 0.025 mm<sup>2</sup> and achieves an input-referred noise density of 1.31 pA/ $\sqrt{Hz}$ .

#### CHAPTER 3

Chapter 3 presents a boxcar-integration-based (BI-based) µBF that delays-and-sums (DAS) output signals of four AFEs in the current domain, providing a 4-fold reduction in channel count. Compared to conventional voltage-mode µBF designs, the BI-based µBF reduces the number of required memory capacitors by a fact of N, where N is the number of channels in a sub-array and equals four in our design, thus minimizing the associated hardware overheads such as routing and switches connected to these capacitors. Meanwhile, the BI-based µBF also eliminates the need for explicit anti-alias filtering (AAF), which reduces the complexity of the AFE and allows for a compact design. A compact row- or column-level pulser has been proposed, that moves bulky high-voltage (HV) MOS transistors to the peripheral region, allowing for only using one active HV MOS and one isolation diode in the pitch-matched region. This results in a compact transmit scheme, providing more space to accommodate more complex functions in RX. The  $\mu$ BF is integrated with the proposed AFE with TGC and demonstrated using the same prototype made with a 180-nm BCD technology. Per element, the µBF occupies  $0.005 \text{ mm}^2$ , which is 2× smaller than the prior designs. The total RX power consumption is 1.2 mW per channel, of which 0.8 mW is consumed by the AFE and biasing, 0.33 mW by the  $\mu$ BF, 0.06 mW by the output buffer and digital circuitry.

#### CHAPTER 4

Chapter 4 presents a fully-digitized transceiver ASIC derived from the first generation ASIC. The new ASIC reuses the proven AFE and incorporates a modified µBF based on passive boxcar integration, which connects to a SAR ADC in the charge domain for the following digitization. This implementation obviates the need for explicit AAF and power-hungry ADC driver stages, thereby consuming less power and occupying a smaller area. A lower-power and compact referencing is employed, which uses an area-efficient MOS capacitor to quickly charge the CDAC array and set an accurate reference voltage for the ADC. A multi-level data link at the periphery concatenates the outputs of four ADC into a bitstream, providing a further 4-fold reduction in channel count. Compared to a conventional LVDS data link, the use of 16-level pulse-amplitude modulation (PAM-16) significantly improves the energy efficiency, improving the trade-offs in a system-level power optimization. The transceiver ASIC achieves a 128-fold reduction in RX channel count and incorporates element-level pulsers and TX beamforming into the same chip, making the proposed ASIC architecture a lower-power and compact solution to high-resolution 3-D ultrasound imaging. The ASIC was fabricated in a 180-nm BCD technology, and the RX achieves a 54-dB signal-to-noise ratio and an 83-dB dynamic range in the sub-array level. The data link achieves an aggregated 3.84 Gb/s data rate and a 3.3 pJ/bit energy efficiency. Per channel, the RX consumes 1.83 mW and occupies 0.05 mm<sup>2</sup>.

# Chapter 5

Chapter 5 summarizes the main contributions and original findings of this thesis work. We conclude that the proposed ASIC architecture provides a comprehensive transceiver solution not only for wearable devices used in TFUS but also for other ultrasound applications requiring a significant reduction in channel count. Besides, the techniques that we developed in the project could also be valuable alternatives in other ultrasound systems. The chapter ends with suggestions for future work and a discussion on extending our work to other ultrasound applications.

# 7 Samenvatting

D EZE scriptie beschrijft het ontwerp en de implementatie van geïntegreerde ultrasone omzetters voor gebruik in transfontanelle echografie. Dit werk heeft als doel de uitdagingen op systeem- en circuitniveau te onderzoeken en bijbehorende oplossingen te identificeren om deze uitdagingen te adresseren. Hiervoor zijn verschillende innovatieve technieken geïntroduceerd om problemen zoals beeldartefacten, beperkingen op gebied en vermogen, en beperkingen op het aantal kanalen te adresseren. De doeltreffendheid en praktische toepasbaarheid van deze technieken zijn aangetoond met twee generaties prototypes. Het eerste prototype richt zich op het ontwerp van een analoge front-end (AFE) die continue tijdsversterkingscompensatie (TGC) en current-mode micro-beamforming ( $\mu$ BF) toepast. Het tweede prototype is een verbeterde versie van het eerste, waarbij een signaal digitaliserende back-end wordt geïntroduceerd, bestaande uit een SAR ADC en een multi-level datalink, om de  $\mu$ BF-signalen in het ladingdomein direct te digitaliseren en het aantal kanalen verder te verminderen.

#### HOOFDSTUK 1

Hoofdstuk 1 introduceert de motivatie en achtergrond van dit werk, gevolgd door een gedetailleerde analyse van de uitdagingen die zijn aangegaan, en behandeld de mogelijke oplossingen vanuit verschillende perspectieven. De analyse vergelijkt en vat de voor- en nadelen van de bestaande literatuur samen, waarna strategieën op systeemniveau en bijbehorende ontwerpdoelen op circuitniveau voor dit werk worden voorgesteld. Ontwerpuitdagingen worden uitgebreid behandeld, zoals de vereiste hoge frequentie en bandbreedte voor ultrasone chips gebruikt in TFUS, de vereiste hoge frame-rate in volumetrische ultrageluid-beeldvorming, de moeilijkheden van transducer-ASIC-integratie in een pitch-matched ontwerp en de vereiste ASIC-functionaliteit zoals het combineren van HV TX en RX in een zeer beperkt oppervlakte, het voldoende verminderen van het aantal kanalen, het verbeteren van de vermogensefficiëntie en het minimaliseren van beeldartefacten.

#### HOOFDSTUK 2

Hoofdstuk 2 behandeld een nieuwe AFE-architectuur, die een laag stroomverbruik, een compact oppervlak en continue TGC met onderdrukking van gain- en T/R-switching artefacten heeft. De AFE bestaat uit twee variabele versterkingstrappen en de bijbehorende complementaire stroom-sturings-netwerken (CCSNs). De eerste trap bestaat uit een trans-impedantieversterker met een hardware-delende ingangstrap en een capacitief terugkoppelnetwerk. De uitgang van deze eerste trap wordt via een capacitief netwerk naar de

tweede trap gekoppeld. De tweede trap is een stroomversterker op basis van stroomspiegels, die voor een hoge uitgangsimpedantie zorgt, nodig voor de opvolgende back-end. In beide versterkingstrappen interpoleren de CCSNs discrete versterkingsstappen om een continu variabele versterking te realiseren. De prototype AFE is geïmplementeerd in een 180nm BCD technologie en is gekoppeld aan een 100um-pitch PZT matrix, bestaande uit 8×8 elementen. Zowel elektrische als akoestische metingen zijn gedaan om de correcte werking van het prototype te verifiëren, waarbij een linear-in-dB versterkingsfout onder  $\pm 0.4$ -dB binnen een 36-dB versterkingsbereik is bereikt. Dit is een twee keer zo kleine fout als bereikt in voorafgaande literatuur. Per kanaal verbruikt de AFE 0.8mW, neemt 0.025 mm<sup>2</sup> oppervlakte in beslag en heeft een ingang-gerefereerde ruis dichtheid van 1.31 pA/ $\sqrt{Hz}$ .

#### HOOFDSTUK 3

Hoofdstuk 3 behandeld een Boxcar-integratie-gebaseerde (BI-gebaseerde) µBF die delayand-sum (DAS) operaties op vier uitgangssignalen van AFE's uitvoert in het stroomdomein, waardoor het aantal kanalen met een factor 4 wordt verminderd. In vergelijking met conventionele voltage-mode µBF-ontwerpen vermindert de BI-gebaseerde µBF het aantal vereiste geheugen-condensators met een factor N, waarbij N het aantal kanalen in een sub-matrix is, wat gelijk is aan vier in dit ontwerp. Hierdoor worden de bijbehorende hoeveelheid routing en schakelaars die op de geheugen-condensators zijn aangesloten geminimaliseerd. Tegelijkertijd elimineert de BI-gebaseerde µBF ook de eis voor een expliciete anti-alias-filtering (AAF), wat de complexiteit van de AFE vermindert en in een compact ontwerp resulteert. Een compacte rij- of kolomniveau-pulser wordt voorgesteld, die de grote hoogspanning (HV) MOS-transistors (nodig voor TX) naar het perifere gebied verplaatst en daardoor slechts één actieve HV-MOS en een isolatiediode gebruikt in het pitch-matched gebied. Dit resulteert in een compacte TX architectuur, waardoor er meer ruimte beschikbaar is om complexe functies in RX te accommoderen. De  $\mu$ BF is geïntegreerd met de voorgestelde AFE met TGC en gedemonstreerd met behulp van hetzelfde prototype gefabriceerd met een 180-nm BCD-technologie. Per element beslaat de µBF 0.005 mm<sup>2</sup>, wat 2× kleiner is dan voorafgaande literatuur. Het totale RX-energieverbruik is 1.2 mW per kanaal, waarvan 0.8 mW wordt verbruikt door de AFE en biasing, 0.33 mW door de µBF, 0.06 mW door de outputbuffer en digitale schakelingen.

#### HOOFDSTUK 4

Hoofdstuk 4 behandeld een volledig gedigitaliseerde transceiver ASIC die is afgeleid van de eerste generatie ASIC. De nieuwe ASIC hergebruikt de eerder bewezen AFE en bevat een aangepaste µBF op basis van passieve boxcar-integratie, die is aangesloten op een SAR ADC in het ladingdomein voor de daaropvolgende digitalisering. Deze implementatie maakt het gebruik van expliciete AAF en energiehongerige ADC-stuurtrappen overbodig, waardoor er minder energie wordt verbruikt en minder ruimte wordt ingenomen. Er wordt gebruik gemaakt van een energiezuinige en compacte referentie, die een ruimte-efficiënte MOS-condensator gebruikt om het CDAC-netwerk snel op te laden en een nauwkeurige referentiespanning voor de ADC in te stellen. Een multi-level datalink aan de rand van de ASIC, combineert de uitgangen van vier ADC's, resulterend in een 4-voudige vermindering van het aantal kanalen. In vergelijking met een conventionele LVDS dataverbinding verbetert het gebruik van 16-level puls-amplitude modulatie (PAM-16) significant de energie-efficiëntie, waardoor een voordeligere verdeling kan worden gekozen voor energieverbruik op systeemniveau. De ASIC met ultrageluid omzetters bereikt een 128-voudige vermindering in het aantal RX kanalen en implementeerd pulser circuits en TX-beamforming functionaliteiten voor elke geïntegreerde omzetter, waardoor de voorgestelde ASIC-architectuur een energiezuinige en compacte oplossing is voor het maken van hoogwaardige 3D-echografiebeelden. De ASIC is gefabriceerd met een 180nm BCD-technologie en het RX circuit bereikt een signaal-ruisverhouding van 54 dB en een dynamisch bereik van 83 dB op sub-matrix-niveau. De datalink bereikt een geaggregeerde datatransmissiesnelheid van 3.84 Gb/s en een energie-efficiëntie van 3.3 pJ/bit. Per kanaal verbruikt het RX circuit 1.83 mW en neemt 0,05 mm<sup>2</sup> in beslag.

#### HOOFDSTUK 5

Hoofdstuk 5 vat de belangrijkste bijdragen en originele bevindingen van dit proefschrift samen. Er wordt geconcludeerd dat de voorgestelde ASIC-architectuur niet alleen oplossingen biedt voor draagbare apparaten die worden gebruikt in TFUS, maar ook voor andere ultrageluid toepassingen die een aanzienlijke vermindering in het aantal uitgangskanalen vereisen. Bovendien kunnen de technieken naast kanaalreductie die zijn ontwikkeld in het project ook waardevolle alternatieven zijn in andere ultrageluid systemen. Het hoofdstuk eindigt met suggesties voor toekomstig werk en een discussie over het uitbreiden van het werk naar andere toepassingen dan TFUS binnen het veld van ultrageluid.

I would like to express my gratitude to my esteemed colleague Nuriel Rozsa for helping me with the translation of this chapter into Dutch.

# LIST OF PUBLICATIONS

# JOURNAL ARTICLES

**P. Guo**, Z.-Y. Chang, E. Noothout, H. J. Vos, J. G. Bosch, N. de Jong, M. D. Verweij, and M. A. P. Pertijs, "A Pitch-Matched Low-Noise Analog Front-End With Accurate Continuous Time-Gain Compensation for High-Density Ultrasound Transducer Arrays," in *IEEE Journal of Solid-State Circuits*, vol. 58, no. 6, pp. 1693-1705, June 2023.

**P. Guo**, F. Fool, Z.-Y. Chang, E. Noothout, H. J. Vos, J. G. Bosch, N. de Jong, M. D. Verweij, and M. A. P. Pertijs, "A 1.2mW/Channel Pitch-Matched Transceiver ASIC Employing a Boxcar-Integration-Based RX Micro-Beamformer for High-Resolution 3-D Ultrasound Imaging," in *IEEE Journal of Solid-State Circuits*, in press, 2023.

### **CONFERENCE PROCEEDINGS**

**P. Guo**, Z. Y. Chang, E. Noothout, H. J. Vos, J. G. Bosch, N. de Jong, M. D. Verweij, and M. A. P. Pertijs, "A Pitch-Matched Analog Front-End with Continuous Time-Gain Compensation for High-Density Ultrasound Transducer Arrays," *in ESSCIRC 2021 - IEEE 47th European Solid State Circuits Conference (ESSCIRC)*, Sep. 2021, pp. 163–166.

**P. Guo**, F. Fool, E. Noothout, Z.-Y. Chang, H. J. Vos, J. G. Bosch, M. D. Verweij, N. de Jong, and M. A. P. Pertijs, "A 1.2mW/channel 100µm-Pitch-Matched Transceiver ASIC with Boxcar-Integration-Based RX Micro-Beamformer for High-Resolution 3D Ultrasound Imaging," in *2022 IEEE International Solid-State Circuits Conference (ISSCC)*, Feb. 2022, pp. 496–498.

**P. Guo**, Z. Y. Chang, E. Noothout, H. J. Vos, J. G. Bosch, N. de Jong, M. D. Verweij, and M. A. P. Pertijs, "A Pitch-Matched Transceiver ASIC for 3D Ultrasonography with Micro-Beamforming ADCs based on Passive Boxcar Integration and a Multi-Level Datalink," in *2023 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits)*, June. 2023.