Review of CMOS Integrated Circuit Technologies for High ... - MDPI

4 downloads 0 Views 16MB Size Report
Aug 25, 2017 - bit-error rate (BER) of 10−12. ...... Han, J.; Lu, Y.; Sutardja, N.; Alon, E. A 60 Gb/s 288 mW NRZ transceiver with adaptive equalization and.
sensors Review

Review of CMOS Integrated Circuit Technologies for High-Speed Photo-Detection Gyu-Seob Jeong 1,† , Woorham Bae 2,† 1 2

* †

ID

and Deog-Kyoon Jeong 1, *

ID

Department of Electrical and Computer Engineering and Inter-University Semiconductor Research Center, Seoul National University, Seoul 08826, Korea; [email protected] Department of Electrical Engineering and Computer Sciences, University of California at Berkeley, Berkeley, CA 94720, USA; [email protected] Correspondence: [email protected]; Tel.: +82-2-871-3987 Gyu-Seob Jeong and Woorham Bae contributed equally to this work.

Received: 27 June 2017; Accepted: 22 August 2017; Published: 25 August 2017

Abstract: The bandwidth requirement of wireline communications has increased exponentially because of the ever-increasing demand for data centers and high-performance computing systems. However, it becomes difficult to satisfy the requirement with legacy electrical links which suffer from frequency-dependent losses due to skin effects, dielectric losses, channel reflections, and crosstalk, resulting in a severe bandwidth limitation. In order to overcome this challenge, it is necessary to introduce optical communication technology, which has been mainly used for long-reach communications, such as long-haul networks and metropolitan area networks, to the mediumand short-reach communication systems. However, there still remain important issues to be resolved to facilitate the adoption of the optical technologies. The most critical challenges are the energy efficiency and the cost competitiveness as compared to the legacy copper-based electrical communications. One possible solution is silicon photonics which has long been investigated by a number of research groups. Despite inherent incompatibility of silicon with the photonic world, silicon photonics is promising and is the only solution that can leverage the mature complementary metal-oxide-semiconductor (CMOS) technologies. Silicon photonics can be utilized in not only wireline communications but also countless sensor applications. This paper introduces a brief review of silicon photonics first and subsequently describes the history, overview, and categorization of the CMOS IC technology for high-speed photo-detection without enumerating the complex circuital expressions and terminologies. Keywords: CMOS; integrated circuit; photodetector; silicon photonics; transimpedance amplifier

1. Introduction Today, people’s everyday lives are connected online, i.e., anyone can access or share data worldwide at any time and place. For example, most people today are familiar with accessing streaming media from YouTube, and using social network services and cloud services for personal purposes. The concept of internet of things (IoT) is not new anymore, because IoT devices have already become a part of people’s lives. Two important factors contributing to this revolution are the development of high-performance computing systems and data communication systems which are closely interdependent. Especially, the requirement of high-performance data communication systems is being emphasized more than ever, because of ever-increasing demand for data throughput in every computing system. Even a microprocessor and memory constituting a tiny computing system is now required to handle several hundreds of gigabits per second, and it is growing at a relentless rate. A similar phenomenon is observed in longer-distance applications such as inter-server and long-haul communications. The forecast of data throughput demands is provided by Cisco, one of Sensors 2017, 17, 1962; doi:10.3390/s17091962

www.mdpi.com/journal/sensors

Sensors 2017, 2017, 17, 17, 1962 1962 Sensors Sensors 2017, 17, 1962

of 40 39 22 of 2 of 39

of the leading companies specializing in networking equipment. Figure 1 forecasts the annual global of leading companies specializingin innetworking networking equipment. Figure forecasts the leading 11 forecasts the global IPthe traffic bycompanies 2020 with aspecializing compound annual growthequipment. rate of 20%,Figure predicting that the IPannual traffic global would IP traffic by 2020 with a compound annual growth rate of 20%, predicting that the IP traffic would IP traffic by 2020 with a compound annual growth rate of 20%, predicting that the IP traffic would become 200 exabytes/month in 2020 [1]. become become 200 200 exabytes/month exabytes/monthinin2020 2020[1]. [1]. Exabytes/Month Exabytes/Month

200 200

Consumer Consumer Business Business

150 150 100 100 50 50 0 0

2015 2016 2017 2018 2019 2020 2015 2016 2017Year 2018 2019 2020 Source: Cisco

Year

Source: Cisco

Figure 1. Forecast of total global IP traffic [1]. Figure 1. 1. Forecast Forecastof oftotal total global global IP IP traffic traffic [1]. [1]. Figure

It is worth investigating the main factors that result in these explosive total data throughput ItIt is factors that in demands. Figureinvestigating 2a illustratesthe themain breakdown theresult total global IPexplosive traffic by total application. As shown is worth worth investigating the main factorsof that result in these these explosive total data data throughput throughput demands. Figure 2a illustrates the breakdown of the total global IP traffic by application. As shown in this figure, the proportion that internet video applications account for will increase in the future. demands. Figure 2a illustrates the breakdown of the total global IP traffic by application. As shown in in this figure, the proportion that internet video applications account for will increase in the future. Thisfigure, is reasonably explained Figurevideo 2b by further subdividing applications Standard this the proportion that in internet applications account forvideo will increase in the into future. This is This is reasonably explained in2b Figure 2b by further subdividing video applications into Definition Standard Definition (SD), High Definition (HD), and Ultra-High Definition (UHD). From investigations, reasonably explained in Figure by further subdividing video applications intothese Standard Definition High Definition (HD), contents and Ultra-High Definition (UHD). From these investigations, it is evident that high-quality would(UHD). be dominant in future IP traffic of (SD), High(SD), Definition (HD), and video Ultra-High Definition From these investigations, itbecause is evident itthat is evident that high-quality video contents would be dominant in future IP traffic because of increasing demand forcontents 3D media content and virtual reality. It can also be predicted demand that the high-quality video would be dominant in future IP traffic because of increasing increasing demand 3Dvirtual media content and virtual reality. It also becenters predicted thathigh the capability of handling high data bandwidth will bebe required forthat notcan only data but also endfor 3D media contentfor and reality. It can also predicted the capability of handling capability of handling high data bandwidth will be required for not only data centers but also enduser bandwidth devices. data will be required for not only data centers but also end-user devices. user devices. Source: Cisco 63%Source: INTERNET VIDEO

63%

INTERNET VIDEO

Source: Cisco

Cisco SD

79%

47%

18%

SD

18%

79%

51%

24% WEB/DATA

HD

24% 17%

WEB/DATA

Source: Cisco

47%

51%

66%

HD

66%

17% 2%

13% FILE SHARING

FILE SHARING

UHD

13% 4%

2015

4%

2015

(a) (a)

2020

2020

UHD

2%

16%

2015

16%

2015

2020

2020

(b) (b)

Figure 2. 2. Forecast of global global IP IP traffic traffic (a) (a) by by application application and and (b) (b) by by video video content content [1]. [1]. Figure Forecast of Figure 2. Forecast of global IP traffic (a) by application and (b) by video content [1].

The above data predicts that the bandwidth requirement will inevitably increase for both data The above abovedata datapredicts predicts that bandwidth requirement will inevitably increaseboth for data both The thethe bandwidth requirement increase centers and end-user devices.that Another important requirement iswill theinevitably reduction of powerfor consumption. data centers and end-user devices. Another important requirement is the reduction of power centers devices. Another important requirement the reduction of power Figure and 3a end-user shows the power breakdown of typical data iscenters, indicating thatconsumption. the current consumption. Figure 3a shows the power breakdown of typical data centers, indicating that the Figure 3a shows the power breakdown of power typicalin data centers, indicating thatrates the so current communication systems are consuming much handling extremely high data that a current communication systems are consuming much power in handling extremely high data rates communication are consuming muchfor power in handling extremely data rates so thatthe a significant part systems of resources is being wasted cooling systems. In the casehigh of end-user devices, so that a significant part of resources is being wasted for cooling systems. In the case of end-user significant of resources being wasted systems. the case of end-user the proportionpart of mobile devicesiswill exceed thatfor ofcooling PCs as shown inIn Figure 3b, meaning thedevices, importance devices, theofproportion of mobile devicesthat willof exceed that of PCs as shown in Figure 3b, meaning the proportion mobile devices exceed PCs as shown in Figure meaning importance of power management will will be undoubtedly more pronounced in the3b,future. In the summary, it is importance of power management will be undoubtedly more pronounced in the future. In summary, of power to management will be undoubtedly system more pronounced in the future. In summary, it is required build a reliable communication satisfying the aforementioned requirements, it is required to build a reliable communication system satisfying the aforementioned requirements, required to build a reliable communication system satisfying necessitating innovations in the existing communication systems.the aforementioned requirements, necessitating innovations innovations in in the the existing existing communication communication systems. systems. necessitating The primary obstacle that hinders high-speed operation is the electrical limitation of the copperThe primary primaryobstacle obstacle that hinders high-speed operation iselectrical the electrical limitation of the The that hinders high-speed operation is theapplications limitation of the copperbased interconnection which is predominantly used in most except for long-reach copper-based interconnection is predominantly in most applications exceptfor forlong-reach long-reach based interconnection which which is predominantly usedused in most applications except

Sensors 2017, 17, 1962

3 of 40

Sensors 2017, 17, 1962

3 of 39

interconnections. The frequency response of a copper wire severely degrades due to conductor loss interconnections. frequency response of a copper wire severely degrades due to conductor loss and dielectric loss The at higher frequencies. and dielectric loss at higher frequencies. Source: Cisco

Smartphone

PC

Others

8%

2015

8%

53%

Distribution 33%

59%

39%

IT Load 30%

Cooling 2020

22%

48%

Source: Cisco

(a)

(b)

Figure 3. (a) Power breakdown of data centers and (b) forecast of IP traffic contribution [1]. Figure 3. (a) Power breakdown of data centers and (b) forecast of IP traffic contribution [1].

Channel reflection and crosstalk are also detrimental to high-speed operation, limiting the Channel reflection and crosstalk are also detrimental to high-speed operation, limiting the maximum operating frequency to only several GHz over a few tens of meters. Even if various circuit maximum operating frequency to only several GHz over a few tens of meters. Even if various techniques have successfully overcome this physical limit, copper-based interconnection faces the circuit techniques have successfully overcome this physical limit, copper-based interconnection faces challenge of bandwidth limit as the data rate exceeds several tens of Gb/s. On the other hand, optical the challenge of bandwidth limit as the data rate exceeds several tens of Gb/s. On the other hand, interconnection can provide a much higher bandwidth and is free from reflection and crosstalk. In optical interconnection can provide a much higher bandwidth and is free from reflection and crosstalk. practice, optical fibers have successfully replaced the legacy copper cables over long distances and In practice, optical fibers have successfully replaced the legacy copper cables over long distances and extension of their usage into shorter distances appears promising. Furthermore, with advanced extension of their usage into shorter distances appears promising. Furthermore, with advanced silicon silicon photonic technologies, it is anticipated that optics can be used even for micro-scale photonic technologies, it is anticipated that optics can be used even for micro-scale interconnections. interconnections. Although several problems remain to be solved, many research groups are working Although several problems remain to be solved, many research groups are working toward the same toward the same goal, i.e., the realization of high-performance computing and communication goal, i.e., the realization of high-performance computing and communication systems on a single chip. systems on a single chip. Silicon photonics with mature complementary metal-oxide-semiconductor Silicon photonics with mature complementary metal-oxide-semiconductor (CMOS) IC technologies (CMOS) IC technologies will provide solutions to fulfill the aforementioned industrial demands. In will provide solutions to fulfill the aforementioned industrial demands. In this paper, we summarize this paper, we summarize the current status and attempt to forecast the future of optical the current status and attempt to forecast the future of optical interconnection in a wide perspective by interconnection in a wide perspective by reviewing papers in as many fields as possible. We also reviewing papers in as many fields as possible. We also provide a review of key circuit techniques for provide a review of key circuit techniques for high-speed photo-detection. This paper is organized as high-speed photo-detection. This paper is organized as follows. In Section 2, various aspects of optical follows. In Section 2, various aspects of optical links are discussed. Subsequently, we focus on the links are discussed. Subsequently, we focus on the subject of an optical receiver by providing the basis subject of an optical receiver by providing the basis of a photodetector (PD) in Section 3 and the basic of a photodetector (PD) in Section 3 and the basic topologies of a transimpedance amplifier (TIA) in topologies of a transimpedance amplifier (TIA) in Section 4. In Sections 5 and 6, the detailed circuit Section 4. In Sections 5 and 6, the detailed circuit techniques for the optical receiver are introduced. techniques for the optical receiver are introduced. Finally, we conclude this review work. Finally, we conclude this review work. 2. Silicon Photonics for High-Speed Data Communications 2. Silicon Photonics for High-Speed Data Communications 2.1. Overview Overview of of Optical Optical Link Link 2.1. Links or or I/O I/O circuits and played played aa major major role role in in the the Links circuits have have experienced experienced remarkable remarkable advancements advancements and development of of data data communications communications systems. systems. However, However, the the bandwidth bandwidth limitation limitation of of copper-based copper-based development links was already predicted a long time ago and engineers have attempted to determine alternative links was already predicted a long time ago and engineers have attempted to determine alternative ways to handling the ever-increasing bandwidth requirement. The optical link based on silicon silicon ways to handling the ever-increasing bandwidth requirement. The optical link based on photonics continues continues to to be be the the best best solution solution to to successfully photonics successfully replace replace the the conventional conventional electrical electrical link link [2]. [2]. Accordingly, many research groups at not only the leading companies such as Intel and IBM, but Accordingly, many research groups at not only the leading companies such as Intel and IBM, also but universities and national institutions have focused on developing silicon photonic devices including also universities and national institutions have focused on developing silicon photonic devices both the optical and electrical parts [3–5]. The results of massive research over several decades appear including both the optical and electrical parts [3–5]. The results of massive research over several to be successful and some remarkable results are already applicable to the industrial world [6]. decades appear to be successful and some remarkable results are already applicable to the industrial Optical arelinks gradually replacing their their electrical counterparts in in long-haul world [6].links Optical are gradually replacing electrical counterparts long-hauland and high-speed high-speed applications. Moreover, the ultimate goal of silicon photonics is defined and envisioned as “macro applications. Moreover, the ultimate goal of silicon photonics is defined and envisioned as aa “macro chip” [7] or an “on-chip server” [8], indicating that the optical links may hopefully substitute most of the electrical links even in the inter-/intra-chip scale. Nevertheless, it appears that the industry is

Sensors 2017, 17, 1962

4 of 40

Sensors 2017, 17, 1962 4 of 39 chip” [7] or an “on-chip server” [8], indicating that the optical links may hopefully substitute most of the electrical links even in the inter-/intra-chip scale. Nevertheless, it appears that the industry is hesitating at adopting silicon photonics as a primary interconnection method for some practical hesitating at adopting silicon photonics as a primary interconnection method for some practical reasons. reasons. InIn the case rangingfrom fromseveral severalcentimeters centimeters meters, electrical the caseofofvery veryshort-reach short-reachapplications applications ranging toto meters, electrical links are still dominant, despite the bandwidth limit of copper-based interconnection. The proliferation links are still dominant, despite the bandwidth limit of copper-based interconnection. The of proliferation copper-basedoflinks in this arealinks can be attributed to the advancements of CMOS technologies copper-based in mainly this area can be mainly attributed to the advancements of and circuit techniques. By utilizing advanced CMOS technologies and equalization techniques, CMOS technologies and circuit techniques. By utilizing advanced CMOS technologies and operations at techniques, higher than 25 Gb/s even in severe loss conditions of conditions more thanof 40 dBthan were equalization operations at higher than 25 Gb/s even in severe loss more achieved [9–11]. Recently, circuit designers attemptattempt to overcome the the limitations ofofcopper 40 dB were achieved [9–11]. Recently, circuit designers to overcome limitations copper by introducing a pulse-amplitude-modulation (PAM) signaling thatthat cancan enhance thethe effective data rate by introducing a pulse-amplitude-modulation (PAM) signaling enhance effective data the loss samecondition loss condition the conventional binary signaling [12–15]. Figure 4 summarizesthe forrate thefor same as theasconventional binary signaling [12–15]. Figure 4 summarizes the binary and PAM-4 transceivers which recorded the fastest operating speed each year over binary and PAM-4 transceivers which recorded the fastest operating speed each year over thethe past 10[14,16–29]. years [14,16–29]. Notably, the electrical the challenge of bandwidth limitationand andthe 10 past years Notably, the electrical linkslinks face face the challenge of bandwidth limitation the maximum available saturate in the immediate Therefore, designers are maximum available speed speed wouldwould saturate in the immediate future.future. Therefore, designers are focusing focusing on developing PAM-4 transceivers in recent years. However, despite the innovations in on developing PAM-4 transceivers in recent years. However, despite the innovations in electrical links, electrical links, onethe cancapacity expect that thecopper-based capacity of thelinks copper-based links would saturate eventually one can expect that of the would saturate eventually because of both because of both the circuit and device limitations. the circuit and device limitations.

70

Data Rate [Gb/s]

60

Saturation

PAM4

50

Binary [23]

[13] [24]

6

PAM4

5

[28]

40

[20]

30

4

[27]

[18]

3

[21] [22]

20

[25]

10 [15] [16]

0 2007

[19]

2 1

[17]

2009

2011

2013

2015

2017

# of Imptlemented PAM4

7 [26]

Binary

0 2019

Year Figure4.4.Trends Trendsofofcopper-based copper-based electrical electrical links Figure links from fromrecent recent10-year 10-yearISSCC ISSCCpapers. papers.

The optical links can be configured with either a multi-mode fiber (MMF) or a single-mode fiber The optical links can be configured with either a multi-mode fiber (MMF) or a single-mode fiber (SMF). An MMF-based link features a distinctly low cost whereas the communication distance is (SMF). An MMF-based link features a distinctly low cost whereas the communication distance is strictly strictly restricted due to modal dispersion. A vertical-cavity surface-emitting laser (VCSEL) is restricted due to modal dispersion. A vertical-cavity surface-emitting laser (VCSEL) is commonly used commonly used as a light source in the MMF-based links with a typical wavelength of 850 nm. Owing as to a light source MMF-based linksshort-reach with a typical wavelength of 850 Owing hundreds to the lowof cost the low costinofthe MMF-based links, applications ranging upnm. to several of meters MMF-based short-reach ranging up to several hundreds meters mostly rely mostlylinks, rely on them. On applications the other hand, an SMF-based link is free fromofmodal dispersion, onthereby them. On the other hand, an SMF-based link is free from modal dispersion, thereby providing providing a much longer communication distance and is only limited by chromatic a much longer communication distance and is only limited by chromatic dispersion and loss dispersion and other loss mechanisms such as Rayleigh scattering or electronic absorption. other Typical mechanisms such standards as Rayleigh scattering electronic absorption. Typical communication communication based on SMFor support a communication distance of up to severalstandards tens of based on SMFwith support a communication of Figure up to several tens of kilometers with wavelengths kilometers wavelengths of 1310 ordistance 1550 nm. 5 summarizes the communication trends of from 1310 orEthernet 1550 nm.standards: Figure 5 summarizes the communication trends from Ethernet standards: 10 GbE, 10 GbE, 40 GbE, and 100 GbE [30,31]. Among various optical applications, the short-reach are the most applications, popular and market size is 40communication GbE, and 100 GbE [30,31]. Among variousapplications optical communication the short-reach growing rapidly to increasing demand centers and active cables demand (AOCs). in applications are theowing most popular and market sizeinisdata growing rapidly owingoptical to increasing Especially, the demand for thecables AOCs(AOCs). would increase significantly, as their is notwould restricted to data centers and active optical Especially, the demand for usage the AOCs increase data centers, but massively extended to everyday life. However, since silicon photonics is based on significantly, as their usage is not restricted to data centers, but massively extended to everyday alife. silicon waveguide, only the telecom wavelengths arewaveguide, available inonly silicon which is notare However, since silicon photonics is based on a silicon thephotonics, telecom wavelengths directly compatible with the short-reach applications, most of which rely on low-cost VCSEL-based available in silicon photonics, which is not directly compatible with the short-reach applications, most Nevertheless, prospect of silicon photonics is promising the future of solutions. which rely on low-costthe VCSEL-based solutions. Nevertheless, theforprospect of inter-/intra-chip silicon photonics interconnections and the next-generation data centers covering a communication distance of up to 2 is promising for the future inter-/intra-chip interconnections and the next-generation data centers km as highlighted in Figure 5 [32,33]. covering a communication distance of up to 2 km as highlighted in Figure 5 [32,33].

Sensors 2017, 17, 1962 Sensors 2017, 17, 1962

5 of 40 5 of 39

Copper

MMF

SMF

SMF

10G New standards for next-generation data centers

40/100G Backplane

850 nm

1310 nm

1310, 1550 nm

Distance

1m 10 m 100 m 1 km 10 km 100 km 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05

Application

Communication Distance intra/inter-server (data center) metro, long-haul

intra/inter-chip

communication standards to which silicon photonics can be applied low-cost VCSEL-based (MMF) solutions

Figure 5. Summary of communication standards, Ethernet.

2.2. Silicon Photonics

In this this section, section,an anoverview overviewofofsilicon siliconphotonics photonics presented and silicon photonic technologies In is is presented and silicon photonic technologies are are discussed examples of silicon photonic transceivers. Thestatus current status photonics of silicon discussed with with somesome examples of silicon photonic transceivers. The current of silicon photonics and the underlying problems toin bethe solved in are the addressed. future are addressed. and the underlying problems to be solved future As explained progress in in silicon photonics waswas first first observed in theinearly As explained in in[34,35], [34,35],outstanding outstanding progress silicon photonics observed the 2000s.2000s. Although siliconsilicon photonics pursues the realization of on-chip optical interconnections, they early Although photonics pursues the realization of on-chip optical interconnections, may find application in fiber-optic communications as discussed in the previous section. they may immediate find immediate application in fiber-optic communications as discussed in the previous Therefore, extensive extensive research on silicon on photonics notinclude only lasers, modulators, section. Therefore, research silicon include photonics not photodiodes, only lasers, photodiodes, and waveguides but also waveguide-to-fiber couplers. Furthermore, all the photonic devices modulators, and waveguides but also waveguide-to-fiber couplers. Furthermore, all the photonic necessary to build a complete photonic transceiver were already viable in 2000s except for the devices necessary to build a complete photonic transceiver were already viable in 2000s exceptlaser for source. more recent review work stateswork that the implementation of silicon lasers is still lasers challenging; the laserAsource. A more recent review states that the implementation of silicon is still however, a number ofainnovations using bonding and epitaxial growth of III-V materials silicon challenging; however, number of innovations using bonding and epitaxial growth of III-V on materials have beenhave accomplished, so called aso hybrid laser [36]. laser In the[36]. caseIn ofthe photodetectors, silicon by on silicon been accomplished, calledsilicon a hybrid silicon case of photodetectors, itself cannot becannot a goodbe absorber telecommunication wavelengths due to itsdue large bandgap energy. silicon by itself a good for absorber for telecommunication wavelengths to its large bandgap Among the numerous silicon-based photodiodes, a germanium-introduced photodiode is the most energy. Among the numerous silicon-based photodiodes, a germanium-introduced photodiode popular photodetector; moreover, moreover, recent germanium photodiodes have achieved very very high is the most popular photodetector; recent germanium photodiodes have achieved bandwidths of several tens of GHz [37,38]. Silicon-based photodiodes will be discussed again in the high bandwidths of several tens of GHz [37,38]. Silicon-based photodiodes will be discussed again following section. Silicon modulators are typically based on plasma dispersion effect that the in the following section. Silicon modulators are typically based on plasma dispersion effect that refractive index of the waveguide is changed by the density of free carriers, thereby enabling the the refractive index of the waveguide is changed by the density of free carriers, thereby enabling amplitude modulation with Mach-Zehnder interferometer (MZI) structures. Modern silicon MZI the amplitude modulation with Mach-Zehnder interferometer (MZI) structures. Modern silicon modulators are capable of handling data rates higherhigher than 50 Gb/s Another type oftype silicon MZI modulators are capable of handling data rates than 50 [39,40]. Gb/s [39,40]. Another of modulators is a ringisresonator which exhibits a small footprint and better energy than the silicon modulators a ring resonator which exhibits a small footprint and betterefficiency energy efficiency MZI the structures [41,42].[41,42]. Furthermore, a wavelength-division-multiplexing (WDM) readily than MZI structures Furthermore, a wavelength-division-multiplexing (WDM)isis readily applicable to to the the ring-resonator-based ring-resonator-based architectures, architectures, which desirable feature feature for for future future silicon silicon applicable which is is aa desirable photonic transceivers [43]. Much progress has also been made in silicon waveguides and couplers. photonic transceivers [43]. Much progress has also been made in silicon waveguides and couplers. Depending on on the the type type of of the the waveguide, waveguide, typical typical propagation propagationloss losscan canbe beseveral severaldB/cm dB/cm or or less less than than Depending dB/cm [36]. great importance importance for for successfully successfully interfacing interfacing with with an an 11 dB/cm [36]. A A coupling coupling technique technique is is also also of of great off-chip optical fiber. Fortunately, a number of good solutions such as holographic lens have been off-chip optical fiber. Fortunately, a number of good solutions such as holographic lens have been proposed so sofar, far,ideally ideallyexhibiting exhibitingananextremely extremely low-loss condition which is comparable to that of proposed low-loss condition which is comparable to that of the the conventional connector conventional fiberfiber connector [6]. [6]. Despite the thebrilliant brilliantsuccess success silicon photonic (PICs), compatibility with electronic ICs Despite of of silicon photonic ICs ICs (PICs), compatibility with electronic ICs (EICs) (EICs) or a monolithic integration of the and the EICs is still challenging. Despite many efforts or a monolithic integration of the PICs andPICs the EICs is still challenging. Despite many efforts to realize to realize electronic and photonic ICs (EPICs) on a bulk CMOS platform, the overall performance electronic and photonic ICs (EPICs) on a bulk CMOS platform, the overall performance has yet tohas be yet to be improved. Alternatively, an SOI CMOS platform is generally accepted as an ideal platform for the implementation of both silicon waveguides and active transistors.

Sensors 2017, 17, 1962

6 of 40

Sensors 2017, 17, 1962

6 of 39

improved. Alternatively, an SOI CMOS platform is generally accepted as an ideal platform for the implementation of both siliconCMOS waveguides and active transistors. In a standard 0.13-μm silicon-on-insulator (SOI) platform, the first fully integrated In a standard 0.13-µm CMOS silicon-on-insulator (SOI) the first fully optoelectronic transceiver was realized by Luxtera in 2006 [44].platform, The implemented chip integrated achieves a optoelectronic transceiver was realized by Luxtera in 2006 [44]. The implemented chip achieves a total total throughput of 20 Gb/s with a dual-channel architecture using an SMF for long-reach throughput of 20 Gb/s with a dual-channel architecture using an SMF for long-reach communication. communication. The overall block diagram of the transceiver is illustrated in Figure 6. The TX The overall diagram the transceiver is illustrated Figureand 6. The TX converts electrical converts an block electrical signalofinto a data-modulated opticalinsignal the RX operates an in the other signal into a data-modulated optical signal and the RX operates in the other way around. The TX way around. The TX equalizer in the front compensates for the frequency loss of the electrical channel equalizer in TX theinput. front compensates the frequency loss of clock the electrical prior to the TX input. prior to the Following thefor equalizer, the TX-side and datachannel recovery (CDR) retimes the Following the equalizer, the TX-side clockdata andoutput. data recovery (CDR) retimes the data reduce jitter data to reduce jitter and produce a clean The driver further increases thetoamplitude of and produce a clean data output. The driver further increases the amplitude of the data signal to the data signal to make an optimum input condition for the MZI. The MZI is based on a reversemake optimum condition for the MZI.atThe on a phase reverse-biased p-nreducing junction the for biasedanp-n junctioninput for high-speed operation theMZI costisofbased reduced shift, thus high-speed operation at the cost of reduced phase shift, thus reducing the extinction ratio (ER). ER can extinction ratio (ER). ER can be raised by increasing the length of the modulator or enlarging the be raised by increasing thedriver lengthoutput of the modulator outputinswing. The driver driver output swing. The swing and or theenlarging length ofthe thedriver modulator this design are 5 output the lengthAofholographic the modulator inused this design are 5 Vppd and for 4 mm, Vppd andswing 4 mm,and respectively. lens is as a fiber-optic coupler bothrespectively. the TX and A is RX used a fiber-optic for both theThe TXtransimpedance and the RX. Theamplifier PD at the RX theholographic RX. The PDlens at the is as flip-chip bondedcoupler to the silicon die. (TIA) is flip-chip to thetosilicon die.which The transimpedance amplifier (TIA) converts the(LA) PD current converts thebonded PD current a voltage is further amplified by a limiting amplifier for the to a voltage which further amplified by limiting amplifier (LA) for the proper operation of the proper operation ofis the CDR. Following theaCDR, an output buffer is employed to drive a differential CDR. Following the CDR, an output buffer is employed to drive a differential 100-Ω transmission 100-Ω transmission line. In 2007, Luxtera further developed their photonic device library by adding line. In WDM 2007, (DWDM) Luxtera further developed photonic DWDM device library by adding a dense WDM a dense building block. Bytheir implementing multiplexers and demultiplexers, (DWDM) building block. By implementing DWDM multiplexers and demultiplexers, they succeeded they succeeded in integrating a 4 × 10-Gb/s DWDM optoelectronic transceiver in a 0.13-μm CMOS in integrating a 4[45]. × 10-Gb/s DWDM optoelectronic transceiver in a 0.13-µm CMOS SOI technology [45]. SOI technology Electrical Interface

Optical Interface

TX Equalizer

TX CDR

DRV

MZI

TX CW

p-i-n PD RX Buffer

RX Buffer

LA

TIA

Channel 1 Channel 2

Figure6.6.22× × 10-Gb/s Figure 10-Gb/sdual dualchannel channeloptoelectronic optoelectronictransceiver transceiver[44]. [44].

The silicon photonics research group at MIT has also published successful works in monolithic The silicon photonics research group at MIT has also published successful works in monolithic optical transceivers. In 2012, the first monolithic optical receiver in a sub-100-nm standard SOI optical transceivers. In 2012, the first monolithic optical receiver in a sub-100-nm standard SOI process process was implemented [46]. In this work, novel approaches are proposed toward implementing a was implemented [46]. In this work, novel approaches are proposed toward implementing a PD PD and a detecting scheme, which can be attributed to the increased design flexibility of the PD in and a detecting scheme, which can be attributed to the increased design flexibility of the PD in the the monolithic integration platform. An important advantage of the monolithic integration is that low monolithic integration platform. An important advantage of the monolithic integration is that low PD and wiring capacitances can be obtained. An integrating receiver illustrated in Figure 7 can be the PD and wiring capacitances can be obtained. An integrating receiver illustrated in Figure 7 can be most appropriate topology by exploiting the low-capacitance condition and the presence of an RX the most appropriate topology by exploiting the low-capacitance condition and the presence of an clock. However, an insufficient timing margin for evaluation can severely degrade the sensitivity. PD RX clock. However, an insufficient timing margin for evaluation can severely degrade the sensitivity. splitting combined with a double-data-rate (DDR) scheme is implemented in this work to alleviate PD splitting combined with a double-data-rate (DDR) scheme is implemented in this work to alleviate the timing margin. By interdigitating the metal contacts of the PD, it can be divided into two separate the timing margin. By interdigitating the metal contacts of the PD, it can be divided into two separate PDs sharing a common waveguide, thus reducing the photocurrent of each path by half. In spite of PDs sharing a common waveguide, thus reducing the photocurrent of each path by half. In spite of the the halved photocurrent, the PD splitting is more advantageous at higher data rates considering that halved photocurrent, the PD splitting is more advantageous at higher data rates considering that the the sensitivity degrades exponentially with the shortened evaluation time. Another improvement is sensitivity degrades exponentially with the shortened evaluation time. Another improvement is at the at the PD structure that incorporates a ring resonator to confine the light, thereby enhancing the PD structure that incorporates a ring resonator to confine the light, thereby enhancing the absorption absorption with a smaller PD length. The implemented receiver chip operates at the data rate of up with a smaller PD length. The implemented receiver chip operates at the data rate of up to 3.5 Gb/s to 3.5 Gb/s with an extremely high energy efficiency of 50 fJ/b. This demonstration is significant in that it paved the way for future memory-processor interfaces based on silicon photonics.

Sensors 2017, 17, 1962

7 of 40

with an extremely high energy efficiency of 50 fJ/b. This demonstration is significant in that it paved the way for17,future Sensors 2017, 1962 memory-processor interfaces based on silicon photonics. 7 of 39

RX1

BER Check

CK

RX2 PD1

BER Check

PD2 CKB

Figure splitting [46]. [46]. Figure 7. 7. Monolithic Monolithic receiver receiver with with PD PD splitting

A monolithic optical transceiver operating at higher speed was also reported jointly by UC San A monolithic optical transceiver operating at higher speed was also reported jointly by UC Diego and Oracle in the same year [47]. The chip is implemented in a 0.13-μm SOI CMOS technology San Diego and Oracle in the same year [47]. The chip is implemented in a 0.13-µm SOI CMOS and achieves an operating speed of 25 Gb/s with an energy efficiency of 10.2 pJ/b for the entire technology and achieves an operating speed of 25 Gb/s with an energy efficiency of 10.2 pJ/b for the transceiver. The optical TX is based on a micro-ring modulator and the optical RX employs a Ge entire transceiver. The optical TX is based on a micro-ring modulator and the optical RX employs photodetector. Notably, an asymmetric pre-emphasis is used even for the reverse-biased ring a Ge photodetector. Notably, an asymmetric pre-emphasis is used even for the reverse-biased ring modulator. When the optical devices are forward-biased, or more specifically when operating a modulator. When the optical devices are forward-biased, or more specifically when operating a VCSEL VCSEL or a ring modulator in a forward-biased condition, the storage time of minority carriers or a ring modulator in a forward-biased condition, the storage time of minority carriers obstructs the obstructs the operating speed and causes asymmetric rising and falling times. On the other hand, in operating speed and causes asymmetric rising and falling times. On the other hand, in the case of the case of a reverse-biased ring modulator, the photon lifetime dominates the maximum operating a reverse-biased ring modulator, the photon lifetime dominates the maximum operating speed, which speed, which also incurs asymmetry. Therefore, the asymmetric pre-emphasis, or independently also incurs asymmetry. Therefore, the asymmetric pre-emphasis, or independently controlling the controlling the rising and falling edges, is essential even for the reverse-biased condition in order to rising and falling edges, is essential even for the reverse-biased condition in order to improve the improve the signal quality at higher speed. signal quality at higher speed. In 2015, a more sophisticated work was performed by collaboration among several research In 2015, a more sophisticated work was performed by collaboration among several research groups in academia and industry [48]. They demonstrate good feasibility of future processor-memory groups in academia and industry [48]. They demonstrate good feasibility of future processor-memory links by implementing a processor, a memory, interface circuits, and photonic devices altogether onto links by implementing a processor, a memory, interface circuits, and photonic devices altogether a single microchip in a commercial 45-nm CMOS SOI process without any changes to the foundry onto a‘single microchip in a commercial 45-nm CMOS SOI process without any changes to the process as shown in Figure 8. The implemented chip reliably integrates 70 million transistors and 850 foundry process as shown in Figure 8. The implemented chip reliably integrates 70 million transistors photonic components. The functionality of the processor-memory link is also verified by and 850 photonic components. The functionality of the processor-memory link is also verified by demonstrating actual read and write operations between the processor and the memory at 2.5 Gb/s. demonstrating actual read and write operations between the processor and the memory at 2.5 Gb/s. Although only a single-wavelength operation is demonstrated, the total aggregate bandwidth can be Although only a single-wavelength operation is demonstrated, the total aggregate bandwidth can increased by more than 10 times without using additional fibers. A single external laser source is be increased by more than 10 times without using additional fibers. A single external laser source employed to supply 1180-nm continuous wave for both processor and memory sides using the 50/50 is employed to supply 1180-nm continuous wave for both processor and memory sides using the power splitter. The receiver incorporating a SiGe PD exhibits a sensitivity of −5 dBm for the bit-error 50/50 power splitter. The receiver incorporating a SiGe PD exhibits a sensitivity of −5 dBm for the rate (BER) of 10−12. At the transmitter, a micro-ring resonator with reverse bias is employed and bit-error rate (BER) of 10−12 . At the transmitter, a micro-ring resonator with reverse bias is employed showed an ER of 6 dB. Since the resonant wavelength of the ring structure is sensitive to not only and showed an ER of 6 dB. Since the resonant wavelength of the ring structure is sensitive to not only physical dimension but also temperature, a continuous stabilization of the wavelength is essential. In physical dimension but also temperature, a continuous stabilization of the wavelength is essential. this work, a small part of the modulator output power is monitored and the resonant wavelength is In this work, a small part of the modulator output power is monitored and the resonant wavelength tuned in a way that maximizes the output power by digitally controlling an embedded resistive is tuned in a way that maximizes the output power by digitally controlling an embedded resistive heater inside the ring. Despite several limitations, the contribution of this work is evident and further heater inside the ring. Despite several limitations, the contribution of this work is evident and further development is expected to realize more practical silicon photonic systems in the near future. development is expected to realize more practical silicon photonic systems in the near future. On the other hand, the bulk CMOS platform has been relatively less popular than the SOI On the other hand, the bulk CMOS platform has been relatively less popular than the SOI counterpart primarily due to the difficulty of implementing low-loss silicon waveguides. However, counterpart primarily due to the difficulty of implementing low-loss silicon waveguides. However, the the realization of silicon photonics in bulk CMOS would be preferred because the bulk CMOS realization of silicon photonics in bulk CMOS would be preferred because the bulk CMOS technology technology has been the mainstream so that most electronic parts are optimized in bulk CMOS has been the mainstream so that most electronic parts are optimized in bulk CMOS processes with processes with the lowest cost. On the other hand, the floating-body effect of SOI CMOS may lead to the lowest cost. On the other hand, the floating-body effect of SOI CMOS may lead to non-linear non-linear characteristics of the active transistors and heat dissipation can be problematic, thus rendering the SOI platform inadequate for the extremely high level of integration. Therefore, photonic systems on the bulk CMOS platform have also gathered significant attention.

Sensors 2017, 17, 1962

8 of 40

characteristics of the active transistors and heat dissipation can be problematic, thus rendering the SOI platform inadequate for the extremely high level of integration. Therefore, photonic systems on the SensorsCMOS 2017, 17, 17,platform 1962 of 39 39 bulk have also gathered significant attention. Sensors 2017, 1962 88 of

Figure 8. 8. Block memory-processor optical optical link link realized realized in in [48]. [48]. Block diagram diagram of of memory-processor Figure

In 2014, 2014,the thefirst firstmonolithic monolithicintegration integration of photonic and electronic devices on bulk the bulk bulk CMOS In 2014, the first monolithic integration photonic and electronic devices on the CMOS In ofof photonic and electronic devices on the CMOS was was reported [49]. The most challenging part is the waveguide integration with a low propagation was reported The most challenging the waveguide integration low propagation reported [49]. [49]. The most challenging part part is theiswaveguide integration withwith a lowa propagation loss. loss. Figure Figure shows the cross-section cross-section ofimplemented the implemented implemented silicon photonic devices. For the formation loss. 99 shows the of the silicon photonic devices. formation Figure 9 shows the cross-section of the silicon photonic devices. ForFor thethe formation of of a polysilicon-based waveguide, a lower cladding and an upper cladding are implemented using of a polysilicon-based waveguide, a lower cladding an upper cladding are implemented a polysilicon-based waveguide, a lower cladding and and an upper cladding are implemented usingusing deep deep trench trench isolation (DTI)silicon and silicon silicon nitride barrier films between between the inter-level inter-level dielectric (ILD) deep isolation and films the dielectric (ILD) trench isolation (DTI) (DTI) and nitridenitride barrierbarrier films between the inter-level dielectric (ILD) layers, layers, respectively. respectively. Several optimization optimization processes such as the the crystallization crystallization of amorphous amorphous layers, such as of respectively. Several Several optimization processesprocesses such as the crystallization of amorphous polysilicon polysilicon and low-pressure chemical vapor deposition (LPCVD) of silicon nitride are also polysilicon and low-pressure vapor deposition (LPCVD) of also silicon nitride toare also and low-pressure chemical vaporchemical deposition (LPCVD) of silicon nitride are performed further performed to further further reduce the the The lossresulting of the the waveguide. waveguide. The resulting waveguide achieves record performed to reduce loss of resulting waveguide achieves record reduce the loss of the waveguide. waveguideThe achieves a record attenuation of 10.5aadB/cm attenuation of 10.5 dB/cm which is sufficiently low to support resonant ring modulators. A attenuation of 10.5 low dB/cm which is sufficiently low to support resonantresonant ring modulators. A which is sufficiently to support resonant ring modulators. A polysilicon detector and polysilicon resonant detector and a SiGe p-i-n detector are both implemented. The fully functional polysilicon detector and a SiGe p-i-n are both implemented. The fully functional a SiGe p-i-n resonant detector are both implemented. The detector fully functional optical transceiver operates at 5 Gb/s optical transceiver operates at 5 Gb/s exhibiting an energy efficiency of 2.8 pJ/b. optical transceiver operates at 5 Gb/s exhibiting an energy efficiency of 2.8 pJ/b. exhibiting an energy efficiency of 2.8 pJ/b. ILD1 ILD1

SiGe PD PD SiGe

Modulator Modulator

Waveguide Waveguide

Transistor Transistor

ILD0 ILD0

DTI Under UnderCladding Cladding DTI

STI STI

STI STI

Si Substrate Substrate Si

Figure 9. Cross-section Cross-section of photonic photonic and electronic electronic devices [49]. [49]. Figure Figure 9. 9. Cross-section of of photonic and and electronic devices devices [49].

Subsequently, aa more more advanced advanced work work was was demonstrated demonstrated by by incorporating incorporating aa DWDM DWDM system system into into Subsequently, Subsequently, a more advanced work was demonstrated by incorporating a DWDM system into the bulk bulk CMOS CMOS platform platform [50]. [50]. A A DWDM DWDM optical optical transceiver transceiver is is monolithically monolithically integrated integrated on on aa 0.18-μm 0.18-μm the the bulk CMOS platform [50]. A DWDM optical transceiver is monolithically integrated onrelying a 0.18-µm bulk CMOS process with all the optical devices implemented using polysilicon without on bulk CMOS process with all the optical devices implemented using polysilicon without relying on bulk CMOS process with all the optical devices implemented using polysilicon without relying on the the epitaxial epitaxial crystallization crystallization of of silicon silicon or or the the introduction introduction of of Ge. Ge. By By applying applying the the same same approaches approaches the epitaxial crystallization of silicon or the introduction of Ge. By applying the same approaches presented presented in [49], low-loss photonic devices including waveguides, couplers, micro-ring modulators, presented in [49], low-loss photonic devices including waveguides, couplers, micro-ring modulators, in [49], low-loss photonic devices including waveguides, couplers, micro-ring modulators, PDs and PDs PDs can can be be successfully successfully integrated integrated with with minimal minimal modifications modifications to to the the original original CMOS CMOSand process. and process. can be successfully integrated with minimal modifications to the original CMOS process. Moreover, Moreover, based based on on the the micro-ring micro-ring structures, structures, 9-wavelength 9-wavelength TX TX and and RX RX DWDM DWDM macros macros are are also also Moreover, based on the micro-ring structures, 9-wavelength TX andanRXinternal DWDMpseudorandom macros are also realized as shown realized as shown in Figure 10. The TX employs binary sequence realized as shown in Figure 10. The TX employs an internal pseudorandom binary sequence generator, aa serializer, serializer, and and aa modulator modulator driver, driver, enabling enabling electro-optic electro-optic conversion conversion by by the the carriercarriergenerator, depletion micro-ring micro-ring modulator. modulator. At At the the RX RX side, side, aa polysilicon-based polysilicon-based PD PD utilizing utilizing absorption absorption from from depletion defect states, combined with the resonant structure, provides an acceptable responsivity of 0.2 A/W. defect states, combined with the resonant structure, provides an acceptable responsivity of 0.2 A/W. Similarly, the the PD PD splitting splitting technique technique in in [46] [46] is is also also employed employed to to mitigate mitigate the the timing timing margin. margin. After After the the Similarly, DDR-based receiver front-end, the data is further deserialized for the estimation of BER. In order to

Sensors 2017, 17, 1962

9 of 40

in Figure 10. The TX employs an internal pseudorandom binary sequence generator, a serializer, and a modulator driver, enabling electro-optic conversion by the carrier-depletion micro-ring modulator. At the RX side, a polysilicon-based PD utilizing absorption from defect states, combined with the resonant structure, provides an acceptable responsivity of 0.2 A/W. Similarly, the PD splitting technique in [46]2017, is also to mitigate the timing margin. After the DDR-based receiver front-end, the9data Sensors 17, employed 1962 of 39 is further deserialized for the estimation of BER. In order to achieve a stable wavelength locking of the DWDM asystem, a heater andlocking a digitaloflogic controlling the heater areand alsoaimplemented. Consequently, achieve stable wavelength the DWDM system, a heater digital logic controlling the for a total data rate Consequently, of 45 Gb/s, a 9-wavelength DWDM data link sharing a single waveguide heater are aggregate also implemented. for a total aggregate rate of 45 Gb/s, abus 9-wavelength is successfully demonstrated. DWDM link sharing a single bus waveguide is successfully demonstrated. Tx DWDM Macro

8

Rx DWDM Macro λ9

λ1

TIA

2 8:2

λ2

λ9

SA

TX VPD

CK

TIA

SA

8

BER Checker

PRBS Gen.

λ2

2:8 Deserializer

λ1

Figure [50]. Figure 10. 10. Monolithically Monolithically integrated integrated DWDM DWDM optical optical link link in in bulk bulk CMOS CMOS [50].

Silicon photonics based on bulk-Si substrate is also an attractive option for processor-memory Silicon photonics based on bulk-Si substrate is also an attractive option for processor-memory interfaces, especially for cost-sensitive dynamic random-access memory (DRAM) applications [51]. interfaces, especially for cost-sensitive dynamic random-access memory (DRAM) applications [51]. Processor-memory interfaces are currently facing two challenging industrial requirements. The first Processor-memory interfaces are currently facing two challenging industrial requirements. The first requirement is the demand for higher bandwidth, according to which the next-generation DRAM requirement is the demand for higher bandwidth, according to which the next-generation DRAM (DDR5) standard requires the maximum per-pin data rate of 6.4 Gb/s. The other requirement is that (DDR5) standard requires the maximum per-pin data rate of 6.4 Gb/s. The other requirement is that the total memory capacity that a memory controller can handle should be maximized. However, the the total memory capacity that a memory controller can handle should be maximized. However, the existing schemes, such as the multi-drop bus topology or the point-to-point bus topology, cannot existing schemes, such as the multi-drop bus topology or the point-to-point bus topology, cannot satisfy satisfy the aforementioned requirements simultaneously. Fortunately, if an optical interconnection is the aforementioned requirements simultaneously. Fortunately, if an optical interconnection is applied to applied to the multi-drop bus interface, the per-pin data rate can be increased significantly without the multi-drop bus interface, the per-pin data rate can be increased significantly without degrading the degrading the total available memory capacity. The CMOS SOI platform can never be adopted in the total available memory capacity. The CMOS SOI platform can never be adopted in the DRAM interfaces DRAM interfaces where cost is a crucial factor. Samsung has presented several works based on the where cost is a crucial factor. Samsung has presented several works based on the bulk-Si platform, bulk-Si platform, which is different from the platform in [50] mainly in terms of waveguidewhich is different from the platform in [50] mainly in terms of waveguide-fabrication techniques. fabrication techniques. The schematic view of integrated EPICs is shown in Figure 11. Unlike in [50], The schematic view of integrated EPICs is shown in Figure 11. Unlike in [50], the waveguides are the waveguides are formed through a local crystallization based on epitaxy. For that purpose, a formed through a local crystallization based on epitaxy. For that purpose, a trench is formed first and trench is formed first and filled with silicon dioxide. Subsequently, amorphous silicon (a-Si) is filled with silicon dioxide. Subsequently, amorphous silicon (a-Si) is deposited using LPCVD. After the deposited using LPCVD. After the deposition, a-Si is crystallized by solid-phase epitaxy (SPE) using deposition, a-Si is crystallized by solid-phase epitaxy (SPE) using the bulk-Si substrate near the edges the bulk-Si substrate near the edges of the trench as a crystal seed, thereby growing the crystal of the trench as a crystal seed, thereby growing the crystal laterally toward the center of the trench. laterally toward the center of the trench. Finally, the waveguide is patterned using dry etching. Finally, the waveguide is patterned using dry etching. Owing to the introduction of epitaxy, a higher Owing to the introduction of epitaxy, a higher quality of crystallization as compared with the quality of crystallization as compared with the previous works is obtained, exhibiting a waveguide previous works is obtained, exhibiting a waveguide loss of only 3 dB/cm. A further improvement can loss of only 3 dB/cm. A further improvement can be achieved using a laser-induced epitaxial growth be achieved using a laser-induced epitaxial growth based on the liquid phase epitaxy method, based on the liquid phase epitaxy method, resulting in an almost perfect crystallization, so that the resulting in an almost perfect crystallization, so that the grain size is comparable to that of the bulk grain size is comparable to that of the bulk Si. Using the waveguides formed by the SPE method, Si. Using the waveguides formed by the SPE method, both the MZI and the micro-ring modulators both the MZI and the micro-ring modulators are realized. Two types of PDs are also implemented by are realized. Two types of PDs are also implemented by introducing Ge on Si: a surface-illumination introducing Ge on Si: a surface-illumination type and a butt-coupled-waveguide type. Although the type and a butt-coupled-waveguide type. Although the photonic devices alone exhibited good photonic devices alone exhibited good performance, overall performance degradation is observed performance, overall performance degradation is observed when integrating the EIC and the PIC when integrating the EIC and the PIC together with the combined process, which needs to be improved together with the combined process, which needs to be improved through process optimizations. through process optimizations. Transistor

Modulator

Ge PD

VGC

WG

resulting in an almost perfect crystallization, so that the grain size is comparable to that of the bulk Si. Using the waveguides formed by the SPE method, both the MZI and the micro-ring modulators are realized. Two types of PDs are also implemented by introducing Ge on Si: a surface-illumination type and a butt-coupled-waveguide type. Although the photonic devices alone exhibited good performance, overall performance degradation is observed when integrating the EIC and the PIC Sensors 2017, 17, 1962 10 of 40 together with the combined process, which needs to be improved through process optimizations. Transistor

Modulator

Ge PD

VGC

WG

Trench

Bulk Si Substrate

Sensors 2017, 17, 1962

Figure Figure 11. 11. Schematic Schematic view view of of EPIC EPIC structure structure [51]. [51].

10 of 39

Samsung has has also also demonstrated demonstrated the the feasibility feasibility of of aa multi-drop multi-drop bus bus topology topology based based on on silicon silicon Samsung photonics. Figure 12 illustrates an optical link with the hybrid integration of the EIC and the PIC, photonics. which shall shall be be developed developed into into aa fully fully monolithic monolithic EPIC EPIC in in the the future. future. The The implemented implemented optical optical link link which consists of of two two sets sets of of TXs TXs and and RXs RXs processing processing 88 DQ DQ signals, signals, and and simply simply emulating emulating aa read read or or aa write write consists operation of the controller-memory communication. The link successfully demonstrates an error-free operation of the controller-memory communication. The link successfully demonstrates an error-free operation at at the the data data rate rate of of up up to to 2.5 2.5 Gb/s Gb/s per fiber. operation Photonic IC

Electronic IC

TX1 4:1 MUX

RX1

4 DQ[3:0]

1:4 DEMUX

Source

4-phase CK

CK[3:0]

RX2 1:4 DEMUX

TX2

4 DQ[7:4]

4:1 MUX

Figure 12. 12. Block Block diagram diagram of of optical optical transceiver transceiver [51]. [51]. Figure

2.3. Hybrid Integration 2.3. Hybrid Integration As briefly discussed so far, silicon photonics has undergone remarkable progress, demonstrating As briefly discussed so far, silicon photonics has undergone remarkable progress, demonstrating its potential to replace the legacy copper-based interconnections. Nevertheless, some issues have yet its potential to replace the legacy copper-based interconnections. Nevertheless, some issues have yet to be addressed to prevent degradation of the performance of the EPIC. Moreover, the compatibility to be addressed to prevent degradation of the performance of the EPIC. Moreover, the compatibility with a FinFET CMOS platform is unknown and should be investigated in the future. Alternatively, a with a FinFET CMOS platform is unknown and should be investigated in the future. Alternatively, hybrid integration leveraging the respective process optimizations of the EIC and the PIC would be a hybrid integration leveraging the respective process optimizations of the EIC and the PIC would be more advantageous at present, and it has become viable owing to advanced 3-D integration more advantageous at present, and it has become viable owing to advanced 3-D integration techniques. techniques. In this section, several recent works based on the hybrid integration will be reviewed and In this section, several recent works based on the hybrid integration will be reviewed and compared compared with the monolithic integration in various aspects. with the monolithic integration in various aspects. The most common integration method is the use of a wire-bonding technique which has been The most common integration method is the use of a wire-bonding technique which has been prevalently used in EIC packages. However, wire bonding severely degrades the signal integrity as prevalently used in EIC packages. However, wire bonding severely degrades the signal integrity as the data rate becomes higher, because the length of a wire is directly translated into an inductance. the data rate becomes higher, because the length of a wire is directly translated into an inductance. For even higher data rates, the situation degrades as the wavelength of the signal becomes For even higher data rates, the situation degrades as the wavelength of the signal becomes comparable comparable to the length of the wire. Due to its relatively large dimension, wire bonding also limits to the length of the wire. Due to its relatively large dimension, wire bonding also limits the maximum the maximum pin density; hence, it is not appropriate for modern system-on-chip packages where pin density; hence, it is not appropriate for modern system-on-chip packages where an extremely an extremely high pin density is required. Nevertheless, thanks to its distinguishably low-cost high pin density is required. Nevertheless, thanks to its distinguishably low-cost characteristic, it is characteristic, it is still widely used for low-speed packages and chip-on-board test environments. An still widely used for low-speed packages and chip-on-board test environments. An example of the example of the integration of EPIC based on wire bonding is shown in Figure 13. A typical PD can be integration of EPIC based on wire bonding is shown in Figure 13. A typical PD can be wire-bonded wire-bonded to a TIA chip and the output of the TIA is connected to a trace on a printed circuit board to a TIA chip and the output of the TIA is connected to a trace on a printed circuit board (PCB) to (PCB) to interface with high-speed end-launch connectors. Based on wire bonding, the works in [52] and [53] achieve operating data rates of 25 Gb/s and 64 Gb/s, respectively. High-speed PCB

PD

For even higher data rates, the situation degrades as the wavelength of the signal becomes comparable to the length of the wire. Due to its relatively large dimension, wire bonding also limits the maximum pin density; hence, it is not appropriate for modern system-on-chip packages where an extremely high pin density is required. Nevertheless, thanks to its distinguishably low-cost characteristic, it is still widely used for low-speed packages and chip-on-board test environments. Sensors 2017, 17, 1962 11 ofAn 40 example of the integration of EPIC based on wire bonding is shown in Figure 13. A typical PD can be wire-bonded to a TIA chip and the output of the TIA is connected to a trace on a printed circuit board interface with high-speed end-launch connectors. Based on wire bonding, works in (PCB) to interface with high-speed end-launch connectors. Based on wirethe bonding, the[52,53] worksachieve in [52] operating data rates of 25 Gb/s and 64 Gb/s, respectively. and [53] achieve operating data rates of 25 Gb/s and 64 Gb/s, respectively. High-speed PCB

PD TIA

Sensors 2017, 17, 1962 Sensors 2017, 17, 1962

Figure and PIC. PIC. Figure 13. 13. Illustration Illustration of of wire-bonded wire-bonded EIC EIC and

11 of 39 11 of 39

Flip-chip bonding is a more advanced technique based on face-to-face bonding, completely Flip-chip is more advanced techniquechip-to-chip based on on face-to-face face-to-face bonding, completely Flip-chip bonding is aa more advanced technique based bonding, completely eliminating thebonding bond wires. Because of a shortened space (several tens of μm with eliminating the bond wires. Because of a shortened chip-to-chip space (several tens of with eliminating the bond wires. Because of a shortened chip-to-chip space (several tens of withμm modern modern techniques), the parasitic elements can be significantly reduced, resulting in µm improved signal modern techniques), the parasitic can be significantly reduced, resulting inchip improved signal techniques), the parasitic elements canillustrates be significantly reduced, resulting in The improved signal integrity at higher speeds. Figure elements 14 a typical flip-chip package. PD andintegrity the TIA integrity at higher speeds. Figure 14 illustrates a typical flip-chip package. The PD chip and the TIA at higher speeds. Figure 14 illustrates a typical flip-chip package. The PD chip and the TIA chip are chip are flip-chip bonded on the same package substrate through which high-speed signal chip are flip-chip bonded on the same package substrate through which high-speed signal flip-chip bondedison the same package which high-speed is interconnection made. The output of substrate the TIA isthrough connected to the PCB tracesignal via theinterconnection package and the interconnection is made. The output of the TIA is connected to the PCB trace via the package and the made. The output of the TIA is connected to the PCB trace via the package and the solder bump. In this solder bump. In this case, the structure of the PICs should be a back-illumination type for proper fiber solder bump. Inthis this case, the structure ofathe PICs be atype for proper case, the structure ofconfiguration, the PICs should back-illumination proper fibertype coupling. Withfiber this coupling. With thebe work in [54]should presents a back-illumination 4 × for 28-Gb/s optical receiver exhibiting coupling. With this configuration, the work in [54] presents a 4 × 28-Gb/s optical receiver exhibiting configuration, the work in [54] presents a 4 × 28-Gb/s optical receiver exhibiting good sensitivity. good sensitivity. good sensitivity. PD PD Package Package

TIA TIA

High-speed PCB High-speed PCB

Figure Figure 14. 14. Illustration Illustration of of flip-chip flip-chip package package [54]. [54]. Figure 14. Illustration of flip-chip package [54].

Similarly, but in a slightly different manner, the EIC and the PIC are stacked vertically using an Similarly, but but in in aaslightly slightly differentmanner, manner, theEIC EIC and the PIC stacked vertically using Similarly, the PIC areare stacked interposer as shown in Figure different 15 [55], realizing the a 12 × 5and two-dimensional EPICvertically array in using order an to an interposer as shown in Figure 15 [55],realizing realizinga a12 12××5 5two-dimensional two-dimensionalEPIC EPICarray arrayin in order order to to interposer as shown in Figure 15 [55], maximize the I/O density. The implemented transceiver chip achieves a total aggregate data rate as maximize the I/O density. The implemented transceiver chip achieves a total aggregate data rate as maximize I/O density. The implemented transceiver chip achieves a total aggregate data rate as high as 600the Gb/s. Gb/s. high as 600 Gb/s. One Dimensional Array One Dimensional Array CMOS Chip CMOS Chip VCSEL VCSEL Interposer Interposer

Two Dimensional Array Two Dimensional Array Bump Bump Via Via

Figure 15. Implementation of 2-D array of EPIC [55]. Figure Figure 15. 15. Implementation Implementation of of 2-D 2-D array array of of EPIC EPIC [55]. [55].

Figure 16 illustrates an improved flip-chip package that directly bonds the EIC and the PIC using Figure 16 illustrates an improved flip-chip package directly the EICregions and the to PICfurther using micro bumps, thus obviating any redistribution layer that between thebonds high-speed micro bumps, thus obviating any redistribution layer between the high-speed regions to further reduce the parasitic components. The rest of the signals are connected to the package substrate reduce components. signals by aresuch connected to the packagedue substrate throughthe C4 parasitic bumps. An additionalThe costrest mayofbethe incurred a package technique to the through C4 bumps. An additional cost may be incurred by such a package technique due to the requirement of a customized substrate. requirement of a customized substrate.

Sensors 2017, 17, 1962

12 of 40

Figure 15. Implementation of 2-D array of EPIC [55].

Figure Figure 16 16 illustrates illustrates an an improved improved flip-chip flip-chip package package that that directly directly bonds bonds the the EIC EIC and and the the PIC PIC using using micro bumps, thus obviating any redistribution layer between the high-speed regions to micro bumps, thus obviating any redistribution layer between the high-speed regions to furtherfurther reduce reduce the parasitic components. rest of the are connected to the package substrate the parasitic components. The rest The of the signals aresignals connected to the package substrate through C4 through C4 bumps. An additional cost may be incurred by such a package technique due to the bumps. An additional cost may be incurred by such a package technique due to the requirement of requirement of a customized substrate. a customized substrate. C4 Bump

Micro Bump

EIC PIC Build-up Substrate Figure Figure 16. 16. Flip-chip Flip-chip bonded bonded EIC EIC directly directly on on PIC PIC using using micro micro bump bump [56]. [56]. Sensors 2017, 17, 1962

12 of 39

Currently, the most advanced technique is shown in Figure 17. Contrary to the previous technique, Currently, the most advanced technique is shown in Figure 17. Contrary to the previous it uses the PIC as a redistribution layer itself, thus simplifying the package process [40,43,57]. The EIC technique, it uses the PIC as a redistribution layer itself, thus simplifying the package process is flip-chip bonded to the macro PIC in a similar way. Not only the high-speed signal pads but also [40,43,57]. The EIC is flip-chip bonded to the macro PIC in a similar way. Not only the high-speed the low-speed pads are connected theare PIC. Subsequently, theSubsequently, low-speed signals are carried signal padssignal but also the low-speed signalto pads connected to the PIC. the low-speed outside the PIC through a typical wire-bonding package. Based on this technique, the work in [43] signals are carried outside the PIC through a typical wire-bonding package. Based on this technique, demonstrates a 4[43] × 20-Gb/s WDM and [40] implements a 56-Gb/s optical transmitter the work in demonstrates a 4transceiver × 20-Gb/s WDM transceiver and [40] implements a 56-Gb/s optical with transmitter with the MZI structure. the MZI structure. PCB or Package Substrate SOI Silicon Photonics Chip

RX on EIC

RX1

RX2

RX3

PIC RX

RX4 λ1

λ1

λ2

λ3

λ2

Grating Coupler

λ4

λ3

λ4

TX

λ1

λ2

λ3

Grating Coupler

λ4

Flip-Chip Bonding λ4

TX1

TX2

TX3

TX4

TX2

λ3

TX3

λ2

TX4

λ1

TX1 TX

TX on EIC

RX4

RX3

RX2

RX1

RX

EIC

Figure 17. EIC flip-chip bonded to macro-PIC [43].

Figure 17. EIC flip-chip bonded to macro-PIC [43].

In Table 1, we summarize and compare the optical transceivers presented so far. Notably, the

In Tableperformance 1, we summarize and compare the optical transceivers far. Notably, overall of the hybrid-integrated transceivers is better thanpresented that of thesomonolithic transceivers, especiallyofinthe terms of energy efficiency. This can be to that the independently the overall performance hybrid-integrated transceivers is attributed better than of the monolithic optimizedespecially processes of EIC and the PIC.efficiency. However, itThis should noted thatto recent monolithic transceivers, inthe terms of energy canalso bebe attributed the independently transceivers have demonstrated significant improvement, are comparable their that hybrid optimized processes of the EIC and the PIC. However, and it should also be tonoted recent counterparts. monolithic transceivers have demonstrated significant improvement, and are comparable to their hybrid counterparts.

Sensors 2017, 17, 1962

13 of 40

Table 1. Summary of monolithic and hybrid-integrated optical transceivers.

Integration method Technology (EIC)

[44]

[45]

[47]

[50]

Monolithic

Monolithic

Monolithic

Monolithic

[52]

[55]

[56]

[43]

Hybrid

Hybrid

Hybrid

Hybrid

90 nm bulk

65 nm bulk

28 nm bulk

40 nm bulk

130 nm SOI

130 nm SOI

130 nm SOI

180 nm Bulk

GaAs

GaAs

28 nm SOI

130 nm SOI

1535–1555 nm

1549–1554 nm

1560 nm

1280–1295 nm

850 nm

850 nm

1556 nm

1550 nm

N/A

4

N/A

9

N/A

N/A

N/A

4

Max. data rate/CH

10 Gb/s

10 Gb/s

25 Gb/s

5 Gb/s

25 Gb/s

10 Gb/s

25 Gb/s

20 Gb/s

Total throughput

20 Gb/s

40 Gb/s

25 Gb/s

45 Gb/s

25 Gb/s

600 Gb/s

25 Gb/s

80 Gb/s

Technology (PIC) Wavelength # of WDM channels

TX Modulation type ER

MZI

MZI

MRR

MRR

VCSEL

VCSEL

MRR

MRR

5–6 dB

>4 dB

6.9 dB

6.9 dB

5.1 dB

5.6 dB

6.5 dB

>7 dB

Power/CH

N/A

575 mW

208 mW

N/A

46 mW

69.5 mW

72.5 mW

32.3 mW

Energy efficiency/CH

N/A

57.5 pJ/b

8.32 pJ/b

N/A

1.84 pJ/b

6.95 pJ/b

2.9 dB

1.6 pJ/b

N/A (external)

N/A (external)

Ge waveguide

Si (defect) waveguide

GaAs (top illumination)

N/A

N/A

Ge waveguide

−19.5 dBm

−15 dBm

−6 dBm

−7.5 dBm

−6 dBm @ 22 Gb/s

−16 dBm (estimated)

−8 dBm

−7.2 dBm

Power/CH

N/A

120 mW

48 mW

N/A

44.4 mW

68.2 mW

50 mW

11.6 mW

Energy efficiency/CH

N/A

12 pJ/b

1.92 pJ/b

N/A

1.78 pJ/b

6.82 pJ/b

2 pJ/b

0.73 pJ/b

RX PD type Sensitivity (BER of 10−12 ) @ max. data rate

Total power Total energy efficiency

2.5 W

3.5 W

256 mW

675 mW

90.4 mW

8.26 W

122.5 mW

175.6 mW

125 pJ/b

87.5 pJ/b

10.2 pJ/b

15 pJ/b

3.62 pJ/b

13.77 pJ/b

4.9 pJ/b

2.2 pJ/b

Sensors 2017, 17, 1962

14 of 40

3. Si-Based Photodetectors In this section, we examine the basic terminologies of a PD. We also briefly discuss several PD structures based on silicon photonics and their operating principles. 3.1. Basic Terminology We first present some terminologies related to the PD characteristics. The quantum efficiency is the ratio of the number of generated electron-hole pairs to the number of incident photons. Since the quantum efficiency directly determines the sensitivity of the entire receiver, designing a PD to have a high quantum efficiency is crucial. This quality is generally expressed as: η=

I ph /q P/hν

(1)

The internal quantum efficiency is sometimes defined by de-embedding the loss occurring at the detector interface to evaluate the detector performance alone. The responsivity is a more frequently used quantity by engineers with a similar meaning to the quantum efficiency, and can be expressed as: R=

I ph ηq λ(nm) = =η P hν 1240

(2)

The bandwidth of the PD is determined by two time constants: the transit time and the RC time constant which arises from the series resistance and the junction capacitance of the PD. Hence, the total time constant should be considered for the estimation of the bandwidth as follows: q 2 2 τ = τtransit (3) −time + τRC 3.2. Si PD Silicon is a good waveguide material for telecommunication wavelengths due to its large bandgap energy equivalent to the energy of light with the wavelength of approximately 1100 nm. Therefore, ironically, silicon cannot be a good absorber for wavelengths longer than 1100 nm. Fortunately, some promising results have been obtained by enabling a sub-bandgap detection without introducing any other materials. It is known that silicon can also detect light of wavelength longer than 1100 nm if crystal defects are present, which cannot be easily explained theoretically, but can be proven experimentally [58–62]. By deliberately introducing defects into a p-n diode in a waveguide structure, the sub-bandgap detection in silicon can be realized. In [58], a silicon waveguide PD is implemented, wherein the quantum efficiency is enhanced by applying ion implantation. The implemented waveguide PD is illustrated in Figure 18. The diode is formed on a SOI substrate using a standard CMOS process. After the formation of the waveguide PD, the ion implantation is conducted to introduce the crystal defects. The lengths of the waveguides are 0.25 and 3 mm. The PD with a waveguide of length 0.25 mm shows a responsivity of sub-0.1 A/W at low reverse voltages; however, the responsivity can approach 1 A/W by increasing the reverse voltage at the cost of increased dark current, thereby degrading the minimum detectable power (MDP). The PD with a waveguide of length 3 mm can absorb virtually all the light, thus enhancing the responsivity at lower reverse voltages. However, the increased length directly results in higher junction capacitances, which severely degrades the frequency response. The estimated bandwidths of the PDs with the waveguides of lengths 0.25 and 3 mm are 10–20 GHz and 2 GHz, respectively. Further research advances the performance of the silicon waveguide PD through several process optimizations, providing a bandwidth of >35 GHz and an internal quantum efficiency of 0.5 to 10 A/W [59].

Sensors 2017,2017, 17, 1962 Sensors 17, 1962 Sensors 2017, 17, 1962

15 of 39 15 of 40 15 of 39

Si ion damage Si ion damage

SiO SiO22 Al Al

p+ p+

p p

i i

n n

n+ n+

Al Al

SiO SiO22 Figure 18. Cross-sectional view of waveguide PD [58]. Figure Cross-sectional view view of [58]. Figure 18.18. Cross-sectional of waveguide waveguidePD PD [58].

As clearly observed in [58,59], the silicon waveguide PD suffers from the tradeoff between the As clearly observed in in [58,59], PDsuffers suffersfrom from tradeoff between As clearly observed [58,59],the thesilicon silicon waveguide waveguide PD thethe tradeoff between the the quantum efficiency and the bandwidth. In order to overcome this severe tradeoff, the work in [60] quantum efficiency bandwidth.InInorder order to to overcome overcome this tradeoff, thethe work in [60] quantum efficiency andand thethe bandwidth. thissevere severe tradeoff, work in [60] suggests thatthat the the adoption of of a ring resonator can enhance theabsorption absorption with significantly reduced suggests adoption a ring resonator can enhance the with significantly reduced suggests that the adoption of a ring resonator can enhance the absorption with significantly reduced device dimension. As shown in Figure 19, thethe resonator-enhanced PD achieves responsivity of 0.14 device dimension. As shown in Figure resonator-enhancedPD PDachieves achievesaaaresponsivity responsivity of of 0.14 device dimension. As shown in Figure 19, 19, the resonator-enhanced A/W with a length 10 times shorter than thatthat of the straight waveguide for thesame same responsivity. A/W with a 10 length 10 times shorter of the straightwaveguide waveguide for responsivity. A/W0.14 with a length times shorter thanthan that of the straight forthe the same responsivity. The The darkdark current level is is also maintained nA, thus thusexhibiting exhibitinganan MDP of only current level also maintainedbelow below 0.2 0.2 nA, MDP of only 1.4 1.4 nW.nW. The dark current level is also maintained below 0.2 nA, thus exhibiting an MDP of only 1.4 nW. Moreover, a resistive heater employedfor for tuning tuning the wavelength. Moreover, a resistive heater isisemployed theresonance resonance wavelength. Moreover, a resistive heater is employed for tuning the resonance wavelength. Waveguide PD Waveguide PD Defect Defect Heater Heater

Resonance couple Resonance couple Figure 19. Schematic view of silicon resonator-enhanced PD [60]. Figure Schematicview viewof ofsilicon silicon resonator-enhanced PDPD [60]. Figure 19.19. Schematic resonator-enhanced [60].

The silicon PDs discussed so farfar are all based on the SOI platform.However, However, the realization of a silicon discussed are basedon onthe theSOI SOI platform. platform. of of The The silicon PDsPDs discussed so so far are allallbased However,the therealization realization a silicon PD onPDthe bulk CMOS platform would be essential foraatruly truly monolithic silicon photonic a silicon on the bulk CMOS platform would be essential for monolithic silicon photonic silicon PD on the bulk CMOS platform would be essential for a truly monolithic silicon photonic receiver. In [61], starting with a bulk silicon substrate, micro-ring resonator PD using polysilicon is receiver. In [61], starting with a bulksilicon siliconsubstrate, substrate, aaamicro-ring PDPD using polysilicon is is receiver. In [61], starting with a bulk micro-ringresonator resonator using polysilicon demonstrated, achieving a responsivity darkcurrent currentlevel level a gigahertz demonstrated, achieving a responsivityofof0.15 0.15 A/W, a dark of of 40 40 nA,nA, andand a gigahertz demonstrated, achieving a responsivity of 0.15 A/W, a dark current level of 40 nA, and a gigahertz frequency response. Recently, thework workin in[62] [62]achieves achieves further of overall performance frequency response. Recently, the furtherimprovements improvements of overall performance frequency response. Recently, the work in [62] achieves further improvements of overall performance by applying mid-level implants typical p-i-n p-i-n structure. advances in silicon PDsPDs are are by applying mid-level implants totoa atypical structure.The Therecent recent advances in silicon by applying mid-level implants to a typical p-i-n structure. The recent advances in silicon PDs are significant because they simplify overallprocess process with with aa minimal minimal addition indicating significant because they simplify thetheoverall additionofofsteps, steps, indicating a significant because they simplify the overall process with a minimal addition of steps, indicating a a better compatibility standard CMOSprocesses processes than than their Especially, better compatibility withwith the the standard CMOS theirGe-based Ge-basedcounterparts. counterparts. Especially, betterthe compatibility with the standard CMOS processes than their Ge-based Especially, silicon PD incorporated in a waveguide has a distinct advantage thatcounterparts. the integration with the silicon PD incorporated in a waveguide has a distinct advantage that the integration with a the silicon PD incorporated in a waveguide has a distinct advantage that theFurthermore, integration the with a a waveguide is readily realized, which is challenging for the Ge-based detectors. waveguide is readily realized, which is challenging for the Ge-based detectors. Furthermore, the resonator-based structure directly enables the WDM function, is a desirable feature for highly the waveguide is readily realized, which is challenging for the which Ge-based detectors. Furthermore, resonator-based structure directly enables the WDM function, which is a desirable feature for highly dense interconnection resonator-based structure[50]. directly enables the WDM function, which is a desirable feature for highly dense interconnection [50]. dense interconnection [50]. 3.3. Ge-Introduced PD

3.3. Ge-Introduced PD Another approach involves the introduction of Ge which has a smaller bandgap energy than Si. 3.3. Ge-Introduced PD By appropriately combining with Ge, the detectable be extended from 1100 nm, Another approach involvesSithe introduction of Ge wavelength which has acan smaller bandgap energy than Si. Another approach involves thenm, introduction of Ge which has a smaller bandgapFurthermore, energy than Si. with Si alone, to longer than 1550 covering all the telecommunication wavelengths. By appropriately combining Si with Ge, the detectable wavelength can be extended from 1100 nm, By appropriately combining Si with Ge, the detectable wavelength can be extended from 1100 nm, with Si alone, to longer than 1550 nm, covering all the telecommunication wavelengths. Furthermore, with Si alone, to longer than 1550 nm, covering all the telecommunication wavelengths. Furthermore, Ge exhibits higher mobility of electrons and holes, resulting in a faster detection. Therefore, Ge has Ge exhibits higher mobility of electrons and holes, resulting in a faster detection. Therefore, Ge has long been considered as an ideal candidate to replace the conventional III-V detectors. However, the long been considered as an ideal candidate to replace the conventional III-V detectors. However, the adoption of Ge is severely restricted due to the mismatch of the lattice constants between Si and Ge. adoption of Ge is severely restricted due to the mismatch of the lattice constants between Si and Ge. The lattice mismatch poses a constraint on the maximum Ge thickness that can be grown on Si

Sensors 2017, 17, 1962

16 of 40

Ge exhibits higher mobility of electrons and holes, resulting in a faster detection. Therefore, Ge has long been considered as an ideal candidate to replace the conventional III-V detectors. However, the adoption of Ge is severely restricted due to the mismatch of the lattice constants between Si and Ge. The lattice mismatch poses a constraint on the maximum Ge thickness that can be grown on Si without introducing crystal defects [63]. Consequently, Ge thickness is limited for maintaining the dark current level as low as possible, resulting in a low quantum efficiency. Hence, the epitaxial growth of a thick Ge layer on Si should be the key to the realization of the SiGe PD and, more importantly, it should be achieved in a CMOS-friendly manner. An intuitive way to obtain a thick Ge layer with minimal dislocations is by using a graded buffer layer [64]. Based on this technique, the work in [64] achieves an extremely low dark-current level which is comparable to the theoretical reverse saturation current. Even though it completely addresses the problem of lattice mismatch, it usually results in a tall PD structure; consequently, this technique is not compatible with CMOS back-end processes and waveguide coupling. Alternatively, a direct growth of Ge on Si with the aid of a thin Ge buffer layer can be more effective [65]. Starting from a thin Ge buffer layer formed at a low temperature of 350 ◦ C, the Ge layer can be grown to be sufficiently thick for the absorption of infrared light at a relatively high temperature of 600 ◦ C, thus avoiding the problem of crystal defects. Thus, the detection of a wavelength of 1300 nm can be successfully achieved with a responsivity of 240 mA/W. Although not reported in this work, the dark current level is estimated to be high because the defects are not clearly eliminated. In [66], it is demonstrated that the dislocation density can be remarkably reduced by applying cyclic thermal annealing at a high temperature of 900 ◦ C. Further extending the work in [65] by employing a cyclic thermal annealing process, the work in [67] achieves a high responsivity with a low dark-current level and the frequency response is suitable for gigabit operation. However, high-temperature annealing may not be compatible with the standard CMOS process, necessitating a different defect-handling technique for the seamless integration of the EPIC. The work in [68] demonstrates that the selective growth of Ge through multiple hydrogen annealing is also a good approach to reduce the dislocation density. 3.4. Integration of PD with Waveguide PDs can be categorized into two types according to coupling schemes: a free-space coupled PD and a waveguide-coupled PD as illustrated in Figure 20. In the case of the free-space coupled PD, a fiber can be directly coupled for a top illumination or a back illumination. In either case, the direction of the incoming light is always parallel to that of the carrier collection. Therefore, in order to enhance the quantum efficiency, the thickness of the absorption layer should be sufficiently large, which results in an increased transit time, and hence degrades the bandwidth. On the other hand, in the waveguide-coupled PD, the tradeoff between the quantum efficiency and the bandwidth can be relaxed. This is because the absorption can be improved by increasing the absorption length along the direction of the light, which does not affect the transit time. However, in this case, the increased junction capacitance may degrade the bandwidth, but not as much as in the case of the free-space coupled PD. Another benefit of the waveguide coupling is that the waveguide PD is more suitable for an on-chip WDM system, thus obviating the use of additional fibers. Evanescent coupling and butt coupling are most commonly used for waveguide coupling. Evanescent coupling is easy to realize whereas it is relatively difficult to design a structure for butt coupling. However, in terms of quantum efficiency, butt coupling with a well-designed waveguide is generally much better than evanescent coupling.

coupled PD. Another benefit of the waveguide coupling is that the waveguide PD is more suitable for an on-chip WDM system, thus obviating the use of additional fibers. Evanescent coupling and butt coupling are most commonly used for waveguide coupling. Evanescent coupling is easy to realize whereas it is relatively difficult to design a structure for butt coupling. However, in terms of quantum efficiency, butt coupling with a well-designed waveguide is generally much better than Sensors 2017, 17, 1962 17 of 40 evanescent coupling. Free-space coupling

Waveguide coupling Contact

n

n

i

i

p

p

Figure 20. Categorization Categorization of PD by coupling schemes.

4. CMOS Transimpedance Amplifier (TIA) The TIA is the first electronic circuit that directly interfaces with a photodetector (PD) for a current-to-voltage conversion. Therefore, the overall performance of the receiver is predominantly determined by that of the TIA. There are four main parameters defining the performance of the TIA: gain, bandwidth, noise, and power. Further, these performance parameters are strongly interdependent and strong tradeoff relations exist among them. Thus, for example, we cannot improve the gain and the bandwidth simultaneously without sacrificing the power. Traditionally, compound semiconductor devices were preferred for implementing optical interface circuits due to their high-speed capability. However, significant improvements in CMOS technologies have reduced the performance gap between these two technologies. Moreover, recently, further advancements in circuit technologies have led to CMOS implementations of extremely high-speed optical receivers whose operating speeds are higher than 40 Gb/s [53,69]. Besides enjoying the advantage of CMOS scalability, the success of CMOS-compatible silicon photonics also mandates the CMOS implementation of electronic circuits. Although most optical receivers still rely on hybrid integration, which interconnects multiple ICs from different technologies through bonding, monolithic integration is gaining increasing intention and some remarkable results have been demonstrated as previously mentioned. If realized, monolithic integration would be the best solution since it provides many benefits, such as enhanced signal integrity, small area, and reduced packaging cost, when compared to hybrid integration. In this section, we focus on CMOS realizations of the TIA by providing a brief overview of various TIA topologies reported so far. 4.1. Resistor-Based TIA The most intuitive way to implement the TIA is employing a single resistor as shown in Figure 21. In this configuration, the calculations of the gain, RT , and the bandwidth, f −3dB , are straightforward and can be expressed as follows: R T = R TI A (4) f −3dB =

1 2πR TI A CPD

(5)

As evident from the above equations, there is a severe tradeoff between the gain and the bandwidth. On the other hand, the total integrated noise at the output is given by: 2 Vn,out =

kT CPD

(6)

where k is the Boltzmann constant and T is the absolute temperature. Further, the signal-to-noise ratio (SNR) can be defined to assess the performance of the TIA as follows: SNR =

CPD 2 2 I R kT in TI A

(7)

Sensors 2017, 17, 1962

18 of 40

which indicates that increasing RTIA enhances the SNR indefinitely. However, a large RTIA introduces an inter-symbolic interference (ISI) and thus reduces the bandwidth, which limits the maximum value of RTIA for a given CPD . In general, the minimum bandwidth sufficiently suppressing the ISI is chosen to be 0.5–0.7 times the target data rate. Thus, the only way to enhance the SNR is increasing the input optical power, thereby providing a large input current. Furthermore, a large CPD renders this type of TIAs unsuitable for high-speed optical receivers. Nevertheless, this type of TIAs exhibits a distinct advantage in that it consumes virtually zero power. Recently, by effectively cancelling the ISI, a resistor-based TIA is successfully demonstrated at moderately high speed with Sensors 2017, 17, 1962 18 of 39 low power consumption [70]. Sensors 2017, 17, 1962 18 of 39

in IIin

Vout V out

CPD C PD

RTIA R TIA

Figure 21. Simple implementation of TIA using single resistor. Figure Figure 21. 21. Simple Simple implementation of TIA using single resistor.

4.2. Common-Gate-Based TIA 4.2. Common-Gate-Based 4.2. Common-Gate-Based TIA TIA As discussed in the previous section, a resistor-based TIA suffers badly from the gain-bandwidth As discussed in section, aa resistor-based TIA suffers badly the gain-bandwidth As discussed in the the previous previous section, resistor-based suffers badly from fromthe the disadvantage, gain-bandwidtha tradeoff, which results in a limited achievable SNR. InTIA order to overcome tradeoff, which results in a limited achievable SNR. In order to overcome the disadvantage, a tradeoff, which results in ahas limited achievable common-gate (CG) topology been widely used.SNR. In order to overcome the disadvantage, common-gate (CG) topology has been widely used. a common-gate (CG) topology been widely used. The basic CG amplifier as has a TIA is illustrated in Figure 22. Assuming a dominant pole is located The basic CG amplifier as a TIA is illustrated in Figure Figure 22. 22. Assuming aa dominant The basic CG amplifier as a TIA is illustrated in Assuming dominant pole pole is is located located at the input, the gain, RT, and the bandwidth, f-3 dB, are expressed as follows: at the input, the gain, R T, and the bandwidth, f-3 dB, are expressed as follows: at the input, the gain, RT , and the bandwidth, f −3dB , are expressed as follows: R R (8) RTT  R11 (8) R T = R1 (8) g m1 f 3dB  ggmm1 (9)  2πC1PD f −f3dB (9) (9) 3 dB = 22πC πCPD PD that thegain-bandwidth gain-bandwidth tradeoffis is now completely eliminated, indicating that the gain Note now completely eliminated, indicating thatthat the the gain can Note that that the the gain-bandwidthtradeoff tradeoff is now completely eliminated, indicating gain can be improved without degrading the bandwidth significantly. be without degrading the bandwidth significantly. canimproved be improved without degrading the bandwidth significantly.

R R11

Iin Iin CPD CPD

Vout Vout V VBB

M M11 I IBB

Figure 22. Common-gate Common-gate TIA. TIA. Figure Figure 22. 22. Common-gate TIA.

Even if the SNR can be enhanced by employing the CG TIA, a direct tradeoff between the Even if the SNR can be enhanced by employing the CG TIA, a direct tradeoff between the bandwidth and the power consumption exists, which limits the use of the CG TIA in the present bandwidth and the power consumption consumption exists, exists, which which limits limits the the use use of the CG TIA in the present configuration. Alternatively, a regulated-cascode (RGC) TIA shown in Figure 23 can further improve configuration. Alternatively, a regulated-cascode (RGC) TIA shown in Figure 23 can further improve the bandwidth by lowering the input resistance [71]. The gain of the RGC TIA is the same as that of the bandwidth by lowering the input resistance [71]. The gain of the RGC TIA is the same as that of the CG TIA and the bandwidth is given by (with the same dominant-pole condition): the CG TIA and the bandwidth is given by (with the same dominant-pole condition): g 1  g R  f dB  g mm11 1  g mm22 R22  (10) f 33dB  (10) 2πCPD 2πC

Sensors 2017, 17, 1962

19 of 40

configuration. Alternatively, a regulated-cascode (RGC) TIA shown in Figure 23 can further improve the bandwidth by lowering the input resistance [71]. The gain of the RGC TIA is the same as that of the CG TIA and the bandwidth is given by (with the same dominant-pole condition): f −3dB =

gm1 (1 + gm2 R2 ) 2πCPD

(10)

With linearly increasing IB , gm and thus f −3dB are increased approximately by the square of gm , thereby alleviating the direct tradeoff between bandwidth and power consumption. Owing to their clear advantage, many TIAs based on the RGC topology have been proposed. The bandwidth of the RGC TIA can be further enhanced by combining a shunt-shunt feedback [72] and a differential RGC TIA can also be configured [73]. Recently, RGC-based TIAs have been successfully demonstrated with Sensors 2017,speed 17, 1962higher than 25 Gb/s [54,56,74]. 19 of 39 operating

R1 R2

Vout V2

M1 Iin

M2

CPD

IB

Figure Figure 23. 23. Regulated-cascode Regulated-cascode TIA. TIA.

Despite the the advantages advantages of of the the RGC RGC TIA, TIA, itit is is not not suitable suitable in in the the recent recent CMOS CMOS trend trend wherein wherein the the Despite supply voltage is aggressively scaled down, because it suffers from a small voltage headroom due to supply voltage is aggressively scaled down, because it suffers from a small voltage headroom due stacking of the two NMOS transistors. Specifically, the output voltage, V out, in Figure 23 has to to stacking of the two NMOS transistors. Specifically, the output voltage, Vout , in Figure 23 has to accommodate the the gate-source gate-source voltage voltage of of M M2,, the the drain-source drain-source voltage voltage of of M M1,, and and the the voltage voltage across across accommodate 2 1 R 1. The output of the common-source (CS) amplifier, V2, also has to accommodate the gate-source R1 . The output of the common-source (CS) amplifier, V2 , also has to accommodate the gate-source voltages of ofM M1 and and M M2,, and and the the voltage voltage across across R R2.. These conditions render the RGC TIA inapplicable voltages 1 2 2 These conditions render the RGC TIA inapplicable to sub-1 sub-1 V V CMOS CMOS technologies. technologies. to As an alternative, As an alternative, aa CG-feedforward CG-feedforward TIA TIA illustrated illustratedin inFigure Figure24 24isisproposed proposedinin[75], [75],achieving achievinga gain of of the the awider widerbandwidth bandwidthand andrelaxing relaxingthe thevoltage voltageheadroom headroom simultaneously. simultaneously. Furthermore, Furthermore, the the gain CG feedforward TIA is approximately the same as that of the CG and the RGC TIA. The bandwidth CG feedforward TIA is approximately the same as that of the CG and the RGC TIA. The bandwidth can be be calculated calculated as: as: can g (1 + gm2 gm3 R2 R3 ) f −3dB = m1 (11) g 1  g m 2 g m3 R2 R3  PD f 3dB  m1 2πC (11)

2πC

PD The bandwidth is significantly enhanced than the RGC TIA and the voltage-headroom problems are also greatly mitigated. By employing this topology, the work in [75] a considerably high The bandwidth is significantly enhanced than the RGC TIA and theachieves voltage-headroom problems bandwidth of 20mitigated. GHz in the CMOS are also greatly Bystandard employing this platform. topology, the work in [75] achieves a considerably high

bandwidth of 20 GHz in the standard CMOS platform.

R1

R2

R3

Vout M1 M3 Iin CPD

VB

M2

IB "Inductor omitted"

Figure 24. CG-feedforward TIA.

f 3dB 

g m1 1  g m 2 g m3 R2 R3 

(11)

2πCPD

The bandwidth is significantly enhanced than the RGC TIA and the voltage-headroom problems are also greatly mitigated. By employing this topology, the work in [75] achieves a considerably high Sensors 2017, 17, 1962 20 of 40 bandwidth of 20 GHz in the standard CMOS platform.

R1

R2

R3

Vout M1 M3 Iin CPD

VB

M2

IB "Inductor omitted"

Figure Figure 24. 24. CG-feedforward CG-feedforward TIA. TIA.

4.3. Feedback-Based TIA 4.3. Along with withthe theCG-based CG-based TIAs, a shunt-shunt feedback TIAbeen has widely been widely employed in Along TIAs, a shunt-shunt feedback TIA has employed in optical optical receiver Figure 25ashows a basic feedback-based TIA topology. With amplifier, an ideal amplifier, the receivers. Figures.25 shows basic feedback-based TIA topology. With an ideal the gain and gainbandwidth and the bandwidth can be approximated the can be approximated as follows:as follows: Sensors 2017, 17, 1962

20 of 39

R R (12) R TT = RFF (12) 1 A f 3dB  1 + A (13) (13) f −3dB = 2πRF CPD 2πR F CPD As (13) As (13) indicates, indicates, the the bandwidth bandwidth can can be be enhanced enhanced by by the the higher higher gain gain of of the the amplifier. amplifier. Therefore, Therefore, designing an amplifier that exhibits a high gain and a low output impedance is crucial for designing an amplifier that exhibits a high gain and a low output impedance is crucial for achieving achieving good performance. performance. In In general, general,this thistype typeofofTIAs TIAsisismore moreadvantageous advantageous than CG-based topology good than thethe CG-based topology in in terms of voltage headroom and features a relatively simple architecture favorable for optimization. terms of voltage headroom and features a relatively simple architecture favorable for optimization.

RF

Iin

A

Vout

CPD

Figure 25. Shunt-shunt feedback TIA.

As shown shown in inFigure Figure26, 26,a avariety variety versions of the feedback-based attempted As of of versions of the feedback-based TIA TIA havehave beenbeen attempted with with different implementations of the amplifier. Figure 26a illustrates a traditional TIA different implementations of the amplifier. Figure 26a illustrates a traditional TIA implementation implementation which was first demonstrated with CMOS technology in [76] and achieved 1-Gb/s which was first demonstrated with CMOS technology in [76] and achieved 1-Gb/s operation. operation. Thishas topology has beenused widely has succeeded in achieving high-speed operations This topology been widely andused has and succeeded in achieving high-speed operations of 10 of 10 and 25 Gb/s [44,77]. Alternatively, as shown in Figure 26b, the combination of a common-source and 25 Gb/s [44,77]. Alternatively, as shown in Figure 26b, the combination of a common-source amplifier, aa source source follower, follower, and and aa feedback feedback resistor resistor has has been been commonly commonly employed employed to amplifier, to provide provide aa low low output impedance impedance[78–80]. [78–80].Especially, Especially,the the work [80] achieves a 10-Gb/s operation by combining output work in in [80] achieves a 10-Gb/s operation by combining this this topology with bandwidth-enhancement techniques. In order to achieve a higher gain, the topology with bandwidth-enhancement techniques. In order to achieve a higher gain, the feedback feedback can be applied to a multi-stage amplifier as illustrated in Figure 26c. In [81], the feedback can be applied to a multi-stage amplifier as illustrated in Figure 26c. In [81], the feedback resistor is resistor is connected between thethe input and of the of theinverter three-stage inverter connected between the input and output theoutput three-stage amplifier. Theamplifier. feedback The can feedback can also be applied to the differential topology as implemented in [82,83] with two-stage also be applied to the differential topology as implemented in [82,83] with two-stage and four-stage and four-stage amplifiers, respectively. In spite of several advantages of the feedback-based TIAs, they suffer from an inherent stability issue which is particularly evident in the multi-stage topologies. RF R1

R1 Iin

RF

M2 V

R

Iin

V

of 10 and 25 Gb/s [44,77]. Alternatively, as shown in Figure 26b, the combination of a common-source amplifier, a source follower, and a feedback resistor has been commonly employed to provide a low output impedance [78–80]. Especially, the work in [80] achieves a 10-Gb/s operation by combining this topology with bandwidth-enhancement techniques. In order to achieve a higher gain, the feedback can be applied to a multi-stage amplifier as illustrated in Figure 26c. In [81], the feedback Sensors 2017, 17, 1962 21 of 40 resistor is connected between the input and the output of the three-stage inverter amplifier. The feedback can also be applied to the differential topology as implemented in [82,83] with two-stage and four-stage amplifiers, In advantages spite of several advantages of the TIAs, feedback-based amplifiers, respectively. In respectively. spite of several of the feedback-based they sufferTIAs, from they suffer from an inherent stability issue which is particularly evident in topologies. the multi-stage topologies. an inherent stability issue which is particularly evident in the multi-stage RF R1

R1 Iin

RF

M2 Vout

CPD

M1

Iin

RF

Iin

Vout

CPD

M1

(a)

(b)

Vout

CPD

(c)

Figure 26. Various Figure 26. Various implementations implementations of of shunt-shunt shunt-shunt feedback feedback TIA. TIA. (a) (a) CS CS amplifier amplifier with with resistor resistor feedback, (b) Cascaded CS amplifier and source follower with resistor feedback, and (c) multi-stage feedback, (b) Cascaded CS amplifier and source follower with resistor feedback, and (c) multi-stage amplifier with resistor resistor feedback. feedback. amplifier with

4.4. 4.4. Inverter-Based Inverter-Based TIA TIA Based onthe theshunt-shunt shunt-shunt feedback, an inverter-based is currently most topology popular Based on feedback, an inverter-based TIA isTIA currently the mostthe popular topology it implemented was first implemented with technology a CMOS technology for aoperation 1-Gb/s operation and it wasand first with a CMOS for a 1-Gb/s 20 years 20 agoyears [84]. ago [84]. As shown in Figure 27, the inverter-based TIA features a very simple architecture consisting As shown in Figure 27, the inverter-based TIA features a very simple architecture consisting only only of2017, a CMOS a feedback resistor. Even thebandwidth bandwidthenhancement enhancementfactor factor is is limited limited Sensors 17,inverter 1962inverter 21 of 39 of a CMOS andand a feedback resistor. Even if if the to the gain of the one-stage inverter, this topology has numerous advantages over its competitors. to the gain of the one-stage inverter, this topology has numerous advantages over its competitors. First, First, owing owing to to its its simple simple architecture, architecture, the the voltage voltage headroom headroom is is significantly significantly mitigated mitigated and and no no biasing biasing circuit is needed. Second, since it can exploit g m of both the NMOS and the PMOS, high gm can be circuit is needed. Second, since it can exploit gm of both the NMOS and the PMOS, high gm can be achieved consumption [52]. achieved with with considerably considerably low low power power consumption [52]. Nevertheless, Nevertheless, the the inverter-based inverter-based TIA TIA has has not been frequently used, because it is not compatible with the traditional technologies. However, not been frequently used, because it is not compatible with the traditional technologies. However, there there is is aa major major contributor contributor to to the the widespread widespread use use of of this this topology. topology. The The CMOS CMOS technology technology continues continues to be developed in such a way that most of the performance are optimized for digital circuits, but but not to be developed in such a way that most of the performance are optimized for digital circuits, for analog counterparts, thereby forcing the inverter-based TIA to be the most suitable topology for not for analog counterparts, thereby forcing the inverter-based TIA to be the most suitable topology modern CMOS technologies. Therefore, after being revisited in [52],inthe inverter-based TIA is now for modern CMOS technologies. Therefore, after being revisited [52], the inverter-based TIA widely employed for high-speed and low-power applications [7,43,50,55,69,85–91]. In the is now widely employed for high-speed and low-power applications [7,43,50,55,69,85–91].state-ofIn the the-art implementation, the optical on based this topology a 64-Gb/s operation, state-of-the-art implementation, thereceiver opticalbased receiver on this demonstrates topology demonstrates a 64-Gb/s which is recorded the fastest everspeed achieved the CMOS platform operation, which isasrecorded as speed the fastest everinachieved in the CMOS[53]. platform [53].

M2 Iin CPD

RF

Vout

M1

Figure 27. Inverter-based TIA.

4.5. Integrating Receiver The aforementioned aforementioned topologies topologies are based on an amplifier which offers a high-bandwidth and low-noise characteristics characteristicsgenerally generally at the an increased power consumption. In overcome order to at the cost cost of anof increased power consumption. In order to overcome this, an integrating receiver using is proposed using a different methodafor achieving a high this, an integrating receiver is proposed a different method for achieving high sensitivity with sensitivity with a low-power consumption [92]. The basic concept is briefly explained in Figure 28. a low-power consumption [92]. The basic concept is briefly explained in Figure 28. The photocurrent The photocurrent is first integrated by a sampler using two non-overlapping phases, thus charging or discharging Vin according to the incoming data. After the integrating phase, the current value, Vn, is compared with the previous value, Vn−1, to make a proper decision. Based on this scheme, the work in [92] achieves a 1.6-Gb/s operation while consuming considerably low power. The works in [93,94] further extend this work, and achieve much higher operation speeds of 16 and 24 Gb/s, respectively.

4.5. Integrating Receiver The aforementioned topologies are based on an amplifier which offers a high-bandwidth and low-noise characteristics generally at the cost of an increased power consumption. In order to Sensors 2017, 17, 1962 22 of 40 overcome this, an integrating receiver is proposed using a different method for achieving a high sensitivity with a low-power consumption [92]. The basic concept is briefly explained in Figure 28. The photocurrent is first integrated a sampler using twophases, non-overlapping phases, thus charging is first integrated by a sampler usingby two non-overlapping thus charging or discharging Vin or discharging in according to the incoming data. After the integrating the current value, Vn, according to theVincoming data. After the integrating phase, the current phase, value, V , is compared with n is compared the value, Vn−1, todecision. make a proper decision. Based the on this scheme, work the previous with value, Vnprevious a proper Based on this scheme, work in [92] the achieves −1 , to make [92] achieves a 1.6-Gb/s while consuming considerably power. The works in extend [93,94] ain1.6-Gb/s operation whileoperation consuming considerably low power. Thelow works in [93,94] further further extend this work, andhigher achieve much higher operation speeds of 16respectively. and 24 Gb/s, respectively. this work, and achieve much operation speeds of 16 and 24 Gb/s, ϕ1 Do Iin

Vin

ϕ2

Vin Vn

ϕs De

Filter

Vn-1

Sampler

1 0 1 1 0 1 0 0 tn-1 tn

ϕs

T

Figure 28. Integrating receiver based on double sampling.

5. Bandwidth Extension Techniques Techniques

theCMOS CMOSfront-end front-endforfor photodetection, there many parasitic capacitances thatthe limit the In the photodetection, there are are many parasitic capacitances that limit circuit circuit bandwidth, such as the input capacitance of PD, parasitic capacitances of CMOS devices, and bandwidth, such as the input capacitance of PD, parasitic capacitances of CMOS devices, and output output of loading of TIAs. Therefore, as the required communication opticalCMOS links loading TIAs. Therefore, as the required communication bandwidth for bandwidth optical links for increases, increases, CMOS circuits have become a bottleneck for bandwidth. achieving such highcircuit bandwidth. Several circuits have become a bottleneck for achieving such high Several techniques for circuit techniques for bandwidth extension of TIAs and optical receivers have been presented in the bandwidth extension of TIAs and optical receivers have been presented in the literature [80,85,95–100]. The techniques are classified into two categories: an inductive peaking technique and an equalization technique. In this section, a brief summary and categorization of various bandwidth extension techniques are presented. 5.1. Inductive Peaking In this subsection, the basic principles of inductive peaking techniques are briefed and examples of CMOS photodetection circuits that adopt the peaking techniques are described. Inductive peaking technique has a long history and its integration into CMOS technology is well evaluated in the literature [101–103]. Nevertheless, an integrated inductor occupies a huge silicon area and thus increases the cost of the IC significantly. Therefore, the use of inductors in a photo-detecting IC should be considered carefully and the inductors should be optimized precisely. Fortunately, there are a lot of novel inductive peaking techniques, which have been developed to overcome the bandwidth limit of the legacy electrical link or ultra-wideband amplifier. Some of them have been adopted in high-speed photodetection circuits during the last two decades. Basically, there are two types of inductive peaking—shunt peaking and series peaking—and their examples with a simple CS stage are shown in Figure 29. Without inductive peaking, the transfer function of the CS stage is: H (s) =

gm R 1 + sRC

(14)

which is a simple one-pole system. On the other hand, for the shunt-inductive peaking, which is shown in Figure 29b, the inductor introduces a zero and an additional pole in the transfer function so that the transfer function becomes: H (s) =

gm (R + sL) 1 + sRC + s2 LC

(15)

which is a simple one-pole system. On the other hand, for the shunt-inductive peaking, which is shown in Figure 29b, the inductor introduces a zero and an additional pole in the transfer function so that the transfer function becomes: 𝐻(𝑠) =

Sensors 2017, 17, 1962

(a)

(b)

R

g m (R + sL) 1 + sRC + s 2 LC (c)

R L

Out

Out

In

In C

23(15) of 40

R

L

Out

C1

C2

In C

Figure Figure29. 29. Circuit Circuit diagrams diagramsof of(a) (a) CS CS stage stagewithout withoutinductive inductivepeaking, peaking,(b) (b)CS CSstage stagewith withshunt shuntpeaking, peaking, (c) (c)and andCS CSstage stagewith withseries seriespeaking. peaking.

Byplacing placingthethe zero properly in accordance with the positions of the poles, zero can By zero properly in accordance with the positions of the poles, the zero canthe compensate compensate the dominant pole, and therefore, it improves the circuit bandwidth [102]. the dominant pole, and therefore, it improves the circuit bandwidth [102]. From a qualitativeFrom pointa qualitative point of view, the inductor blocks the high-frequency current flowing through the resistor of view, the inductor blocks the high-frequency current flowing through the resistor so that most of so that of is thesolely bias current solely used for charging/discharging thewhereas load capacitor, the bias most current used forischarging/discharging the load capacitor, the biaswhereas current the bias current is divided into the resistor and the load capacitor in the normal CS stage. Therefore, is divided into the resistor and the load capacitor in the normal CS stage. Therefore, the rise/fall the rise/fall reduces, which that the circuit is enhanced. Thedetails quantitative time reduces,time which indicates thatindicates the circuit bandwidth is bandwidth enhanced. The quantitative of the details of the bandwidth of the peaking areand presented in [102], and bandwidth enhancement of enhancement the shunt peaking are shunt presented in [102], it is demonstrated thatittheis demonstrated that the enhancement maximum bandwidth enhancement ratiopeaking (BWER)isof1.84 the shunt peaking is 1.84 maximum bandwidth ratio (BWER) of the shunt with 1.5 dB peaking with 1.5 dB peaking in magnitude The is defined as the ratio of the 3-dB in magnitude response. The BWERresponse. is defined as BWER the ratio of the 3-dB bandwidth with anbandwidth inductive with an inductive peaking to that without peaking. On the other hand, it is slightly more difficult to peaking to that without peaking. On the other hand, it is slightly more difficult to understand the understand the principle of the series-inductive peaking in the s-domain model as compared to the principle of the series-inductive peaking in the s-domain model as compared to the shunt peaking shunt peaking because andofthe of the load capacitor introducepoles two but additional because the inductor andthe theinductor separation theseparation load capacitor introduce two additional do not poles but do not introduce any zero. Since there are three poles without a zero, the bandwidth introduce any zero. Since there are three poles without a zero, the bandwidth extension is dominated extension is dominated damping factor which of the is transfer function, which is difficult observe by the damping factor ofby thethe transfer function, difficult to observe intuitively. Ontothe other intuitively. On the approach other hand, a qualitative provides a much explanation.current Since the hand, a qualitative provides a muchapproach simple explanation. Since simple the high-frequency is high-frequency current is blocked by the series inductor, C1 and C2 are charged sequentially. blocked by the series inductor, C1 and C2 are charged sequentially. Extremely, C2 is charged after C1 is fully charged. In other words, the rise/fall time of the signal is reduced as a function of the ratio of C1 to C2, at the expense of increased delay [103]. The BWER of the series peaking can be higher than 2.5 with a specific ratio of C1 over C2 [102]. In Figure 30, some advanced inductive peaking techniques are illustrated. Figure 30a shows a shunt-and-series peaking, which combines the shunt peaking and the series peaking in Figure 29. In 2004, a UCLA group proposed to employ this technique for a post-amplifier preceded by an optical TIA and achieved a BWER of 3.5 in [104]. For reference, they used different terminology in their paper; they named this technique the triple-resonance architecture. The inductive peaking technique shown in Figure 30b is widely referred to as the shunt and double series peaking. By placing an additional inductor (L3) over the shunt and series peaking, better isolation of the capacitors can be obtained; hence, the circuit bandwidth can be further enhanced [103]. However, this technique requires three inductors, thus occupying extensive silicon area. A T-coil network shown in Figure 30c resolves this challenge [97,101,103]. The negative coupling between the two inductors (L1 and L2) affects the initial boost in the current flow to C2, which indicates that C2 is effectively connected in series with the negative mutual inductance element of the T-coil [102]. In other words, the T-coil network is equivalent to the shunt and double series peaking in Figure 30b, where three inductors form a ‘T’ shape. That is why this technique is called a ‘T-coil’ network. With the T-coil network, the number of inductors is reduced to two. Moreover, the two inductors or the transformer can be implemented in silicon IC while occupying lesser area than the two inductors without magnetic coupling, by using novel winding techniques for T-coils [102,103]. The achievable BWER of T-coil network is approximately 4.0 with a reasonable C1 over C2 ratio and gain peaking, and the detailed quantitative values of BWER of the T-coil network are provided in [102]. From now on, the inductive peaking techniques employed in the TIAs are introduced. In 2000, a Stanford group introduced the shunt inductive peaking into their TIA based on a resistive feedback

inductors is reduced to two. Moreover, the two inductors or the transformer can be implemented in silicon IC while occupying lesser area than the two inductors without magnetic coupling, by using novel winding techniques for T-coils [102,103]. The achievable BWER of T-coil network is approximately 4.0 with a reasonable C1 over C2 ratio and gain peaking, and the detailed quantitative values of BWER of the T-coil network are provided in [102]. Sensors 2017, 17, 1962 24 of 40 From now on, the inductive peaking techniques employed in the TIAs are introduced. In 2000, a Stanford group introduced the shunt inductive peaking into their TIA based on a resistive feedback CS CSstage stageas asshown shownin inFigure Figure31 31[95]. [95].AACG CGstage stageisisplaced placedbetween betweenthe theshunt-peaked shunt-peakedCS CSstage stageand and the the PD PD for for decoupling decoupling the the PD PD capacitance capacitance and and the the main main stage stage and and for for introducing introducing an an additional additional flexibility flexibilityto tooptimize optimizethe theshunt shuntinductor. inductor.With Withthe theproposed proposedtechnique, technique,they theyachieved achievedBWER BWERof of1.40 1.40 using 20 nH. nH. Moreover, Moreover, ititwas wasverified verifiedthat thattheir theirTIA TIAachieves achievesa abandwidth bandwidth using an an on-chip on-chip inductor of 20 of of 1.2 GHz when fabricated in 0.5-µm CMOS technology. To the best of the authors’ knowledge, the 1.2 GHz when fabricated in 0.5-µ m CMOS technology. To the the authors’ knowledge, the first firstintroduction introductionof ofthe theseries seriesinductive inductivepeaking peakinginto intohigh-speed high-speedphotodetections photodetectionswas wasperformed performedby by aaCaltech Caltechresearch researchgroup groupin in2004 2004[80]. [80].Their TheirTIA, TIA,which whichwas wasfabricated fabricatedin in0.18-µm 0.18-µ mCMOS, CMOS,consists consistsof of three threegain gainstages stagesin inorder orderto toachieve achievesufficient sufficienttransimpedance transimpedancegain gainand andincludes includesfour fourseries seriesinductors. inductors. (a)

(b)

R L1

L2

Out

L2

In

(c)

R L1 L3

Out

In C1

R

-M L2

L1

C1

C2

Out

In

C2

C1

C2

Figure30. 30. Circuit Circuit diagrams diagrams of of (a) (a) shunt shunt and and series series peaking, peaking, (b) (b) shunt shunt and and double-series double-series peaking, peaking, Figure (c) and T-coil peaking network. and17, T-coil Sensors(c) 2017, 1962peaking network. 24 of 39

RL L Out VB2 Rf

CL

CG stage

IPD

CPD VB1

Shunt-peaked TIA

Figure 31. TIA with shunt peaking presented in [80].

As shown shown in Figure 32, 32, each each series series inductor inductor separates As in Figure separates the the adjacent adjacent capacitors capacitors as as follows: follows: (1) (1) L1: L1: the PD capacitance and the TIA input capacitance, (2) L2 and L3: the junction capacitances of cascode the PD capacitance and the TIA input capacitance, (2) L2 and L3: the junction capacitances of cascode transistors, and and (3) (3) L4: L4: the the output output parasitic parasitic capacitance/bonding capacitance/bonding pad capacitance. The series transistors, pad capacitance. The four four series inductors achieve an overall BWER of 2.4, and each contribution of the inductors is summarized in inductors achieve an overall BWER of 2.4, and each contribution of the inductors is summarized in Figure 32. at at 1010 Gb/s is verified. In Figure 32. The The3-dB 3-dBbandwidth bandwidthofofthe theTIA TIAisis9.2 9.2GHz GHzand andthe theeye-diagram eye-diagram Gb/s is verified. addition to using series inductors between cascode transistors, a UCSD group also added series In addition to using series inductors between cascode transistors, a UCSD group also added series inductors between between the the cascaded cascaded gain gain stages stages [98]. [98]. Two series inductors, L2 and inductors Two series inductors, L2 and L3, L3, are are placed placed between between the cascaded amplifiers for further bandwidth extension while adopting a similar structure with the cascaded amplifiers for further bandwidth extension while adopting a similar structure with [80], [80], as shown it is as shown in in Figure Figure 33. 33. Moreover, Moreover, it is notable notable that that the the authors authors of of [98] [98] did did not not overlook overlook the the degradation degradation of phase phase linearity linearityowing owingto tothe theincreased increasedBWER. BWER.They Theyevaluated evaluatedthe the group delay response TIA of group delay response of of thethe TIA as as well as the BWER; thus, a group delay variation of 16 ps was achieved for their TIA whereas that well as the BWER; thus, a group delay variation of 16 ps was achieved for their TIA whereas that of the of the in [80] exceeds ps. As while a result, while fabricated 0.13-µthe m CMOS, thethe TIA the TIA in TIA [80] exceeds 50 ps. As50a result, fabricated in 0.13-µmin CMOS, TIA opens eyeopens diagram eye40diagram at 40 Gb/s and achieves better eye opening to [80]. In 2012, the proposed same group at Gb/s and achieves better eye opening compared tocompared [80]. In 2012, the same group to proposed to combine the series peaking technique with the inverter-based TIA, which more combine the series peaking technique with the inverter-based TIA, which is more appropriateisfor use appropriate for use in deepSOI sub-micron CMOS and with a lower voltage [69]. in deep sub-micron CMOS technology and SOI withtechnology a lower supply voltage [69]. supply With a 1.0-V supply With a 1.0-V supply voltage, the TIA achieves 3-dB bandwidth of 30 GHz while dissipating 9 mW. voltage, the TIA achieves 3-dB bandwidth of 30 GHz while dissipating 9 mW. They also emphasized the They also emphasized the important of optimizing the phase response, and they attempted to suppress the group delay variation for better phase response. Therefore, the group delay variation was suppressed to less than 8 ps, thus achieving better eye opening compared to their previous work in [98]. T-coil peaking technique was introduced in TIA design by Ehwa Womans University in 2010 [97]. As shown in Figure 34, CG topology was chosen and two T-coil networks were employed to

of phase linearity owing to the increased BWER. They evaluated the group delay response of the TIA as well as the BWER; thus, a group delay variation of 16 ps was achieved for their TIA whereas that of the TIA in [80] exceeds 50 ps. As a result, while fabricated in 0.13-µ m CMOS, the TIA opens the eye diagram at 40 Gb/s and achieves better eye opening compared to [80]. In 2012, the same group proposed to combine the series peaking technique with the inverter-based TIA, which is more Sensors 2017, 17, 1962 25 of 40 appropriate for use in deep sub-micron CMOS SOI technology and with a lower supply voltage [69]. With a 1.0-V supply voltage, the TIA achieves 3-dB bandwidth of 30 GHz while dissipating 9 mW. They also of emphasized important of optimizing the phasetoresponse, andgroup they delay attempted to important optimizing the the phase response, and they attempted suppress the variation suppress group delay variation for the better phase response. Therefore, the grouptodelay variation for betterthe phase response. Therefore, group delay variation was suppressed less than 8 ps, was to lesseye than 8 ps, thus achieving betterprevious eye opening to their previous work thussuppressed achieving better opening compared to their workcompared in [98]. T-coil peaking technique in [98]. T-coil peaking introduced TIA design Ehwa University 2010 was introduced in TIAtechnique design bywas Ehwa WomansinUniversity in by 2010 [97].Womans As shown in Figurein34, CG [97]. As shown in Figure CGT-coil topology was chosen and two T-coil networks were employed to topology was chosen and34, two networks were employed to extend the bandwidth. The first extend the bandwidth. first T-coil network (T-coil1) capacitances separates CPDoffrom the junctionand capacitances T-coil network (T-coil1)The separates CPD from the junction the transistors, the second of the network transistors, and the second T-coil network (T-coil2) with RL and T-coil (T-coil2) provides shunt-peaking with RL and provides separates shunt-peaking the junction capacitance and separates the junction capacitance and the output load capacitance. BWER of 2.3 and 3-dB bandwidth the output load capacitance. BWER of 2.3 and 3-dB bandwidth of 12.6 GHz were achieved, whereas of 12.6 GHz were achieved, the phase response was not fully optimized. the phase response was not whereas fully optimized.

Out

L4 VB

VB CPAD L2

L3

L1 -A

IPD

CPD

L1

1.48

L2

1.42

L3

1.62

L4

1.17

L1, L2

2.0 2.3

All

2.4

Figure Figure32. 32.Three-stage Three-stageTIA TIAwith withseries seriespeaking peakingpresented presentedin in[65]. [65].

(a) (a) VBP VBP

-A -A

L2 L2

L3 L3

25 of of 39 39 25

(b) (b)

VBP VBP L5 L5 VBN VBN

PD IIPD

BWER

L1, L2, L3 Source follower

Sensors 2017, 2017, 17, 17, 1962 1962 Sensors

L1 L1

L used

Out Out L1 L1 PD IIPD

L4 L4

L2 L2

L3 L3

Out Out

CPD CPD

CPD CPD

Figure 33. 33. TIA examples with series peaking between cascaded gain stages presented in (a) (a) [61] and Figure 33. TIA TIAexamples exampleswith withseries series peaking between cascaded gain stages presented in[61] (a) and [61] Figure peaking between cascaded gain stages presented in (b) [71]. [71]. and (b) [71]. (b)

RL R L T-coil1 T-coil1 CG stage stage CG PD IIPD

CPD C PD VB1 V B1

VB2 V B2

T-coil2 T-coil2 Out Out

Figure 34. 34. CG CG TIA TIA with T-coil peaking presented in [82]. Figure CG TIA with T-coil peaking presented in [82].

In addition addition to to the the inductive inductive peaking peaking techniques techniques shown shown in in Figures Figures 29 29 and and 30, 30, an an inductive inductive In In addition to the inductive peaking techniques shown in Figures 29 and 30, an inductive feedback feedback technique technique wherein wherein an an inductor inductor is is placed placed in in series series with with aa feedback feedback resistor resistor of of aa TIA TIA has has been been feedback technique wherein an inductor is placedAinstraightforward series with a feedback resistor of a TIA hasextension been proposed proposed in the literature [44,100,105]. intuition of the bandwidth using proposed in the literature [44,100,105]. A straightforward intuition of the bandwidth extension using in the literature [44,100,105]. A straightforward the bandwidth extension using inductive inductive feedback is summarized summarized in Figure Figure 35, 35,intuition where R Rof F and L denote the feedback resistance and inductive feedback is in where F and L denote the feedback resistance and feedback is summarized Figure 35, where R L denote the gain feedback resistance and inductance, F and inductance, respectively.in Assuming simple CS amplifier whose and 3-dB 3-dB bandwidth are -g -gmmR Rout out inductance, respectively. Assuming aa simple CS amplifier whose gain and bandwidth are and f p respectively, a resistive feedback improves the bandwidth by a factor of Rout/RL, where RL is the and fp respectively, a resistive feedback improves the bandwidth by a factor of Rout/RL, where RL is the resistance of of the the parallel parallel combination combination of of R Rout out and RF, whereas the gain is reduced by the same factor, resistance and RF, whereas the gain is reduced by the same factor, as shown in Figure 35a. On the other hand, insulting series inductor inductor to to R RFF,, as as shown shown in in Figure Figure 35b, 35b, as shown in Figure 35a. On the other hand, insulting aa series results in peaking as well as the bandwidth extension. At a low frequency, since the impedance of results in peaking as well as the bandwidth extension. At a low frequency, since the impedance of the inductor inductor is is almost almost zero zero so so that that it it can can be be neglected, neglected, the the transfer transfer function function with with the the inductor inductor becomes becomes the

Figure 34. CG TIA with T-coil peaking presented in [82].

In addition to the inductive peaking techniques shown in Figures 29 and 30, an inductive feedback technique wherein an inductor is placed in series with a feedback resistor of a TIA has been proposed using Sensors 2017,in 17,the 1962literature [44,100,105]. A straightforward intuition of the bandwidth extension 26 of 40 inductive feedback is summarized in Figure 35, where RF and L denote the feedback resistance and inductance, respectively. Assuming a simple CS amplifier whose gain and 3-dB bandwidth are -gmRout respectively. Assuming a simple CS amplifier gain and by 3-dB bandwidth are -gm Rout fp and fp respectively, a resistive feedback improveswhose the bandwidth a factor of Rout/R L, where RLand is the respectively, a resistive feedback improves the bandwidth by a factor of R /R , where R is the out by L the sameLfactor, resistance of the parallel combination of Rout and RF, whereas the gain is reduced resistance of the parallel combination of R and R , whereas the gain is reduced by the out F as shown in Figure 35a. On the other hand, insulting a series inductor to RF, as shown in same Figurefactor, 35b, as shown in Figure 35a. On the other hand, insulting a series inductor to R , as shown in Figure 35b, F since the impedance results in peaking as well as the bandwidth extension. At a low frequency, of results in peaking as well as the bandwidth extension. At a low frequency, since the impedance of the the inductor is almost zero so that it can be neglected, the transfer function with the inductor becomes inductor zero soinductor. that it can be neglected, the transfer function with the inductor becomes the the same is asalmost that without same as that without inductor. (a)

RF

Gain gmRout

-Av, fp

-20dB/dec

gmRL

fp fp(Rout/RL)

f

Av = gmRout, RF >> 1/gm, RL = Rout||RF

(b)

L

RF

Gain gmRout

-Av, fp

Av = gmRout, RF >> 1/gm

-20dB/dec

gmRL

L starts to block feedback

fp fp(Rout/RL)

f

Figure 35. 35. Effect Effect of of feedback feedback on on the the transfer transfer function function of of an an amplifier amplifier (a) (a) with with only only resistive resistive feedback feedback Figure (b) with with the the combination combination of of resistive resistive and (b) and inductive inductive feedback. feedback.

However, the impedance of the inductor increases as the frequency increases; hence, it becomes larger than that of RF at a certain frequency, which can be considered as a zero frequency. This indicates that the feedback strength decreases, and consequently, the transfer function begins to follow that without the resistive feedback. In other words, when the inductance dominates the overall feedback impedance, the feedback network becomes negligible; i.e., the transfer function exhibits a peaking as shown in Figure 35b, which follows the low-gain curve at a low frequency and the high-gain curve at a high frequency. Luxtera Inc. adopted the inductive feedback technique in the CS-stage-based TIA in 2006 [44]. They also reported that the inductor reduces the input-referred noise caused by the feedback resistor. In [100,105], the inductive feedback was employed in the inverter-based TIA by the authors’ group, in order to compensate the frequency dependent loss owing to the PD bonding parasitic impedance by utilizing the peaking illustrated in Figure 35. With inductive peaking, the 3-dB bandwidth of the TIA extends from 11.4 GHz to 25.2 GHz, which corresponds to a BWER of 2.2 dB. There is an interesting inductive peaking technique proposed by TSMC Inc. in 2014, which is called a shared inductor [85]. When an inductor is shared between adjacent amplifier stages, the number of inductors used in the photodetecting circuit halves; moreover, the size of each inductor can be further reduced. The fabrication in 28-nm CMOS technology is provided in [85], and it was reported that the overall chip area reduced by 56% in addition to a power saving of 27% as compared to the fabricated prototype chip with the conventional inductive peaking. Since the on-chip inductor occupies a large area in ICs, the shared inductor is a notable circuit technique because it can significantly reduce the chip area occupied by inductors. 5.2. Equalization In the electrical link receiver, various types of equalization techniques have been utilized to overcome the frequency dependent loss of the electrical transmission line. On the other hand, there has been little interest in using such equalization techniques for optical links because optical fibers

Sensors 2017, 17, 1962

27 of 40

provide almost infinite transmission bandwidth. However, equalization techniques have been adopted in recent state-of-the-art optical receivers in order to (1) overcome the limited O/E converting speed of PD, (2) break the tradeoff between high-gain and high-bandwidth, and (3) enhance the TIA bandwidth with minimal added noise. This section summarizes the equalization techniques for a photodetecting circuit that have been presented in leading journals and conferences recently. In 2010, a University of Toronto group proposed a TIA with negative Miller capacitances and a continuous-time linear equalizer (CTLE) [106] in order to overcome the limited speed of a spatially modulated PD. Figure 36 shows the principle of the Miller effect. In definition, the impedance is the coefficient of the voltage difference between the two ports of a device under test (DUT) over the current flowing through the DUT. In case there is an amplifier whose input and output are connected to the two ports of a DUT, the voltage applied to the DUT is influenced by the amplifier as well as the input voltage. Therefore, the amplifier gain should be included in the impedance. For example, the input impedance becomes Z/(1-A), where Z and A represent the original impedance of the DUT and gain of the amplifier, respectively. It can be achieved by using the Kirchhoff’s current law (KCL) at the input node, as long as the amplifier sets the ratio of the output voltage over the input voltage to A. Because the Miller effect is a truly well-known circuit effect, further explanations can be found in many textbooks, i.e. [107]. In [106], a pair of capacitors is placed between the input and output of a differential CS stage TIA, which have the same polarity, as shown in Figure 36b. Owing to the Miller effect, the input capacitance of the TIA becomes CCS + C(1-ACS ), where CCS , C, and ACS are the input capacitance of the CS stage, Miller capacitance, and gain of the CS stage, respectively. Notably, the second capacitance term of the Miller capacitance has a negative value when the gain of the CS stage is larger than unity. That is why this technique is referred to as the negative Miller capacitance. Accordingly, the input capacitance becomes lower than CCS as long as the gain of the CS stage is larger than unity, thus Sensors 2017, 17, 1962improving the bandwidth of the preceding amplifier. 27 of 39 (a)

C

(b)

RL

Z CS A Zin = Z/(1-A)

out+

in+

in-

Cin CS stage ACS = gmRL

C Cin = CCS+C(1-gmRL)

RL

out-

IBIAS

Figure Figure36. 36.(a) (a)Miller Millereffect effect(b) (b)Miller Millercapacitance capacitanceapplied appliedin inaadifferential differential CS CS stage stage in in [91]. [91].

A CTLE introduces a similar effect with the inductive feedback technique discussed in the A CTLE introduces a similar effect with the inductive feedback technique discussed in the previous section. A CTLE is generally implemented with a CS stage with source degeneration using previous section. A CTLE is generally implemented with a CS stage with source degeneration using a combination of a resistor and a capacitor. As shown in Figure 37, the source degeneration with a a combination of a resistor and a capacitor. As shown in Figure 37, the source degeneration with resistor (RS in Figure 37b) decreases the gain of the CS stage by a factor of (1 + gmRS) while the circuit a resistor (RS in Figure 37b) decreases the gain of the CS stage by a factor of (1 + gm RS ) while the bandwidth is extended by the same factor [107]. By placing a degeneration capacitor (CS) in parallel circuit bandwidth is extended by the same factor [107]. By placing a degeneration capacitor (CS ) with RS, the degeneration strength becomes weakened at a high frequency so that the transfer in parallel with RS , the degeneration strength becomes weakened at a high frequency so that the function follows that of the normal CS stage, which is similar to the effect of the inductive feedback. transfer function follows that of the normal CS stage, which is similar to the effect of the inductive As a result, the CTLE achieves a high-frequency peaking in its transfer function as shown in feedback. As a result, the CTLE achieves a high-frequency peaking in its transfer function as shown in Figure 37d. The overall photodetecting circuit presented in [106] is shown in Figure 38. Two identical Figure 37d. The overall photodetecting circuit presented in [106] is shown in Figure 38. Two identical CS stages are cascaded for high gain, and the negative Miller capacitance is used for the second stage CS stages are cascaded for high gain, and the negative Miller capacitance is used for the second stage to reduce the load capacitance driven by the first CS stage. After the cascaded CS stage, CTLE is used to reduce the load capacitance driven by the first CS stage. After the cascaded CS stage, CTLE is used to compensate the high-frequency loss owing to the limited bandwidth of a spatially modulated PD, to compensate the high-frequency loss owing to the limited bandwidth of a spatially modulated PD, which is monolithically integrated in a single chip. Therefore, the work in [106] achieves 5-Gb/s data which is monolithically integrated in a single chip. Therefore, the work in [106] achieves 5-Gb/s data rate while the 3-dB bandwidth of the spatially modulated PD is only 700 MHz. rate while the 3-dB bandwidth of the spatially modulated PD is only 700 MHz. (a)

(b) RL

In

RL

Out CL

(c)

In

RL

Out CL

(d)

In

Out CL

Gain gmRL (a) gmRL 1+gmRS

(c)

(b)

Figure 37d. The overall photodetecting circuit presented in [106] is shown in Figure 38. Two identical CS stages are cascaded for high gain, and the negative Miller capacitance is used for the second stage to reduce the load capacitance driven by the first CS stage. After the cascaded CS stage, CTLE is used to compensate the high-frequency loss owing to the limited bandwidth of a spatially modulated PD, which is monolithically integrated in a single chip. Therefore, the work in [106] achieves 5-Gb/s data Sensors 2017, 17, 1962 28 of 40 rate while the 3-dB bandwidth of the spatially modulated PD is only 700 MHz. (a) (a)

(b) (b) RLL R

In In

RLL R

Out Out CLL C

(c) (c)

In In

RLL R

Out Out In In

CLL C

RSS R A == -g -gmmR RLL A

(d) (d)

RLL ggmmR A == -A 1+gmmR RSS 1+g

Out Out CLL C

RSS R

Gain Gain RLL ggmmR (a) (a) RLL ggmmR 1+gmmR RSS (c) 1+g (c)

(b) (b)

CSS C 1/(RSSC CSS)) 1/(R

1/(RLLC CLL)) 1/(R

w w

Figure Figure37. 37.Summary Summaryof ofCTLE CTLEbased basedon onsource sourcedegenerated degeneratedCS CSstage. stage.(a–c) (a–c)Circuit Circuitdiagrams diagramsof ofnormal normal Figure 37. Summary of CTLE based on source degenerated CS stage. (a–c) Circuit diagrams of normal CS stage, CS stage with source degeneration resistor, and CS stage with source degeneration resistor CS stage, stage, CS CS stage stage with with source source degeneration degeneration resistor, resistor, and and CS CS stage stage with with source source degeneration degeneration resistor resistor CS and andcapacitor capacitor(CTLE). (CTLE).(d) (d)Transfer Transferfunction functionof ofCTLE. CTLE. and capacitor (CTLE). (d) Transfer function of CTLE.

IIPD PD

CS CS

CS CS CTLE CTLE

2-stage CS TIA

CTLE for boosting

Figure38. 38.Overall Overallstructure structureof of[91]. [91]. Figure 38. Overall structure of [91]. Figure

In order order to to achieve achieve sufficient sufficient high-frequency high-frequency boosting boosting and and to to overcome overcome noise noise amplification amplification of of In CTLE, a non-linear equalization technique of a decision feedback equalizer (DFE) has been CTLE, a non-linear equalization technique of a decision feedback equalizer (DFE) has been introduced introduced for optical The DFE hasused beeninwidely used in to electrical linksfor to for optical receivers. Thereceivers. DFE technique hastechnique been widely electrical links compensate compensate for inter-symbol interference (ISI) caused by the frequency dependent loss and inter-symbol interference (ISI) caused by the frequency dependent loss and reflections [9,25,108–110]. reflections [9,25,108–110]. Because a theoretical background of DFE can be found in amany Because a theoretical background of DFE can be found in many publications [9,111–114], only brief introduction is provided in this paper for better clarification of the focus. Owing to the loss and reflections, a single transmitted bit cannot complete its transition within a bit period or a unit-interval (UI) so that it influences the following bit sequence. The basic concept of the DFE is that the previously received bit sequence is used when a receiver makes a decision whether the currently received bit is ‘zero’ or ‘one’, by feeding back the previously decided bit sequence to the input as shown in Figure 39. That is, the DFE uses consecutive multiple bits to distinguish a single bit because the bits interact with each other due to the ISI. The degree of compensation is closely related to the number of DFE taps, which determines the length of the previous bit sequence to be included in the decision of the current bit. However, the power consumption of DFE circuitry is proportional to the number of taps. This indicates that the DFE requires large power consumption to achieve a sufficient compensation, which renders it less attractive to adopt the DFE for optical links. In 2013, however, an IBM Research group proposed an optical receiver by employing the DFE with infinite impulse response (IIR) feedback [70]. Its most interesting feature is that the receiver uses a resistor-based TIA for I–V conversion as shown in Figure 40a. As mentioned earlier, the resistor-based TIA should sacrifice the circuit bandwidth for gain and SNR. In this approach, a large resistor is used to achieve high gain and SNR at the expense of reduced bandwidth, and the resulting ISI is compensated by DFE. Moreover, only one tap DFE is used to compensate the ISI, which arises from a nearly infinite number of preceding bits, by using IIR feedback, and therefore, the overhead of a large number of DFE taps is significantly reduced. In contrast to electrical links where the ISI is mainly caused by external components so that is difficult to predict, the RC time constant between the PD capacitance and the resistor, which is located in the IC and is truly predictable, determines the ISI in a resistor-based receiver. As a result, by making the first feedback tap experience the same RC time constant as the input signal, the feedback signal is exponentially decayed in the same manner as the currently incoming signal so that it can compensate

feedback, and therefore, the overhead of a large number of DFE taps is significantly reduced. In contrast to electrical links where the ISI is mainly caused by external components so that is difficult to predict, the RC time constant between the PD capacitance and the resistor, which is located in the IC and is truly predictable, determines the ISI in a resistor-based receiver. As a result, by making the Sensors 2017, 17, 1962 29 of 40 first feedback tap experience the same RC time constant as the input signal, the feedback signal is exponentially decayed in the same manner as the currently incoming signal so that it can compensate infinitetaps tapsofof as shown in Figure 40b. the With the IIR-DFE technique, the resistor-based TIA infinite ISIISI as shown in Figure 40b. With IIR-DFE technique, the resistor-based TIA achieves achieves a data rate of 9 Gb/s. In [115], a University of Toronto group extended the adoption of a data rate of 9 Gb/s. In [115], a University of Toronto group extended the adoption of IIR-DFE to IIRthe DFE to the RGC TIA structure, where the dominant pole is located at the output of the TIA. RGC TIA structure, where the dominant pole is located at the output of the TIA. Bit Decision In

x c1

Z-1

1st tap

x c2

Z-1

2nd tap

Figure39. 39. Simplified Simplified block block diagram diagram of ofDFE. DFE. Figure (b) Clocked Comparator

RTIA2

CPD

Out

VSUM

Iin

Z-1 RTIA1

IDFE

Buffer

Normalized Voltage

(a)

VSUM w/o DFE

ISI tail

DFE tap signal VSUM w/ DFE

0

1

2

3

4

5

Current bit position

6

7

8

Time (UI)

Figure 40. 40. Resistor-based Resistor-based TIA TIA with with IIR-DFE IIR-DFEproposed proposedin in[55]. [55]. Figure

Compared with [70], the dominant pole is moved from the TIA input to the TIA output; therefore, the DFE feedback subtraction is placed at the output of the TIA in [115]. A data rate of 20 Gb/s is achieved without using any inductive peaking technique with a relative small area of 0.027 mm2 . 6. Clock and Data Recovery (CDR) Circuits 6.1. CDR Basic Because the transmission of binary data in optical communications is based on a time-division multiplexing [116], just amplifying the incoming signal is not sufficient to recover the data [93]. The other requirement for the complete recovery of the data is to retime the incoming data with a clock whose frequency is well-aligned with the bit-rate of the data. In general, the retiming operation is based on sample-and-regeneration using a clocked sense amplifier [117,118]. That is, the timing information of the data is recovered by sampling, while the value of binary data is gathered by sampling and recovered to digital rail-to-rail amplitude by regenerating. Notably, the SNR at the sense amplifier input is restored after being regenerated at the sense amplifier output, unlike linear amplifiers where the input SNR propagates to the output, because the regeneration is based on the positive feedback. Rather, the input SNR leads to a probability of wrong decisions while the output has little noise component. Therefore, during the retiming operation, it is very important to sample the data at a proper time at which the SNR is maximized. Unlike the retiming of the digital signal domain, where only the requirements of the setup and hold time are considered, the error rate from the retiming of analog domain varies with the timing, even though the timing satisfies the setup and hold time requirements. Figure 41a illustrates this aspect. In [119], the theoretical dependency of BER on the sampling timing is analyzed. The calculated BER as a function of the sampling timing using

has little noise component. Therefore, during the retiming operation, it is very important to sample the data at a proper time at which the SNR is maximized. Unlike the retiming of the digital signal domain, where only the requirements of the setup and hold time are considered, the error rate from the retiming of analog domain varies with the timing, even though the timing satisfies the setup and Sensors 2017, requirements. 17, 1962 30 BER of 40 hold time Figure 41a illustrates this aspect. In [119], the theoretical dependency of on the sampling timing is analyzed. The calculated BER as a function of the sampling timing using the formula provided in [119] is shown in Figure 41b, where a maximum SNR of 10 is used for the the formula provided in [119] is shown in Figure 41b, where a maximum SNR of 10 is used for the calculation. For the sake of simplicity, it is assumed that there is no random timing error, which is calculation. For the sake of simplicity, it is assumed that there is no random timing error, which is referred to as a jitter. As observed in Figure 41b, the BER is exponentially degraded as the sampling referred to as a jitter. As observed in Figure 41b, the BER is exponentially degraded as the sampling timing alters from the ideal position. A CDR circuit is used as the building block of an optical receiver, timing alters from the ideal position. A CDR circuit is used as the building block of an optical receiver, and it generates the sampling clock with a precise frequency and drives the sampling timing to the and it generates the sampling clock with a precise frequency and drives the sampling timing to the ideal position. In conventional optical receivers, the importance of CDR has been underestimated or ideal position. In conventional optical receivers, the importance of CDR has been underestimated or even ignored for a long time [93]. However, recently, there have been some engineering examples even ignored for a long time [93]. However, recently, there have been some engineering examples that that include the CDR circuits for better performance and robustness. This section provides the CDR include the CDR circuits for better performance and robustness. This section provides the CDR circuit circuit examples used in a high-speed photodetection receiver. examples used in a high-speed photodetection receiver. (a)

(b)

11

Noise

Bit Error Rate

Degraded SNR

Decision threshold

Max SNR

-4 10 0.0001 -8 10 1E-08 -12 101E-12 -16 101E-16 -20 101E-20

Optimum Sampling

-24 101E-24 -0.5 -0.5

Sub-optimum Sampling

00 (eye center)

0.5 0.5

Sampling Position (UI)

Figure 41. 41. Dependency Dependency of BER BER on on sampling sampling timing. timing. (a) Qualitative Qualitative explanation explanation of BER BER degradation degradation and (b) quantitative relation relation of of BER BER with with sampling sampling timing timing using using the the formula formula in in [105]. [105].

6.2. CDR Examples Figure a block diagram of a of receiver, whichwhich includes CDR circuitry. First, a TIA converts Figure 42 42shows shows a block diagram a receiver, includes CDR circuitry. First, a TIA Sensors 2017,and 17, 1962 30 of 39 and amplifies the input current-mode signal from signal PD intofrom a voltage-mode Subsequently, converts amplifies the input current-mode PD into a signal. voltage-mode signal. aSubsequently, clocked sampler such sampler as a StrongArm retimeslatch the retimes voltage-mode signal withsignal the sampling a clocked such as alatch StrongArm the voltage-mode with the sampling and also regenerates thetosignal to rail-to-rail digital rail-to-rail The operation principles clock and clock also regenerates the signal digital swing. swing. The operation principles and and characterization the clocked samplers can beinfound As mentioned the characterization of theofclocked samplers can be found [118]. in As [118]. mentioned above, theabove, frequency frequency of theclock sampling should be well-aligned withof the of data. the incoming data. of the sampling shouldclock be well-aligned with the bit-rate the bit-rate incoming Depending on Depending on the applications and architectures, the sampling clock can be generated within the the applications and architectures, the sampling clock can be generated within the CDR circuit and CDRalso circuit and can from also be provided from a transmitter or a global clock The generation circuit. and The can be provided a transmitter or a global clock generation circuit. categorization categorization analysis on the clocking architecture are provided in [120,121]. analysis on the and clocking architecture are provided in [120,121]. The relative phase ofThe the relative sampledphase data of the sampled data and the clock is compared in the phase detector circuit. According to the relative and the clock is compared in the phase detector circuit. According to the relative phase information phase information obtained froma the phase detector, a phase circuit aligns obtained from the phase detector, phase adjusting circuit alignsadjusting the sampling timing to the the sampling optimum timing to the optimum position, which is the center of the eye in Figure 41. position, which is the center of the eye in Figure 41. Clock and Data Recovery (CDR) Clocked Sampler

Iin

Data Phase Detector

TIA

CPD

Clock

Lead/Lag

Aligned Clock

Phase Adjust Circuit

Figure 42. 42. Simplified including CDR. CDR. Figure Simplified block block diagram diagram of of aa complete complete high-speed high-speed photodetecting photodetecting receiver receiver including

The first CDR example to be introduced was presented in [44] by Luxtera Inc. in 2006. As shown The first CDR example to be introduced was presented in [44] by Luxtera Inc. in 2006. As shown in Figure 43, it consists of two negative feedback loops: one for the precise phase alignment and the in Figure 43, it consists of two negative feedback loops: one for the precise phase alignment and the other for the initial frequency acquisition. In order to avoid interference between the loops, a locking procedure that locks the loops sequentially is used in the CDR. At the initial stage, the frequency acquisition loop is enabled until a frequency lock detector determines that the frequency of the CDR clock is sufficiently close to the data rate of the incoming data using an externally provided reference frequency. Subsequently, the CDR enters the normal CDR operation state, where the lock detector

Aligned Clock

Adjust Circuit

Figure 42. Simplified block diagram of a complete high-speed photodetecting receiver including CDR. SensorsThe 2017, 17, 1962 31 of 40 first CDR example to be introduced was presented in [44] by Luxtera Inc. in 2006. As shown

in Figure 43, it consists of two negative feedback loops: one for the precise phase alignment and the other for for the the initial acquisition. In between the the loops, loops, aa locking other initial frequency frequency acquisition. In order order to to avoid avoid interference interference between locking procedure that locks the loops sequentially is used in the CDR. At the initial stage, the frequency frequency procedure that locks the loops sequentially is used in the CDR. At the initial stage, the acquisition loop lock detector detector determines determines that of the the CDR CDR acquisition loop is is enabled enabled until until aa frequency frequency lock that the the frequency frequency of clock is sufficiently close to the data rate of the incoming data using an externally provided reference clock is sufficiently close to the data rate of the incoming data using an externally provided reference frequency. Subsequently, the CDR CDR enters enters the the normal normal CDR CDR operation operation state, state, where where the the lock lock detector detector frequency. Subsequently, the activates the phase alignment alignment loop loop and and simultaneously simultaneously disables frequency acquisition acquisition loop activates the phase disables the the frequency loop in in order to track the optimum sampling phase. Although the frequency acquisition loop is disabled order to track the optimum sampling phase. Although the frequency acquisition loop is disabled in in the normal operation, the residual frequency error can be tracked only with the phase alignment loop, the normal operation, the residual frequency error can be tracked only with the phase alignment loop, as long long as as the the frequency frequency error error is is within within the the lock-in lock-in range range [122,123]. [122,123]. as Phase Alignment Loop Front-end Circuit

Iin

TIA

Lead Linear Charge Phase Lag Pump Detection Logic

LA

Enable

CPD CDR CLK

VCO

Control voltage

Freq. Lock Detector

Div. Lead PFD

Frequency Acquisition Loop

Ref. CLK

Lag

Charge Pump

Enable

Figure 43. Sequential locking CDR presented in [29].

If a large frequency error out of the lock-in range is introduced, the lock detector re-enables the If a large frequency error out of the lock-in range is introduced, the lock detector re-enables the frequency acquisition loop. Notably, a single voltage-controlled oscillator (VCO) generating the CDR frequency acquisition loop. Notably, a single voltage-controlled oscillator (VCO) generating the CDR clock is shared between the two loops. Since the clock frequency generated by the VCO is clock is shared between the two loops. Since the clock frequency generated by the VCO is proportional to the control voltage of the VCO, the phase of the clock is adjusted by changing the control voltage of the VCO for a short period of time, whereas the control voltage is changed permanently for the frequency acquisition. In the front-end, a limiting amplifier, which consists of five-stage cascaded gain stages, restores the low-swing signal from TIA to the logic level signal because this CDR uses a linear phase detection scheme that requires logical processing of data before recovery. The linear phase detection scheme provides predictable loop dynamics of the CDR as compared to the non-linear phase detection, which is represented by bang-bang phase detection or Alexander phase detection. However, a large amount of power is dissipated in this CDR structure owing to the limiting amplifier stage and the logic gate operating at a high data rate. Further detailed explanations and comparisons of the phase detecting schemes can be found in many publications, such as [104]. With the implementation of the CDR, the receiver achieves an exceptional sensitivity of −19.5 dBm at the data rate of 20 Gb/s. In [44], the fabrication of an optical TX in 0.13-µm technology is also provided, and the overall power consumption of the transmitter and the receiver is 1.25 W per channel. In [77,124], a National Taiwan University group achieved a data rate of 25 Gb/s per channel using a similar structure. In order to overcome the limited speed of the conventional linear phase detector in [44], the studies in [77,124] replaced the conventional linear phase detector with a mixer-based phase detector. While fabricated in 65-nm CMOS technology, the front-end circuit and the CDR circuit dissipate 40 mW and 100 mW, respectively, at 25 Gb/s per channel. The subsequent CDR example to be introduced was [93] presented by a Stanford group in 2008. The work in [93] adopted an integration-based front-end presented in [125], as shown in Figure 44. Unlike the sequential locking CDR presented in [44], the work in [93] adopted a dual-loop CDR

using a similar structure. In order to overcome the limited speed of the conventional linear phase detector in [44], the studies in [77,124] replaced the conventional linear phase detector with a mixerbased phase detector. While fabricated in 65-nm CMOS technology, the front-end circuit and the CDR circuit dissipate 40 mW and 100 mW, respectively, at 25 Gb/s per channel. The subsequent CDR example to be introduced was [93] presented by a Stanford group in 2008. Sensors 2017, 17, 1962 32 of 40 The work in [93] adopted an integration-based front-end presented in [125], as shown in Figure 44. Unlike the sequential locking CDR presented in [44], the work in [93] adopted a dual-loop CDR architecture phase alignment alignmentand andfrequency frequencyacquisition acquisitionloops loops operate simultaneously. architecture where where the phase operate simultaneously. In In this typeofofCDR, CDR,the theloop loopbandwidth bandwidthofofaaloop loopshould shouldbe be set set an an order order of magnitude magnitude lower this type lower than thanthat that of ofthe theother otherloop, loop,in inorder orderto toprevent preventthe theinterference interferencebetween betweenthe theloops loops[119,126]. [119,126]. In In [93], [93],the theloop loop bandwidth bandwidthof ofthe thephase phasealignment alignmentloop loopwas wasdesigned designedto tobe belower lowerthan thanthat thatof ofthe thefrequency frequencyloop loopto to suppress suppressthe thejitter jittertransfer transfercaused causedby bythe theinput inputdata. data.Contrary Contraryto tothe thesequential sequentiallocking lockingCDR, CDR,the theVCO VCO isisnot notshared sharedbetween betweenthe theloops. loops.An Anadditional additionalphase phaseadjusting adjustingcircuit circuitbased basedon onaaphase phaseinterpolator interpolator (PI) the phase alignment loop. It cannot generate a clock but can the phase of the input (PI)isisused usedinin the phase alignment loop. It cannot generate a clock butchange can change the phase of the clock by the by VCO to the relative phase information extracted from the non-linear inputprovided clock provided theaccording VCO according to the relative phase information extracted from the nonphase The phase-adjusted clock is fed back intoback the VCO through phase-frequency detector lineardetector. phase detector. The phase-adjusted clock is fed into the VCO athrough a phase-frequency detector (PFD) sooutput that thephase output of the VCO is eventually aligned data. Fabricated in (PFD) so that the of phase the VCO is eventually aligned with thewith data.the Fabricated in 90-nm 90-nm technology, CMOS technology, the integration-based front-end and the CDRconsume circuit consume 23 mW and CMOS the integration-based front-end and the CDR circuit 23 mW and 35 mW, 35 mW, respectively, andreceiver achievesensitivity receiver sensitivity of −5.4 a data of 16 Gb/s. respectively, and achieve of −5.4 dBm at adBm data at rate of 16rate Gb/s. Phase Alignment Loop Iin

Integrating Sampler

Phase Detector

CPD

Lead/ Lag

Frequency Acquisition Loop

PI-based Phase Adjust

Div.

Lead Charge Pump

Lag

PFD

Ref. CLK

VCO

Figure44. 44.Dual Dualloop loopCDR CDRwith withintegration-based integration-basedfront-end front-endpresented presentedinin[78]. [78]. Figure

As CMOS technology scales down, many non-idealities appear in analog circuit components As CMOS technology scales down, many non-idealities appear in analog circuit components because the scaling mainly focuses on digital circuit elements. In order to overcome the various because the scaling mainly focuses on digital circuit elements. In order to overcome the various performance degradations introduced by the analog loop filters in previous CDR examples in performance degradations introduced by the analog loop filters in previous CDR examples in [44,77,93,124], a recently developed all-digital CDR technology, which replaces the analog loop filter with a digitally synthesized logic, was adopted in an optical receiver by the authors’ group in [89]. In deep-submicron CMOS technology, a leakage current flowing through a MOS capacitor becomes no longer negligible due to the direct tunneling across the gate oxide [127]. Therefore, a considerable amount of current flows through the loop filter capacitor, leading to a static phase error and a deterministic jitter [128–131]. Moreover, the performance of charge-pumps is degraded owing to the reduced voltage headroom and reduced output impedance of the CMOS devices, which also results in static phase error and jitter. In order to eliminate the analog components, the VCO and analog loop filter are replaced by a digitally-controlled oscillator (DCO) and a synthesized digital logic, respectively, in [89]. While the input analog voltage of a VCO adjusts the output clock frequency, the DCO frequency is determined by a multi-bit digital code. The digital code is achieved by processing the sampled data and reference clock using a digital loop filter. On the other hand, the main drawback of this all-digital approach is the long loop latency, which introduces a phase dithering and loop stability issue. Because of the relatively low operating speed of digital logic, a frequency-divided DCO clock is used for operating the digital filter. Therefore, the loop latency introduced by the digital filter becomes tens of multiples of the bit period, since the latency of the digital filter is a few multiples of the digital clock period. In [89], a direct path without the filtering process, which provides only a phase adjustment, is separated from the main loop. As a result, fast phase response is maintained while the

of this all-digital approach is the long loop latency, which introduces a phase dithering and loop stability issue. Because of the relatively low operating speed of digital logic, a frequency-divided DCO clock is used for operating the digital filter. Therefore, the loop latency introduced by the digital filter becomes tens of multiples of the bit period, since the latency of the digital filter is a few multiples Sensors 2017, 17, clock 1962 period. In [89], a direct path without the filtering process, which provides only 33 of 40 of the digital a phase adjustment, is separated from the main loop. As a result, fast phase response is maintained while the drawbacks of analog loop filter are overcome. The block diagram of the receiver presented drawbacks of analog loop filter are overcome. The block diagram of the receiver presented in [89] is in [89] is shown in Figure 45. The receiver fabricated in 65-nm CMOS achieves a data rate of 26.5 Gb/s shown in Figure 45. The receiver fabricated in 65-nm CMOS achieves a data rate of 26.5 Gb/s while while consuming 254 mW. Notably, the all-digital approach becomes more effective as CMOS consuming 254 mW. Notably, the all-digital approach becomes more effective as CMOS technology technology continues to scale down. continues to scale down.

Iin

Sampled data

Ref. CLK

TIA

CPD Synthesized Digital Filter

Phase Detector Direct phase control

Frequency control

Div. DCO

Figure Figure 45. 45. All-digital All-digital CDR CDR presented presented in in [74]. [74].

7. 7. Summary Summary and and Outlook Outlook This history andand review of the IC technology for high-speed photoThis paper paperpresents presentsthethe history review of CMOS the CMOS IC technology for high-speed detection withoutwithout relyingrelying on complex expressions or formulas, in orderintoorder provide an insight on the photo-detection on complex expressions or formulas, to provide an insight technology even for the readers who are not professional circuit designers. Recent high-speed photoon the technology even for the readers who are not professional circuit designers. Recent high-speed detecting sensorsensor technology provides one ofone theofmost powerful forces which drive the the evolution of photo-detecting technology provides the most powerful forces which drive evolution optical links into the short-reach applications. On the other hand, another remarkable breakthrough of optical links into the short-reach applications. On the other hand, another remarkable breakthrough has from the the CMOS CMOSIC ICtechnologies technologieswhich whichconvert convertthe the current-mode signal from has come come from current-mode signal from thethe PDPD intointo the the VLSI-compatible voltage-mode the CMOSprovides platform providesconsumption, low-power VLSI-compatible voltage-mode signal, signal, because because the CMOS platform low-power consumption, cost, and compatibility with high-density CMOS logic whichrequired are highly low cost, and low compatibility with high-density CMOS logic blocks, whichblocks, are highly for required for the short-reach optical links.the Because IC for photo-detection bridges two the short-reach optical links. Because CMOSthe ICCMOS for photo-detection bridges the two the historic historic technologies of the fiber optics and the silicon VLSI there systems, theredemand is a highfor demand for a technologies of the fiber optics and the silicon VLSI systems, is a high a guidebook guidebook which introduces the technology to non-experts on IC, including optical device engineers. which introduces the technology to non-experts on IC, including optical device engineers. However, to However, the best of knowledge, the authors’there knowledge, has been no publication which offers good the best ofto the authors’ has beenthere no publication which offers good overview and overview and summary of the CMOS IC technologies for high-speed photo-detection. The summary of the CMOS IC technologies for high-speed photo-detection. The technology becomes more technology becomesas more and more complex as therefore the history becomes more and more complex the history deepens, and it deepens, becomes and moretherefore difficultitfor beginners to catch up with, as there is no publication that provides an overview of the technology evolution. For that purpose, in this paper, tens of technical papers which present high-speed photo-detecting circuits with strict verifications of theory based on IC fabrication results are thoroughly selected, reviewed, and categorized. In addition, several papers which provide a technical background of such IC design are introduced as well. Looking ahead, the present electrical link will be ‘eventually’ replaced by the optical link, because of inherent bandwidth limit of electrical channel. The matters are when and which speed, but the ending will not change, although there are many impressive efforts to extend the limit of electrical link. The authors also believe that the platform of the optical link will be CMOS technology and thus CMOS interface circuit will play an important role. Based on the trend of last 20 years, which is described in this paper, the future optical receiver is expected to include the following features: 1.

2.

Circuit and system topologies which utilize the advantages of highly-scaled CMOS technology, such as the inverter-based TIA and all-digital CDR. It is because advanced CMOS technology focuses on optimizing digital circuits. Circuit techniques to overcome the noise-bandwidth tradeoff in front-end circuit, such as the inductive peaking techniques, the equalization scheme, and the integration-based circuit. Since it

Sensors 2017, 17, 1962

34 of 40

is obvious that the front-end circuit has been and will be the bottleneck, both in terms of speed and SNR, it is highly required to break the tradeoff. Author Contributions: G.-S.J. and W.B. collected data, organized, wrote, revised the manuscript; D.-K.J. edited and approved the final manuscript. Conflicts of Interest: The authors declare no conflict of interest.

References 1. 2. 3.

4. 5.

6. 7.

8. 9.

10.

11.

12.

13.

14.

15.

Cisco Systems, Inc. Available online: http://www.cisco.com (accessed on 19 June 2017). Soref, R.A. Silicon-based optoelectronics. Proc. IEEE 1993, 81, 1687–1706. [CrossRef] Izhaky, N.; Morse, M.T.; Koehl, S.; Cohen, O.; Rubin, D.; Barkai, A.; Sarid, G.; Cohen, R.; Paniccia, M.J. Development of CMOS-compatible integrated silicon photonics devices. IEEE J. Sel. Top. Quantum Electron. 2006, 12, 1688–1698. [CrossRef] Vlasov, Y.A. Silicon CMOS-integrated nano-photonics for computer and data communications beyond 100G. IEEE Commun. Mag. 2012, 50, s67–s72. [CrossRef] Fedeli, J.-M.; Fulbert, L.; Van Thourhout, D.; Viktorovitch, P.; O’Connor, I.; Duan, G.-H.; Reed, G.; Corte, F.D.; Vivien, L.; Royo, F.L.; et al. HELIOS: Photonics electronics functional integration on CMOS. SPIE Proc. 2010, 7719, 1–10. Gunn, C. CMOS photonics for high-speed interconnects. IEEE Micro 2006, 26, 58–66. [CrossRef] Liu, F.Y.; Patil, D.; Lexau, J.; Amberg, P.; Dayringer, M.; Gainsley, J.; Moghadam, H.F.; Zheng, X.; Cunningham, J.E.; Krishnamoorthy, A.V. 10-Gbps, 5.3-mW optical transmitter and receiver circuits in 40-nm CMOS. IEEE J. Solid-State Circuits 2012, 47, 2049–2067. [CrossRef] Arakawa, Y.; Nakamura, T.; Urino, Y.; Fujita, T. Silicon photonics for next generation system integration platform. IEEE Commun. Mag. 2013, 51, 72–77. [CrossRef] Zhang, B.; Khanoyan, K.; Hatamkhani, H.; Tong, H.; Hu, K.; Fallahi, S.; Abdul-Latif, M.; Vakilian, K.; Fujimori, I.; Brewster, A. A 28 Gb/s multistandard serial link transceiver for backplane applications in 28 nm CMOS. IEEE J. Solid-State Circuits 2015, 50, 3089–3100. [CrossRef] Kawamoto, T.; Norimatsu, T.; Kogo, K.; Yuki, F.; Nakajima, N.; Tsuge, M.; Usugi, T.; Hokari, T.; Koba, H.; Komori, T. Multi-standard 185 fs rms 0.3-to-28 Gb/s 40 dB backplane signal conditioner with adaptive pattern-match 36-Tap DFE and data-rate-adjustment PLL in 28 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 22–26 February 2015; pp. 1–3. Norimatsu, T.; Kawamoto, T.; Kogo, K.; Kohmu, N.; Yuki, F.; Nakajima, N.; Muto, T.; Nasu, J.; Komori, T.; Koba, H. A 25 Gb/s multistandard serial link transceiver for 50 dB-loss copper cable in 28 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 31 January–4 February 2016; pp. 60–61. Peng, P.-J.; Li, J.-F.; Chen, L.-Y.; Lee, J. A 56 Gb/s PAM-4/NRZ transceiver in 40 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 5–9 February 2017; pp. 110–111. Im, J.; Freitas, D.; Roldan, A.; Casey, R.; Chen, S.; Chou, A.; Cronin, T.; Geary, K.; McLeod, S.; Zhou, L. A 40-to-56 Gb/s PAM-4 receiver with 10-tap direct decision-feedback equalization in 16 nm FinFET. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 5–9 February 2017; pp. 114–115. Steffan, G.; Depaoli, E.; Monaco, E.; Sabatino, N.; Audoglio, W.; Rossi, A.A.; Erba, S.; Bassi, M.; Mazzanti, A. A 64 Gb/s PAM-4 transmitter with 4-Tap FFE and 2.26 pJ/b energy efficiency in 28 nm CMOS FDSOI. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 5–9 February 2017; pp. 116–117. Dickson, T.O.; Ainspan, H.A.; Meghelli, M. A 1.8 pJ/b 56 Gb/s PAM-4 transmitter with fractionally spaced. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 5–9 February 2017; pp. 118–119.

Sensors 2017, 17, 1962

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

27.

28.

29.

30.

35 of 40

Gupta, S.; Tellado, J.; Begur, S.; Yang, F.; Balan, V.; Inerfield, M.; Dabiri, D.; Dring, J.; Goel, S.; Muthukumaraswamy, K.; et al. A 10 Gb/s IEEE 802.3 an-compliant Ethernet transceiver for 100 m UTP cable in 0.13 um CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 3–7 February 2008; pp. 106–107. Hidaka, Y.; Gai, W.; Horie, T.; Jiang, J.H.; Koyanagi, Y.; Osone, H. A 4-channel 10.3 Gb/s backplane transceiver macro with 35 dB equalizer and sign-based zero-forcing adaptive control. IEEE J. Solid-State Circuits 2009, 44, 3547–3559. [CrossRef] Fukuda, K.; Yamashita, H.; Ono, G.; Nemoto, R.; Suzuki, E.; Takemoto, T.; Yuki, F. A 12.3 mW 12.5 Gb/s complete transceiver in 65 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 7–11 February 2010; pp. 368–369. Chen, M.-S.; Shih, Y.-N.; Lin, C.-L.; Hung, H.-W.; Lee, J. A 40 Gb/s TX and RX chip set in 65 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 20–24 February 2011; pp. 146–147. Bulzacchelli, J.; Beukema, T.; Storaska, D.; Hsieh, P.-H.; Rylov, S.; Furrer, D.; Gardellini, D.; Prati, A.; Menolfi, C.; Hanson, D.; et al. A 28 Gb/s 4-tap FFE/15-tap DFE serial link transceiver in 32 nm SOI CMOS technology. IEEE J. Solid-State Circuits 2012, 47, 3232–3248. [CrossRef] Raghavan, B.; Cui, D.; Singh, U.; Maarefi, H.; Pi, D.; Vasani, A.; Huang, Z.; Momtaz, A.; Cao, J. A sub-2W 39.8-to-44.6 Gb/s transmitter and receiver chipset with SFI-5.2 interface in 40 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 17–21 February 2013; pp. 32–33. Jaussi, J.; Balamurugan, G.; Hyvonen, S.; Hsueh, T.-C.; Musah, T.; Keskin, G.; Shekhar, S.; Kennedy, J.; Sen, S.; Inti, R.; et al. A 205 mW 32 Gb/s 3-tap FFE/6-tap DFE bidirectional serial link in 22 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 9–13 February 2014; pp. 440–441. Upadhyaya, P.; Savoj, J.; An, F.-T.; Bekele, A.; Jose, A.; Xu, B.; Wu, D.; Turker, D.; Aslanzadeh, H.; Hedayati, H.; et al. A 0.5-to-32.75 Gb/s flexible-reach wireline transceiver in 20 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 22–26 February 2015; pp. 56–57. Shibasake, T.; Danjo, T.; Ogata, Y.; Sakai, Y.; Miyaoka, H.; Terasawa, F.; Kudo, M.; Kano, H.; Matsuda, A.; Kawai, S.; et al. A 56 Gb/s NRZ-electrical 247 mW/lane serial-link transceiver in 28 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 31 January–4 February 2016; pp. 64–65. Han, J.; Lu, Y.; Sutardja, N.; Alon, E. A 60 Gb/s 288 mW NRZ transceiver with adaptive equalization and baud-rate clock and data recovery in 65 nm CMOS technology. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 5–9 February 2017; pp. 112–113. Balamurugan, G.; O’Mahony, F.; Mansuri, M.; Jaussi, J.E.; Kennedy, J.T.; Casper, B. A 5-to-25 Gb/s 1.6-to-3.8 mW/(Gb/s) reconfigurable transceiver in 45 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 7–11 February 2010; pp. 372–373. Chiang, P.-C.; Hung, H.-W.; Chu, H.-Y.; Chen, G.-S.; Lee, J. 60 Gb/s NRZ and PAM4 transmitters for 400 GbE in 65 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 9–13 February 2014; pp. 42–43. Kim, J.; Balankutty, A.; Elshazly, A.; Huang, Y.-Y.; Song, H.; Yu, K.; O’Mahony, F. A 16-to-40 Gb/s quarter-rate NRZ/PAM4 dual-mode transmitter in 14 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 22–26 February 2015; pp. 60–61. Gopalakrishnan, K.; Ren, A.; Tan, A.; Farhood, A.; Tiruvur, A.; Helal, B.; Loi, C.-F.; Jiang, C.; Cirit, H.; Quek, I.; et al. A 40/50/100 Gb/s PAM-4 ethernet transceiver in 28 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 31 January–4 February 2016; pp. 62–63. IEEE P802.3ae 10 Gb/s Ethernet Task Force. Available online: http://www.ieee802.org/3/ae/ (accessed on 19 June 2017).

Sensors 2017, 17, 1962

31. 32. 33. 34. 35. 36. 37.

38. 39.

40.

41. 42.

43.

44.

45.

46. 47.

48.

49.

50.

51.

36 of 40

IEEE P802.3ba and 100 Gb/s Ethernet Task Force. Available online: http://www.ieee802.org/3/ba/ (accessed on 19 June 2017). 100G CLR4 Alliance. Available online: https://www.clr4-alliance.org/ (accessed on 19 June 2017). CWDM4 MSA. Available online: http://www.cwdm4-msa.org/ (accessed on 19 June 2017). Jalali, B.; Paniccia, M.; Reed, G. Silicon photonics. IEEE Microw. Mag. 2006, 7, 58–68. [CrossRef] Soref, R. The past, present, and future of silicon photonics. IEEE J. Sel. Top. Quantum Electron. 2006, 12, 1678–1687. [CrossRef] Hochberg, M.; Harris, N.C.; Ding, R.; Zhang, Y.; Novack, A.; Xuan, Z.; Baehr-Jones, T. Silicon photonics: The next fabless semiconductor industry. IEEE Solid-State Circuits Mag. 2013, 5, 48–58. [CrossRef] Vivien, L.; Polzer, A.; Marris-Morini, D.; Osmond, J.; Hartmann, J.M.; Crozat, P.; Cassan, E.; Kopp, C.; Zimmermann, H.; Fédéli, J.M. Zero-bias 40 Gbit/s germanium waveguide photodetector on silicon. Opt. Express 2012, 20, 1096–1101. [CrossRef] [PubMed] Assefa, S.; Xia, F.; Vlasov, Y.A. Reinventing germanium avalanche photodetector for nanophotonic on-chip optical interconnects. Nature 2010, 464, 80–84. [CrossRef] [PubMed] Thomson, D.J.; Gardes, F.Y.; Fedeli, J.-M.; Zlatanovic, S.; Hu, Y.; Kuo, B.P.P.; Myslivets, E.; Alic, N.; Radic, S.; Mashanovich, G.Z. 50-Gb/s silicon optical modulator. IEEE Photonics Technol. Lett. 2012, 24, 234–236. [CrossRef] Temporiti, E.; Minoia, G.; Repossi, M.; Baldi, D.; Ghilioni, A.; Svelto, F. A 56 Gb/s 300 mW silicon-photonics transmitter in 3D-integrated PIC25G and 55 nm BiCMOS technologies. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 31 January–4 February 2016; pp. 404–405. Xu, Q.; Schmidt, B.; Pradhan, S.; Lipson, M. Micrometre-scale silicon electro-optic modulator. Nature 2005, 435, 325–327. [CrossRef] [PubMed] Li, G.; Zheng, X.; Yao, J.; Thacker, H.; Shubin, I.; Luo, Y.; Raj, K.; Cunningham, J.E.; Krishnamoorthy, A.V. 25 Gb/s 1V-driving CMOS ring modulator with integrated thermal tuning. Opt. Express 2011, 19, 20435–20443. [CrossRef] [PubMed] Rakowski, M.; Pantouvaki, M.; De Heyn, P.; Verheyen, P.; Ingels, M.; Chen, H.; De Coster, J.; Lepage, G.; Snyder, B.; De Meyer, K. A 4 × 20 Gb/s WDM ring-based hybrid CMOS silicon photonics transceiver. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 22–26 February 2015; pp. 1–3. Analui, B.; Guckenberger, D.; Kucharski, D.; Narasimha, A. A Fully Integrated 20-Gb/s Optoelectronic Transceiver Implemented in a Standard 0.13-µm CMOS SOI Technology. IEEE J. Solid-State Circuits 2006, 41, 2945–2955. [CrossRef] Narasimha, A.; Analui, B.; Liang, Y.; Sleboda, T.J.; Abdalla, S.; Balmater, E.; Gloeckner, S.; Guckenberger, D.; Harrison, M.; Koumans, R.G.M.P.; et al. A Fully Integrated 4 × 10-Gb/s DWDM Optoelectronic Transceiver Implemented in a Standard 0.13 µm CMOS SOI Technology. IEEE J. Solid-State Circuits 2007, 42, 2736–2744. [CrossRef] Georgas, M.; Orcutt, J.; Ram, R.J.; Stojanovic, V. A monolithically-integrated optical receiver in standard 45-nm SOI. IEEE J. Solid-State Circuits 2012, 47, 1693–1702. [CrossRef] Buckwalter, J.F.; Zheng, X.; Li, G.; Raj, K.; Krishnamoorthy, A.V. A monolithic 25-Gb/s transceiver with photonic ring modulators and Ge detectors in a 130-nm CMOS SOI process. IEEE J. Solid-State Circuits 2012, 47, 1309–1322. [CrossRef] Sun, C.; Wade, M.T.; Lee, Y.; Orcutt, J.S.; Alloatti, L.; Georgas, M.S.; Waterman, A.S.; Shainline, J.M.; Avizienis, R.R.; Lin, S. Single-chip microprocessor that communicates directly using light. Nature 2015, 528, 534–538. [CrossRef] [PubMed] Meade, R.; Orcutt, J.S.; Mehta, K.; Tehar-Zahav, O.; Miller, D.; Georgas, M.; Moss, B.; Sun, C.; Chen, Y.-H.; Shainline, J. Integration of silicon photonics in bulk CMOS. In Proceedings of the IEEE Symposium on VLSI Technology Digest of Technical Papers, Honolulu, HI, USA, 9–12 June 2014; pp. 1–2. Sun, C.; Georgas, M.; Orcutt, J.; Moss, B.; Chen, Y.-H.; Shainline, J.; Wade, M.; Mehta, K.; Nammari, K.; Timurdogan, E. A monolithically-integrated chip-to-chip optical link in bulk CMOS. IEEE J. Solid-State Circuits 2015, 50, 828–844. [CrossRef] Byun, H.; Bok, J.; Cho, K.; Cho, K.; Choi, H.; Choi, J.; Choi, S.; Han, S.; Hong, S.; Hyun, S. Bulk-Si photonics technology for DRAM interface [Invited]. Photonics Res. 2014, 2, A25–A33. [CrossRef]

Sensors 2017, 17, 1962

52.

53.

54.

55.

56.

57.

58.

59.

60. 61. 62.

63. 64. 65.

66. 67. 68. 69. 70.

37 of 40

Proesel, J.; Schow, C.; Rylyakov, A. 25 Gb/s 3.6 pJ/b and 15 Gb/s 1.37 pJ/b VCSEL-based optical links in 90 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 19–23 February 2012; pp. 418–420. Cevrero, A.; Ozkaya, I.; Francese, P.A.; Menolfi, C.; Morf, T.; Brandli, M.; Kuchta, D.; Kull, L.; Proesel, J.; Kossel, M. A 64 Gb/s 1.4 pJ/b NRZ optical-receiver data-path in 14 nm CMOS FinFET. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 5–9 February 2017; pp. 482–483. Takemoto, T.; Yamashita, H.; Yazaki, T.; Chujo, N.; Lee, Y.; Matsuoka, Y. A 25-to-28 Gb/s High-Sensitivity (−9.7 dBm) 65 nm CMOS Optical Receiver for Board-to-Board Interconnects. IEEE J. Solid-State Circuits 2014, 49, 2259–2276. [CrossRef] Morita, H.; Uchino, K.; Otani, E.; Ohtorii, H.; Ogura, T.; Oniki, K.; Oka, S.; Yanagawa, S.; Suzuki, H. A 12× 5 two-dimensional optical I/O array for 600 Gb/s chip-to-chip interconnect in 65 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 9–13 February 2014; pp. 140–141. Chen, Y.; Kibune, M.; Toda, A.; Hayakawa, A.; Akiyama, T.; Sekiguchi, S.; Ebe, H.; Imaizumi, N.; Akahoshi, T.; Akiyama, S. A 25 Gb/s hybrid integrated silicon photonic transceiver in 28 nm CMOS and SOI. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 22–26 February 2015; pp. 1–3. Cignoli, M.; Minoia, G.; Repossi, M.; Baldi, D.; Ghilioni, A.; Temporiti, E.; Svelto, F. A 1310 nm 3D-integrated silicon photonics Mach-Zehnder-based transmitter with 275 mW multistage CMOS driver achieving 6 dB extinction ratio at 25 Gb/s. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 22–26 February 2015; pp. 1–3. Geis, M.; Spector, S.; Grein, M.; Schulein, R.; Yoon, J.; Lennon, D.; Deneault, S.; Gan, F.; Kaertner, F.; Lyszczarz, T. CMOS-compatible all-Si high-speed waveguide photodiodes with high responsivity in near-infrared communication band. IEEE Photonics Technol. Lett. 2007, 19, 152–154. [CrossRef] Geis, M.; Spector, S.; Grein, M.; Yoon, J.; Lennon, D.; Lyszczarz, T. Silicon waveguide infrared photodiodes with> 35 GHz bandwidth and phototransistors with 50 AW-1 response. Opt. Express 2009, 17, 5193–5204. [CrossRef] Doylend, J.; Jessop, P.; Knights, A. Silicon photonic resonator-enhanced defect-mediated photodiode for sub-bandgap detection. Opt. Express 2010, 18, 14671–14678. [CrossRef] [PubMed] Preston, K.; Lee, Y.H.D.; Zhang, M.; Lipson, M. Waveguide-integrated telecom-wavelength photodiode in deposited silicon. Opt. Lett. 2011, 36, 52–54. [CrossRef] [PubMed] Mehta, K.K.; Orcutt, J.S.; Shainline, J.M.; Tehar-Zahav, O.; Sternberg, Z.; Meade, R.; Popovi´c, M.A.; Ram, R.J. Polycrystalline silicon ring resonator photodiodes in a bulk complementary metal-oxide-semiconductor process. Opt. Lett. 2014, 39, 1061–1064. [CrossRef] [PubMed] Reed, G.T. Silicon Photonics: The State of the Art; John Wiley & Sons: Hoboken, NJ, USA, 2008. Samavedam, S.; Currie, M.; Langdo, T.; Fitzgerald, E. High-quality germanium photodiodes integrated on silicon substrates using optimized relaxed graded buffers. Appl. Phys. Lett. 1998, 73, 2125–2127. [CrossRef] Colace, L.; Masini, G.; Galluzzi, F.; Assanto, G.; Capellini, G.; Di Gaspare, L.; Palange, E.; Evangelisti, F. Metal–semiconductor–metal near-infrared light detector based on epitaxial Ge/Si. Appl. Phys. Lett. 1998, 72, 3175–3177. [CrossRef] Luan, H.-C.; Lim, D.R.; Lee, K.K.; Chen, K.M.; Sandland, J.G.; Wada, K.; Kimerling, L.C. High-quality Ge epilayers on Si with low threading-dislocation densities. Appl. Phys. Lett. 1999, 75, 2909–2911. [CrossRef] Fama, S.; Colace, L.; Masini, G.; Assanto, G.; Luan, H.-C. High performance germanium-on-silicon detectors for optical communications. Appl. Phys. Lett. 2002, 81, 586–588. [CrossRef] Yu, H.-Y.; Ren, S.; Jung, W.S.; Okyay, A.K.; Miller, D.A.; Saraswat, K.C. High-efficiency pin photodetectors on selective-area-grown Ge for monolithic integration. IEEE Electron Device Lett. 2009, 30, 1161–1163. Kim, J.; Buckwalter, J.F. A 40-Gb/s optical transceiver front-end in 45 nm SOI CMOS. IEEE J. Solid-State Circuits 2012, 47, 615–626. [CrossRef] Proesel, J.; Rylyakov, A.; Schow, C. Optical receivers using DFE-IIR equalization. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 17–21 February 2013; pp. 130–131.

Sensors 2017, 17, 1962

71.

72. 73. 74.

75. 76. 77.

78.

79.

80. 81.

82.

83. 84. 85.

86.

87. 88.

89.

38 of 40

Park, S.M.; Toumazou, C. A packaged low-noise high-speed regulated cascode transimpedance amplifier using a 0.6 µm N-well CMOS technology. In Proceedings of the IEEE European Solid-State Circuits Conference (ESSCIRC), Stockholm, Sweden, 19–21 September 2000; pp. 431–434. Park, S.M.; Yoo, H.-J. 1.25-Gb/s regulated cascode CMOS transimpedance amplifier for gigabit ethernet applications. IEEE J. Solid-State Circuits 2004, 39, 112–121. [CrossRef] Park, S.M.; Lee, J.; Yoo, H.-J. 1-Gb/s 80-dBΩ fully differential CMOS transimpedance amplifier in multichip on oxide technology for optical interconnects. IEEE J. Solid-State Circuits 2004, 39, 971–974. [CrossRef] Takemoto, T.; Yamashita, H.; Yuki, F.; Masuda, N.; Toyoda, H.; Chujo, N.; Lee, Y.; Tsuji, S.; Nishimura, S. A 25-Gb/s 2.2-W 65-nm CMOS optical transceiver using a power-supply-variation-tolerant analog front end and data-format conversion. IEEE J. Solid-State Circuits 2014, 49, 471–485. [CrossRef] Kromer, C.; Sialm, G.; Morf, T.; Schmatz, M.L.; Ellinger, F.; Erni, D.; Jackel, H. A low-power 20-GHz 52-dBΩ transimpedance amplifier in 80-nm CMOS. IEEE J. Solid-State Circuits 2004, 39, 885–894. [CrossRef] Abidi, A.A. Gigahertz transresistance amplifiers in fine line NMOS. IEEE J. Solid-State Circuits 1984, 19, 986–994. [CrossRef] Chiang, P.-C.; Jiang, J.-Y.; Hung, H.-W.; Wu, C.-Y.; Chen, G.-S.; Lee, J. 4× 25 Gb/s transceiver with optical front-end for 100 GbE system in 65 nm CMOS technology. IEEE J. Solid-State Circuits 2015, 50, 573–585. [CrossRef] Lim, P.-W.; Tzeng, A.Y.; Chuang, H.L.; Onge, S.S. A 3.3-V monolithic photodetector/CMOS-preamplifier for 531 Mb/s optical data link applications. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 24–26 February 1993; pp. 96–97. Woodward, T.; Krishnamoorthy, A.; Rozier, R.; Lentine, A. Low-power, small-footprint gigabit Ethernet-compatible optical receiver circuit in 0.25/spl mu/m CMOS. Electron. Lett. 2000, 36, 1489–1491. [CrossRef] Analui, B.; Hajimiri, A. Bandwidth enhancement for transimpedance amplifiers. IEEE J. Solid-State Circuits 2004, 39, 1263–1270. [CrossRef] Ingels, M.; Van der Plas, G.; Crols, J.; Steyaert, M. A CMOS 18 THz/spl Omega/248 Mb/s transimpedance amplifier and 155 Mb/s LED-driver for low cost optical fiber links. IEEE J. Solid-State Circuits 1994, 29, 1552–1559. [CrossRef] Yoon, T.; Jalali, B. 1.25 Gb/s CMOS differential transimpedance amplifier for gigabit networks. In Proceedings of the IEEE European Solid-State Circuits Conference (ESSCIRC), Southampton, UK, 16–18 September 1997; pp. 140–143. Tavernier, F.; Steyaert, M.S. High-speed optical receivers with integrated photodiode in 130 nm CMOS. IEEE J. Solid-State Circuits 2009, 44, 2856–2867. [CrossRef] Woodward, T.; Krishnamoorthy, A.V. 1 Gbit/s CMOS photoreceiver with integrated detector operating at 850 nm. Electron. Lett. 1998, 34, 1252–1253. [CrossRef] Huang, T.-C.; Chung, T.-W.; Chern, C.-H.; Huang, M.-C.; Lin, C.-C.; Hsueh, F.-L. A 28 Gb/s 1 pJ/b shared-inductor optical receiver with 56% chip-area reduction in 28 nm CMOS. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 9–13 February 2014; pp. 144–145. Li, C.; Bai, R.; Shafik, A.; Tabasy, E.Z.; Wang, B.; Tang, G.; Ma, C.; Chen, C.-H.; Peng, Z.; Fiorentino, M. Silicon photonic transceiver circuits with microring resonator bias-based wavelength stabilization in 65 nm CMOS. IEEE J. Solid-State Circuits 2014, 49, 1419–1436. [CrossRef] Li, D.; Minoia, G.; Repossi, M.; Baldi, D.; Temporiti, E.; Mazzanti, A.; Svelto, F. A low-noise design technique for high-speed CMOS optical receivers. IEEE J. Solid-State Circuits 2014, 49, 1437–1447. [CrossRef] Rylyakov, A.; Proesel, J.; Rylov, S.; Lee, B.; Bulzacchelli, J.; Ardey, A.; Parker, B.; Beakes, M.; Baks, C.; Schow, C. A 25 Gb/s burst-mode receiver for rapidly reconfigurable optical networks. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 22–26 February 2015; pp. 1–3. Chu, S.-H.; Bae, W.; Jeong, G.-S.; Jang, S.; Kim, S.; Joo, J.; Kim, G.; Jeong, D.-K. A 22 to 26.5 Gb/s optical receiver with all-digital clock and data recovery in a 65 nm CMOS process. IEEE J. Solid-State Circuits 2015, 50, 2603–2612. [CrossRef]

Sensors 2017, 17, 1962

90.

91.

92.

93. 94. 95. 96. 97. 98. 99.

100. 101. 102. 103. 104. 105.

106.

107. 108. 109.

110.

111. 112.

39 of 40

Sun, C.; Wade, M.; Georgas, M.; Lin, S.; Alloatti, L.; Moss, B.; Kumar, R.; Atabaki, A.H.; Pavanello, F.; Shainline, J.M. A 45 nm CMOS-SOI monolithic photonics platform with bit-statistics-based resonant microring thermal tuning. IEEE J. Solid-State Circuits 2016, 51, 893–907. [CrossRef] Yu, K.; Li, C.; Li, H.; Titriku, A.; Shafik, A.; Wang, B.; Wang, Z.; Bai, R.; Chen, C.-H.; Fiorentino, M. A 25 Gb/s Hybrid-Integrated Silicon Photonic Source-Synchronous Receiver With Microring Wavelength Stabilization. IEEE J. Solid-State Circuits 2016, 51, 2129–2141. [CrossRef] Emami-Neyestanak, A.; Liu, D.; Keeler, G.; Helman, N.; Horowitz, M. A 1.6 Gb/s, 3 mW CMOS receiver for optical communication. In Proceedings of the IEEE Symposium on VLSI Circuits Digest of Technical Papers, Honolulu, HI, USA, 13–15 June 2002; pp. 84–87. Palermo, S.; Emami-Neyestanak, A.; Horowitz, M. A 90 nm CMOS 16 Gb/s transceiver for optical interconnects. IEEE J. Solid-State Circuits 2008, 43, 1235–1246. [CrossRef] Nazari, M.H.; Emami-Neyestanak, A. A 24-Gb/s double-sampling receiver for ultra-low-power optical communication. IEEE J. Solid-State Circuits 2013, 48, 344–357. [CrossRef] Mohan, S.S.; Hershenson, M.D.M.; Boyd, S.P.; Lee, T.H. Bandwidth extension in CMOS with optimized on-chip inductors. IEEE J. Solid-State Circuits 2000, 35, 346–355. [CrossRef] Lee, J.; Kundert, K.S.; Razavi, B. Analysis and modeling of bang-bang clock and data recovery circuits. IEEE J. Solid-State Circuits 2004, 39, 1571–1580. Han, J.; Choi, B.; Seo, M.; Yun, J.; Lee, D.; Kim, T.; Eo, Y.; Park, S.M. A 20-Gb/s Transformer-Based Current-Mode Optical Receiver in 0.13-µm CMOS. IEEE Trans. Circuits Syst. II Express Br. 2010, 57, 348–352. Kim, J.; Buckwalter, J.F. Bandwidth enhancement with low group-delay variation for a 40-Gb/s transimpedance amplifier. IEEE Trans. Circuits Syst. I Regul. Pap. 2010, 57, 1964–1972. Park, K.-S.; Yoo, B.-J.; Hwang, M.-S.; Chi, H.; Kim, H.-C.; Park, J.-W.; Kim, K.; Jeong, D.-K. A 10-Gb/s optical receiver front-end with 5-mW transimpedance amplifier. In Proceedings of the IEEE Asian Solid-State Circuits Conference (A-SSCC), Beijing, China, 8–10 November 2010; pp. 1–4. Bae, W.; Jeong, G.-S.; Kim, Y.; Chi, H.-K.; Jeong, D.-K. Design of silicon photonic interconnect ICS in 65-nm CMOS technology. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2016, 24, 2234–2243. [CrossRef] Galal, S.; Razavi, B. 40-Gb/s amplifier and ESD protection circuit in 0.18-µm CMOS technology. IEEE J. Solid-State Circuits 2004, 39, 2389–2396. [CrossRef] Shekhar, S.; Walling, J.S.; Allstot, D.J. Bandwidth extension techniques for CMOS amplifiers. IEEE J. Solid-State Circuits 2006, 41, 2424–2439. [CrossRef] Kim, J.; Kim, J.-K.; Lee, B.-J.; Jeong, D.-K. Design Optimization of On-Chip Inductive Peaking Structures for 0.13-µm CMOS 40-Gb/s Transmitter Circuits. IEEE Trans. Circuits Syst. I Regul. Pap. 2009, 56, 2544–2555. Razavi, B. Challenges in the design high-speed clock and data recovery circuits. IEEE Commun. Mag. 2002, 40, 94–101. [CrossRef] Jeong, G.-S.; Chi, H.; Kim, K.; Jeong, D.-K. A 20-Gb/s 1.27 pJ/b low-power optical receiver front-end in 65 nm CMOS, Circuits and Systems (ISCAS). In Proceedings of the IEEE Symposium on Circuits and Systems (ISCAS), Melbourne, Australia, 1–5 June 2014; pp. 1492–1495. Kao, T.S.-C.; Musa, F.A.; Carusone, A.C. A 5-Gbit/s CMOS optical receiver with integrated spatially modulated light detector and equalization. IEEE Trans. Circuits Syst. I Regul. Pap. 2010, 57, 2844–2857. [CrossRef] Razavi, B. Fundamentals of Microelectronics; Wiley: New York, NY, USA, 2009. Razavi, B. Design of Integrated Circuits for Optical Communications; McGraw-Hill: New York, NY, USA, 2002. Bulzacchelli, J.F.; Meghelli, M.; Rylov, S.V.; Rhee, W.; Rylyakov, A.V.; Ainspan, H.A.; Parker, B.D.; Beakes, M.P.; Chung, A.; Beukema, T.J. A 10-Gb/s 5-tap DFE/4-tap FFE transceiver in 90-nm CMOS technology. IEEE J. Solid-State Circuits 2006, 41, 2885–2900. [CrossRef] Kimura, H.; Aziz, P.M.; Jing, T.; Sinha, A.; Kotagiri, S.P.; Narayan, R.; Gao, H.; Jing, P.; Hom, G.; Liang, A. A 28 Gb/s 560 mW multi-standard SerDes with single-stage analog front-end and 14-tap decision feedback equalizer in 28 nm CMOS. IEEE J. Solid-State Circuits 2014, 49, 3091–3103. [CrossRef] Kajley, R.S.; Hurst, P.J.; Brown, J.E. A mixed-signal decision-feedback equalizer that uses a look-ahead architecture. IEEE J. Solid-State Circuits 1997, 32, 450–459. [CrossRef] Stojanovic, V.; Ho, A.; Garlepp, B.W.; Chen, F.; Wei, J.; Tsang, G.; Alon, E.; Kollipara, R.T.; Werner, C.W.; Zerbe, J.L. Autonomous dual-mode (PAM2/4) serial link transceiver with adaptive equalization and data recovery. IEEE J. Solid-State Circuits 2005, 40, 1012–1026. [CrossRef]

Sensors 2017, 17, 1962

40 of 40

113. Wong, K.-L.J.; Chen, E.-H.; Yang, C.-K.K. Edge and data adaptive equalization of serial-link transceivers. IEEE J. Solid-State Circuits 2008, 43, 2157–2169. [CrossRef] 114. Wang, H.; Lee, J. A 21-Gb/s 87-mW transceiver with FFE/DFE/analog equalizer in 65-nm CMOS technology. IEEE J. Solid-State Circuits 2010, 45, 909–920. [CrossRef] 115. Sharif-Bakhtiar, A.; Carusone, A.C. A 20 Gb/s CMOS Optical Receiver With Limited-Bandwidth Front End and Local Feedback IIR-DFE. IEEE J. Solid-State Circuits 2016, 51, 2679–2689. [CrossRef] 116. Kim, J.; Jeong, D.-K. Multi-gigabit-rate clock and data recovery based on blind oversampling. IEEE Commun. Mag. 2003, 41, 68–74. 117. Nikolic, B.; Oklobdzija, V.G.; Stojanovic, V.; Jia, W.; Chiu, J.K.-S.; Leung, M.M.-T. Improved sense-amplifier-based flip-flop: Design and measurements. IEEE J. Solid-State Circuits 2000, 35, 876–884. [CrossRef] 118. Kim, J.; Leibowitz, B.S.; Ren, J.; Madden, C.J. Simulation and analysis of random decision errors in clocked comparators. IEEE Trans. Circuits Syst. I Regul. Pap. 2009, 56, 1844–1857. [CrossRef] 119. Bae, W.; Jeong, G.-S.; Park, K.; Cho, S.-Y.; Kim, Y.; Jeong, D.-K. A 0.36 pJ/bit, 0.025 mm2 , 12.5 Gb/s Forwarded-Clock Receiver With a Stuck-Free Delay-Locked Loop and a Half-Bit Delay Line in 65-nm CMOS Technology. IEEE Trans. Circuits Syst. I Regul. Pap. 2016, 63, 1393–1403. [CrossRef] 120. Hsieh, M.-T.; Sobelman, G.E. Architectures for multi-gigabit wire-linked clock and data recovery. IEEE Circuits Syst. Mag. 2008, 8, 45–57. [CrossRef] 121. Casper, B.; O’Mahony, F. Clocking analysis, implementation and measurement techniques for high-speed data links—A tutorial. IEEE Trans. Circuits Syst. I Regul. Pap. 2009, 56, 17–39. [CrossRef] 122. Kim, J. Design of CMOS Adaptive-Supply Serial Links. Ph.D. Thesis, Stanford University, Stanford, CA, USA, 2002. 123. Hwang, M.-S.; Lee, S.-Y.; Kim, J.-K.; Kim, S.; Jeong, D.-K. A 180-Mb/s to 3.2-Gb/s, continuous-rate, fast-locking CDR without using external reference clock. In Proceedings of the IEEE Asian Solid-State Circuits Conference (A-SSCC), Jeju City, Korea, 12–14 November 2007; pp. 144–147. 124. Wu, K.-C.; Lee, J. A 2 × 25-Gb/s Receiver with 2: 5 DMUX for 100-Gb/s Ethernet. IEEE J. Solid-State Circuits 2010, 45, 2421–2432. 125. Emami-Neyestanak, A.; Palermo, S.; Lee, H.-C.; Horowitz, M. CMOS transceiver with baud rate clock recovery for optical interconnects. In Proceedings of the IEEE Symposium on VLSI Circuits Digest of Technical Papers, Honolulu, HI, USA, 17–19 June 2004; pp. 410–413. 126. Sidiropoulos, S.; Horowitz, M.A. A semidigital dual delay-locked loop. IEEE J. Solid-State Circuits 1997, 32, 1683–1692. [CrossRef] 127. Lo, S.-H.; Buchanan, D.; Taur, Y.; Wang, W. Quantum-mechanical modeling of electron tunneling current from the inversion layer of ultra-thin-oxide nMOSFET’s. IEEE Electron Device Lett. 1997, 18, 209–211. [CrossRef] 128. Holzer, R. A 1 V CMOS PLL designed in high-leakage CMOS process operating at 10–700 MHz. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 7 February 2002; pp. 272–273. 129. Chen, J.-S.; Ker, M.-D. Impact of gate leakage on performances of phase-locked loop circuit in nanoscale CMOS technology. IEEE Trans. Electron Devices 2009, 56, 1774–1779. [CrossRef] 130. Hung, C.-C.; Liu, S.-I. A leakage-compensated PLL in 65-nm CMOS technology. IEEE Trans. Circuits Syst. II Express Br. 2009, 56, 525–529. [CrossRef] 131. Bae, W.; Ju, H.; Jeong, D.-K. Design of a Transceiver Transmitting Power, Clock, and Data over a Single Optical Fiber for Future Automotive Network System. J. Semicond. Technol. Sci. 2017, 17, 48–55. [CrossRef] © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Suggest Documents