Michael Kalcher, BSc

# Power Efficient LO Distribution and I/Q-Phase Generation for Wireless Transceiver Systems 

MASTER'S THESIS<br>to achieve the university degree of<br>Diplom-Ingenieur<br>Master's degree programme: Electrical Engineering<br>submitted to<br>Graz University of Technology<br>Supervisor<br>Dipl.-Ing. Dr.techn. Mario Auer<br>Institute of Electronics

## AFFIDAVIT

I declare that I have authored this thesis independently, that I have not used other than the declared sources/resources, and that I have explicitly indicated all material which has been quoted either literally or by content from the sources used. The text document uploaded to TUGRAZonline is identical to the present master's thesis dissertation.

## Acknowledgments

The completion of this thesis would not have been possible without the help of many.

Engineers and members of Intel's entire Villach site made me feel welcome and are always available for help. I specifically thank my supervisor and mentor Daniel Gruber for his patient guidance throughout the course of this thesis. Not only is he an exceptional engineer but also a humorous and straight forward co-worker. I have relied numerous times on the help of Davide Ponton and Alan Paussa, who also greatly helped writing this thesis.

Having had virtually no experience with layout of integrated circuits, I want to thank Gerald Rauter and Christoph Duller who helped me executing the layout in spite of the numerous difficulties.

Also, I want to thank Gerhard Knoblinger as well as Intel Mobile Communications Austria for offering me this great possibility in the first place.

At Graz University of Technology I want to thank my supervisor Mario Auer. He provided the necessary academic feedback and helped me with all the organizational issues related to this thesis.

Ultimately, I thank my parents and my sister, who, against all odds, somehow managed to keep me sane during the time of writing.


#### Abstract

Mobile communication systems pose extreme challenges to designers of wireless transceivers, in which the lowest possible power consumption is essential to compete in today's fast paced markets. Focusing on I/Q modulation, one of the main sources of power dissipation is the distribution of the two quadrature local oscillator ( LO ) signals from the frequency synthesizer to the modulators. This thesis focuses on reducing this power dissipation by generating the quadrature LO signal from a single LO signal. First, the challenges and problems are outlined and existing quadrature generators are presented and evaluated. Then, an alternative circuit principle is developed and described in-depth. An implementation of this concept is designed with a standard high performance 28 nm cmos technology to the point of a functional layout block. Specifications were derived from an existing system used in a next-generation wireless transceiver product.

The design is extensively evaluated with post-layout simulations to verify the feasibility of the proposed circuit principle. While fulfilling the specifications, a comparison to results published in literature reveals superior performance of the developed circuit.


## Contents

Abstract ..... vii
1 Introduction ..... 1
2 Theory on Phase Noise and I/Q Modulation ..... 5
2.1 Phase Noise ..... 5
2.2 I/Q Modulation ..... 10
2.2.1 I/Q Imbalance ..... 14
2.2.2 Impact of Phase Noise ..... 17
3 State-of-the-Art Solutions ..... 19
3.1 Frequency Division ..... 19
3.2 Delay Locked Loops ..... 21
3.3 Injection Locked Ring Oscillators ..... 22
3.4 Phase Correction ..... 24
4 Proposed Solution: Self-Aligned Open Loop Multiphase Generation ..... 25
4.1 Building Blocks ..... 26
4.2 Quadrature Generation ..... 28
4.2.1 Analysis of Possible Imperfections ..... 30
4.3 Multiphase Generation ..... 35
5 Circuit Implementation ..... 41
5.1 Target Specification ..... 41
5.2 Circuit Design ..... 42
5.2.1 Phase Shifter ..... 43
5.2.2 Phase Interpolator ..... 47
5.2.3 Control Logic ..... 50

Contents
5.3 Layout ..... 53
5.3.1 Delay Element $\Delta T$ ..... 54
5.3.2 Phase Interpolator ..... 56
5.3.3 Top Level ..... 56
5.4 Simulation Results ..... 59
5.4.1 Test Bench ..... 59
5.4.2 Static Performance ..... 60
5.4.3 Dynamic Performance ..... 76
5.4.4 Statistical Analysis ..... 83
5.4.5 Figure of Merit ..... 87
6 Conclusion ..... 89
Bibliography ..... 91

## List of Tables

5.1 Summary of the main performance characteristics ..... 75
5.2 Comparison of performance parameters to literature ..... 88

## List of Figures

1.1 Principal I/Q modulator structure ..... 1
1.2 Quadrature signals ..... 2
1.3 Basic LO distribution ..... 2
1.4 LO routing strategies ..... 3
2.1 Phase noise scenario with a noisy buffer ..... 6
2.2 Impact of amplitude noise on sinusoidal signals ..... 6
2.3 Impact of amplitude noise on trapezoidal signals ..... 7
2.4 Typical (oscillator's) phase noise spectrum ..... 7
2.5 Direct influence of the LO phase noise ..... 8
2.6 Mixer based receiver front-end ..... 8
2.7 Reciprocal mixing phenomenon ..... 9
2.8 Constellation diagrams ..... 11
2.9 Basic I/Q modulator structure ..... 13
2.10 Basic I/Q demodulator structure ..... 14
2.11 Constellation diagrams I/Q amplitude imbalance ..... 16
2.12 Constellation diagrams with $10^{\circ} \mathrm{I} / \mathrm{Q}$ phase imbalance ..... 16
2.13 Constellation diagrams with correlated phase noise ..... 18
2.14 Constellation diagrams with uncorrelated phase noise ..... 18
3.1 Timing diagram of division based quadrature generation ..... 19
3.2 Latch-based fully differential quadrature frequency divider ..... 20
3.3 Fully differential CMOS D-Latch ..... 20
3.4 Block diagram of a conventional DLL ..... 21
3.5 Block diagram of a tapped delay for quadrature generation ..... 22
3.6 Single-ended cmos ring oscillator with $n$ stages ..... 22
3.7 Differential injection locked cmos ring oscillator ..... 23
3.8 Phase corrector stage ..... 24
4.1 Example phases in the time domain ..... 25
4.2 Example phases in a phasor diagram ..... 26
4.3 Phase shifter element basic relations ..... 27
4.4 Phase interpolator basic relations ..... 27
4.5 Quadrature phase generation principle ..... 28
4.6 Quadrature phase generation principle with additional phases ..... 29
4.7 Block diagram of the differential quadrature generator ..... 31
4.8 Block diagram of the quadrature generator with mismatch ..... 32
4.9 Auxiliary phases for multiphase generation ..... 35
4.10 Phasor diagram highlighting interpolated phases ..... 38
4.11 Exemplary phasor diagram for $60^{\circ}$ spaced phases ..... 39
5.1 Basic input-output diagram of the block ..... 41
5.2 Block diagram of the implemented circuit ..... 43
5.3 cmos inverter ..... 44
5.4 Pseudo differential cmos inverter ..... 44
5.5 Tunable RC-based delay ..... 45
5.6 One half of the tunable delay element ..... 45
5.7 Properties of the used delay element in the nominal case ..... 46
5.8 Voltage mode phase interpolator ..... 47
5.9 Voltage mode phase interpolator's inputs and outputs ..... 48
5.10 Voltage mode phase interpolator used in this work ..... 48
5.11 Phase transfer function of a voltage mode phase interpolator ..... 49
5.12 Waveforms of the phase interpolator ..... 51
5.13 Block diagram of the control circuitry ..... 52
5.14 Phase detector ..... 52
5.15 Floorplan of one half of a single stage of the delay element ..... 54
5.16 Floorplan of the delay element ..... 55
5.17 Floorplan of the phase interpolator ..... 56
5.18 Final layout of the quadrature generator ..... 57
5.19 Signal flow ..... 58
5.20 The test bench used to perform the following analyses ..... 60
5.21 Signal waveforms of the $R C$ coupled simulation, positive ..... 61
5.22 Signal waveforms of the $R C$ coupled simulation, negative ..... 61
5.23 Simulated I/Q phase shifts in the nominal corner ..... 62
5.24 Simulated I/Q phase shifts in the slow corner ..... 63
5.25 Simulated I/Q phase shifts in the fast corner ..... 63
5.26 Simulated I phase noise in the nominal corner ..... 64
5.27 Simulated I phase noise in the slow corner ..... 65
5.28 Simulated I phase noise in the fast corner ..... 65
5.29 Simulated Q phase noise in the nominal corner ..... 66
5.30 Simulated Q phase noise in the slow corner ..... 66
5.31 Simulated Q phase noise in the fast corner ..... 67
5.32 Simulated duty cycle of the I outputs in the nominal corner ..... 68
5.33 Simulated duty cycle of the I outputs in the slow corner ..... 68
5.34 Simulated duty cycle of the I outputs in the fast corner ..... 69
5.35 Simulated duty cycle of the $Q$ outputs in the nominal corner ..... 69
5.36 Simulated duty cycle of the $Q$ outputs in the slow corner ..... 70
5.37 Simulated duty cycle of the Q outputs in the fast corner ..... 70
5.38 Current consumption in the nominal corner ..... 71
5.39 Current consumption in the slow corner ..... 72
5.40 Current consumption in the fast corner ..... 72
5.41 Delay control code in the nominal corner ..... 73
5.42 Delay control code in the slow corner ..... 73
5.43 Delay control code in the fast corner ..... 74
5.44 I/Q phase shift over supply voltage variations ..... 76
5.45 Simulated phase noise over supply voltage variations ..... 77
5.46 Simulated duty cycle over supply voltage variations ..... 78
5.47 I/Q phase shift over input LO frequency ..... 79
5.48 Simulated phase noise over input LO frequency ..... 79
5.49 Simulated duty cycle over input LO frequency ..... 80
5.50 I/Q phase shift over temperature ..... 81
5.51 Phase noise over temperature ..... 82
5.52 Duty cycle over temperature ..... 82
5.53 Current consumption over temperature ..... 83
5.54 Distribution of the I/Q phase shifts ..... 84
5.55 Distribution of the in-phase duty cycles ..... 84
5.56 Distribution of the quadrature duty cycles ..... 85
5.57 Distribution of the in-phase phase noise ..... 86
5.58 Distribution of the quadrature phase noise ..... 86
5.59 Distribution of the current consumption ..... 87

## 1 Introduction

Modern wireless communication standards like Universal Mobile Telecommunications System (umts) [1], High Speed Downlink Packet Access (HSDPa) [2] and Long Term Evolution (LTE) [3], [4] pose extreme challenges to manufacturers and designers of wireless transceivers [5], [6]. Higher order modulation, narrow channel spacing and Orthogonal Frequency-Division Multiplexing (ofdm) set very stringent requirements on wireless radios. Moreover, for handheld devices a very low power consumption is also required to be able to compete in today's fast pacing markets. All these reasons, combined with the challenges of very deep sub micron cmos technologies [7]-[9], make the development of wireless transceivers a complex and challenging task.

All of the above standards use higher order Quadrature Amplitude Modulation (QAM) to translate the low-frequency data stream to the respective radio frequency (RF) signal that is transmitted (see Section 2.2 for further details). The basic structure of a quadrature modulator is shown in Figure 1.1.


Figure 1.1: Principal I/Q modulator structure [10], [11]
The digital data is split into two streams and converted to an analog signal by the digital-to-analog (D/A) converter. These signals are then multiplied with the Local Oscillator's (LO) output, which usually is a sinusoidal or a trapezoidal signal already at the desired RF frequency. Note that the LO signals for the two multiplications are shifted by $90^{\circ}$, the quadrature LO.

Finally the two resulting signals are added, amplified and transmitted via an antenna.

In general, quadrature signals describe two periodic signals (which usually have the same form) that exhibit a phase shift of $90^{\circ}$. Let $T=1 / f$ denote the period of the signals, then the delay between the two signals is $T / 4$. The lagging signal is called in-phase (I) and its phase is usually set to zero. The leading signal is called quadrature $(\mathrm{Q})$ and ideally its phase is $90^{\circ}=\frac{\pi}{2}$ [12].

Figure 1.2 shows sinusoidal and trapezoidal quadrature signals. In this work the focus lies only on trapezoidal (digital) signals, as these are beneficial to use with cmos devices.


Figure 1.2: Quadrature signals, in-phase in red and quadrature in blue

In multi-standard capable transceivers [13]-[15] several of these modulators are used to serve the various bands of the different standards. These blocks usually consume a large area resulting in long distances between the LO synthesizer and the mixers of the modulators which can reach the millimeter range as exemplarily seen in Figure 1.3.


Figure 1.3: Basic LO distribution
Such long distribution networks, which are essentially large capacitive loads, are especially bad for power consumption as several powerful buffers have to be inserted to maintain signal fidelity [16], [17]. Furthermore, it is easy to see that distributing quadrature signals doubles power consumption and area as two LO signals are routed across the chip. Therefore, it is beneficial
if the quadrature signals can be generated locally in the very vicinity of the modulator, while only one LO is distributed across the chip.


Figure 1.4: LO routing strategies
In this thesis a circuit generating quadrature LO signals (see Figure 1.4b) is developed to counter the power and area penalty of routing two critical oscillator signals (see Figure 1.4a) over long distances. Existing topologies are compared and the most promising solution is designed and implemented in layout. The exact specifications are given in Section 5.1.

Although modulators and demodulators used in wireless communications are by far not the only area of application for quadrature LO signals, they generally pose the most demanding requirements in terms of phase noise and phase accuracy. Other applications for quadrature LOs include Digital-to-Time Converters [18], beamforming applications [19] and many more [12].
This thesis is structured as follows: Chapter 2 provides a brief overview on the theoretical topics of phase noise and quadrature modulation. Chapter 3 covers related and previous work, Chapter 4 describes the proposed concept in detail and in Chapter 5 the implemented circuit is presented. Finally, Chapter 6 concludes this thesis and provides an outlook.

## 2 Theory on Phase Noise and I/Q Modulation

This chapter provides an overview of the theoretical background required for understanding the specifications for a quadrature generator. First, the important topic of phase noise and jitter is discussed. Second, applications of I/Q signals and especially I/Q modulation are discussed, including an overview of the imperfections and error sources of real circuits and their impact on the performance of the modulation.

### 2.1 Phase Noise

All resistive and active electronic devices, such as resistors and transistors, exhibit noise. While mechanisms and causes for noise may differ, the outcome is the same: a degradation of signal quality [20].

These devices exhibit stationary current and/or voltage noise, which usually depends upon their sizing and bias point. When this bias point is periodically changing with time, as in oscillators and circuits fed by oscillators, the noise becomes cyclostationary: in this case not only the amplitude noise is present but also phase noise [21].

Phase noise can be defined as random fluctuations $\varphi_{n}(t)$ of the phase of a periodic signal.

$$
\begin{equation*}
v(t)=A \cos \left(\omega t+\varphi_{n}(t)\right) \tag{2.1}
\end{equation*}
$$

To better understand how amplitude fluctuations of devices convert to phase noise, the scenario with a noisy voltage buffer shown in Figure 2.1 is examined. First, assume that the noise introduced by the buffer acts as an additive

noisy buffer
Figure 2.1: Phase noise scenario with a noisy buffer
voltage pulse ${ }^{1}$, that is short in time compared to the signal's period, at the output. The resulting signals are shown in Figure 2.2 for two different times of the voltage pulse occurring.


Figure 2.2: Impact of the voltage pulse at different timing instances for sinusoidal LO signals
The voltage pulse near the maximum, shown in Figure 2.2b, has negligible influence on the phase. In contrast, the voltage pulse near the zero crossing, shown in Figure 2.2a, causes a shift of the zero crossing and therefore a large deviation of the phase.

It is now easy to understand that arbitrary noise causes variations in phase depending on the signal itself. Consider the same scenario with a trapezoidal signal and an inverter as the buffer. Clearly, any pulses that occur during a high or low state cannot alter the phase at all, while a pulse occurring in the transition phase influences the time of the zero crossing heavily, as shown in Figure 2.3. This phenomenon of an uncertain transition timings is generally referred to as (timing) jitter.

[^0]

Figure 2.3: Impact of the voltage pulse at different timing instances for trapezoidal LO signals

It can be shown [21], [22] that this phase noise translates to a frequency spectrum similar to the one shown in Figure 2.4. The phase noise is a random phase modulation of the periodic signal which becomes visible as sidebands close to the carrier.


Figure 2.4: Typical (oscillator's) phase noise spectrum

The dependency of the spectrum on the offset frequency $\Delta f$ to the carrier $f_{\text {LO }}$ can be modeled for oscillators [21]. The flat region observed for big $\Delta f$ is caused by the white noise sources of the circuit and is not necessarily related to the phase modulation mechanism of phase noise. Instead, the $1 / \Delta f^{2}$ regime stems from the conversion of the white noise sources to phase noise. Finally, the $1 / \Delta f^{3}$ region additionally contains low frequency noise like flicker noise (which has a $1 / f$ frequency dependence) converted to phase noise. More detailed analyses can be found in literature [21], [22].

Phase noise is especially critical in telecommunication systems [23]. Consider a full duplex operation frequency division scenario in which the receive and transmit channels are very close in frequency. Suppose the LO exhibits phase noise: as shown in Figure 2.5 this unwanted noise power can now degrade the signal quality of the received signal.


Figure 2.5: Direct influence of the LO phase noise on a neighboring channel
But this is not the only problem concerning the LO's phase noise. In typical mixer based receiver front-ends (see Figure 2.6) [24], [25], the received radio frequency (RF) signal is mixed down with the LO to an intermediate frequency (IF). The resulting IF frequency is the difference between the LO and the RF signal frequency $f_{\mathrm{IF}}=\left|f_{\mathrm{LO}}-f_{\mathrm{RF}}\right|$.


LO
Figure 2.6: Mixer based receiver front-end
Consider the case (see Figure 2.7) when the desired RF signal is converted to the intermediate frequency $f_{\text {IF }}$ and also a strong adjacent interferer is present at $f_{i}$.

In the noiseless case shown in Figure 2.7a, both the desired RF signal and the interferer are converted down to intermediate frequencies $f_{\mathrm{IF}}$ and $f_{i}^{\prime}$. They stay well separated and the interferer does not degrade signal quality.

In the noisy case shown in Figure 2.7 b, an effect called reciprocal mixing can be observed [22], [26]. The LO exhibits phase noise which is additionally mixed with the RF signal and the interferer. Not only the phase noise is converted to the intermediate frequency, but also the interferer is mixed with a non negligible amount of phase noise, so that the frequency difference is equal to the IF. In this case, depending upon the power of the phase noise at that frequency, the signal quality is degraded.


Figure 2.7: Reciprocal mixing phenomenon
Phase noise usually is expressed as the ratio between the single sideband power density and the carrier power (in dBc , decibels below the carrier) normalized to a 1 Hz bandwidth [22]. The unit of phase noise is $\mathrm{dBc} / \mathrm{Hz}$ (decibels below the carrier per hertz). The symbol for phase noise is $\mathcal{L}(\Delta f)$, where $\Delta f$ is frequency offset from the carrier.

In specifications, limits on phase noise are usually defined at single points at given frequency offsets. E.g. the LO at the mixer input must have phase
noise performance lower than $-145 \mathrm{dBc} / \mathrm{Hz}$ at a frequency offset of 1 MHz and $-160 \mathrm{dBc} / \mathrm{Hz}$ at 100 MHz offset.

An empirical model of phase noise spectrum was introduced by Leeson [27]:

$$
\begin{equation*}
\mathcal{L}(\Delta f)=10 \log _{10}\left[\frac{K}{P_{s}}\left(\left(\frac{f_{\mathrm{LO}}}{2 Q \Delta f}\right)^{2}+1\right)\left(\frac{f_{c}}{\Delta f}+1\right)\right] \tag{2.2}
\end{equation*}
$$

in $\mathrm{dBc} / \mathrm{Hz}$, where $K$ and $Q$ are fitting parameters, $P_{s}$ is the signal power and $f_{c}$ is the flicker noise corner frequency. The power of phase noise depends on [22]:

- Carrier frequency: proportional to $f_{\text {LO }}^{2}$
- Offset frequency: depending on the regime proportional to $1 / \Delta f^{\alpha}$, where $\alpha \geq 3$ for very small offsets, $\alpha=2$ for small to medium offset frequencies and $\alpha=0$ for large offset frequencies
- Signal power: inversely proportional to the signal power, as the phase noise is measured relatively to the signal power

Although in this section phase noise was primarily explained with sinusoidal signals, the same applies for trapezoidal signals. As previously shown in Figures 2.2 and 2.3, it is easy to see that trapezoidal signals are only vulnerable to noise during the transition time, which is usually kept short compared to the period of the signal. Furthermore, digital (trapezoidal) signals are preferred when using cmos devices.

There are two possibilities to reduce phase noise: by decreasing the devices' amplitude noise and by reducing the amount of amplitude noise converted to phase noise. Similar current versus noise trade-offs are in place compared to lowering a transistor's thermal noise.

### 2.2 I/Q Modulation

Digital quadrature amplitude modulation (QAM) was introduced by Cahn [28] in 1960. He suggested the combination of digital amplitude and phase modulation. Constellation diagrams for all three modulation schemes (amplitude modulation, phase modulation, and combined amplitude and phase
modulation) are shown in Figure 2.8 for eight and sixteen symbols respectively. Although square QAM is quite common, a lot of different phasor constellations are possible. The real axis usually is called in-phase (I) and the imaginary axis quadrature (Q) [10].


Figure 2.8: Constellation diagrams
In a constellation diagram, the possible phase and amplitude combinations are represented as phasors in the complex plane. Each individual phasor represents a data symbol, which also comprises the complex baseband signal
$b(t)$. QAM allows to transmit complex signals, as both amplitude and phase are modulated. The physical output signal $s(t)$ therefore is

$$
\begin{align*}
s(t) & =\Re\left\{b(t) \mathrm{e}^{j \omega_{\mathrm{LO}} t}\right\}  \tag{2.3}\\
& =\Re\left\{(I(t)+j Q(t)) \mathrm{e}^{j \omega_{\mathrm{LO}} t}\right\} \\
& =I(t) \cos \left(\omega_{\mathrm{LO}} t\right)-Q(t) \sin \left(\omega_{\mathrm{LO}} t\right)  \tag{2.4}\\
& =I(t) \cos \left(\omega_{\mathrm{LO}} t\right)+Q(t) \cos \left(\omega_{\mathrm{LO}} t+\frac{\pi}{2}\right)
\end{align*}
$$

where $\omega_{\text {LO }}$ is the angular frequency of the LO. The complex baseband signal $b(t)$ can be represented either in real and imaginary part (Cartesian coordinates) or as amplitude and phase (polar coordinates).

$$
\begin{equation*}
b(t)=I(t)+j Q(t)=\sqrt{I^{2}(t)+Q^{2}(t)} \mathrm{e}^{\operatorname{jarg}\{I(t)+j Q(t)\}} \tag{2.5}
\end{equation*}
$$

As the above representation suggests, both approaches can be used to generate QAM signals. Using amplitude and phase as suggested by Cahn [28] is known as polar modulation [29], [30]. This work focuses on quadrature modulation, using the $I(t)$ and $Q(t)$ signals.
To demodulate the signals again in a quadrature receiver, the orthogonality of the sine and cosine terms is exploited. To reconstruct the in-phase signal, the received signal $r(t)$ is modulated again as (assuming an ideal channel as $r(t)=s(t))$

$$
\begin{align*}
I_{r}(t) & =r(t) \cos \left(\omega_{\mathrm{LO}} t\right) \\
& =I(t) \cos \left(\omega_{\mathrm{LO}} t\right) \cos \left(\omega_{\mathrm{LO}} t\right)-Q(t) \sin \left(\omega_{\mathrm{LO}} t\right) \cos \left(\omega_{\mathrm{LO}} t\right) \\
& =\frac{1}{2} I(t)+\frac{1}{2}\left(I(t) \cos \left(2 \omega_{\mathrm{LO}} t\right)-Q(t) \sin \left(2 \omega_{\mathrm{LO}} t\right)\right) \tag{2.6}
\end{align*}
$$

The component at twice the LO-frequency can be filtered with a low pass filter and the original signal remains ${ }^{2}$. To reconstruct the quadrature signal one operates analogously

[^1]\[

$$
\begin{align*}
Q_{r}(t) & =-r(t) \sin \left(\omega_{\mathrm{LO}} t\right) \\
& =-I(t) \cos \left(\omega_{\mathrm{LO}} t\right) \sin \left(\omega_{\mathrm{LO}} t\right)+Q(t) \sin \left(\omega_{\mathrm{LO}} t\right) \sin \left(\omega_{\mathrm{LO}} t\right) \\
& =\frac{1}{2} Q(t)-\frac{1}{2}\left(I(t) \sin \left(2 \omega_{\mathrm{LO}} t\right)+Q(t) \cos \left(2 \omega_{\mathrm{LO}} t\right)\right) \tag{2.7}
\end{align*}
$$
\]

A basic quadrature modulator structure is shown in Figure 2.9. The actual composition and the use of an intermediate frequency depend on the architecture used.


Figure 2.9: Basic I/Q modulator structure [10], [11]
The digital data is split into in-phase and quadrature components. The resulting signals are converted to analog voltages or currents (usually also filtered). Then, as Equation 2.4 suggests, both signals are mixed with the respective I and Q RF carrier and then summed. The result usually is amplified by a power amplifier before transmission. In implementations also alternative setups can be used, but the requirements on the quadrature generator remain similar [31], [32].
In Figure 2.10 the analogous receiver and demodulator structure is depicted. The demodulator performs as Equations 2.6 and 2.7 suggest.
The main impairments of the quadrature modulator are imbalances between the two paths, which translate to I/Q imbalances and (oscillator) phase noise [33]-[36]. Therefore, also for the quadrature generator these are important sources of errors degrading the overall performance of the system.


Figure 2.10: Basic I/Q demodulator structure [11]

### 2.2.1 I/Q Imbalance

I/Q imbalance defines the deviation of the real in-phase and quadrature signals from the ideal signals exhibiting a perfect $90^{\circ}$ phase shift and equal amplitudes. There are two types of I/Q imbalances [36]:

- amplitude imbalance: differences of the amplitudes of the I and Q signals
- phase imbalance: deviations from the $90^{\circ}$ phase difference between I and Q

When using sinusoidal LO signals, both amplitude and phase imbalances are critical. In cmos environments trapezoidal signals' amplitude imbalances can be neglected since the amplitudes are automatically limited by the supply voltage. In cmos environments, imbalances and deviations in the duty cycle are more critical [37].

Generally, the quadrature generator produces the following signals

$$
\mathrm{LO}_{I}(t)=A_{I} \cos \left(\omega_{\mathrm{LO}} t+\phi_{I}\right) \quad \mathrm{LO}_{Q}(t)=A_{Q} \cos \left(\omega_{\mathrm{LO}} t+90^{\circ}+\phi_{Q}\right)
$$

The amplitude imbalance is defined as $m=\frac{A_{Q}}{A_{I}}$ and the phase imbalance as $\Delta \phi=\phi_{Q}-\phi_{I}$. Without loss of generality the amplitude $A_{I}$ is set to unity and the phase is set to zero ( $\phi_{I}=0$ ). Thus, the output of the quadrature generator can be rewritten as

$$
\begin{equation*}
\mathrm{LO}_{I}(t)=\cos \left(\omega_{\mathrm{LO}} t\right) \quad \mathrm{LO}_{Q}(t)=m \cos \left(\omega_{\mathrm{LO}} t+90^{\circ}+\Delta \phi\right) \tag{2.8}
\end{equation*}
$$

If the modulation is perfect, the resulting modulated signal is

$$
\begin{equation*}
s(t)=I(t) \cos \left(\omega_{\mathrm{LO}} t\right)+m Q(t) \cos \left(\omega_{\mathrm{LO}} t+90^{\circ}+\Delta \phi\right) \tag{2.9}
\end{equation*}
$$

If $s(t)$ is demodulated with an ideal quadrature LO signal, the in-phase reconstruction is (assuming an ideal transmission channel, i.e. $r(t)=s(t)$ )

$$
\begin{align*}
I_{r}(t)= & r(t) \cos \left(\omega_{\mathrm{LO}} t\right) \\
= & \left(I(t) \cos \left(\omega_{\mathrm{LO}} t\right)+m Q(t) \cos \left(\omega_{\mathrm{LO}} t+90^{\circ}+\Delta \phi\right)\right) \cos \left(\omega_{\mathrm{LO}} t\right) \\
= & \frac{1}{2} I(t)\left(1+\cos \left(2 \omega_{\mathrm{LO}} t\right)\right) \\
& +\frac{1}{2} m Q(t)\left(\cos \left(90^{\circ}+\Delta \phi\right)+\cos \left(2 \omega_{\mathrm{LO}} t+90^{\circ}+\Delta \phi\right)\right) \\
= & \frac{1}{2}(I(t)-m Q(t) \sin (\Delta \phi))  \tag{2.10}\\
& +\frac{1}{2}\left(I(t) \cos \left(2 \omega_{\mathrm{LO}} t\right)+m Q(t) \cos \left(2 \omega_{\mathrm{LO}} t+90^{\circ}+\Delta \phi\right)\right)
\end{align*}
$$

and similar for the quadrature reconstruction

$$
\begin{align*}
Q_{r}(t)= & r(t) \cos \left(\omega_{\mathrm{LO}} t+90^{\circ}\right) \\
= & \left(I(t) \cos \left(\omega_{\mathrm{LO}} t\right)+m Q(t) \cos \left(\omega_{\mathrm{LO}} t+90^{\circ}+\Delta \phi\right)\right) \cos \left(\omega_{\mathrm{LO}} t+90^{\circ}\right) \\
= & \frac{1}{2} I(t) \cos \left(2 \omega_{\mathrm{LO}} t+90^{\circ}\right)+\frac{m Q(t)}{2}\left(\cos (\Delta \phi)-\cos \left(2 \omega_{\mathrm{LO}} t+\Delta \phi\right)\right) \\
= & \frac{1}{2} m Q(t) \cos (\Delta \phi)  \tag{2.11}\\
& +\frac{1}{2}\left(I(t) \cos \left(2 \omega_{\mathrm{LO}} t+90^{\circ}\right)-m Q(t) \cos \left(2 \omega_{\mathrm{LO}} t+\Delta \phi\right)\right)
\end{align*}
$$

It is easy to see that due to the quadrature imbalance, both in-phase and quadrature signals cannot be entirely separated. Both, the sender's and receiver's LOs influence the separability: in a real system both LOs will show imperfections. Furthermore, the receiver's LO will not be perfectly aligned with the signal, which is another source of errors [36].

In Figure 2.11 the impact of $10 \%$ amplitude imbalance ( $m=1.1$ ) on the constellation diagram is shown for circular and square 16-QAM. Similarly, in


Figure 2.11: Constellation diagrams with $10 \% \mathrm{I} / \mathrm{Q}$ amplitude imbalance ( $m=1.1$ )


Figure 2.12: Constellation diagrams with $10^{\circ} \mathrm{I} / \mathrm{Q}$ phase imbalance $\left(\Delta \phi=10^{\circ}\right)$

Figure 2.12 the constellation diagrams with a phase imbalance of $\Delta \phi=10^{\circ}$ are shown.

I/Q imbalances can usually be compensated with digital signal processing [38]-[42]. Therefore the initial (static) imbalance is somehow measured, for instance at the initialization of the system or dynamically, and then corrected digitally. Even though there are powerful compensation algorithms available, the imbalance must be below an algorithm specific limit to guarantee high
performance operation. It shall further be noted that depending on the algorithms employed, the dynamic I/Q imbalance (the change during operation) has to be even smaller to negligibly impact the operation.

### 2.2.2 Impact of Phase Noise

The very basics of phase noise have already been discussed in Section 2.1. In this section the impact of phase noise on quadrature modulation is briefly covered.

With I/Q modulation, phase noise has a specific impact on the generated output signal [43]-[47]. A lot of effort has been put into finding optimal codes and constellations [48], [49] to reduce the effects of phase noise, yielding different results for various scenarios.

For quadrature modulation one has to distinguish between correlated and uncorrelated noise on the I and Q LO paths. Correlated noise is the very same on both in-phase and quadrature LO signals and is generated by the input LO. The uncorrelated parts in the I and Q paths originate from the different devices in the two paths. In other words, noise introduced before the actual split of the I and Q LO paths is the same on both signals (e.g. coming from the single phase oscillator), while noise introduced beyond the I/Q generator is independent of each other.

Therefore, it is possible to write the phase noise (see Equation 2.1) of either I or Q as

$$
\begin{equation*}
\varphi_{n}(t)=\varphi_{n, \mathrm{LO}}(t)+\varphi_{n, \text { Path }}(t) \tag{2.12}
\end{equation*}
$$

where $\varphi_{n, \mathrm{LO}}(t)$ accounts for the correlated phase noise and $\varphi_{n, \text { Path }}(t)$ for the phase noise of an uncorrelated single path.
These two noise parts do effect the output of the modulation differently. On one hand, the correlated phase noise portion impacts both I and Q equally. Therefore all phasors in the constellation diagram are randomly rotated over time by the correlated phase noise, hence no I/Q imbalance is introduced by correlated phase noise (see Figure 2.13).

On the other hand, the uncorrelated part causes a random I/Q imbalance as I and Q are affected differently, as shown in Figure 2.14.

2 Theory on Phase Noise and I/Q Modulation


Figure 2.13: Constellation diagrams with correlated phase noise

Thus in total, phase noise causes a random rotation of the phasors, which, in theory, does not corrupt the transmitted information, and a random I/Q imbalance, which impairs the signal.


Figure 2.14: Constellation diagrams with uncorrelated phase noise

## 3 State-of-the-Art Solutions

In this chapter, state-of-the-art solutions to generate quadrature LO signals are briefly described. This thesis focuses on the generation of смоs quadrature signals, therefore, the generation of sinusoidal I/Q signals is not addressed.

### 3.1 Frequency Division

Quadrature generation by frequency division uses an input clock at twice the frequency $2 \cdot f_{\mathrm{LO}}$ of the I/Q output frequency $f_{\mathrm{LO}}$. The rising edges of the input clock define the Q-phase transitions, the falling edges of the input define the I-phase transitions. The according timing diagram is shown in Figure 3.1. This concept can be expanded to generate $n$ equally spaced phases: a $n \cdot f_{\mathrm{LO}}$ clock is divided by $n$ where every $n$-th edge defines an output transition.


Figure 3.1: Timing diagram of the frequency division based quadrature generation
Possible implementations, based on D-Latches in the master-slave configuration, are presented in [50]-[52] and sketched in Figure 3.2. The cmos implementation of the fully differential latch is shown in Figure 3.3.

A benefit of this structure, besides its simplicity, certainly is the wide range of operating frequency. The upper limit is determined by the time required


Figure 3.2: Latch-based fully differential quadrature frequency divider [51], [52]


Figure 3.3: Fully differential CMOS D-Latch [52]
to recharge the parasitic capacitances at the internal nodes and is mainly defined by the used technology.

On the downside, this solution requires the generation and distribution of a LO signal at twice (or $n$ times) the desired output frequency, which usually is very power intensive. This is the main challenge when considering low noise low power designs, because phase noise increases with frequency (given a certain power consumption) as mentioned in Section 2.1. Furthermore, input duty cycles deviating from 0.5 result in I/Q imbalance [50].

### 3.2 Delay Locked Loops

Delay-locked loops (DLL) [53]-[55] can be used for various purposes, like clock data recovery (CDR) [56], [57], phase interpolation [58] and multiphase clock generation [59]-[63]. The basic principle of a DLL is shown in Figure 3.4 .


Figure 3.4: Block diagram of a conventional DLL [59]-[61], [63]
A DLL consists of a variable delay $\Delta T$, a phase detector (PD) and a loop filter (LF). The delay line is driven by a reference signal $\varphi_{\text {ref }}$ and produces the output signal $\varphi_{\text {out. }}$. In many analog CMOS implementations additionally a charge pump [59] is inserted between the PD and the LF to convert the phase information into a voltage and to drive the LF. The variable delay usually has one or more control voltages influencing the effective delay. This topology employs a feedback loop, adjusting the delay $\Delta T$ to the input frequency, process, voltage and temperature variations. Therefore, a DLL is a dynamic system. In literature many extensions to the basic DLL topology can be found to counter various problems and to improve performance [56]-[63].

The DLL is locked when the phase difference between the reference and the output signal has a defined value. Usually a phase difference of $0^{\circ}$ or $180^{\circ}$ is detected, resulting in an effective phase shift of $360^{\circ}$ or $180^{\circ}$ from the input to the output.

To generate multiple phases with a DLL, the delay element (drawn as a monolithic block in Figure 3.4) is tapped at the desired phase shifts [63]. This is shown exemplarily for quadrature phases in Figure 3.5. To generate more phases, the delay has to be split further.


Figure 3.5: Block diagram of a tapped delay for quadrature generation
The DLL is a simple structure to generate a multiphase LO. Unfortunately it suffers from several issues: a long delay (in the range of the input signal's period) is required, which introduces noise and increases power consumption. Matching between the single delay elements is critical for multiphase generation, as mismatch introduces phase errors. The topology is a feedback structure, which in turn has its own dynamic behavior and may pose stability issues.

### 3.3 Injection Locked Ring Oscillators

cmos ring oscillators are composed of a series of more than two смоs inverters, where the output of the last inverter is connected to the input of the first inverter, forming a ring [64]. In a single ended fashion (see Figure 3.6), the number of inverters usually is odd (equal or greater than three). The circuit has no stable operating point and starts to oscillate. In differential circuits also an even number of inverting stages can be used.


Figure 3.6: Single-ended cmos ring oscillator with $n$ stages

Ring oscillators can be used to generate a multiphase output signal similar to delay locked loops. As each stage provides a delay, desired phases can be tapped from the different oscillator's stages. To generate quadrature phases, an $n$-stage ring oscillator can be used, which is tapped at stages $\frac{n}{2}$ and $n$. Therefore, $n$ has to be even (which implies the use of a differential structure) and all inverting stages should provide equal delay (or the delay from the first stage to stage $\frac{n}{2}$ should match the one from stage $\frac{n}{2}$ to the last).
The main issue of ring oscillators is their bad noise performance, compared to other types of oscillators [22], [65]-[67]. To circumvent this problem, a reference signal can be injected into the oscillator.

This phenomenon of injecting a signal into an oscillator with a frequency similar to the free running frequency of the oscillator (or a harmonic thereof) is called injection locking [68]. Injection locking can also be observed in mechanical or other dynamic systems [69].

If the locking succeeds, the oscillator oscillates at the injected frequency or a multiple thereof. Furthermore, the phase noise of the injection locked oscillator tracks its reference [68], [70]. The injection of the reference signal can be done in many ways [71]-[74].


Figure 3.7: Differential injection locked cmos ring oscillator with frequency tuning
It shall be noted that a perfect quadrature shift with the method described above is only obtained at the natural frequency of the oscillator. A novel way to circumvent this problem is presented in [26], where a quadrature phase detector is used to adapt the ring oscillator's frequency to the reference frequency. The basic circuit principle for a four-stage differential ring oscillator
is shown in Figure 3.7. The quadrature phase detector is used to adjust the ring oscillator's natural frequency to the injected frequency. If these two frequencies match, each of the four stages provides a $45^{\circ}$ phase shift.

### 3.4 Phase Correction

A very different approach to generate quadrature phases was presented by Kim et. al [75], [76]. Rather precise quadrature phases are generated by correcting the phase error with a driven interpolating oscillator. The phase corrector is shown in Figure 3.8.


Figure 3.8: Phase corrector stage [75], [76]

The corrector is composed of multiple three stage ring oscillators, which influence each other. To generate quadrature phases from a single phase, several of these correctors are cascaded and the first stage is driven only by one phase.

This topology generates quadrature phases without frequency division and without the need of complex feedback structures as observed in DLLs or phase locked loops. Also, more phases can be generated with this principle [77]. On the contrary, due to the large amount of devices, the power consumption of this circuit is rather high (as many correctors have to be cascaded) and also the noise performance is rather poor [77].

## 4 Proposed Solution: Self-Aligned Open Loop Multiphase Generation

This chapter describes the approach developed in this thesis. The theoretical operation principle of the implemented circuit (see Chapter 5) is disclosed.

In this chapter the term phase will be used regularly. Therefore, a precise definition is required: The term phase describes a periodic signal (e.g. a trapezoidal or sinusoidal) that has a specific phase shift to a reference signal (which ideally has the very same shape) whose phase shift is defined as zero. In other words, phase means a shifted replica signal. The usage of the term phase shall emphasize the importance of the signals' timings.

Example phases (and therefore signals) $\varphi_{0}$ and $\varphi_{1}$ are shown in Figure 4.1. The blue phase $\varphi_{0}$ is considered as the reference phase (hence it exhibits a $0^{\circ}$ phase shift and also $\varphi_{0}=0^{\circ}$ ) while the red phase $\varphi_{1}$ is a replica of the blue one and is shifted by $\Delta T$ in time. This shift in time can be expressed as a shift in phase as $\Delta \varphi=360^{\circ} \cdot \Delta T \cdot f$, where $f$ is the signals' frequency $\left(\varphi_{1}=\varphi_{0}+\Delta \varphi=\Delta \varphi\right)$. In the following, the value of a phase (like $\varphi_{0}$ and $\varphi_{1}$ above) is its phase difference to the reference phase.


Figure 4.1: Example phases in the time domain
A phase is defined only by the phase shift relative to the reference phase (which can be chosen arbitrarily). Therefore, it is possible to depict the relations of the signals in terms of phase shifts in a phasor diagram. In such
a diagram, the periodic signals are represented as phase vectors or phasors. Only the amplitudes and phase shifts relative to the reference of the signals are shown. The phasors themselves have a length and an angle which correspond to the signals' amplitudes and phase shifts respectively. The phasor diagram of the signals of Figure 4.1 is shown in Figure 4.2.


Figure 4.2: Example phases in a phasor diagram

### 4.1 Building Blocks

The solution presented in this chapter is able to generate $180^{\circ} / n$ spaced output phases $\psi_{1}, \ldots, \psi_{n}$ from one input phase $\varphi_{0}$ and its $180^{\circ}$ shifted counterpart $\bar{\varphi}_{0}$. This $180^{\circ}$ phase shift is exploited to accurately generate the desired output signals by linear phase interpolation.

Two basic components are needed for this solution to work:

- phase shifter: An element that generates a phase $\varphi_{1}=\varphi_{0}+\Delta \varphi$ with a certain phase shift $\Delta \varphi$ from its input $\varphi_{0}$ (see Figure 4.3). This element provides a time delay $\Delta T$, which translates to a phase shift over the frequency as $\Delta \varphi=360^{\circ} \cdot \Delta T \cdot f$. Of course only positive phase shifts are possible, otherwise the system would be acausal.

(a) Symbol

(b) Time domain

(c) Phasor diagram

Figure 4.3: Phase shifter element basic relations

- phase interpolator or phase average: An element that combines several input phases $\varphi_{0}, \ldots, \varphi_{n-1}$ so that the output $\psi=\frac{1}{n} \sum_{k=0}^{n-1} \varphi_{k}$ is the arithmetic mean of the inputs in terms of the phase shifts (see Figure 4.4). In a real system, a linear phase interpolation is only possible for rather small phase differences of the input phases.


Figure 4.4: Phase interpolator basic relations

The implementation details of the afore mentioned components are described in Section 5.2.

It is assumed that an input phase $\varphi_{0}$ and its $180^{\circ}$ shifted phase $\bar{\varphi}_{0}$ are available. This is reasonable, as with digital (and sinusoidals) signals this is just the inverse signal when the duty cycle is perfectly $50 \%$. Furthermore, to improve robustness against external interferers [20], the LO is usually distributed differentially, so both phases $\varphi_{0}$ and $\bar{\varphi}_{0}$ are available.

### 4.2 Quadrature Generation

First, the principle is explained by generating quadrature phases, so $n=2$. The input phase $\varphi_{0}$ is phase shifted by $\Delta \varphi$, which generates the auxiliary phase $\varphi_{1}=\varphi_{0}+\Delta \varphi$. Then the original input phase $\varphi_{0}$ and the auxiliary phase $\varphi_{1}$ are interpolated, generating the first output phase $\psi_{1}$ :

$$
\begin{equation*}
\psi_{1}=\frac{1}{2}\left(\varphi_{0}+\varphi_{1}\right)=\frac{1}{2}\left(\left(\varphi_{0}\right)+\left(\varphi_{0}+\Delta \varphi\right)\right)=\varphi_{0}+\frac{1}{2} \Delta \varphi \tag{4.1}
\end{equation*}
$$

Next, the auxiliary phase $\varphi_{1}$ and the inverse input phase $\bar{\varphi}_{0}$ are interpolated as well to yield $\psi_{2}$ :

$$
\begin{equation*}
\psi_{2}=\frac{1}{2}\left(\varphi_{1}+\bar{\varphi}_{0}\right)=\frac{1}{2}\left(\left(\varphi_{0}+\Delta \varphi\right)+\left(180^{\circ}+\varphi_{0}\right)\right)=90^{\circ}+\varphi_{0}+\frac{1}{2} \Delta \varphi \tag{4.2}
\end{equation*}
$$

The phase difference between both output phases is

$$
\psi_{2}-\psi_{1}=90^{\circ}+\varphi_{0}+\frac{1}{2} \Delta \varphi-\left(\varphi_{0}+\frac{1}{2} \Delta \varphi\right)=90^{\circ}
$$

It shall be noted that the phase difference $\psi_{2}-\psi_{1}$ is always $90^{\circ}$ independently of the phase shift $\Delta \varphi$. The phasors representing these relations are sketched in Figure 4.5 for different phase shifts $\Delta \varphi$.

(a) Small $\Delta \varphi$

(b) Large $\Delta \varphi$

Figure 4.5: Quadrature phase generation principle with different phase shifts of the auxiliary phase $\varphi_{1}$

This principle works as long as the phase interpolation between the phases is linear. In a real system this will not be the case for larger phase differences
of the interpolated phases $\varphi_{0}, \varphi_{1}$ and $\bar{\varphi}_{0}$. Phase averaging will only be linear for rather small phase differences of the interpolator's input signals. Further analysis of a real phase interpolator is provided in Section 5.2.2. For the principle of operation it is sufficient to assume that the interpolation will properly work for phase differences smaller than $\Delta \zeta$ with $\Delta \zeta \ll 180^{\circ}$.

To counter this problem, additional auxiliary phases are introduced, as shown in Figure 4.6.


Figure 4.6: Quadrature phase generation principle with additional auxiliary phases to enable operation with real circuit components

First, the initial phase difference $\Delta \varphi$ is selected so that the interpolation of $\varphi_{1}$ and $\bar{\varphi}_{0}$ works linearly, so $\left|\varphi_{1}-\bar{\varphi}_{0}\right|<\Delta \zeta$. Due to this selection, $\psi_{2}$ can readily be generated. To also generate $\psi_{1}$, two additional auxiliary phases $\lambda_{1}^{+}$and $\lambda_{1}^{-}$ are introduced. They are chosen in a way that they lie symmetrically around $\psi_{1}$. In this way the phase shift $\Delta \varphi$ can be split into three components:

$$
\begin{equation*}
\Delta \varphi=\Delta \chi+\Delta \vartheta+\Delta \chi \tag{4.4}
\end{equation*}
$$

It is clear to see that if $\Delta \varphi$ is partitioned into $\Delta \chi, \Delta \vartheta$ and $\Delta \chi$, the two additional phases are symmetrical around $\psi_{1}$. These two auxiliary phases are

$$
\lambda_{1}^{-}=\varphi_{0}+\Delta \chi \quad \lambda_{1}^{+}=\varphi_{0}+\Delta \chi+\Delta \vartheta
$$

If $\Delta \chi$ and $\Delta \vartheta$ are chosen so that $\Delta \vartheta<\Delta \zeta$, also $\psi_{1}$ can be generated. The interpolation of $\lambda_{1}^{+}$and $\lambda_{1}^{-}$yields

$$
\psi_{1}=\frac{1}{2}\left(\lambda_{1}^{+}+\lambda_{1}^{-}\right)
$$

$$
\begin{align*}
& =\frac{1}{2}\left(\left(\varphi_{0}+\Delta \chi\right)+\left(\varphi_{0}+\Delta \chi+\Delta \vartheta\right)\right) \\
& =\varphi_{0}+\Delta \chi+\frac{1}{2} \Delta \vartheta  \tag{4.5}\\
& =\varphi_{0}+\frac{1}{2} \Delta \varphi
\end{align*}
$$

The interpolation of $\varphi_{1}$ and $\bar{\varphi}_{0}$ gives

$$
\begin{align*}
\psi_{2} & =\frac{1}{2}\left(\varphi_{1}+\bar{\varphi}_{0}\right) \\
& =\frac{1}{2}\left(\left(\varphi_{0}+\Delta \varphi\right)+\left(180^{\circ}+\varphi_{0}\right)\right) \\
& =\frac{1}{2}\left(\left(\varphi_{0}+\Delta \chi+\Delta \vartheta+\Delta \chi\right)+\left(180^{\circ}+\varphi_{0}\right)\right) \\
& =90^{\circ}+\varphi_{0}+\Delta \chi+\frac{1}{2} \Delta \vartheta  \tag{4.6}\\
& =90^{\circ}+\varphi_{0}+\frac{1}{2} \Delta \varphi
\end{align*}
$$

This solution will generate a $90^{\circ}$ phase shift between its outputs, if:

- the two phase shifts denoted as $\Delta \chi$ are exactly the same
- the absolute values of the phase shifts enable linear interpolation

The second condition should be rather easy to fulfill, as the phase shifts only have to be within an acceptable range.

This principle can also be easily extended to output differential quadrature signals. It does not matter whether $\varphi_{0}$ or $\bar{\varphi}_{0}$ has a $180^{\circ}$ degree shift, they can be interchanged by their definition. By repeating the above procedure for both inputs, a fully differential structure is obtained, where all auxiliary phases and their $180^{\circ}$ shifted versions are generated and used. The block diagram of such a quadrature generator is shown in Figure 4.7.

### 4.2.1 Analysis of Possible Imperfections

In this section possible imperfections of an implementation of the approach presented in Section 4.2 are evaluated. With the aid of Figure 4.7, possible error sources are listed and analyzed in the following.


Figure 4.7: Block diagram of the differential quadrature generator

## Errors Due to Mismatch in the Phase Shifters

First, the mismatches in the various phase shifts are analyzed. All phase shifters are assigned different values deviating from their ideal values as shown in Figure 4.8.

To gain some insight, these phase shifts are composed from the respective nominal phase shift and error terms, which account for deviations between the differential paths (terms $\Delta \varepsilon_{1,2}$ and $\Delta \gamma$ ) and the sequentially matched elements (term $\Delta \varepsilon$ ) as follows:

$$
\begin{align*}
& \Delta \chi_{1}^{ \pm}=\Delta \chi+\Delta \varepsilon \pm \Delta \varepsilon_{1}  \tag{4.7}\\
& \Delta \chi_{2}^{ \pm}=\Delta \chi-\Delta \varepsilon \pm \Delta \varepsilon_{2}  \tag{4.8}\\
& \Delta \vartheta^{ \pm}=\Delta \vartheta \pm \Delta \gamma \tag{4.9}
\end{align*}
$$



Figure 4.8: Block diagram of the differential quadrature generator with mismatch

With this notation one can easily find the expressions for the four output phases as

$$
\begin{align*}
& \psi_{1}=\varphi_{0}+\Delta \chi+\Delta \varepsilon_{1}+\Delta \varepsilon+\frac{1}{2} \Delta \vartheta+\frac{1}{2} \Delta \gamma  \tag{4.10}\\
& \psi_{2}=90^{\circ}+\varphi_{0}+\Delta \chi+\frac{1}{2}\left(\Delta \varepsilon_{1}+\Delta \varepsilon_{2}\right)+\frac{1}{2} \Delta \vartheta+\frac{1}{2} \Delta \gamma  \tag{4.11}\\
& \bar{\psi}_{1}=180^{\circ}+\varphi_{0}+\Delta \chi-\Delta \varepsilon_{1}+\Delta \varepsilon+\frac{1}{2} \Delta \vartheta-\frac{1}{2} \Delta \gamma  \tag{4.12}\\
& \bar{\psi}_{2}=270^{\circ}+\varphi_{0}+\Delta \chi-\frac{1}{2}\left(\Delta \varepsilon_{1}+\Delta \varepsilon_{2}\right)+\frac{1}{2} \Delta \vartheta-\frac{1}{2} \Delta \gamma \tag{4.13}
\end{align*}
$$

From these equations it is possible to gain some insight into the mismatch mechanisms present. To simplify matters, consider the phase differences:

$$
\begin{equation*}
\psi_{2}-\psi_{1}=90^{\circ}+\frac{1}{2}\left(\Delta \varepsilon_{2}-\Delta \varepsilon_{1}\right)-\Delta \varepsilon \tag{4.14}
\end{equation*}
$$

$$
\begin{align*}
& \bar{\psi}_{2}-\bar{\psi}_{1}=90^{\circ}-\frac{1}{2}\left(\Delta \varepsilon_{2}-\Delta \varepsilon_{1}\right)-\Delta \varepsilon  \tag{4.15}\\
& \bar{\psi}_{1}-\psi_{1}=180^{\circ}-2 \Delta \varepsilon_{1}-\Delta \gamma  \tag{4.16}\\
& \bar{\psi}_{2}-\psi_{2}=180^{\circ}-\left(\Delta \varepsilon_{1}+\Delta \varepsilon_{2}\right)-\Delta \gamma \tag{4.17}
\end{align*}
$$

As expected, the $90^{\circ}$ phase shift of the outputs $\psi_{1}$ and $\psi_{2}$ (and $\bar{\psi}_{1}$ and $\bar{\psi}_{2}$ respectively) is influenced by the mismatches of the phase shifters $\Delta \chi_{1,2}^{ \pm}$, which is effectively $1 / 2 \cdot\left(\Delta \varepsilon_{2}-\Delta \varepsilon_{1}\right)-\Delta \varepsilon$. The mismatch $\Delta \varepsilon_{1,2}$ and $\Delta \gamma$ between the differential phase shifters impacts the alignment of the differential output phases $\psi_{1,2}$ and $\bar{\psi}_{1,2}$, yielding phase differences deviating from the ideal $180^{\circ}$.

## Errors Due to Mismatch in the Phase Interpolators

Similar to mismatches in the phase shifts, mismatches in the phase interpolators will degrade both, the ideal desired phase shifts as well as the alignment between the differential phases.

## Errors Due to Nonlinearities in the Phase Interpolators

As already mentioned, the phase interpolator will only have a limited range where it exhibits the desired linear phase transfer function

$$
\begin{equation*}
\Phi\left(\varphi_{1}, \varphi_{2}\right)=\frac{\varphi_{1}+\varphi_{2}}{2}+\Delta \xi=\varphi_{1}+\frac{\varphi_{2}-\varphi_{1}}{2}+\Delta \xi \tag{4.18}
\end{equation*}
$$

The term $\Delta \xi$ in Equation 4.18 accounts for any additional phase shifts introduced by the phase interpolator, which has no influence on the outputs if it is perfectly the same in all phase interpolators.

To analyze the possible nonlinearities in the phase transfer function, a series expansion has been performed, yielding

$$
\begin{equation*}
\hat{\Phi}\left(\varphi_{1}, \varphi_{2}\right)=\varphi_{1}+\Delta \xi+\frac{\varphi_{2}-\varphi_{1}}{2}+\sum_{k=2}^{\infty} c_{k}\left(\varphi_{2}-\varphi_{1}\right)^{k} \tag{4.19}
\end{equation*}
$$

The coefficient values $c_{k}$ of the series expansion depend on the physical implementation of the phase interpolator.

It shall be noted that the linear component of Equation 4.19 is still assumed to be the average between the two input phases. One can easily see that the error introduced by the nonlinear terms is

$$
\begin{equation*}
\mathcal{E}\left(\varphi_{1}, \varphi_{2}\right)=\hat{\Phi}-\Phi=\sum_{k=2}^{\infty} c_{k}\left(\varphi_{2}-\varphi_{1}\right)^{k} \tag{4.20}
\end{equation*}
$$

The error is only dependent on the spacing of the interpolated phases $\Delta \varphi=$ $\varphi_{2}-\varphi_{1}$, therefore it can be reformulated as

$$
\begin{equation*}
\mathcal{E}(\Delta \varphi)=\hat{\Phi}-\Phi=\sum_{k=2}^{\infty} c_{k} \Delta \varphi^{k} \tag{4.21}
\end{equation*}
$$

In the following it is assumed that all phase interpolators behave identically and exhibit the same nonlinear characteristics.

Recalling the notation from Figures 4.6 and 4.7 , phases $\lambda_{1}^{-}$and $\lambda_{1}^{+}$are used for the generation of $\psi_{1}$, while $\varphi_{1}$ and $\bar{\varphi}_{0}$ are used for $\psi_{2}$. The error stemming from the nonlinearity on $\psi_{1}$ and $\psi_{2}$ can be expressed as $\mathcal{E}\left(\lambda_{1}^{-}, \lambda_{1}^{+}\right)=\mathcal{E}(\Delta \vartheta)$ and $\mathcal{E}\left(\varphi_{1}, \bar{\varphi}_{0}\right)=\mathcal{E}\left(180^{\circ}-2 \Delta \chi-\Delta \vartheta\right)$ respectively.

The single error terms only specify the phase deviations from the ideal linear interpolation. The output phases including these errors then are then

$$
\begin{align*}
& \psi_{1}=\tilde{\psi}_{1}+\mathcal{E}(\Delta \vartheta)  \tag{4.22}\\
& \psi_{2}=\tilde{\psi}_{2}+\mathcal{E}\left(180^{\circ}-2 \Delta \chi-\Delta \vartheta\right) \tag{4.23}
\end{align*}
$$

where $\tilde{\psi}_{1,2}$ are the ideal output phases without the nonlinearities, and $\tilde{\psi}_{2}-$ $\tilde{\psi}_{1}=90^{\circ}$.

The phase difference including the nonlinearities is

$$
\begin{align*}
\psi_{2}-\psi_{1} & =\tilde{\psi}_{2}+\mathcal{E}\left(180^{\circ}-2 \Delta \chi-\Delta \vartheta\right)-\left(\tilde{\psi}_{1}+\mathcal{E}(\Delta \vartheta)\right) \\
& =90^{\circ}+\mathcal{E}\left(180^{\circ}-2 \Delta \chi-\Delta \vartheta\right)-\mathcal{E}(\Delta \vartheta) \tag{4.24}
\end{align*}
$$

To minimize the error in the phase shift of the output signals, the difference of both error terms should be as small as possible. This means, that the phase
difference of the auxiliary phases $\lambda_{1}^{+}$and $\lambda_{1}^{-}$have to equal the phase difference of the phases $\varphi_{1}$ and $\bar{\varphi}_{0}$. The value of error $\mathcal{E}(\Delta \vartheta)$ then shifts both output phases in the same manner, hence not influencing their relative position.

As a final remark, note that if the phase shifts are implemented with a time delay $\left(\Delta \varphi=360^{\circ} \cdot \Delta T \cdot f\right)$, it is important that this delay is matched to the operating frequency.

### 4.3 Multiphase Generation

The principle explained in Section 4.2 can be extended to the generation of $n$ phases with phase differences of $180^{\circ} / n$ (with $n \geq 2$ being an integer). In this case, the phase interpolator must combine and average $n$ phases.


Figure 4.9: Auxiliary phases for multiphase generation
First, as exemplarily shown in Figure 4.9, the input phase $\varphi_{0}$ is phase shifted $n-1$ times to produce the auxiliary phases $\varphi_{1}, \ldots, \varphi_{n-1}$. Note that the values
of the single phase shifts $\Delta \hat{\varphi}_{k}=\varphi_{k}-\varphi_{o}$ can be chosen arbitrarily. Without loss of generality, it is assumed that

$$
\begin{equation*}
\varphi_{0} \leq \varphi_{1} \leq \ldots \leq \varphi_{n-1} \tag{4.25}
\end{equation*}
$$

This can simply be achieved by reordering and renaming the generated phases. Equation 4.25 allows to write the following relation

$$
\begin{equation*}
\Delta \hat{\varphi}_{1} \leq \Delta \hat{\varphi}_{2} \leq \ldots \leq \Delta \hat{\varphi}_{n-1} \tag{4.26}
\end{equation*}
$$

In a practical implementation, one will not use several distinct phase shifters in parallel, but cascade them. The phase shift between the phases $\varphi_{k-1}$ and $\varphi_{k}$ is defined as $\Delta \varphi_{k}$. It can be seen in Figure 4.9 that $\Delta \varphi_{k}=\Delta \hat{\varphi}_{k}-\Delta \hat{\varphi}_{k-1}$, or alternatively $\Delta \hat{\varphi}_{k}=\sum_{l=1}^{k} \Delta \varphi_{l}$.

The $180^{\circ}$ phase shifted counterparts $\bar{\varphi}_{1}, \ldots, \bar{\varphi}_{n-1}$ of the auxiliary phases $\varphi_{1}, \ldots$, $\varphi_{n-1}$ have to be generated in the same manner from the input $\bar{\varphi}_{0}$. The resulting intermediate phases are shown exemplarily in Figure 4.9.

To generate the $k$-th output phase $\psi_{k}$ (for $1 \leq k \leq n$ ), $n$ phases need to be combined with a linear phase interpolator, according to the following expression:

$$
\begin{align*}
\psi_{k} & =\frac{1}{n}\left(\sum_{l=k-1}^{n-1} \varphi_{l}+\sum_{m=0}^{k-2} \bar{\varphi}_{m}\right)  \tag{4.27}\\
& =\frac{1}{n}\left(\sum_{l=k-1}^{n-1}\left(\varphi_{0}+\Delta \hat{\varphi}_{l}\right)+\sum_{m=0}^{k-2}\left(\bar{\varphi}_{0}+\Delta \hat{\varphi}_{m}\right)\right) \\
& =\frac{1}{n}\left[\sum_{l=k-1}^{n-1}\left(\varphi_{0}+\sum_{i=1}^{l} \Delta \varphi_{i}\right)+\sum_{m=0}^{k-2}\left(180^{\circ}+\varphi_{0}+\sum_{j=1}^{m} \Delta \varphi_{j}\right)\right]
\end{align*}
$$

The terms of the sums can be reordered as

$$
\begin{align*}
\psi_{k} & =\frac{1}{n}\left(\sum_{m=0}^{k-2} \varphi_{0}+\sum_{l=k-1}^{n-1} \varphi_{0}+\sum_{m=0}^{k-2} 180^{\circ}+\sum_{m=0}^{k-2} \sum_{j=1}^{m} \Delta \varphi_{j}+\sum_{l=k-1}^{n-1} \sum_{i=1}^{l} \Delta \varphi_{i}\right) \\
& =\frac{1}{n}\left(n \varphi_{0}+\sum_{l=0}^{n-1} \sum_{i=1}^{l} \Delta \varphi_{i}+(k-1) \cdot 180^{\circ}\right) \\
& =\varphi_{0}+\sum_{l=1}^{n-1}\left[(n-l) \Delta \varphi_{l}\right]+\frac{k-1}{n} 180^{\circ} \tag{4.28}
\end{align*}
$$

The phase difference between the output phases $\psi_{k}$ and $\psi_{k+1}$ is

$$
\begin{align*}
\psi_{k+1}-\psi_{k}= & \left(\varphi_{0}+\sum_{l=1}^{n-1}\left[(n-l) \Delta \varphi_{l}\right]+\frac{k}{n} 180^{\circ}\right)  \tag{4.29}\\
& -\left(\varphi_{0}+\sum_{l=1}^{n-1}\left[(n-l) \Delta \varphi_{l}\right]+\frac{k-1}{n} 180^{\circ}\right) \\
= & \frac{1}{n} 180^{\circ}
\end{align*}
$$

and therefore the generated output phases exhibit a $180^{\circ} / n$ phase difference. It shall be noted that the result is independent of the absolute values of the single phase differences $\Delta \varphi_{1}, \ldots, \Delta \varphi_{n-1}$. Of course it is necessary that they are equal for all $\varphi_{k}$ and $\bar{\varphi}_{k}$ pairs. Furthermore, the phase interpolator must work linearly.

To visualize the concept more clearly, Figure $4 \cdot 10$ shows which phases need to be interpolated to generate $\psi_{1}, \psi_{2}, \psi_{3}$ and $\psi_{n}$.

Similar to quadrature generation $(n=2)$ shown in Section 4.2, differential outputs can be generated by swapping $\varphi_{0}$ and $\bar{\varphi}_{0}$ and repeating the above process.
In order to achieve linear interpolation in a real system, additional auxiliary phases $\lambda_{k}^{ \pm}$will be needed to achieve linear interpolation. Alternatively to an $n$-phase interpolation, also cascaded interpolations combining less phases are possible.

This section is concluded by presenting the example where $n=3$, phases with $180^{\circ} / 3=60^{\circ}$ spacing shall be generated.

The auxiliary phases $\varphi_{1}=\varphi_{0}+\Delta \varphi_{1}, \varphi_{2}=\varphi_{0}+\Delta \varphi_{1}+\Delta \varphi_{2}$ and their $180^{\circ}$ shifted version $\bar{\varphi}_{1}$ and $\bar{\varphi}_{2}$ are generated.

4 Proposed Solution: Self-Aligned Open Loop Multiphase Generation


Figure 4.10: Phasor diagram highlighting the phases (in red) which need to be interpolated for several different $\psi_{k}$

The generation of the output $\psi_{1}$ requires the interpolation of the phases $\varphi_{0}$, $\varphi_{1}$ and $\varphi_{2}$ according to Equation 4.27:

$$
\begin{aligned}
\psi_{1} & =\frac{1}{3}\left(\varphi_{0}+\varphi_{1}+\varphi_{2}\right) \\
& =\frac{1}{3}\left(\left(\varphi_{0}\right)+\left(\varphi_{0}+\Delta \varphi_{1}\right)+\left(\varphi_{0}+\Delta \varphi_{1}+\Delta \varphi_{2}\right)\right)
\end{aligned}
$$



Figure 4.11: Exemplary phasor diagram for the generation of $60^{\circ}$ spaced phases

$$
=\varphi_{0}+\frac{2}{3} \Delta \varphi_{1}+\frac{1}{3} \Delta \varphi_{2}
$$

The phase $\psi_{2}$ requires the interpolation of $\varphi_{1}, \varphi_{2}$ and $\bar{\varphi}_{0}$ as

$$
\begin{aligned}
\psi_{2} & =\frac{1}{3}\left(\varphi_{1}+\varphi_{2}+\bar{\varphi}_{0}\right) \\
& =\frac{1}{3}\left(\left(\varphi_{0}+\Delta \varphi_{1}\right)+\left(\varphi_{0}+\Delta \varphi_{1}+\Delta \varphi_{2}\right)+\left(180^{\circ}+\varphi_{0}\right)\right) \\
& =60^{\circ}+\varphi_{0}+\frac{2}{3} \Delta \varphi_{1}+\frac{1}{3} \Delta \varphi_{2}
\end{aligned}
$$

Finally, the output phase $\psi_{3}$ is generated from the phases $\varphi_{2}, \bar{\varphi}_{0}$ and $\bar{\varphi}_{1}$ :

$$
\begin{aligned}
\psi_{3} & =\frac{1}{3}\left(\varphi_{2}+\bar{\varphi}_{0}+\bar{\varphi}_{1}\right) \\
& =\frac{1}{3}\left(\left(\varphi_{0}+\Delta \varphi_{1}+\Delta \varphi_{2}\right)+\left(180^{\circ}+\varphi_{0}\right)+\left(180^{\circ}+\varphi_{0}+\Delta \varphi_{1}\right)\right) \\
& =120^{\circ}+\varphi_{0}+\frac{2}{3} \Delta \varphi_{1}+\frac{1}{3} \Delta \varphi_{2}
\end{aligned}
$$

It is clear to see that the resulting phases $\psi_{1}, \psi_{2}$ and $\psi_{3}$ exhibit a phase shift of $60^{\circ}$ each. The according phasor diagram is shown in Figure 4.11.
Within the general concept of multiphase generation, this thesis focuses on the generation of quadrature phases.

## 5 Circuit Implementation

This chapter describes the implementation of the quadrature generator employing the circuit concept shown in Chapter 4. After defining target specifications, needed circuit components are evaluated and a possible implementation is presented. Finally, simulation results are provided.

### 5.1 Target Specification

A circuit is designed using the principle described in Section 4.1. The circuit receives a differential LO that is used to generate quadrature phases. To cover the required frequency range, a reset signal is also issued upon a frequency change. The circuit may take several cycles to adjust to the new input frequency. The basic symbol is shown in Figure 5.1.


Figure 5.1: Basic input-output diagram of the block

Realistic specifications were derived from an existing system used in a nextgeneration product that needs a quadrature LO. If the developed solution fulfills the specifications, it can readily be used in a cutting-edge new technology product.

Non-functional requirements include:

- design with a standard high performance 28 nm cmos technology
- avoid the usage of an integrated inductor

Functional specifications include:

- operating frequency range: 1.7 GHz to 2.7 GHz
- supply voltage: 1.05 V to 1.15 V
- phase noise: lower than $-155 \mathrm{dBc} / \mathrm{Hz}$ at 100 MHz offset
- I/Q imbalance: initial imbalance lower than $\pm 5^{\circ}$, dynamic imbalance lower than $\pm 1^{\circ}$ at 2 GHz
- duty cycle: maximum deviation $\pm 2 \%$ from $50 \%$ at 2 GHz
- operating temperature: $-30^{\circ} \mathrm{C}$ to $120^{\circ} \mathrm{C}$
- load: NOR-based multiplexer with an estimated load capacitance of 40 fF per output signal
- power consumption: as low as possible
- area: as small as possible


### 5.2 Circuit Design

The fundamental block diagram of the complete circuit is shown in Figure 5.2: it includes phase shifters (delay elements), phase interpolators and control circuitry to enable the required operating frequency region.

The circuit looks similar to the principle block diagram shown in Figure 4.7. The phase shifters are implemented by (variable) time delays. The delay $\Delta T$ is a variable delay that provides the substantial part of the total phase shift. The delay $\Delta T_{f}$ is a short delay which shall ensure that the in-phase branch can be linearly interpolated The final time delay $\Delta T_{f}$ is used as a dummy load for the final stage as well as a buffer for the control circuitry. The actual implementation, using a standard high performance 28 nm cmos technology of the blocks is described in detail in the following sections.


Figure 5.2: Block diagram of the implemented circuit, $\Delta T_{f}$ inverts the signals

### 5.2.1 Phase Shifter

A phase shifter is a delay element with time delay $\Delta T$. The resulting phase shift then is $\Delta \varphi=360^{\circ} \cdot \Delta T \cdot f$, where $f$ is the frequency of the input signal.

The simplest delay element in cmos technology for trapezoidal signals is the cmos inverter shown in Figure 5.3: its delay is rather small, but is sufficient for the implementation of the block $\Delta T_{f}$ in Figure 5.2. It has been implemented as a pseudo differential inverter as shown in Figure 5.4. The two additional cross coupled inverters force the input signals to stay differential.

Unfortunately the (plain) inverter is not suitable for the implementation of the longer delay $\Delta T$, as many such devices are necessary consuming a lot of power, for this reason a different delay mechanism has been chosen.

There are various different delay elements presented in literature [79]-[86], but they suffer from bad noise performance and/or high complexity. This is due to the usage of either current sources as delay defining elements or


Figure 5.3: cmos inverter


Figure 5.4: Pseudo differential cmos inverter [78]
by using threshold voltages, which in turn results in low overdrive of the transistors, which again is bad for noise.

Therefore, a simple digitally configurable RC-delay element (see Figure 5.5) has been chosen as the basic building block. The capacitor is split into an array of unit cells which can be enabled and disabled, thus increasing or decreasing the effective delay provided by the stage. The properties of such a structure have already been thoroughly investigated [87].

The delay element $\Delta T$ used in the circuit comprises three switchable stages of the digitally controllable RC-delay introduced above. Furthermore, an additional tristate inverter is added after each stage, providing multiplexing functionality. One half of the differential delay element is shown in Figure 5.6.

The control signals Cs0 to Cs2 define which output of the actual delay element is used. If Csn is low, the according inverter switches its output to a high ohmic state. The circuit is built in a way that only the necessary invert-


Figure 5.5: Tunable RC-based delay


Figure 5.6: One half of the tunable delay element

## 5 Circuit Implementation

ers driving the load capacitances are active. This topology enables an even wider range of possible delays. The capacitors $C$ are arrays of unity capacitors controlled by the code Ct as previously shown. They are arranged in a binary weighted scheme using five control bits. In Figure 5.7 the simulated delays versus the control codes are shown.


Figure 5.7: Properties of the used delay element in the nominal case at 2 GHz (schematic simulations)

Obviously, same values of delays can be achieved with different combinations of the number of stages and active capacitances. As Figure 5.7 d suggests, it is favorable to have as many stages active as possible for a certain delay in terms of power consumption. Of course, this yields the worst possible noise performance, but this can be tolerated if the specification is not violated.

### 5.2.2 Phase Interpolator

As already briefly described in Section 4.1, a phase interpolator (alternatively phase average, phase mixer, phase blender or edge combiner) takes two or more input phases and combines them into a single output phase, whose transitions lie exactly in the middle of the inputs' edges.

In literature several possibilities to interpolate phases can be found. Basic implementations either add voltages [88]-[90] or currents [91]-[94]. More advanced solutions employ oscillators [95] or DLLs as already briefly discussed in Section 3.2.

The principal phase interpolator topologies suffer from their limited interpolation range: in these systems, the delay between the interpolated edges should be smaller than the rise and fall times of the input signals to guarantee linear interpolation (the transitions of the inputs have to overlap) [89].

In this work, a voltage mode phase interpolator was chosen, because, different from a current mode one, this solution has no current sources and does not need a level conversion after the interpolation. The sketch of the phase interpolator [88], [90] is shown in Figure 5.8. The phase interpolator is shown for $m$ inputs, but for this application one with two inputs is needed. Essentially, the phase interpolator is composed of two inverters with their outputs connected together. A third inverter converts the interpolated signal to a regular cmos one.


Figure 5.8: Voltage mode phase interpolator [88], [90]
Sketches for overlapping and non-overlapping inputs and the respective outputs are shown in Figure 5.9. As it can be seen in Figure 5.9b, if the inputs do

## 5 Circuit Implementation

not exhibit overlapping transitions, the two interpolation inverters will work one against each other, producing an intermediate voltage and resulting in a large current consumption. This is a totally undesired behavior, as the phase information is lost and the output's transitions will now only depend on threshold voltages.


Figure 5.9: Voltage mode phase interpolator's inputs and outputs [94]


Figure 5.10: Voltage mode phase interpolator used in this work
In the actual implementation, additional switches were included to turn off the interpolating inverters (see Figure 5.10). If there is no input LO signal, the two different interpolated signals on the Q-path (LO and $\mathrm{D}_{3}$ or $\overline{\mathrm{LO}}$ and $\overline{\mathrm{D}}_{3}$,
see Figure 5.2 ) exhibit different potentials. This results in a constant current consumption, although the circuit is switched off. A possibility to prevent this unwanted current flow, is to switch off one interpolating inverter. To keep symmetry intact, the additional transistors are included in both branches.

Figure 5.11 shows the phase transfer function of the phase interpolator at 2 GHz for different rise and fall times in the nominal case. The output inverter is sized twice as big as the interpolating devices. As it can be seen, the phase transfer function (transition times measured at $0.5 V_{\mathrm{DD}}$ crossings) is already nonlinear for rather small phase differences of the input. The phase interpolator therefore is designed so that the error introduced is small enough to stay within the specification bounds.


Figure 5.11: Phase transfer function of a voltage mode phase interpolator as shown in Figure 5.10, considering 2 GHz input signals with different rise and fall times, the inherent delay of the interpolator (propagation delay of the inverters when the input delay is zero) is not shown

To emphasize the range of linear operation of the phase interpolator, schematic timedomain simulations are shown in Figure 5.12. In Figure 5.12a, the
delay between the two input signals is 10 ps , while the rise and fall times are 20 ps each. Note that in the figure the inherent delay of the interpolator (propagation delay of the inverters when the input delay is zero) is not corrected. Instead, in Figure 5.12b, the rather extreme case of a 30 ps input delay with rise and fall times of 10 ps is shown. As it can be seen, there are time intervals (highlighted areas in Figure 5.12b) in which the two input inverters are working one against the other, thus leading to an output voltage with a value that only depends on the exact ratios of the devices and the trip point of the output inverter, also all timing information is lost.

### 5.2.3 Control Logic

Since the phase interpolation block has a rather limited linear interpolation range, the delays used in the system have to be adjusted with sufficient precision according to the operating frequency and the process corner. To do so, a closed control loop has been implemented. When there is a frequency change at the input, a reset signal is sent to the logic block. Upon reset, the control circuit adjusts the delays to the input frequency, using several LO cycles. When this operation is completed, the control unit switches itself off and the main circuit is operating in open loop configuration again.

The control logic is performing a modified binary search: Starting with the longest possible delay (all capacitors active, all delay stages active), the algorithm decreases the number of active capacitors according to the binary search algorithm [96]. If the delayed signal $D_{4}$ in Figure 5.2 is lagging the input LO signal at the end of a search cycle, the capacitor control word is reset, the next lower output multiplexer is activated and the above process is repeated until either a fitting control word is found or until the minimum possible delay is reached.

In a conventional binary search algorithm, one would start with the lowest possible delay. In this work the reverse search was implemented, because, as it can be seen in Figure 5.7, in order to get the lowest power consumption for a given delay it is better to have as many stages active as possible while their load capacitance should be as small as possible.
This behavior is similar to a (very coarse) digital DLL, which is described in Section 3.2. In contrast to a conventional DLL, the feedback loop is deacti-

(b) Rise and fall time $t_{r f}=10 \mathrm{ps}$, input delay $t_{d}=30 \mathrm{ps}$, shaded areas indicate the time intervals in which the two input inverters are working against each other, in this case the output transitions are not correct

Figure 5.12: Waveforms of the phase interpolator for different rise/fall times and input delays (schematic simulations)
vated after finding a lock (or running out of possibilities to vary the delay any further).

The block diagram of the control logic is shown in Figure 5.13, with start and reset signals omitted.


Figure 5.13: Block diagram of the control circuitry (reset and start signals are not shown)

The phase detector (PD) is implemented with two edge triggered D-type registers (see Figure 5.14). The inverters are inserted to compensate for setup times of the registers. The up signal is high if the LO is leading the delayed version (total phase shift is higher than $180^{\circ}$ ), while, if the LO is lagging, down is high (total phase shift smaller than $180^{\circ}$ ). If both up and down are high simultaneously, both edges are closer than the resolution of the PD (which can also be tuned by delaying the respective clock inputs of the registers). In this case, the currently evaluated control bit is not changed and the algorithm continues.


Figure 5.14: Phase detector

The capacitance control register (CAP CTRL REG in Figure 5.13) stores the current 5-bit control word for the tunable delay element. The multiplexer con-
trol shift register (MUX CTRL) is a 3-bit shift register storing which multiplexer in the delay elements is used. The position control shift register (POS CTRL SREG) is used for the modified binary search algorithm and stores the position of the currently evaluated bit. Finally, the END block in Figure 5.13 determines whether the locked state is achieved after each sweep of the capacitance control word, or if the next multiplexer needs to be activated.

The control block is operating at half the LO frequency. This allows the switches of the delay elements to settle for more than one LO period before evaluation. According to the algorithm described above, in the worst case it takes $2 \cdot 3 \cdot 5=30$ LO-cycles (three times the evaluation of five bits at half the LO rate) for the control block to adjust the delay. At the lowest frequency specified ( 1.7 GHz ), this translates into a maximum time of 17.6 ns .

Additional multiplexers are added to allow external programming of the delay for debugging reasons.

### 5.3 Layout

The layout was planned in order to keep the structures as simple as possible rather than to minimize the mismatch. This is a reasonable approach, as the distances between critical elements are rather short. Additionally, it was deemed that, in contrast to conventional analog circuits, the impact of certain mismatch is not that critical. A symmetrical layout approach was aimed at, in which both differential signals are kept close together.

Although all elements of the long delays denoted $\Delta T$ in Figure 5.2 should be matched, it was decided that matching through vicinity was sufficient. Both delays $\Delta T$ were laid out separately, otherwise complex matching techniques for four elements were needed. Additionally, to increase simplicity, the layout approach with separated delays easily enables symmetrical wiring and therefore symmetrical parasitic effects.
Also, all four phase interpolators should be matched well together. Again, for the sake of simplicity, only the two paths in a single phase interpolator were arranged with advanced matching techniques, such as common centroid layouting, applied [97], [98].

### 5.3.1 Delay Element $\Delta T$

The basic floorplan of one half of a single stage of the delay element is shown in Figure 5.15. The shown slice is mirrored upwards to gain differential operation. Three of these stages are concatenated to yield the final delay element, shown in Figure 5.16.


Figure 5.15: Floorplan of one half of a single stage of the delay element
All the three slices are slightly different with respect to the resistance value. The various resistances are implemented as parallel and series connections of unit resistors (labeled $R$ in Figure 5.15). The remaining resistors (labeled DM, as dummy) are not connected but increase the regularity of the structure. Additionally, in the first stage the enable transistors are not connected to yield the circuit as shown in Figure 5.6.


Figure 5.16: Floorplan of the delay element

As stated before, the matching of the single capacitors in the array was deemed uncritical for the circuit, as the digital control only needs to have coarsely binary weighted capacitances to work properly. Therefore, the arrangement of the unit capacitors according to their weight has been planned in order to simplify the wiring of the respective control signals, which are routed horizontally between the capacitors. The switches are placed below the ground connection of each capacitor.

Dummy capacitors, which are used as decoupling capacitors, have been inserted on the top and bottom of the capacitor array.

At the right side of the layout, the output buffer, as seen in Figure 5.2, and two columns of dummy capacitors are added (see Figure 5.16).

### 5.3.2 Phase Interpolator

The floorplan of a single-ended phase interpolator (see Section 5.2.2) is shown in Figure 5.17. The transistors of the input inverters as well as the switches were arranged in a common centroid configuration. The wiring was planned in order to guarantee symmetrical parasitics for both the input signal paths.


Figure 5.17: Floorplan of the phase interpolator

### 5.3.3 Top Level

The layout of the entire circuit is shown in Figure 5.18. The arrangement is dominated by the four capacitor arrays forming the long delays $\Delta T$ in Figure 5.2 (see Section 5.3.1). The pseudo differential inverters used for the small delays $\Delta T_{f}$, as well as the phase interpolators, are symmetrically placed on the sides of the delay elements $\Delta T$. An additional input buffer was added, which is also visible.

The control block is placed in the lower left corner next to the second delay element $\Delta T$. The remaining area, rather far away from the critical signals, is filled with decoupling capacitors.


Figure 5.18: Final layout of the quadrature generator, dimensions are $40 \mu \mathrm{~m} \times 80 \mu \mathrm{~m}$

The dimensions of the block shown in Figure 5.18, are $40.425 \mu \mathrm{~m} \times 80.850 \mu \mathrm{~m}$ resulting in an area of $3269 \mu \mathrm{~m}^{2}=0.003269 \mathrm{~mm}^{2}$. This area is approximately half the size of a dot printed at 300 dpi resolution [99]. The four capacitor arrays occupy approximately $50 \%$ of the block's area by consuming $1609 \mu^{2}=0.001609 \mathrm{~mm}^{2}$. Finally, the manually placed and routed digital control block was placed in one of the block's corners. It's dimensions are $6.64 \mu \mathrm{~m} \times 14.49 \mu \mathrm{~m}$, occupying an area of $96.2 \mu \mathrm{~m}^{2}$, which is approximately $3 \%$ of the total area.

The signal flow is as follows (see Figure 5.19): the input LO is fed into the input buffer from the outer left side, then travels upwards into the first delay $\Delta T$. The signal lines are also tapped here to be fed into the quadrature's phase interpolators.

The delayed signals are then routed downwards on the right side to enter the delay block $\Delta T_{f}$, and then even further down to the inputs of the second delay block $\Delta T$. Both taps for the in-phase's phase interpolators are placed along these lines in a symmetrical fashion, also with respect to the quadrature components. Eventually, the delayed signal is routed up again on the left side to enter the final delay $\Delta T_{f}$ and to be fed into the quadrature's phase interpolators.


Figure 5.19: Signal flow

### 5.4 Simulation Results

All simulations were performed using the Spectre Circuit Simulator [100] and the SpectreRF extensions [101]. The radio frequency extensions provide an additional set of analyses that allow the study of the circuits in their periodic steady state [102]. Similar analyses to conventional DC, AC, transient or noise simulations are available. The RF analyses use a time varying operating period, on which all the small signal simulations are performed. This operating period, in contrast to the conventional operating point obtained by a DC simulation, is usually one or more periods of the fundamental frequency that drives the circuit. It is assumed that the circuit under investigation shows a periodic behavior due to the driving signal [102], [103].

To compute the periodic operating period, the so called Periodic Steady State (PSS) analysis is used. It is the RF-equivalent to the DC analysis. With this result, periodic AC (PAC), periodic noise (pnoise) analyses and various other periodic small signal analyses can be performed. Additionally, the pnoise analysis is able to determine the phase noise properties of the circuit [103], [104].

In the following, schematic level simulations are compared to post-layout simulations. The layout, described in Section 5.3 and shown in Figure 5.18, was extracted with Cadence $Q R C$ Extraction Solution [105]. In such a parasitic extraction, the real electrical properties of the circuit in dependence of the layout are estimated by algorithms [106]. The extracted parasitic elements include capacitances between lines, resistances of lines and (partial) inductances of lines [107], [108]. It is possible to only extract certain elements and/or areas of interest. There are several extraction modes available, the most important ones include: C coupled, which extracts only parasitic capacitances between all the lines, and $R C$ coupled, which additionally extracts the resistances of the lines. These two modes are used in the following.

### 5.4.1 Test Bench

The test bench is displayed in Figure 5.20. Shown are the sources driving the quadrature generator, sources to control the delay if necessary and the

## 5 Circuit Implementation

load at the outputs. The load transistors represent the inputs of the specified NOR-based multiplexer.


Figure 5.20: The test bench used to perform the following analyses

### 5.4.2 Static Performance

For this analysis, the circuit was operated at several different input frequencies and supply voltages at nominal temperature $T=27^{\circ} \mathrm{C}$. Initially, the reset signal was activated in order to let the digital control block adjust the delay to the LO frequency. Only after the completion of this procedure, the circuit performance was evaluated.

First, Figures 5.21 and 5.22 show the behavior of the output signals simulated for the RC coupled extracted circuit, at 2 GHz and in the nominal corner. A deviation from the ideal 125 ps delay between the I and Q outputs can be seen. In the following, an extensive study of the main performance parameters has been carried out considering schematic and extracted simulations.


Figure 5.21: Signal waveforms of the $R C$ coupled simulation at 2 GHz nominal, positive signals


Figure 5.22: Signal waveforms of the $R C$ coupled simulation at 2 GHz nominal, negative signals

## I/Q Phase Shift

First, the phase shift between the I and Q outputs is analyzed. The phase shift is measured at the $0.5 V_{D D}$ crossings. Figures $5.23,5.24$ and 5.25 show the I/Q phase shift for different process corners versus supply voltage and operating frequency.


Figure 5.23: Simulated I/Q phase shifts in the nominal corner, $T=27^{\circ} \mathrm{C}$

These plots show that the I/Q accuracy of $90^{\circ} \pm 5^{\circ}$ is achieved over the entire frequency region, corners and supply voltages. Only in the slow corner at very low supply voltages the margin is low compared to the other results. While the results evaluated at the schematic level lie closely to the desired $90^{\circ}$ phase shift, the extracted results tend to drift slightly away from this ideal. The direction of this deviation is similar in all corners, so it is possible to conclude that there is a slight asymmetry between the in-phase and quadrature paths introduced by the layout. It shall be noted that for these simulations no mismatch was considered.


Figure 5.24: Simulated I/Q phase shifts in the slow corner, $T=27^{\circ} \mathrm{C}$


Figure 5.25: Simulated I/Q phase shifts in the fast corner, $T=27^{\circ} \mathrm{C}$

## Phase Noise

The phase noise of both the I and Q outputs is now analyzed. Figures 5.26, 5.27 and 5.28 show the phase noise at 100 MHz frequency offset for the inphase outputs, which, according to the specifications (see Section 5.1), has to smaller than $-155 \mathrm{dBc} / \mathrm{Hz}$. Figures 5.29, 5.30 and 5.31 show the same for the quadrature outputs.


Figure 5.26: Simulated I phase noise in the nominal corner, $T=27^{\circ} \mathrm{C}$
As it can be seen, only the slow corner is critical in terms of phase noise. While the in-phase outputs are within the specification boundaries for all simulations, the RC coupled simulation results violate the specification for frequencies higher than 2.6 GHz at low supply voltages. This was deemed uncritical, as the violation is below $1 \mathrm{dBc} / \mathrm{Hz}$ and only affects the slow corner at low supply voltages and high frequencies. Furthermore, the gate resistances are heavily overestimated in the RC coupled mode. From direct comparison with RF transistor models, it is known, that thermal noise is overestimated in standard device models in the given circuit conditions. For implementation reasons, it was decided against the usage of dedicated RF transistors.

As it can be seen in the figures, the noise increases as the carrier frequency is increased and as the supply voltage is reduced, i.e., as the signal power and


Figure 5.27: Simulated I phase noise in the slow corner, the dashed line indicates the specification limit, $T=27^{\circ} \mathrm{C}$


Figure 5.28: Simulated I phase noise in the fast corner, $T=27^{\circ} \mathrm{C}$
the overdrive of the transistors is reduced. This is in line with the phase noise general trend described in Section 2.1.

5 Circuit Implementation


Figure 5.29: Simulated Q phase noise in the nominal corner, $T=27^{\circ} \mathrm{C}$


Figure 5.30: Simulated Q phase noise in the slow corner, the dashed line indicates the specification limit, $T=27^{\circ} \mathrm{C}$

It is also worth noting that there is a difference in phase noise for the I and Q outputs: apart from the extreme case (slow corner, low supply voltage and


Figure 5.31: Simulated $Q$ phase noise in the fast corner, $T=27^{\circ} \mathrm{C}$
high frequencies) the phase noise on the quadrature path is considerably lower than the one on the in-phase path. This is due to the nature of the interpolated signals. While the quadrature outputs are generated with both, the "clean" signals near the input and the noisy output of the entire delay, the in-phase outputs are generated with two noisy signals taken from the delay. Therefore, the quadrature output gains in noise performance, as long as the interpolation is not significantly contributing to the noise.

## Duty Cycle

The duty cycle of the signals is measured at the $0.5 V_{\mathrm{DD}}$ crossings, similarly to the I/Q phase shift. Figures $5.32,5.33$ and 5.34 show the duty cycle for the in-phase outputs, while Figures 5.35, 5.36 and 5.37 show the same for the quadrature outputs.

The specifications on the duty cycles are met for all operating frequencies, process corners and supply voltages. Similar to the phase noise, in the slow corner at low supply voltages and high frequencies there is little margin on

5 Circuit Implementation


Figure 5.32: Simulated duty cycle of the in-phase outputs in the nominal corner, $T=27^{\circ} \mathrm{C}$


Figure 5.33: Simulated duty cycle of the in-phase outputs in the slow corner, $T=27^{\circ} \mathrm{C}$
the quadrature outputs; in this conditions the circuit is reaching its upper frequency limit due to the parasitic capacitances introduced through the layout and the reduced driving strength of the devices in the slow corner.


Figure 5.34: Simulated duty cycle of the in-phase outputs in the fast corner, $T=27^{\circ} \mathrm{C}$


Figure 5.35: Simulated duty cycle of the quadrature outputs in the nominal corner, $T=27^{\circ} \mathrm{C}$

5 Circuit Implementation


Figure 5.36: Simulated duty cycle of the quadrature outputs in the slow corner, the dashed lines indicate the specification limits, $T=27^{\circ} \mathrm{C}$


Figure 5.37: Simulated duty cycle of the quadrature outputs in the fast corner, $T=27^{\circ} \mathrm{C}$

## Current Consumption

The current consumption of the entire circuit is shown in Figures 5.38, 5.39 and 5.40 for the various corners. The current drawn by the load transistors considered in the testbench (see Figure 5.20) is not included.


Figure 5.38: Current consumption in the nominal corner, $T=27^{\circ} \mathrm{C}$
Similar to the parameters analyzed previously, the extracted simulations show worse performance than the schematic level simulation as expected. The current consumption is dependent on the input frequency and the supply voltage, common to any CMOS circuit.

## Delay Control Word

The control word adjusting the variable delays is displayed in Figures 5.41, 5.42 and 5.43 . The code was determined by the digital control block as explained in Section 5.2.3.

In the slow corner the minimum possible control word is already reached at 2.1 GHz , with a supply voltage of 1.05 V , while for the nominal corner this

5 Circuit Implementation


Figure 5.39: Current consumption in the slow corner, $T=27^{\circ} \mathrm{C}$


Figure 5.40: Current consumption in the fast corner, $T=27^{\circ} \mathrm{C}$
happens at 2.6 GHz . This hints that the parasitic capacitive loading is the frequency limiting factor of the circuit. These analyses also clearly show the dif-


Figure 5.41: Delay control code in the nominal corner, $T=27^{\circ} \mathrm{C}$


Figure 5.42: Delay control code in the slow corner, $T=27^{\circ} \mathrm{C}$
ferences between the schematic and post-layout simulations: in the schematic simulations the capacitances to delay the signals are mostly provided by the designed array, while in the post-layout simulations the parasitic elements contribute a substantial part to the delay.


Figure 5.43: Delay control code in the fast corner, $T=27^{\circ} \mathrm{C}$

It is possible to observe that a wide range of delays is necessary in order to achieve operation over the required frequency range in all corners.

## Summary

In Table 5.1 an overview over the previously discussed results is given. The performance measurements were conducted over the entire operating frequency ( 1.7 GHz to 2.7 GHz ), supply voltage ( 1.05 V to 1.15 V ) and temperature range $\left(-30^{\circ} \mathrm{C}\right.$ to $\left.120^{\circ} \mathrm{C}\right)$.

As mentioned in Section 5.4.2, when discussing the phase noise performance, the RC coupled extracted simulation in the slow corner violates the specification. At high temperatures, the noise increases by only 1 dB . It is worth highlighting, that the gate resistances are heavily overestimated in the RC coupled extraction. Additionally, the accuracy of the transistor model is slightly worse at extreme temperatures compared to nominal conditions.

|  |  | nominal |  |  | slow |  |  | fast |  |  | total |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | min | avg | max | min | avg | max | min | avg | max | min | avg | max |
| I/Q phase shift in degree | schematic | 89.2 | 89.8 | 91.2 | 87.6 | 89.4 | 91.1 | 89.5 | 89.8 | 90.2 | 87.6 | 89.6 | 91.2 |
|  | C coupled | 88.1 | 89.4 | 91.5 | 87.4 | 88.8 | 91.1 | 88.9 | 89.7 | 92.5 | 87.4 | 89.3 | 92.5 |
|  | RC coupled | 87.5 | 88.8 | 91.3 | 87.1 | 88.4 | 91.6 | 88.1 | 89.4 | 91.9 | 87.1 | 88.8 | 91.9 |
| phase noise I at 100 MHz offset in $\mathrm{dBc} / \mathrm{Hz}$ | schematic | -160.8 | -159.0 | -157.3 | -159.5 | -157.6 | -155.7 | -160.6 | -159.1 | -157.6 | -160.8 | -158.6 | -155.7 |
|  | C coupled | -161.2 | -159.1 | -157.2 | -159.7 | -157.5 | -155.1 | -161.2 | -159.4 | -157.4 | -161.2 | -158.7 | -155.1 |
|  | RC coupled | -160.6 | -158.7 | -156.7 | -159.5 | -157.1 | -154.4 | -161.0 | -159.0 | -157.0 | -161.0 | -158.2 | -154.4 |
| phase noise Q at 100 MHz offset in $\mathrm{dBc} / \mathrm{Hz}$ | schematic | -164.0 | -161.7 | -158.1 | -162.7 | -160.4 | -157.7 | -163.8 | -162.0 | -159.7 | -164.0 | -161.4 | -157.7 |
|  | C coupled | -164.0 | -161.3 | -156.8 | -162.8 | -159.8 | -155.2 | -164.3 | -161.7 | -157.6 | -164.3 | -160.9 | -155.2 |
|  | RC coupled | -163.7 | -161.1 | -156.9 | -162.2 | -159.2 | -153.5 | -164.0 | -161.2 | -157.2 | -164.0 | -160.5 | -153.5 |
| duty cycle I | schematic | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 |
|  | C coupled | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 |
|  | RC coupled | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 |
| duty cycle Q | schematic | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.51 |
|  | C coupled | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.52 | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.52 |
|  | RC coupled | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.52 | 0.49 | 0.50 | 0.51 | 0.49 | 0.50 | 0.52 |
| current draw in mA | schematic | 2.596 | 3.368 | 4.205 | 2.240 | 2.986 | 3.796 | 3.049 | 4.026 | 5.050 | 2.240 | 3.462 | 5.050 |
|  | C coupled | 2.984 | 3.983 | 5.158 | 2.800 | 3.570 | 4.422 | 3.503 | 4.606 | 5.696 | 2.800 | 4.053 | 5.696 |
|  | RC coupled | 2.901 | 3.809 | 4.716 | 2.644 | 3.482 | 4.439 | 3.362 | 4.407 | 5.652 | 2.644 | 3.900 | 5.652 |

Table 5.1: Summary of the main performance characteristics

### 5.4.3 Dynamic Performance

In this section the behavior of the circuit is analyzed after the digital control unit has completed adjusting the delay and the operating conditions are varied. This analysis is conducted to see how the circuit performs under varying operating conditions without readjusting the delay. Therefore, in the following analyses, the delay control code is kept constant. The codes are determined by the digital control algorithm at the nominal supply voltage of 1.1 V , operating frequency of 2 GHz and temperature of $27^{\circ} \mathrm{C}$. The supply voltage, input frequency and temperature are varied.

## Supply Voltage Variation

Figure 5.44 shows the variation of the I/Q phase shift at 2 GHz in the nominal corner when the supply voltage is changed.


Figure 5.44: I/Q phase shift over supply voltage variations at 2 GHz in the nominal corner, using delay control codes determined at 1.1 V , the dashed lines indicate specification bounds, $T=27^{\circ} \mathrm{C}$

Over the entire supply voltage region, ranging from 1.05 V to 1.15 V , the phase shift stays in a $\pm 1^{\circ}$ band for schematic and extracted simulations.

Figure 5.45 shows the phase noise of the in-phase and quadrature outputs when changing the supply voltage. Again, the specification is met for the operating supply voltage region. The characteristics of these curves mostly resemble the phase noise dependency on the supply voltage. A severe degradation is only expected when the interpolation is not working anymore.


Figure 5.45: Simulated phase noise over supply voltage variations at 2 GHz in the nominal corner, using delay control codes determined at 1.1 V , the dashed lines indicate specification bounds, $T=27^{\circ} \mathrm{C}$

Figure 5.46 shows the duty cycle of the I and Q outputs for a supply voltage variation. The deviation of the duty cycle stays below $\pm 1 \%$.

## Operating Frequency Variation

This analysis is very similar to the previous one, but now the supply voltage is kept at its nominal level of 1.1 V and the input LO frequency is varied. As already mentioned, the delay control code was evaluated by the digital control unit at 2 GHz and 1.1 V supply voltage.

Figure 5.47 shows the I/Q phase shift when the input frequency is changed. Interestingly, the frequency region with an acceptable deviation of the I/Q


Figure 5.46: Simulated duty cycle over supply voltage variations at 2 GHz in the nominal corner, using delay control codes determined at 1.1 V , the dashed lines indicate specification bounds, $T=27^{\circ} \mathrm{C}$
phase shift is noticeably bigger for the extracted simulations (squares and crosses in Figure 5.47) than for the schematic simulation (disks in Figure 5.47). While the schematic simulation yields an acceptable I/Q phase shift in the frequency region between 1.85 GHz and 2.15 GHz , the RC coupled simulations yield a range of 1.7 GHz to 2.15 GHz , which is about 150 MHz more towards lower frequencies.

In Figure 5.48 the phase noise performance dependency on the variation of the input frequency is displayed.

On one hand, the phase noise of the in-phase outputs (blue curves in Figure 5.48) depends only on the carrier frequency, as linear interpolation on this signals is always guaranteed. On the other hand, on the quadrature outputs, a non-negligible degradation of the noise can be observed when the interpolation ceases to work as desired. Please note that in these frequency ranges also the I/Q imbalance reaches unacceptable values (see Figure 5.47).

The measured duty cycles' dependency on the variation of the input frequency are shown in Figure 5.49 for both the in-phase and quadrature out-


Figure 5.47: I/Q phase shift over input LO frequency at 1.1 V in the nominal corner, using delay control codes determined at 2 GHz , the dashed lines indicate specification bounds, $T=27^{\circ} \mathrm{C}$


Figure 5.48: Simulated phase noise over input LO frequency at 1.1 V in the nominal corner, using delay control codes determined at 2 GHz , the dashed lines indicate specification bounds, $T=27^{\circ} \mathrm{C}$


Figure 5.49: Simulated duty cycle over input LO frequency at 1.1 V in the nominal corner, using delay control codes determined at 2 GHz , the dashed lines indicate specification bounds, $T=27^{\circ} \mathrm{C}$
puts respectively. Similar to the phase noise, the duty cycle is always within the specification bounds for the in-phase outputs, as the interpolation always works as desired. With the quadrature outputs, the resulting duty cycles strongly depend on the trip points of the output inverters, as they are the edge defining elements when the interpolation is not working correctly.

## Temperature Variation

In this analysis the operating frequency and supply voltage are kept constant at 2 GHz and 1.1 V respectively. The delay control code determined at $27^{\circ} \mathrm{C}$ and is kept the same for all simulations. The temperature is varied from $-30^{\circ} \mathrm{C}$ to $120^{\circ} \mathrm{C}$. Model accuracy is assumed to be best at room temperature and lower at the temperature extremes, especially at low temperatures.

Figure 5.50 shows the variation of the I/Q phase shift: as it can be seen, the I/Q phase shift changes less than $1^{\circ}$ over the considered temperature range, even in RC coupled simulations.


Figure 5.50: I/Q phase shift over temperature at 2 GHz and 1.1 V in the nominal corner, using delay control codes determined at 2 GHz and $27^{\circ} \mathrm{C}$

In Figure 5.51, the phase noise at 100 MHz offset of the in-phase and quadrature outputs over temperature are shown.
At high temperatures, the phase noise performance is worse by approximately 1 dB in comparison to nominal temperature. As expected, all simulation types (schematic, C coupled and RC coupled) show a very similar behavior.

The duty cycle dependency over temperature is shown in Figure 5.52. The variation is well below $1 \%$ for all the simulations.

Finally, the circuit's current consumption versus temperature is shown in Figure 5.53: as it can be seen, at high temperatures the current consumption increases by approximately 0.2 mA compared to nominal temperature, mostly due to increase of leakage currents.

5 Circuit Implementation


Figure 5.51: Phase noise over temperature at 2 GHz and 1.1 V in the nominal corner, using delay control codes determined at 2 GHz and $27^{\circ} \mathrm{C}$


Figure 5.52: Duty cycle over temperature at 2 GHz and 1.1 V in the nominal corner, using delay control codes determined at 2 GHz and $27^{\circ} \mathrm{C}$


Figure 5.53: Current consumption over temperature at 2 GHz and 1.1 V in the nominal corner, using delay control codes determined at 2 GHz and $27^{\circ} \mathrm{C}$

### 5.4.4 Statistical Analysis

A schematic level Monte Carlo simulation [109] with one thousand runs has been performed considering the design at nominal conditions, operating frequency of 2 GHz , supply voltage of 1.1 V and nominal temperature of $27^{\circ} \mathrm{C}$.

The statistics of the I/Q phase shift is shown in Figure 5.54: as expected, the mean value of the I/Q phase shift is close to the ideal $90^{\circ}$; moreover, the standard deviation is lower than $0.4^{\circ}$. Under the reasonable assumption that the parasitics are only negligibly contributing to the mismatch, a similar standard deviation can also be expected for the extracted layout.
The simulated distributions for the duty cycles are shown in Figures 5.55 and 5.56: also the mean values of the duty cycle lies very close to the ideal 0.5 for the schematic level analyses.

## 5 Circuit Implementation



Figure 5.54: Distribution of the $\mathrm{I} / \mathrm{Q}$ phase shifts at 2 GHz and 1.1 V in the nominal corner, the mean value is $\mu=89.8^{\circ}$ and the standard deviation is $\sigma=0.38^{\circ}, N=1000$


Figure 5.55: Distribution of the in-phase duty cycles at 2 GHz and 1.1 V in the nominal corner, the mean value is $\mu=0.50$ and the standard deviation is $\sigma=0.002, N=1000$


Figure 5.56: Distribution of the quadrature duty cycles at 2 GHz and 1.1 V in the nominal corner, the mean value is $\mu=0.50$ and the standard deviation is $\sigma=0.002$, $N=1000$

The phase noise distributions for the in-phase and quadrature outputs are shown in Figures 5.57 and 5.58: as it can be seen, the phase noise is not distributed normally. The noise on the outputs essentially is the squared sum of the individual noise sources' impact on the outputs. Assuming that these individual noise sources are distributed normally, the resulting distribution is similar to a $\chi$-squared distribution [110], as shown in Figures 5.57 and 5.58.

Finally, the distribution of the current consumption is shown in Figure 5.59.


Figure 5.57: Distribution of the in-phase phase noise at 2 GHz and 1.1 V in the nominal corner, the mean value is $\mu=-159.4 \mathrm{dBc} / \mathrm{Hz}$ and the standard deviation is $\sigma=0.4 \mathrm{dBc} / \mathrm{Hz}, N=1000$


Figure 5.58: Distribution of the quadrature phase noise at 2 GHz and 1.1 V in the nominal corner, the mean value is $\mu=-161.9 \mathrm{dBc} / \mathrm{Hz}$ and the standard deviation is $\sigma=$ $0.9 \mathrm{dBc} / \mathrm{Hz}, N=1000$


Figure 5.59: Distribution of the current consumption at 2 GHz and 1.1 V in the nominal corner, the mean value is $\mu=3.2 \mathrm{~mA}$ and the standard deviation is $\sigma=0.1 \mathrm{~mA}, N=1000$

### 5.4.5 Figure of Merit

The circuit presented in this work has been compared to similar circuits presented in literature. In order to make a fair comparison, a figure of merit (FoM) has been introduced that combines the main parameters of interest. This FoM was originally used to compare the phase noise performance of oscillators [111]

$$
\begin{equation*}
\mathrm{FoM}=\mathcal{L}(\Delta f)-20 \log _{10}\left(\frac{f_{0}}{\Delta f}\right)+10 \log _{10}\left(\frac{P}{1 \mathrm{~mW}}\right) \tag{5.1}
\end{equation*}
$$

where $f_{0}$ is the carrier frequency, $\Delta f$ is the frequency offset, $\mathcal{L}(\Delta f)$ is the measured phase noise and $P$ is the power consumption of the circuit. It has been chosen because it combines all the circuit's specifications.

The number of publications presenting similar solutions and also providing all the required information to calculate the FoM was very limited at the time this thesis was conducted. Also, a company internal evaluation of the phase corrector (see Section 3.4) is listed for comparison, which was implemented

## 5 Circuit Implementation

in the very same technology as the presented circuit. The results are listed in Table 5.2.

|  | this work ${ }^{\text {a }}$ |  |  | [112] ${ }^{\text {a }}$ | [113] | [114] | [77] ${ }^{\text {a }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| circuit principle | - |  |  | $\mathrm{ILRO}^{\text {b }}$ | PPF ${ }^{\text {c }}$ | PPC ${ }^{\text {d }}$ | PC ${ }^{\text {e }}$ |
| frequency range (GHz) | $1.7 \sim 2.7$ |  |  | $4.23 \sim 4.77$ | $2.4 \sim 2.5$ | $9.9 \sim 14.6$ | - |
| frequency (GHz) | 2.0 |  |  | 4.5 | 2.45 | 11.75 | 2.0 |
| technology | 28 nm |  |  | 180 nm | $250 \mathrm{~nm}^{\text {f }}$ | 28 nm | 28 nm |
| area | $3269 \mu \mathrm{~m}^{2}$ |  |  | - | $0.4 \mathrm{~mm}^{2}$ | $0.04 \mathrm{~mm}^{2}$ | - |
| phase noise (dBc/Hz) | -148.3 | -156.0 | -159.0 | -130.9 | -128.0 | -118.0 | - |
| at (MHz) | 1 | 10 | 100 | 1 | 1 | 10 | - |
| \# output phases | 4 |  |  | 8 | 4 | 4 | 8 |
| power (mW) | 4.0 |  |  | 4.25 | 12.5 | $3.1{ }^{\text {a }}$ | - |
| FoM (dBF) | -208.3 | -196.0 | -179.0 | -197.7 | -184.8 | -166.5 | -172.3 |

Table 5.2: Comparison of the main performance parameters to solutions presented in literature
${ }^{\mathrm{a}}$ simulation results only; ${ }^{\mathrm{b}}$ Injection Locked Ring Oscillator, see Section 3.3;
${ }^{c}$ Polyphase Filter; ${ }^{\text {d }}$ Parametric Pumped Capacitor, with LC resonator;
${ }^{\mathrm{e}}$ Phase Corrector, see Section 3.4; ${ }^{\mathrm{f}} 250 \mathrm{~nm}$ SiGe bicmos
As it can be seen, the presented circuit exhibits superior FoMs over the other solutions. Furthermore, the proposed approach requires a lot less area as no integrated inductor, huge resistor and big capacitor arrays are needed.

## 6 Conclusion

The scope of this thesis was to investigate the generation of quadrature LO phases out of a single differential LO input, with the aim of reducing the power dissipation and area taken compared to more standard LO routing techniques. The target applications are wireless communications systems. To underline this goal, a circuit was designed to the point of a complete functional layout.

The area and the power consumption are extremely important in wireless communication systems, focused on mobile battery operated devices. These two aspects, combined with the characteristics of the very deep sub-micron technology employed, the low supply voltage used and the tight specification limits dictated by the wireless communication standards, pose extreme challenges in the design of analog RF front-ends. Existing solutions for quadrature LO generation in the literature hardly meet the stringent requirements of wireless transceivers, like phase noise, area and power consumption.

In this work, a rather simple circuit concept that shall overcome the shortcomings of current designs is presented. The proposed solution exploits the inherent $180^{\circ}$ phase shift of the differential input signal and is only limited by device matching and by the ability to accurately and linearly interpolate phase shifted LO signals. Therefore, it is highly suitable for operating under the stringent wireless communication specifications.

A circuit implementation of the presented concept was designed based on realistic specifications derived from a next-generation wireless transceiver product. The circuit comprises a delay chain for phase shifting and phase interpolators. Circuit components were chosen to be as simple as possible, because the power, noise and area penalties of more complex realizations were unacceptable.

During the design stage, the phase interpolator proved to be the critical element. As already mentioned, the functionality and accuracy of the circuit concept rely on the linear and accurate phase interpolation. The performance of the phase interpolation depends on the accuracy of the phase shift matching to the operating frequency over all process, supply voltage and temperature variations.

To overcome this limitation, a control loop that adjusts the phase shifts to the frequency of the input signal was introduced. Process variations, supply voltage and temperature influences are inherently compensated. After this adjustment, the circuit is operated in an open loop configuration again.

The circuit was designed and implemented on a standard high performance 28 nm cmos technology. Since the circuit is robust versus device mismatches, the layout was planned targeting simplicity and symmetry and resulted in a very compact block.

The full functionality of the circuit principle was verified with extensive simulations on schematic level and post-layout extractions, across corners, supply voltages and temperature variations. The target specifications were all met in simulation. Experimental characterization will be carried out as soon as the design will be implemented in silicon.

The implemented circuit performs, in simulations, superiorly compared to solutions presented in literature.

For future explorations of the presented circuit principle, it might be worth focusing on the improvement of the two main building blocks, namely the delay element and the phase interpolator. Since the two long delays are the main contributors to the power consumption, it might be worth exploring whether a more power efficient delay element exists. The phase interpolation also provides room for improvement: an implementation which does not require overlapping signal transitions (if possible at all) would simplify the circuit and possibly also decrease the power consumption.

In summary, the outcome of this thesis is a quadrature LO generator that can be readily used in a next-generation mobile wireless transceiver product.

## Bibliography

[1] ETSI TS 125306 V12.3.0 (2014-09), Sep. 2014. [Online]. Available: http ://www.etsi.org/deliver/etsi_ts/125300_125399/125306/12.03. $00 \_60 /$ ts_125306v120300p.pdf (cit. on p. 1).
[2] J. Wannstrom. HSPA, [Online]. Available: http://www.3gpp.org/tec hnologies/keywords-acronyms/99-hspa (visited on 11/05/2014) (cit. on p. 1).
[3] M. Nohborg. LTE, [Online]. Available: http://www.3gpp.org/techno logies/keywords-acronyms/98-lte (visited on 10/30/2014) (cit. on p. 1).
[4] J. Wannstrom. (Jun. 2013). LTE-Advanced, [Online]. Available: http:/ /www.3gpp.org/technologies/keywords-acronyms/97-lte-advanc ed (visited on 11/05/2014) (cit. on p. 1).
[5] P. Desgreys, F. Ghanem, G. Pham, H. Fakhoury, and P. Loumeau, "Beyond 3 G wideband and high linearity ADCs," in Faible Tension Faible Consommation (FTFC), 2011, May 2011, pp. 59-62 (cit. on p. 1).
[6] S. Rodriguez, A. Rusu, and M. Ismail, "4G CMOS nanometer receivers for mobile systems: challenges and solutions," in International Symposium on Signals, Circuits and Systems, 2009. ISSCS 2009, Jul. 2009, pp. 14 (cit. on p. 1).
[7] L. L. Lewyn, T. Ytterdal, C. Wulff, and K. Martin, "Analog circuit design in nanoscale CMOS technologies," Proceedings of the IEEE, vol. 97, no. 10, pp. 1687-1714, Oct. 2009 (cit. on p. 1).
[8] W. Sansen, "Analog design challenges in nanometer CMOS technologies," in Solid-State Circuits Conference, 2007. ASSCC 'o7. IEEE Asian, Nov. 2007, pp. 5-9 (cit. on p. 1).
[9] M. Vertregt, "The analog challenge of nanometer CMOS," in Electron Devices Meeting, 2006. IEDM 'o6. International, Dec. 2006, pp. 1-8 (cit. on p. 1).
[10] S.-X. Ng, T. Keller, and W. Webb, Quadrature amplitude modulation: From basics to adaptive trellis-coded, turbo-equalised and space-time coded OFDM, CDMA and MC-CDMA systems, 2nd ed. Chichester: Hoboken, N.J: John Wiley \& Sons, 2004, 1136 pp. (cit. on pp. 1, 11, 13).
[11] L. B. Oliveira, J. R. Fernandes, I. M. Filanovsky, C. J. M. Verhoeven, and M. M. Silva, Analysis and design of quadrature oscillators. Dordrecht: Springer Netherlands, 2008. [Online]. Available: http://rd.springer . com/book/10.1007/978-1-4020-8516-1 (visited on 01/28/2015) (cit. on pp. 1, 13, 14).
[12] R. Lyons, Quadrature signals: Complex, but not complicated, Jan. 2008. [Online]. Available: http://www.ieee.li/pdf/essay/quadrature_si gnals.pdf (cit. on pp. 2, 3).
[13] M. Ingels, V. Giannini, J. Borremans, G. Mandal, B. Debaillie, P. Van Wesemael, T. Sano, T. Yamamoto, D. Hauspie, J. Van Driessche, and J. Craninckx, "A 5 mm 240 nm LP CMOS o.1-to-3GHz multistandard transceiver," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2010 IEEE International, Feb. 2010, pp. 458-459 (cit. on p. 2).
[14] S. Tadjpour, P. Rossi, L. Romano, R. Chokkalingam, H. Firouzkouhi, F. Shi, M. Leroux, D. Gerna, A. Venca, J. Vasa, B. Ramachandran, B. Brunn, A. Pirola, D. Ottini, A. Milani, E. Sacchi, M. Behera, X. Chen, U. Decanis, M. Tedeschi, S. DalToso, W. Eyssa, C. Cakir, C. Prakash, Y. He, N. Damavandi, R. Srinivasan, D. Shum, X. Fan, C. Yu, E. Pehlivanoglu, H. Zarei, A. Loke, G. Uehara, R. Castello, and Y. Song, "A multi-band Relg WCDMA/HSDPA/TDD LTE and FDD LTE transceiver with envelope tracking," in European Solid State Circuits Conference (ESSCIRC), ESSCIRC 2014-40th, Sep. 2014, pp. 383-386 (cit. on p. 2).
[15] X. Jiang, X. Yu, F. Lin, F. Cheung, M. Inerfield, K. Li, A. Kamath, H. Mehta, J. Duan, J. Yang, G. Krishnamurthy, S. Ranganathan, D. Cheung, N. R. K. Damaraju, J. Chen, D. Lu, V. Jayakumar, L. Wang, D. Soltesz, H. Kong, M. Zhang, and D. Chang, "A 28 nm analog and audio mixed-signal front end for 4G/LTE cellular system-on-chip," in European Solid State Circuits Conference (ESSCIRC), ESSCIRC 201440th, Sep. 2014, pp. 471-474 (cit. on p. 2).
[16] R. Roufoogaran, T. Li, A. Ojo, S. Cheng, C. Lee, S. Mahadeva, P. Shetter, and A. Behzad, "A compact and power efficient local oscillator generation and distribution system for complex multi radio systems,"
in IEEE Radio Frequency Integrated Circuits Symposium, 2008. RFIC 2008, Jun. 2008, pp. 277-280 (cit. on p. 2).
[17] A. Behzad, K. Carter, E. Chien, S. Wu, M. Pan, C. Lee, T. Li, J. Leete, S. Au, M. Kappes, Z. Zhou, D. Ojo, L. Zhang, A. Zolfaghari, J. Castanada, H. Darabi, B. Yeung, R. Rofougaran, M. Rofougaran, J. Trachewsky, T. Moorti, R. Gaikwad, A. Bagchi, J. Rael, and B. Marholev, "A fully integrated MIMO multi-band direct-conversion CMOS transceiver for WLAN applications (802.11n)," in Solid-State Circuits Conference, 2007. ISSCC 2007. Digest of Technical Papers. IEEE International, Feb. 2007, pp. 560-622 (cit. on p. 2).
[18] Y.-C. Choi, S.-S. Yoo, and H.-J. Yoo, "A fully digital polar transmitter using a digital-to-time converter for high data rate system," in IEEE International Symposium on Radio-Frequency Integration Technology, 2009. RFIT 2009, Jan. 2009, pp. 56-59 (cit. on p. 3).
[19] B. D. Van Veen and K. M. Buckley, "Beamforming: A versatile approach to spatial filtering," IEEE ASSP Magazine, IEEE ASSP Magazine, vol. 5, no. 2, pp. 4-24, Apr. 1988 (cit. on p. 3).
[20] B. Razavi, Design of analog CMOS integrated circuits. Boston, MA: Mcgraw Hill Book Co, Sep. 2000, 704 pp. (cit. on pp. 5, 27).
[21] A. Hajimiri and T. H. Lee, "A general theory of phase noise in electrical oscillators," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 33, no. 2, pp. 179-194, Feb. 1998 (cit. on pp. 5, 7).
[22] T. H. Lee, The design of CMOS radio-frequency integrated circuits, and ed. Cambridge, UK ; New York: Cambridge University Press, Dec. 2003, 816 pp. (cit. on pp. 7, 9, 10, 23).
[23] G. Hueber and R. B. Staszewski, Multi-mode/multi-band RF transceivers for wireless communications: advanced techniques, architectures, and trends, 1st ed. New York: John Wiley \& Sons, Jan. 2011, 608 pp. (cit. on p. 8).
[24] M. Kaltiokallio, "Integrated radio frequency circuits for wideband receivers," PhD thesis, Aalto University, Helsinki, 2014. [Online]. Available: http://lib.tkk.fi/Diss/2014/isbn9789526056166/isbn $9789526056166 . p d f(c i t$. on p. 8).
[25] H. Rabén, Receiver front-end design for WiMAX/LTE in 90 nm CMOS. 2009. [Online]. Available: http://www.diva-portal.org/smash/rec ord.jsf?searchId=4\%5C\&pid=diva2:277389 (visited on 10/17/2014) (cit. on p. 8).
[26] R. J. Betancourt-Zamora, "Injection-locked ring oscillator frequency dividers," PhD thesis, Stanford University. Dept. of Electrical Engineering, Stanford, 2005, 128 pp. [Online]. Available: http: //betasoft .org/wordpress/wp-content/uploads/2011/11/thesis.pdf (cit. on pp. 9, 23).
[27] D. B. Leeson, "A simple model of feedback oscillator noise spectrum," Proceedings of the IEEE, Proceedings of the IEEE, vol. 54, no. 2, pp. 329330, Feb. 1966 (cit. on p. 10).
[28] C. R. Cahn, "Combined digital phase and amplitude modulation communication systems," IRE Transactions on Communications Systems, IRE Transactions on Communications Systems, vol. 8, no. 3, pp. 150-155, Sep. 1960 (cit. on pp. 10, 12).
[29] E. McCune and W. Sander, "EDGE transmitter alternative using nonlinear polar modulation," in Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03, vol. 3, May 2003, (cit. on p. 12).
[30] A. Hadjichristos, "Transmit architectures and power control schemes for low cost highly integrated transceivers for GSM/EDGE applications," in Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03, vol. 3, May 2003, (cit. on p. 12).
[31] N. Zimmermann, R. Negra, and S. Heinen, "Design of an RF-DAC in 65 nm CMOS for multistandard, multimode transmitters," in IEEE International Symposium on Radio-Frequency Integration Technology, 2009. RFIT 2009, Jan. 2009, pp. 343-346 (cit. on p. 13).
[32] A. Kuckreja. (May 2012). Implementing a direct RF transmitter for wireless communications, [Online]. Available: http://www.max imintegrated.com/en/app-notes/index.mvp/id/5317 (visited on 10/15/2014) (cit. on p. 13).
[33] L. Anttila, P. Handel, and M. Valkama, "Joint mitigation of power amplifier and I/Q modulator impairments in broadband directconversion transmitters," IEEE Transactions on Microwave Theory and Techniques, IEEE Transactions on Microwave Theory and Techniques, vol. 58, no. 4, pp. 730-739, Apr. 2010 (cit. on p. 13).
[34] A. Lohtia, P. A. Goud, and C. G. Englefield, "An adaptive digital technique for compensating for analog quadrature modulator/demodulator impairments," in , IEEE Pacific Rim Conference on

Communications, Computers and Signal Processing, 1993, vol. 2, May 1993, 447-450 vol. 2 (cit. on p. 13).
[35] G. Fettweis, M. Löhning, D. Petrovic, M. Windisch, P. Zillmann, and W. Rave, "Dirty RF: a new paradigm," International Journal of Wireless Information Networks, vol. 14, no. 2, pp. 133-148, Jun. 2007. (visited on 10/24/2014) (cit. on p. 13).
[36] M. Valkama, Advanced $I / Q$ signal processing for wideband receivers: models and algorithms. Nov. 2001. [Online]. Available: http://dspace.cc. tut.fi/dpub/handle/123456789/102 (visited on 10/24/2014) (cit. on pp. 13-15).
[37] J. H. R. Schrader, Wireline equalization using pulse-width modulation. JanRutger Schrader, 166 pp. (cit. on p. 14).
[38] D. Fu, "A simultaneous TX and RX I/Q imbalance calibration method," in IEEE International Symposium on Circuits and Systems, 2008. ISCAS 2008, May 2008, pp. 1264-1267 (cit. on p. 16).
[39] S. Simoens, M. de Courville, F. Bourzeix, and P. de Champs, "New I/Q imbalance modeling and compensation in OFDM systems with frequency offset," in The 13th IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, 2002, vol. 2, Sep. 2002, 561-566 vol. 2 (cit. on p. 16).
[40] J. P. F. Glas, "Digital I/Q imbalance compensation in a low-IF receiver," in IEEE Global Telecommunications Conference, 1998. GLOBECOM 1998. The Bridge to Global Integration, vol. 3, 1998, 1461-1466 vol. 3 (cit. on p. 16).
[41] M. Valkama, M. Renfors, and V. Koivunen, "Advanced methods for I/Q imbalance compensation in communication receivers," IEEE Transactions on Signal Processing, IEEE Transactions on Signal Processing, vol. 49, no. 10, pp. 2335-2344, Oct. 2001 (cit. on p. 16).
[42] L. Qianqian, Z. Erhu, Y. Fang, L. Min, L. Lianbi, and F. Song, "I/Q mismatch calibration based on digital baseband," Journal of Semiconductors, Journal of Semiconductors, vol. 34, no. 7, Jul. 2013. (visited on 10/24/2014) (cit. on p. 16).
[43] D. Taggart and R. Kumar, "Impact of phase noise on the performance of the QPSK modulated signal," in 2011 IEEE Aerospace Conference, Mar. 2011, pp. 1-10 (cit. on p. 17).
[44] (Nov. 2001). Modeling phase noise leads to lower BER in fixed wireless design, EE Times, [Online]. Available: http://www.eetimes.com/doc ument.asp?doc_id=1225313 (visited on 10/24/2014) (cit. on p. 17).
[45] D. Barker, "The effects of phase noise on high-order QAM systems," Communication Systems Design, Oct. 1999. [Online]. Available: http:// ichannel-design.com/documents/EffectsOfPhaseNoiseOnHighOrd erQAM. pdf (cit. on p. 17).
[46] Z. Chen and F. F. Dai, "Effects of LO phase and amplitude imbalances and phase noise on QAM transceiver performance," IEEE Transactions on Industrial Electronics, IEEE Transactions on Industrial Electronics, vol. 57, no. 5, pp. 1505-1517, May 2010 (cit. on p. 17).
[47] R. Corvaja and S. Pupolin, "Phase noise effects in QAM systems," in The 8th IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, 1997. Waves of the Year 2000. PIMRC '97, vol. 2, Sep. 1997, 452-456 vol. 2 (cit. on p. 17).
[48] M. A. Tariq, H. Mehrpouyan, and T. Svensson, "Performance of circular QAM constellations with time varying phase noise," in 2012 IEEE 23 rd International Symposium on Personal Indoor and Mobile Radio Communications (PIMRC), Sep. 2012, pp. 2365-2370 (cit. on p. 17).
[49] G. J. Foschini, R. Gitlin, and S. Weinstein, "Optimization of twodimensional signal constellations in the presence of gaussian noise," IEEE Transactions on Communications, IEEE Transactions on Communications, vol. 22, no. 1, pp. 28-38, Jan. 1974 (cit. on p. 17).
[50] F. Behbahani, A. Fotowat-Ahmady, S. Navid, R. Gaethke, and M. Delurio, "An adjustable bipolar quadrature LO generator with an improved divide-by-2 stage," in Bipolar/BiCMOS Circuits and Technology Meeting, 1996., Proceedings of the 1996, Sep. 1996, pp. 157-160 (cit. on pp. 19, 20).
[51] T. Hornak, K. L. Knudsen, A. Z. Grzegorek, K. A. Nishimura, and W. J. McFarland, "An image-rejecting mixer and vector filter with $55-\mathrm{dB}$ image rejection over process, temperature, and transistor mismatch," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 36, no. 1, pp. 23-33, Jan. 2001 (cit. on pp. 19, 20).
[52] S. J. Fang, A. Bellaouar, S. T. Lee, and D. J. Allstot, "An image-rejection down-converter for low-IF receivers," IEEE Transactions on Microwave Theory and Techniques, vol. 53, no. 2, pp. 478-487, Feb. 2005 (cit. on pp. 19, 20).
[53] J. J. Spilker and D. T. Magill, "The delay-lock discriminator-An optimum tracking device," Proceedings of the IRE, Proceedings of the IRE, vol. 49, no. 9, pp. 1403-1416, Sep. 1961 (cit. on p. 21).
[54] A. G. Lindgren, R. F. Pinkos, and M. E. Schumacher, "Theory and noise dynamics of the delay-locked loop," IEEE Transactions on Geoscience Electronics, IEEE Transactions on Geoscience Electronics, vol. 8, no. 1, pp. 30-40, Jan. 1970 (cit. on p. 21).
[55] L. Estes and E. O'Neill, "Dynamics and stability of a delay-locked loop," IEEE Transactions on Automatic Control, vol. 21, no. 4, pp. 564567, Aug. 1976 (cit. on p. 21).
[56] W. Rhee, H. Ainspan, S. Rylov, A. Rylyakov, M. Beakes, D. Friedman, S. Gowda, and M. Soyuer, "A 10-Gb/s CMOS clock and data recovery circuit using a secondary delay-locked loop," in Custom Integrated Circuits Conference, 2003. Proceedings of the IEEE 2003, Sep. 2003, pp. 81-84 (cit. on p. 21).
[57] V. Kumar and M. Khosla, "Design of a low power delay locked loop based clock and data recovery circuit," in 2011 Annual IEEE India Conference (INDICON), Dec. 2011, pp. 1-4 (cit. on p. 21).
[58] T. Kim and B. Kim, "Phase interpolator using delay locked loop [multiphase clock generation]," in Southwest Symposium on Mixed-Signal Design, 2003, Feb. 2003, pp. 76-80 (cit. on p. 21).
[59] Y. Moon, J. Choi, K. Lee, D.-K. Jeong, and M.-K. Kim, "An all-analog multiphase delay-locked loop using a replica delay line for wide-range operation and low-jitter performance," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 35, no. 3, pp. 377-384, Mar. 2000 (cit. on p. 21).
[6o] M. Zlatanski, W. Uhring, J.-P. Le Normand, and D. Mathiot, "A fully characterizable asynchronous multiphase delay generator," IEEE Transactions on Nuclear Science, IEEE Transactions on Nuclear Science, vol. 58, no. 2, pp. 418-425, Apr. 2011 (cit. on p. 21).
[61] M.-H. Chang, L.-P. Chuang, I.-M. Chang, and W. Hwang, "A 300-mV 36-uW multiphase dual digital clock output generator with selfcalibration," in SOC Conference, 2008 IEEE International, Sep. 2008, pp. 97-100 (cit. on p. 21).
[62] J. Craninckx, V. Gravot, and S. Donnay, "A harmonic quadrature LO generator using a 90 deg; delay-locked loop [zero-IF transceiver ap-
plications]," in Solid-State Circuits Conference, 2004. ESSCIRC 2004. Proceeding of the 30th European, Sep. 2004, pp. 127-130 (cit. on p. 21).
[63] A. Elshazly, A. Balankutty, Y.-Y. Huang, Y. Kai, and F. O'Mahony, "A 2 GHz -to- 7.5 GHz quadrature clock-generator using digital delay locked loops for multi-standard I/Os in 14 nm CMOS," in 2014 Symposium on VLSI Circuits Digest of Technical Papers, Jun. 2014, pp. 1-2 (cit. on pp. 21, 22).
[64] V. Michal, "On the low-power design, stability improvement and frequency estimation of the CMOS ring oscillator," in Radioelektronika (RADIOELEKTRONIKA), 2012 22nd International Conference, Apr. 2012, pp. 1-4 (cit. on p. 22).
[65] R. Navid, T. H. Lee, and R. W. Dutton, "Minimum achievable phase noise of RC oscillators," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 40, no. 3, pp. 630-637, Mar. 2005 (cit. on p. 23).
[66] P. Andreani, X. Wang, L. Vandi, and A. Fard, "A study of phase noise in colpitts and LC-tank CMOS oscillators," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 40, no. 5, pp. 11071118, May 2005 (cit. on p. 23).
[67] E. Hegazi, H. Sjoland, and A. A. Abidi, "A filtering technique to lower LC oscillator phase noise," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 36, no. 12, pp. 1921-1930, Dec. 2001 (cit. on p. 23).
[68] B. Razavi, "A study of injection locking and pulling in oscillators," IEEE Journal of Solid-State Circuits, vol. 39, no. 9, pp. 1415-1424, Sep. 2004 (cit. on p. 23).
[69] (Sep. 2012). Synchronization of thirty two metronomes - YouTube, Synchronization of thirty two metronomes - YouTube, [Online]. Available: http://www.youtube.com/watch?v=JWToUATLGzs (visited on 10/15/2014) (cit. on p. 23).
[70] T. Djurhuus and V. Krozer, "Theory of injection-locked oscillator phase noise," IEEE Transactions on Circuits and Systems I: Regular Papers, IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 58, no. 2, pp. 312-325, Feb. 2011 (cit. on p. 23).
[71] N. Kanemaru, S. Ikeda, T. Kamimura, S.-y. Lee, S. Tanoi, H. Ito, N. Ishihara, and K. Masu, "A ring-VCO-based injection-locked frequency multiplier using a new pulse generation technique in 65 nm CMOS,"
in SoC Design Conference (ISOCC), 2011 International, Nov. 2011, pp. 3235 (cit. on p. 23).
[72] K. Takano, M. Motoyoshi, and M. Fujishima, " 4.8 GHz CMOS frequency multiplier with subharmonic pulse-injection locking," in Solid-State Circuits Conference, 2007. ASSCC 'o7. IEEE Asian, Nov. 2007, pp. 336-339 (cit. on p. 23).
[73] H.-T. Ng, R. Farjad-Rad, M.-J. E. Lee, W. Dally, T. Greer, J. Poulton, J. H. Edmondson, R. Rathi, and R. Senthinathan, "A second-order semidigital clock recovery circuit based on injection locking," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 38, no. 12, pp. 2101-2110, Dec. 2003 (cit. on p. 23).
[74] R. J. Betancourt-Zamora, S. Verma, and T. H. Lee, " $1-\mathrm{GHz}$ and $2.8-\mathrm{GHz}$ CMOS injection-locked ring oscillator prescalers," in 2001 Symposium on VLSI Circuits, 2001. Digest of Technical Papers, Jun. 2001, pp. 47-50 (cit. on p. 23).
[75] K.-h. Kim, Y.-S. Sohn, C.-K. Kim, M. Park, D.-J. Lee, W.-S. Kim, and C. Kim, "A $20-\mathrm{gb} / \mathrm{s} 256-\mathrm{mb}$ DRAM with an inductorless quadrature PLL and a cascaded pre-emphasis transmitter," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 41, no. 1, pp. 127134, Jan. 2006 (cit. on p. 24).
[76] K.-h. Kim, P. W. Coteus, D. Dreps, S. Kim, S. V. Rylov, and D. J. Friedman, "A 2.6 mW 370 MHz -to-2.5 GHz open-loop quadrature clock generator," in Solid-State Circuits Conference, 2008. ISSCC 2008. Digest of Technical Papers. IEEE International, Feb. 2008, pp. 458-627 (cit. on p. 24).
[77] R. T. Yazicigil, "Multiphase LO generator design review," Company Internal Presentation, Aug. 2014 (cit. on pp. 24, 88).
[78] H. Song, The arts of VLSI circuit design. Xlibris Corporation, Mar. 2011, 437 pp. (cit. on p. 44).
[79] B.-S. Chang, G. Kim, and W. Kim, "A low voltage low power CMOS delay element," in Solid-State Circuits Conference, 1995. ESSCIRC '95. Twenty-first European, Sep. 1995, pp. 222-225 (cit. on p. 43).
[8o] P. Chu, Y. Zhang, Z. Wen, and L. Yu, "A monotonic low power thyristor-based CMOS delay element," in International Conference on Microwave and Millimeter Wave Technology, 2007. ICMMT '07, Apr. 2007, pp. 1-4 (cit. on p. 43).
[81] J. Al-Eryani, A. Stanitzki, K. Konrad, N. Tavangaran, D. Bruckmann, and R. Kokozinski, "Low-power area-efficient delay element with a wide delay range," in 2012 19th IEEE International Conference on Electronics, Circuits and Systems (ICECS), Dec. 2012, pp. 717-720 (cit. on p. 43).
[82] J.-L. Yang, C.-W. Chao, and S.-M. Lin, "Tunable delay element for low power VLSI circuit design," in TENCON 2006. 2006 IEEE Region 10 Conference, Nov. 2006, pp. 1-4 (cit. on p. 43).
[83] M. Kurchuk and Y. Tsividis, "Energy-efficient asynchronous delay element with wide controllability," in Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS), May 2010, pp. 38373840 (cit. on p. 43).
[84] N. R. Mahapatra, S. V. Garimella, and A. Tareen, "An empirical and analytical comparison of delay elements and a new delay element design," in IEEE Computer Society Workshop on VLSI, 2000. Proceedings, 2000, pp. 81-86 (cit. on p. 43).
[85] M. Moazedi, A. Abrishamifar, and A. M. Sodagar, "A highly-linear modified pseudo-differential current starved delay element with wide tuning range," in 2011 19th Iranian Conference on Electrical Engineering (ICEE), May 2011, pp. 1-4 (cit. on p. 43).
[86] H.-Y. Huang and J.-H. Shen, "A DLL-based programmable clock generator using threshold-trigger delay element and circular edge combiner," in Proceedings of 2004 IEEE Asia-Pacific Conference on Advanced System Integrated Circuits 2004, Aug. 2004, pp. 76-79 (cit. on p. 43).
[87] V. Adler and E. G. Friedman, "Delay and power expressions for a CMOS inverter driving a resistive-capacitive load," in 1996 IEEE International Symposium on Circuits and Systems, 1996. ISCAS '96., Connecting the World, vol. 4, May 1996, 101-104 vol. 4 (cit. on p. 44).
[88] S. Kumaki, A. H. Johari, T. Matsubara, I. Hayashi, and H. Ishikuro, "A 0.5 V 6-bit scalable phase interpolator," in 2010 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), Dec. 2010, pp. 1019-1022 (cit. on p. 47).
[89] A. Nicholson, J. Jenkins, A. van Schaik, T. J. Hamilton, and T. Lehmann, "A 1.2 V 2-bit phase interpolator for 65 nm CMOS," in 2012 IEEE International Symposium on Circuits and Systems (ISCAS), May 2012, pp. 2039-2042 (cit. on p. 47).
[90] B. W. Garlepp, K. S. Donnelly, J. Kim, P. S. Chau, J. L. Zerbe, C. Huang, C. V. Tran, L. Portmann, D. Stark, Y.-F. Chan, T. H. Lee, and M. A. Horowitz, "A portable digital DLL for high-speed CMOS interface circuits," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 34, no. 5, pp. 632-644, May 1999 (cit. on p. 47).
[91] K. Pagiamtzis, ECE1352 Analog Integrated Circuits Reading Assignment: Phase Interpolating Circuits. 2001 (cit. on p. 47).
[92] R. Kreienkamp, U. Langmann, C. Zimmermann, T. Aoyama, and H. Siedhoff, "A $10-\mathrm{gb} / \mathrm{s}$ CMOS clock and data recovery circuit with an analog phase interpolator," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 40, no. 3, pp. 736-743, Mar. 2005 (cit. on p. 47).
[93] T. H. Lee, K. S. Donnelly, J. T. C. Ho, J. Zerbe, M. G. Johnson, and T. Ishikawa, "A 2.5 V CMOS delay-locked loop for $18 \mathrm{Mbit}, 500$ megabyte/s DRAM," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 29, no. 12, pp. 1491-1496, Dec. 1994 (cit. on p. 47).
[94] M. Benyahia, J. B. Moulard, F. Badets, A. Mestassi, T. Finateu, L. Vogt, and F. Boissieres, "A digitally controlled 5 GHz analog phase interpolator with 10 GHz LC PLL," in International Conference on Design Technology of Integrated Systems in Nanoscale Era, 2007. DTIS, Sep. 2007, pp. 130-135 (cit. on pp. 47, 48).
[95] K. Bhardwaj and T. H. Lee, "A $0.96 \mathrm{~mW}, 5 \cdot 3-6.75 \mathrm{GHz}$, phase-interpolation and quadrature-generation method using parametric energy transfer in 65 nm CMOS," in 2014 IEEE International Symposium on Circuits and Systems (ISCAS), Jun. 2014, pp. 2145-2148 (cit. on p. 47).
[96] T. H. Cormen, Introduction to algorithms. MIT, 2001, 1216 pp. (cit. on p. 50).
[97] M.-F. Lan, A. Tammineedi, and R. Geiger, "A new current mirror layout technique for improved matching characteristics," in 42 nd Midwest Symposium on Circuits and Systems, 1999, vol. 2, 1999, 1126-1129 vol. 2 (cit. on p .53 ).
[98] R. A. Hastings, The art of analog layout. Pearson Prentice Hall, 2006, 68 o pp. (cit. on p. 53).
[99] (Jan. 2015). 3269 square micrometers, Wolfram-Alpha, [Online]. Available: http://www.wolframalpha.com/input/?i=3269+square+m icrometers (visited on 01/13/2015) (cit. on p. 58).
[100] Cadence Spectre Circuit Simulator, [Online]. Available: http://www.c adence.com/products/cic/spectre_circuit/pages/default.aspx (visited on 01/16/2015) (cit. on p. 59).
[101] Cadence Spectre RF Simulation, [Online]. Available: http://www.cad ence.com/products/rf/spectre_rf_simulation/pages/default.a spx (visited on 01/16/2015) (cit. on p. 59).
[102] K. S. Kundert, "Introduction to RF simulation and its application," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 34, no. 9, pp. 1298-1319, Sep. 1999 (cit. on p. 59).
[103] Virtuoso® Spectre ® Circuit Simulator RF Analysis Theory, Sep. 2011 (cit. on p. 59).
[104] Virtuoso® Spectre® Circuit Simulator and Accelerated Parallel Simulator RF Analysis User Guide, Sep. 2011 (cit. on p. 59).
[105] Cadence Quantus QRC Extraction Solution, [Online]. Available: http: //www.cadence.com/products/di/quantus_qrc_extraction/pages/ default.aspx (visited on 01/16/2015) (cit. on p. 59).
[106] M. Reinhardt, Automatic layout modification: Including design reuse of the Alpha CPU in 0.13 Micron SOI technology. Springer Science \& Business Media, Jun. 2002, 250 pp. (cit. on p. 59).
[107] W. Kao, C.-Y. Lo, R. Singh, and M. Basel, "Parasitic extraction: current state of the art and future trends," in The 2001 IEEE International Symposium on Circuits and Systems, 2001. ISCAS 2001, vol. 5, 2001, 487-490 vol. 5 (cit. on p. 59).
[108] D. Sitaram, Y. Zheng, and K. L. Shepard, "Full-chip, three-dimensional shapes-based RLC extraction," IEEE Transactions on Computer-AidedAided Design of Integrated Circuits and Systems, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 23, no. 5, pp. 711-727, May 2004 (cit. on p. 59).
[109] Cadence® Advanced Analysis Tools User Guide, Sep. 2011 (cit. on p. 83).
[110] S. M. Kay, Intuitive probability and random processes using MATLAB®. Boston, MA: Springer US, 2006. [Online]. Available: http://rd.spr inger.com/book/10.1007/b104645 (visited on o1/23/2015) (cit. on p. 85).
[111] R. Pokharel, P. Nugroho, A. Anand, K. Kanaya, and K. Yoshida, "Digitally controlled CMOS quadrature ring oscillator with improved FoM for GHz range all-digital phase-locked loop applications," in

Microwave Symposium Digest (MTT), 2012 IEEE MTT-S International, Jun. 2012, pp. 1-3 (cit. on p. 87).
[112] K. Yousef, H. Jia, A. Allam, A. Anand, R. Pokharel, and T. Kaho, "An eight-phase CMOS injection locked ring oscillator with low phase noise," in 2014 IEEE International Conference on Ultra-WideBand (ICUWB), Sep. 2014, pp. 337-340 (cit. on p. 88).
[113] A. Valero-Lopez, S. T. Moon, and E. Sanchez-Sinencio, "Self-calibrated quadrature generator for WLAN multistandard frequency synthesizer," IEEE Journal of Solid-State Circuits, IEEE Journal of Solid-State Circuits, vol. 41, no. 5, pp. 1031-1041, May 2006 (cit. on p. 88).
[114] K. Bhardwaj, S. Narayan, S. Shumarayev, and T. Lee, "A 3.1mW phasetunable quadrature-generation method for CEI 28G short-reach CDR in 28 nm CMOS," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2013 IEEE International, Feb. 2013, pp. 412-413 (cit. on p. 88).


[^0]:    ${ }^{1}$ This is purely done to better understand the cause of phase noise. A real noise signal of course is different. This effect usually is explained by injecting a current pulse into the tank of an LC-oscillator.

[^1]:    ${ }^{2}$ In a real system the communication channel introduces noise and distortion which can be modeled as an additive error signal as $r(t)=s(t)+e(t)$, for example.

