Analog Integrated Circuits

EleCannonic

Copyright Notice:

This article is licensed under CC BY-NC-SA 4.0.

Licensing Info:

Commercial use of this content is strictly prohibited. For more details on licensing policy, please visit the About page.

1. MOS Devices

1.1 General Considerations

A MOSFET comprises a gate (polysilicon), a substrate (P/N semiconductor), a source (N/P semiconductor) and a drain (N/P semiconductor). The source and drain are interchangeable due to their symmetry during fabrication. There are two types of MOSFET: If the substrate is made of a P-type semiconductor (and the source and drain are made of an N-type semiconductor), it is an NMOS device; conversely, if the substrate is made of an N-type semiconductor (and the source and drain are made of a P-type semiconductor), it is a PMOS device. A typical NMOS structure is shown below:

NMOS Structure

The lateral dimension of the gate along the source-drain path is called the length, , and that perpendicular to the length is called the width, . Due to the side-diffuse of source and drain during fabrication (ion injection), the actual lenght is slightly smaller than the ideal length . The actual length is usually called effective length .

In long channel processes, the diffusion length can be ignored so we approximate . In the following chapters we denote effective length as unless otherwise declared.

Since the source and drain are symmetric, we call the carrier provider as the source. For example, in an NMOS, terminal with the lower voltage is cthe source because it provides electrons to establish the current.

In reality MOSFET is a 4-terminal device. The last terminal is the substrate. In typical MOS operation, the S/D junction must be reverse biased, thus we assume the global p-substrate is connected to the most negative supply.

Substrate Connection for NMOS

But for PMOS the substrate is independent because it needs a n-well on the p-substrate.

n-well of a PMOS

And in some modern processes, we make a deep n-well first and make another p-well inside to fabricate a NMOS to decouple the substrate with other devices.

n-well of a PMOS

Generally, if not specially designated, p-substrate in NMOS is connected to the lowest supply (negative supply or GND) and n-well in PMOS is connected to the highest positive supply. Then in the symbol we neglect substrate terminal by default.

1.2 MOS I/V Characteristics

MOSFET has a characteristic of switch. Now we analyze it.

n(a) A MOSFET driven by a gate voltage; (b) formation of depletion region; (c) onset of inversion; (d) formation of inversion layer.

Consider an NMOS connected to external voltages, with source connected to GND. When the gate voltage increase from 0 while the substrate is connected to the most negative supply or GND, the vertical electric field at the surface of the substrate increases, attracting more negative carriers (electrons) to the surface and filling the original holes. The region will holes filled is called depletion region. When continues to increase and all holes in depletion region are filled, the extra electrons forms a channel for conductance since these electrons are free. At this time, the MOS is on. When you apply voltage between source and drain, there will be current in the channel. This channel is called inversion layer and the threshold gate voltage of forming a channel is called threshold voltage, denoted as . In another word, the is defined as the gate voltage for which the interface is “as much as n-type as the substrate p-type”.

In semi-conductor physics, it can be proved that

where is the difference between the work functions of the polysilicon gate and the silicon substrate,, is Boltzmann’s constant, is the electron charge, is the doping density of the substrate, is the density of electrons in undoped silicon, is the charge in the depletion region, and is the gate-oxide capacitance per unit area. From pn junction theory, , where denotes the dielectric constant of silicon. Since appears very frequently in device and circuit calculations, it is helpful to remember that for Å, . The value of can then be scaled proportionally for other oxide thicknesses.

In fabrication, the threshold volatge will be adjusted by modifying the concentration of dopants to meet differernt requirements.

The source is not necessarily to be connected to GND. Thus, the above is in fact . We call the MOS is “off” when , meaning the current between source and drain . When , we say the MOS is “on”, meaning . We can see that only the drain current can indicate whether the device is on or off (at least in practice).

For PMOS, its switching characteristics are similar to those of NMOS, but in the opposite direction. It turns on when .


In order to obtain the relationship between the drain current of a MOSFET and its terminal voltages, we make two observations.

First, consider a semiconductor bar carrying a current . If the mobile charge density along the direction of current (current linear density) is Coulombs per meter and the velocity of the charge is meters per second, then

Consider a NMOS whose source and drain are both connected to GND, indicating . Thus the charge density are the same at all positions in the inversion layer. Since the gate forms a capacitor, the negative charge in the substrate mirrors equal positive charge on the gate. The total capacitance length density is . (Note that is the capacitance per unit area). When , based on the formation of inversion layer, the extra voltage falls on the capacitance . The charge density is

The extra voltage is called overdrive voltage.

But when the drain voltage is greater than 0, the local difference between gate and the channel varies from (source) to (drain). The substrate voltage is no longer 0. Then

Where the is the channel potential (inside the inversion layer). Then

The negative number comes from the electron.


In semi-conductors, , where is hte mobility of charge carriers. In NMOS, the inversion layer is composed by negative electrons. Then

Thus

The boundary condition indicates and . Integrate over to , then

Since is constant along the channel

This is the I-V characteristic when . This is a parabola-function and the maximum current occurs at

The peak current is

We call “aspect ratio”. If , we say the device operate in “triode region”.

Triode Region

In triode region the V-I curve is approximately a line. So we can estimate the equivalent resistance if


But what happens if ? In reality becomes relatively constant and we call the device is in “saturation region”. To understand, recall the term in . If , the charge density drops to 0. It means in region the charge density becomes 0. We call it pinch off. In the pinch off region, one electron is pulled by very strong field to continue the current. As increases further, the pinch off point which gradually moves from drain to source. With this motion, the effective length gradually shortens and causes channel length modulation. This effect causes a slight increase in - curve.

Saturation Region
Pinch-off Behavior

Note that in the figure the channel width represents the charge density instead of geometric width. In saturation region, the current is almost the same as that at , slightly distrubed by the effective length.

If is known, then is obtained as

To recap:

  • Cut-off Region: ,
  • Triode Region: , ,
  • Saturation Region: , ,

Similarly, for PMOS, the current formula is

The negative symbol appears because the current direction is opposite to that of NMOS. In NMOS enters the device from drain and flows out from source. In PMOS the direction is opposite.

1.3 MOS Transconductance

From sections above, we can see a MOSFET controls with . We need to define a figure of merit to indicate how well a device converts a voltage to current. This merit is transconductance, denoted as .

$$
g_m = \dfrac{\partial I_D}{\partial V_{GS}}\bigg|{V{DS} \text{const}} = \mu_n C_{ox} \dfrac{W}{L}(V_{GS} - V_{TH})
$$

represents the sensitivity of the device: for a high a small change of results in a large change of . You can also prove that can also be expressed by

Each expression is useful.

Transconductance Behavior

For example, in practice is usually stablized with a current source. Then we know transconductance decrease with the growing of overdrive.

1.4 Second-Order Effects

  • Body Effect

In the analysis before, we tacitly assumed that the bulk and the source of the transistor were tied to ground. What happens if the bulk voltage of an NMOS drops below the source voltage? In fact the MOS still works properly but some characteristics changes. Recall the threshold voltage analysis before. If the substrate voltage changes, as becomes more negative, more holes are attracted to the substrate connection, leaving a larger negative charge behind. Now recall that the threshold voltage is a function of the total charge in the depletion region because the gate charge must mirror Qd before an inversion layer is formed. Thus, as drops and
increases, also increases. This phenomenon is called the “body effect” or the “back-gate effect.”

It can be proven

Where is the threshold regardless of body effect, , is the source-bulk potential difference. The value of typically lies in the range of 0.3 to 0.4 V.

Since is related to , if the source or bulk is not tied to the AC ground, body effect should be taken into consideration in precise cases.


  • Channel-Length Modulation

Recall the current in saturation, we replace with to represent the pinch off effect. This is the channel length modulation. However, in practice it’s impossible to measure exactly. Thus, we equate effect of to and introduce a empirical factor .

is usually smaller for longer channels. Thus, in long channel process channel-length modulation can be neglected. However, in short channel process like 3nm, the modulation will affact much.

This effect influences more in short channel processes. In long channel processes, it is usually neglected.


  • Subthreshold Conduction

In reality the device does not turn on or off abruptly at . At , a weak inversion layer still exists and some current flow from D to S, even for . exhibits an exponential depenbdence on in this region.

where is proportional to , is a nonideality factor, . Such exponential relation occurs only in , serving as a transition from turn-off to turn on.

Subthreshold Behavior

1.5 MOS Device Capacitances

Note: We will denote “capacitance” as “cap.” to simplify the decription.

We know there’re non-ideal cap. in PN junction. In a MOS device, the cap. distribution is shown below:

MOSFET Cap.

is the cap. between gate and channel(inversion layer), i.e., .

is the depletion cap. between the channel and substrate, .

and are caused by the overlap of the source and drain with gate due to the drawn in fabrication. The two cap.’s cannot be simply written as because the N-type semi-conductor is different with the P-type substrate. They are usually obtained by more elaborate calculations. The overlap cap. per unit area is denoted as , then .

and are PN junction cap. For the source and drain, the cap can be decomposed into two components: bottom and sidewall:

PN Junction Cap. Decomposition

We specify and as cap. per unit area. For PN junctions

where is the reverse voltage accross the junction and is the build-in potential. is an empirial factor, typically in range of 0.3 and 0.4. In this article we denote the unit of as . Then the total sidewall cap. is

where is the length of source or drain.


In different regions cap. of MOSFET will change. If the device is off, there is no connection between substrate, source and drain. Then

The symbol means “connect in cascade/series”, not “plus”.

When the device is in deep triode region, i.e., and , the source and drain have approximately equal voltages. Then the gate-channel cap. is divided equally between the source and drain.

Cap. in Triode Region

In saturation region, the connection between channel and drain is cut off so . is a little bit complicated because the charge distribution is not uniform in the remaining channel. Recall the charge density in channel

and apply current equation in saturation region

Ignore the channel length modulation, is a constant. Then

The total charge in the channel is

Plug in we can obtain the total charge

is defined as the changing rate of total charge relative to , parallel with drawn cap.

Cap. in Saturation Region

1.6 Small Signal Model

Based on a stable DC point, we can analyze AC signals. The fundamental principle of AC analysis is that we apply a very small signal on a DC signal. Though the DC curve may not be linear, the small range near the DC working point is approximately linear, hence we can apply conclusions in linear systems for AC signals. By the way, we focus on AC signals in most time.

Principle of Small Signals

When the device is on, terminal D and terminal S are defined as output terminals. Then is therefore the output current and is the output voltage. Hence the output impedance

Usually is very small (second order effect) so . In the saturation region, should be infinite if the device is ideal. However, it is finite due to the channel-length modulation effect. In triode region, this resistance is just the conduction resistance .


Recall bulk potential can influence the threshold voltage and current is also related to threshold. It is equivalent to add a current source related to . We write the value as , where

where is typically 0.25 and usually is proportional to .

AC Model of MOSFET

If all cap.’s are taken into consideration, the complete AC model should be

Complete AC Model of MOSFET

For PMOS devices, since the power supply terminals are equivalent to ground in AC analysis (AC grounding), their AC model remains unchanged. Typically, we mirror-flip them to correspond to the layout where PMOS devices are placed on top in DC circuits.


To recap, we summarize the three important parameters:

  • : designed by engineers, first order
  • : introduced by channel-length modulation, second order
  • : introduced by body-effect, second order

In some cases, we use a simpler conclusion to calculate impedance. For a MOS device, we can summarize the impedance in the view of different ports. This summary can be derived from the small signal model:

MOSFET Impedance Model

2. Single Stage Amplifiers

Note: In the following chapters, we denote “sat.” for “saturation region”, “tri.” for “troide region” and “CLM” for “channel-length modulation”.

2.1 General Considerations

Denote the input signal as and output signal , an ideal amplifier should follow the linear relationship

This linear coefficient is called the gain. However, in reality we cannot manufacture ideal things, which means all amplifiers have non-linearity. According to Taylor series theory, we approximate the characteristic by polynomial:

In this general relationship, is called bias, is called gain, and other coefficients represent different orders of nonlinearity, which should be avoided as possible as we can in design.

We estimate the performance of an amplifier with the following indices: gain, speed, I/O range, power dissipation, supply voltage, linearity, noise and so on. Most of them trade each other so the design is usually a multi-dimension optimization problem.

Our target is to amplify small signals. However, MOS devices have a threshold voltage and may work in different regions. Thus setting a proper DC working point is necessary to confirm the devices work in a desired region.


Before diving into specific amplification circuits, we introduce a general used formula to calculate gain. If a system has a total transconductance and output impedance , then the total voltage gain is

You may find an extra negative symbol in some textbooks, that’s because we apply a different definition. They force the transconductance to be positive for convenience so they have to add an extra “-“ to indicate that the system produces a “inverted phase”. In our definition, the symbol is absorbed by and the dierction of .

The formula is easy to prove. According to definition , by the chain rule of derivatives


Since MOSFET can also be considered as an amplification device, we define

as its intrinsic gain to represent its amplification ability.

2.2 Common-Source Stage (CS)

2.2.1 CS with Resistive Load

A MOSFET transits its input voltage signal at gate to a current signal. With a load resistance, the current will be transit back to a voltage signal. This fundamental idea introduces common-source (CS) amplifier.

CS Amplifier

We expect the device to work in sat. and neglect CLM. DC working point is set by the DC component in . The DC analysis gives

The two equations hold in sat. Note that is limited to satisfy sat., the two boundaries are

  • Cut-off:
  • Tri.:

In cut-off region where , the device is disconnected and . In sat., , the I-V curve is approximately linear. In tri., the device can be regraded as a resistor. Finally the total I-V curve can be obtained:

I-V Curve of CS Amplifier

Since the transconductance drops in the triode region, we usually ensure that . Hence, we obtain the I/O range:

We have three methods to calculate voltage gain:

  • Partial Derivative on DC Formula

Remember .

  • Analyze Small Signal Model
Small Signal Model of CS Amplifier

  • Use the General Formula Directly


Now take CLM into consideration, meaning is not an infinite value. Then the DC equation changes

Since appears in both sides, derivative method becomes more complicated (of course it is solvable). We leave it as an exercise. Now the advantage of small signal method becomes obvious

Small Signal Model of CS Amplifier with CLM

2.2.2 CS Stage with Diode-Connected Load

The basic CS topology has some problems: is usually much more smaller than so it restricts the intrinsic gain; varies with the DC working point set by DC component in , meaning the gain changes with the AC signal in (strong nonlinearity); The I/O range is limited. To solve these problems, engineers raised some new topologies to improve the performance.

The most severe problem is nonlinearity, which is mostly caused by the variation of when varies. Thus we think about stablizing DC current. The most simple idea is to replace the load resistor with a current source.

CS Amplifier with a Current Source Load

Since , every parameters are constants. Nonlinearity disappears. By the way, the disappears so it no longer restrict the gain. , which is much larger.

But the current source is impossible to be ideal. In fact if it is ideal, DC point is not well-defined. You can try to calculate to understand it. So how to implement it? A diode-connected MOS is a good choice.

Diode-Connected MOS Devices

If the device is connected like that, , . The device is in sat. in any cases.

You can figure out the small signal model

Small Signal of Diode-Connected MOS Devices

If NMOS serves as a load, we must take body effect into consideration.

NMOS Serves as Load

From the source, list the current equation

Solve the equation and calculate the impedance .

We now study the CS amplifier with an NMOS load. Neglect CLM first

CS with NMOS Diode Load

From the output terminal, M2 gives an impedance of . From the input terminal, the transconductance is . Then the total gain

where and . Notice the gain is independent of the bias currents and voltages (so long as M1 stays in saturation). In other words, as the input and output signal levels vary, the gain remains relatively constant, indicating that the input-output characteristic is relatively linear.

The linear behavior of the circuit can also be confirmed by large-signal analysis.

and hence

Take derivative to on both sides. Do not forget the body effect

Apply the chain rule

Then

It is instructive to study the overall large-signal characteristic of the circuit as well. But let us first consider the circuit with a cap. load. What is the final value of Vout if drops to zero? As decreases, so does the overdrive of M2. Thus, for small , and . In reality, the subthreshold conduction in M2 eventually brings to if approaches zero (subthreshold charge the cap.), but at very low current levels, the finite capacitance at the output node slows down the change from to . This is illustrated in the time-domain waveforms. For this reason, in circuits that have frequent switching activity, we assume that Vout remains around when falls to small values.

Waveform of Diode Load

The following figure plots the relationship. The output voltage equals if . For , follows an approximately straight line. As exceeds (beyond point A), M_1 enters the triode region, and the characteristic becomes nonlinear.

VTC of CS with Diode Load

If the load is implemented with a PMOS, then body effect disappears and the gain is more linear.

CS with PMOS Diode Load

With disappearance of , the gain becomes completely independent of varying signals

if CLM is neglected. If CLM is taken into consideration, the gain becomes dependent to .

2.2.3 CS Stage with Current Source Load

Another mothod to stablize DC working point is to add an extra bias to replace diode-connected MOS.

CS Stage with Current-Source Load

Obviously the total impedance is and the total transconductance is , then

if M2 is biased in sat. However, the is not well-defined. You can list the two DC current equations and equate them

The KCL equation only one solution for . Thus, even if deviates only a little bit from the solution, will rush to or GND, with one MOS enters tri. Hence, in engineering and are usually connected with a feedback loop to adjust automatically so both MOS’s can remain in sat.


If the MOSFET up is biased in tri., then it is almost the same as the initial non-optimized circuit. One advantage of biasing in tri. is that you can adjust the resistance value by adjusting .

2.2.4 CS Stage with Active Load

If one of the MOS just provides bias, it seems that the gain of that MOS is wasted. Can we make full use of both the two tubes? Yes. The topology is called compensated CS, also known as CMOS inverter.

CMOS Inverter

From the small signal model in figure (b), input transconductance , output impedance , then

CMOS inverter must solve two critical issues when serves as an amplifier: First, the bias current of the two transistors is a strong function of PVT (Process drift, voltage drift, temperature drift, the three parameters impact the performance and cannot be controlled). In particular, since , variations in or the threshold voltages directly translate to changes in the drain currents. Second, the circuit amplifies supply voltage variations (“supply noise”)! To understand this point, consider the arrangement depicted in the following figure, where is a bias voltage to place M1 and M2 in saturation. We can prove that the small-signal gain from to is given by through the small signal model

Power Noise of CMOS Inverter

And the input range is very small. CMOS inverter sacrifices the power noise and input range to get larger gain. So, this topology is widely used in digital circuits and seldom used in analog circuits.

2.2.5 Source Degenerate

In some applications, the nonlinear dependence of the drain current upon the overdrive voltage introduces excessive nonlinearity. By placing a “degeneration” resistor in series with the source terminal so we can make the input device more linear.

CS Stage with Source Degeneration

Neglect CLM and body effect. Here, as increases, so do and the voltage drop across . This is a negative feedback: increase increase increase decrease decrease. With this feedback process is restricted in a very narrow region so the voltage transient curve (VTC) is approximately linear.

Another view of this is the transconductance. intends to make the gain equation a weaker function of , which is strongly indluenced by the bias. We calculate the total transconductance

Then

The source degenerate resistor adds an extra term in the total transconductance, partly cancels the nonlinearity of . If is large enough, , which is completely determined by external resistor .

The AC gain is

If CLM and body effect are not neglected, the small signal model is shown below

CS Stage with Source Degeneration with CLM

It can be proven that

2.3 Common-Drain Stage (Source Follower, SF)

Our analysis of the common-source stage indicates that, to achieve a high voltage gain with limited supply voltage, the load impedance must be as large as possible. If such a stage is to drive a low-impedance load, then a “buffer” must be placed after the amplifier so as to drive the load with negligible reduction in gain. The source follower (also called the “common-drain” stage) can operate as a voltage buffer.

Source Follower and Serving as a Buffer

We know CS has a high output impedance mainly restricted by the load resistor. If the input impedance of the next stage is small, the output voltage may drop and only part of the signal can enter the next stage. By applying a source follower, the total output impedance will decrease, therefore has a better driving ability.

we note that for , M1 is off and = 0. As exceeds , M1 turns on in saturation (because at this point ) and flows through . As increases further, follows the input with a difference (level shift) equal to . We can express the input-output characteristic as

when CLM is neglected. By taking derivative to and apply chain rule of

we get

Plug in the transconductance ,

By using small signal model, the conclusion is easier

Small Signal Model of SF

As increases, the AC gain increases from 0 to . Since always exists, the actual gain of an SF is slightly smaller than 1.

Gain Variation of SF

With the similar reason of CS, we can replace with a current source. But the improved circuit still has a problem: nonlinearity. When increases, increases, increases, increases, decreases. It seems perfect. But the rising of costs larger , hence the rising of must be slower than , causing nonlinearity.

Usually the current source is implemented with a biased MOS. Equate the two current equations

We can see the input and output is broadenly linear

We apply a feedback loop to adjust in order to stablize the DC working point.


Obviously the SF has a high impedance. We check the output impedance for SF with a load of current source.

SF with Current Source Load

At source point

giving

Since is always large enough, the output imdepance is small.

We know the body effect causes part of the nonlinearity. This can be solved if the bulk is tied to source, which means replacing all MOS’s with PMOS.

SF Implemented with PMOS

We must replace all devices because all NMOS’s share the same substrate potential GND. This topology has less nonlinearity but lower mobility of PMOS also yields higher output impedance.

Source followers also shift the DC level of the signal by , thereby consuming voltage headroom and limiting the voltage swings. To understand this point, consider the example illustrated in the following figure, a cascade of a common-source stage and a source follower. Without the source follower, the minimum allowable value of would be equal to (for M1 to remain in saturation). With the source follower, on the other hand, must be greater than so that M3 is saturated. For comparable overdrive voltages in M1 and M3, this means the allowable swing at is reduced by , a substantial amount.

SF and CS in Cascade

2.4 Common-Gate Stage (CG)

It is also possible to apply the signal to the source terminal.

CG with Direct and Capacitive Coupling

Note that you should give a DC bias in so the circuit is not floated. It senses the input on the source terminal and produces output signal on the drain.

We also research the DC characteristic first. Take the direct coupled topology. When , the device is off and . As decreases delow , the device enters sat. and

and

Clearly as decreases so does , hence CG is a non-inverting topology, which is different with CS. With this, obtaining the small signal gain by taking derivatives to as we have done many times:

And as continues dropping the device must enter tri.

Interestingly, body effect increases the equivalent transconductance of the stage. From the equation, we can increase by increasing to increase gain but should not approach the subthreshold operation . If you widen the device too much while remains constant, the charge density per unit area will become too small to form a strong inversion layer, thus entering subthreshold region. Hence, we can see that the subthreshold voltage is an upper limitation of sat. transconductance.

In the capacitive coupled topology, the minimum allowable level of equals to because the current source requires a minimum voltage to work properly.

For the input impedance, we note that with CLM neglected, the impedance seen at the source of M1 equals to . Thus, the body effect decreases the input impedance of the common-gate stage, which is a drawback in voltage-input cases.

If we draw the small signal figure the result will be the same. Now suppose the current source has a finite resistance (or the DC point will be not-well-defined). The small signal model should be

CG with Non-ideal Current Source

Just apply KCL you can obtain

The gain of the common-gate stage is slightly higher due to body effect.

We now calculate the input and output impedance separately.

CG Input Impedance

At node X

Indicating

Usually and , so CG stage has a low input impedance.

Then set the input voltage to 0 but reserve the source impedance to calculate output impedance.

CG Output Impedance

Draw the small signal model and write out KCL at source terminal

Indicating

This is a very high value. Hence, CG has a low input impedance and a high output impedance. This impedance characteristic is suitable to work as a current buffer or impedance transferer. we loosely say that a transistor transforms its source resistance up and its drain resistance down (when seen at the appropriate terminal).

Impedance Transfer of CG

2.5 Cascode

2.5.1 Classical Cascode

As mentioned in the last section, CG is suitable for receiving a current input. We also know that the CS topology transfer a voltage input to a current input. The cascade of CG and CS is called a cascode topology.

Cascode Stage

Instead of using the small signal model (of course you can), we view the devices as specific behaviors. With CLM and body effect neglected, we inspect small vairation on and separately (and therefore the AC parameters are available).

Analysis of Cascode Stage

When changes a small value , the current changes . For small signals, is regarded as grounded so must change to adjust to match . Since , then

is the supply minus the voltage drop on , therefore

By now, we’ve figured out the AC gain without applying small signal model. And such analysis is much more practical in complex systems. As you can imagine, it’s annoying and impossible to draw small signal models in a large scale analog chip with hundreds of MOSFETs.


Now inspect the perturbation on M2, while the input signal on M1 is tied to a constant DC level. In this case M1 works as a constant DC current source. No matter how changes, must also change to match the constant current below. Since remains constant, drop on also remains, thus will never change. We say the input of M2 is isolated from .

Note that always change along to remain unchanged, we can get

To bias both devices in sat., we must gurantee . Plug in the M2 sat. condition (it is in fact the because we uses the sat. current to derive this), we get

For M2 to be sat., , that is

One of the drawback of cascode is that the output swing is limited.


We now analyze the large-signal behavior of the cascode stage as goes from zero to . Suppose CLM and body effect are neglected and is biased properly. When , M1 is off, but M2 is not off. This is because there’re parasitic cap. in M2, which can be equivalent to cap. from source of M2 to GND. The supply should charge to the cap. first, at this time M2 is on. The charging process will continue until , which means that the device has no more ability to support current to charge the cap. Thus, the source of M2 is not floated but tied to a charged cap., with voltage . To speak precisely, M2 is in subthreshold conduction.

As exceeds , M1 enters sat. and increases with growing, driving to decrease. In this region both devcices are in sat. and the amplifier works properly. If continues to increase until drops below by , M2 will enter tri.

Cascode Voltage Transfer Curve

We still draw the small signal model and try to find the total transconductance and output impedance.

Small Signal Model of Cascode

All current flows through , so (Mind the direction). To figure out , we must take CLM into consideration.

Output Impedance of Cascode

Suppose the AC voltage on the M2 source is . Then by KCL

Then

This is a very large value. Thus, cascode is suitable for receiving a voltage input and giving a current output.

Note that the output impedance of M1 from drainis , hence we can say that CG can amplify impedance by a factor of its intrinsic gain..

Finally we get the AC gain

Cascode make full use of the intrinsic gain of both devices. If the two devices are identical,

If body effect is not neglectable, and , yielding . Thus, body effect slightly increases the voltage gain.


In fact we can pile CG for many stages to boost the output impedance in a factor of (suppose all CG devices are identical). However in practice no one do this. That’s because more CG stages will limit output swing. If we uses 2 CG devices, the output swing should be

Rising the lowest significantly.


There is a trade-off problems. Recall that and , we summarize

The signal path is that . When we increase , increases but decreases so we need to trade between the transconductance and output impedance.

A cascode structure need not operate as an amplifier. Another popular application of this topology is in building constant current sources. The high output impedance yields a current source closer to the ideal, but at the cost of voltage headroom.

Cascode with PMOS Cascode Current Source

2.5.2 Folded Cascode

Classical cascode has a problem: There are too many devices on one path from VDD to GND, which will limit the swing because we have to gurantee sat. for all devices, increasing design difficulty. To solve this problem, we apply folded cascode.

Folded Cascode

In folded cascode CS device and CG device are separated to two paths and can be designed separately and independently, but at the cost of double current and quadruple power consumption (bacause both path need a while classical cascode only needs one ).

Usually the folded cascode is biased with a current source . We assume the source has a finite impedance . We try to calculate and withour small signal model. In this circuit, M1 transform to current to M2, this operation is only done by M1, so the transformed total current from M1 to M2 is . And the transformed current has 3 paths to GND: resistance of M1 , resistance of M2 and transconductance of M2 . If you are confused try to use small signal model to help understand but do not list KCL equation. Only current to serves as the output current. So the total transconductance is

And for , we apply the conclusion of impedance transformation. View from the output of CS stage, there’re two impedances: to real GND and to AC GND (). Hence the output impedance of CS stage is . Then directly multiply an intrinsic gain of M2

Parallel connection decreases the output impedance and further decreases the gain. This is also one of the cost of folded cascode.

The DC characteristic is

DC VTC of Folded Cascode

3. Differential Amplifiers

3.1 Single-Ended and Differential Operation

All circuits in the last chapter deals with single-ended signals, which uses the GND as a reference. A differential signal is defined as one that is measured between two nodes that have equal and opposite signal excursions around a fixed potential. This potential is called common-mode signal.

Single-Ended and Differential Signals

Usually, we can decompose the two inputs:

That is how we decompose common-mode component and differential mode component.

So why we use differential signals, which seems more complicated than single-ended signals? Consider the case below:

Distortion of Differential Signals

Clock line will produce environmental noise on lines in the neighbor through the parasitic cap. between lines. If a single-ended signal is loaded on a wire close to the clock, the signal will be sensitive to the clock perturbation. But if the signal is reassigned on two symmtrically distributed lines, as a differential signal, the perturbation from the clock acts the same on both lines (amplitude and phase). Then, the same noise will cancel each other in the subtraction process to obtain differential mode signal. The noise will be all loaded on common-mode components, which we do not care at all.

3.2 Differential Pairs

3.2.1 Pesudo-Differential Pair

How do we amplify a differential signal? The most simple idea is to amplify the two branches separately, with the same gain

Pesudo-Differential Pair

To implement differential characteristics, all components on both devices must be identical. Here, two differential inputs, and , having a certain CM level . It’s not difficult to find that CM level in this circuit is to bias the devices in a proper DC working point. Such a circuit also offers high power noise rejection (power noise effects are the same for two paths because they are symmetric).

Since this is just two CS stages with inverted differential input, neglecting CLM and body effect, the AC gain will be

Because the two paths are almost completely independent, we call this topology pesudo-differential pair.

All problems in fundamental CS stage also appear on this topology. Different CM bias will influence so it suffers from severe nonlinearity. Moreover, if the input CM level is excessively low, the minimum values of and may in fact turn off M1 and M2, leading to severe clipping at the output.

Input and Output Pesudo-Differential Pair

On the other hand, the pesudo-diffential pair will also amplify the useless CM signals, which impacts the output swing.

3.2.2 Basic differential pair

A simple modification can resolve the above issue. In this topology the two paths are coupled with a constant current source, usually implemented by a MOS device. This topology is called source-coupled pair.

Basic Differential Pair

The source introduces a constraint between the two paths: . This constant current becomes independent of CM. Thus, if , and output CM is .


We scan from to to research the DM behavior. If is much more negative than , M1 is off and M2 is on. Then and , . With increasing and decreasing, M1 carries larger current, drops and rises. The output reaches the crossing point when . According to symmetry, the behavior of the positive part should be the same. With the analysis above, we can draw the DC behavior curve.

DM Behavior of Basic Differential Pair

Note that the circuit contains three differential quantities: , , and .


Then turn the view to CM behavior. We set and scan the CM level from 0 to . The symmetry requires that .

When , both M1 and M2 are off, then no current are in the paths. Note that in this case cannot work properly and no longer provides a constant current. If the current source is implemented with a MOS, then it must enter deep tri. In this case we should model M3 as a resistor.

CM Analysis of Basic Differential Pair

When is sufficiently positive to turn on M1 and M2 in sat., the structure composed of M1 and behaves as a source follower with respect to node P. Consequently, follows shifted by a gate-source voltage . While M3 remains in tri., is governed by . As rises, both and increase until M3 reaches the edge of sat. Once M3 enters sat., the total current stabilizes at a constant and the circuit operates properly. We conclude that for proper operation, all devices must be in sat.

Summarize them

With rises further, M1 and M2 are expected to enter tri. if

In this region will approach a constant. This set an upper limit on the input swing

CM Behavior of Basic Differential Pair

Beyond the upper bound, the CM characteristics of do not change, but the differential gain drops.

AC Gain of Basic Differential Pair

Then it comes the output swing. If M1 and M2 are desired to be in sat.

Giving

The upper limit, of course, . Then

Notice that smaller leads to smaller and therefore a larger swing.

Implementable Basic Differential Pair

Now we analyze DM behavior quantitatively. We simply calculate and in terms of and , assuming the circuit is symmetric, M1 and M2 are saturated, and no CLM. Since the voltage at node P is equal to and ,

For a square-law device, we have

and, therefore,

It follows from the preceding definitions that

We wish to calculate the differential output current, . Squaring the two sides of the equation for the input difference and recognizing that , we obtain

That is,

Squaring the two sides again and noting that , we arrive at

Thus,

Now we obtain the relationship between the differential current and differential input voltage. We can say that M1, M2, and the tail operate as a voltage-dependent current source producing according to the above large-signal characteristics. As expected, is an odd function of , falling to zero for . As increases from zero, increases because the factor preceding the square root rises more rapidly than the argument in the square root drops.

Before examining further, it is instructive to calculate the slope of the characteristic, i.e., the equivalent of M1 and M2. Denoting the differential quantities by , the reader can show that

For , is maximum and equal to . Moreover, since , we can write the small-signal differential voltage gain of the circuit in the equilibrium condition as

Since each transistor carries a bias current of in this condition, the factor is in fact the same as the transconductance of each device, that is, . The derivation also suggests that falls to zero for . As we will see below, this value of plays an important role in the operation of the circuit.
Let us now examine the expression more closely. If , then

which yields the same equilibrium as that obtained above.

But what happens for larger values of ? It appears that the argument in the square root drops to zero for and crosses zero at two different values of , an effect not predicted by our qualitative analysis. This conclusion, however, is incorrect. To understand why, recall that the expression was derived with the assumption that both M1 and M2 are on. In reality, as exceeds a limit, one transistor carries the entire , turning off the other. Denoting this value by , we have and because M2 is nearly off. It follows that

This value means if you want to make a constant value, then you must provide a input range at least . This requires you to bias the in a proper level so the interval between CM level and upper/lower limit of input range is large enough.

The value of in essence represents the maximum differential input that the circuit can “handle”. It is possible to relate to the overdrive voltage of M1 and M2 in equilibrium. For a zero differential input, , yielding the overdrive voltage

Thus, is equal to times the equilibrium overdrive. The conclusion matches the intuition. Recall current formula in sat.

In equilibrium . When one of the device is off (meaning current in one device reaches , i.e., doubles from the equilibrium), the must becomes times than the equilibrium value. The changed value is provided by the differential mode. Note that floats, so when rises half of , the overdrive becomes .


Then comes the small signal analysis. We suppose M3 is a constant current source and cut it off in AC diagram. Denote the source terminal shared by the two devices as S and the influence on of the two paths and , separately. Hence the variation of is

Note that the two paths are completely symmetric and inputs are inverted. So and . Thus

This is an amazing lemma in differential pair. The voltage of node S follows in the CM input but remains constant in AC analysis. We call S a “virtual ground” because it in fact serves as a AC ground. That is because the inverted input is converted to inverted in the two paths and cancels each other.

With the two paths completely symmetric,

then


But PVT may perturb the value of . If the two devices are not symmtric precisely, there will be mismatch. Mismatch of will break the virtual ground property. Instead, node S will be equivalent to a controlled voltage source. This mismatch introduces input offset voltage, you may have heard about that if you have learnt op-amps.

At node S, the AC current is

leads to (the summed term vanishes because the inverted phase)

giving

3.2.3 Degenerated Differential Pair

Like the single stage CS, differential pairs can also incorporate resistive degeneration to improve its linearity.

Degenerated Differential Pair

The principle is the same as that in CS stage. To analyze, we introduce the half-circuit technique. This technique is efficient because the two paths in a differential pair are completely symmetric. We analyze common-mode and differential-mode separately.

First comes the common-mode. Suppose M1 and M2 are both in sat., when DM voltage equals to 0, both path has current of . Since the two paths are symmetric, the current source can be equally separated. Then the half circuit is shown below.

Common Mode Half of Degenerated Differential Pair

Note that the degeneration reduces the headroom by because each carries a voltage drop of that value in equilibrium, limiting the CM voltage level. The following circuit can eliminate the limitation by separating the resistor and the current source.

Improvement of Degenerated Differential Pair

In DM analysis you can see that the resistor and the current source are parallel.


Then the differential mode. Similarly to the analysis in the last section, the node between source resistors is the virtual GND. Then in DM analysis current source vanishes and the DM half circuit is a typical source degenerated CS stage.

Differential Mode Half of Degenerated Differential Pair

Then the small signal gain should be

Thus, the circuit trades gain for linearity. Linearity improvement is shown below.

VTC of Degenerated Differential Pair

The degeneration widens the input voltage swing. Suppose the CM input is biased on a proper level. Then increase the DM input until one device is off. In this case another path obtains all current and the off-device has . Suppose M2 is off, we have

yields

Note that the first term on right hand side is the input swing before degeneration. So the conclusion is that the degeneration increases the input swing by .


We need to clarify the half circuit technique. The essence of this technique is symmetry. Thus, if there are connections between the two symmetric paths, the connection should be divided into two parts. Take the circuit below for example.

For common mode, the voltage between and are the same so they are equivalent to disconnection. The half-circuit is a CS stage with a current source load. For differential mode, because the variation of voltage on the sides of and are inverse, the voltage on the middle should remain constant. Hence, the midpoint should be the virtual ground. We should separate the and into two components in series.

Obviously the half circuit should be

Note that the virtual ground (DM ground) is not the small signal ground (AC ground). The former is based on the symmetry in differential pairs, whose equivalence is still DC ground because DM signal is still large signal. But the latter is based on linearity. Hence, the DM ground should be regarded as DC ground in analysis.

3.3 Common-Mode Response

In reality the circuit will not be ideal. Generally in a differential pair, either the asymmetry or the finite impedance of tail current source will introduce common-mode components in the output differential-mode signal.

First consider the impedance of tail current source.

Differential Pair with Finite tail Impedance

We first assume that the circuit is symmetric. In each path

By applying half circuit technique,

And by KCL

Combine the two equations,

Then the vairation of will lead to the variation of , the gain is

In a symmetric circuit, input CM variations disturb the bias points, altering the small-signal gain and possibly limiting the output voltage swings.


Then comes to the asymmetry. Suppose the in the two paths are not completely identical.

Common-mode response in the presence of resistor mismatch.

With the conclusion above, the voltage on the two output nodes are

Thus, a common-mode change at the input introduces a differential component at the output. On the other hand, the MOS devices are usually not symmetric. Owing to dimension and threshold voltage mismatches, the two transistors carry slightly different currents and exhibit unequal transconductances. Writing and . Since .

and

We then obtain the output voltages as

The differential is

In other words, the circuit converts input CM variations to a differential error by a factor equal to

Generally, to measure the rejection performance of a differential circuit, we define a parameter called common-mode rejection ratio (CMRR).

If only mismatch is considered,

3.4 Differential Pair with MOS Loads

The load of a differential pair need not be implemented by linear resistors. As with the common-source stages, differential pairs can employ diode-connected or current-source loads.

Differential pair with (a) diode-connected and (b) current-source loads.

The half circuit indicates that it is a CS stage with MOS load. Recall the last chapter, with small signal model

Giving

Replace with device dimensions, we have

The diode-connected loads comsume voltage headroom thus creating a trade-off between the gain and the output swing because the rising of will increase the gain but also limit the max voltage swing. To solve this problem, a technique that divide the current in M3 and M4 apart with another biased device.

Diode-connected pair with current divided.

In this structure M5 divided 80% of the so the current in M3 decreased while M1 is not affected. Since , the transconductance decreases to 20%, then the gain is now five times that of the case with no division of current when remains constant.

3.5 Gilbert Cell

The small signal gain of a differential pair is a function of , which is also a function of tail current. The gain of basic differential pair is

Note that if we converse the definition of the output terminals, the gain becomes positive

Then, to obtain the gain varying continuously from negative to positive, two pairs should be used.

Two stages providing variable gain.

and are control voltage. They control tail currents to change to adjust the gain. Quantitively,

If we add the two output signals together

If the two pairs are completely identical,

Varying and in the different direction, the gain can vary from negative to positive.

But how to add the differential pairs together? Note that

then we connect the related terminals together (because the is independent of the load)

Summation in the current domain

Now, the bias contains two independent variables, and , which reduces robustness. However, if another constraint is added, e.g. , and a bias technique is used to generate from (or vice versa), then there will only be one control signal and the risk of mismatch will decrease. Such a structure is called Gilbert cell.

Gilbert Cell

4. Biasing Techniques

4.1 Current Mirrors

In the chapters above we use biasing voltages or currents many times. For example, the tail current in a differential pair. To generate a current, we have two ideas: +

  • Generate a biasing voltage on the gate of a MOS device, which works as a current source
  • Copy a reference current

The first idea is not usable. To know why, take the following circuit for example:

Definition of current by resistive divider

In this circuit

It seems simple, but completely not applicable. First the current varies with and cannot reject supply noise at all. Moreover and also changes with temperature. And it also consumes large quiescent current on the path, bringing large power consumption.

Another point is that generating voltage level by resistor divider is also a terrible option because the voltage changes with your load.

Then we should apply the second idea: copying a reference current. This function is implemented by current mirror. The basic idea is that for a MOS device is a function of , then . By apply another function will be recovered.

Then we have the following topology: current mirror.

Current mirror

M1 is diode-connected so it is guranteed to be in sat., then follows the constraint of . With this constraint, a current corresponds to a unique , thus can be driven by . M2 serves as a typical voltage to current transformer. Regardless of CLM, the two devices follows the same constraints:

obtaining

If the two devices are identical, .


Generally current mirrors employ the same length for all devices to minimize errors due to side diffusion. And widening the channel will introduce extra error in processing (identical devices contains less error in processing). Hence, in practice people stack identical devices in parallel to equate the effect of directly widening the channel.

MOS stacking

When it comes to fractions (like ), we stack reference devices or apply series devices on output terminal to equivalently scaling the length (the latter is usually better).

MOS stacking in current dividing

But the CLM will also influence the output current. With CLM neglected, a term is left apart. In CLM

To compensate for changes in , we should increase the output impedance of . It should be noted that the common-base configuration can amplify the impedance by a factor of the intrinsic gain . Since the M2 has output only , we can apply another device M3 to amplify the output impedance to . Then the output becomes a cascode.

Cascode current mirror

The bias can be provided by .

Biased cascode current mirror

For M1, , thus . Since is a constant, then also remains constant, providing a stable bias voltage. Remember that this topology is introduced to equal of M1 and M2, so M0 and M3 should be identical because it is a source following process from N to X and Y.

But cascode consumes output headroom. To make all devices sat., the minimum allowable voltage at node P is

If can be chosen more randomly,

Obviously the cascode wastes one threshold voltage headroom. To solve this problem we need to decrease the voltage at node Y (not necessarily X because M3-M2 is the output path and the headroom is independent of M0-M1 path.). We move the output path away and the original path is reserved to decrease the node voltage.

Cascode current mirror with headroom increased

where .

With this size relationship . You can calculate the voltages of the nodes. You will find that the voltage at the node between M2 and M3 is and the output node is . Compared to the original voltge headroom , one is saved.

4.2 Current Generation

But where does the reference current come? Then we need the current generation circuit.

The following is called constant transconductance current source.

Source generation circuit based on current mirror

where .

Based on this condition, the current through M4 (serving as ) is copied to path M3 as . Later in M2, the current is copied back to M1. Thus, if there are some deviations in the REF path (M4-M1), driving the OUT path (M3-M2) to deviate, the variation in output current will be copied back to REF path, forming a negative feedback process.

In REF path

while in OUT path

meanwhile

PMOS mirror forces that . Combining all equations above, the results are

or

where .

A solution of 0 means the circuit can be completely open. A circuit with two distinct solutions is typically referred to as having a degenerate solution. In fact, the trivial solution is unstable. Upon even a slight disturbance, the current will immediately increase, eventually reaching the non-trivial solution and settling into equilibrium.

Working point crossing of two current mirrors

In the following figure, suppose the is perturbation in M1. Since , the copied current in M2 must be over . Through the PMOS mirror, a larger current is copied back to M4-M1 path, increasing the REF path current.

Feedvback process

To avoid body effect, the load can be placed between the PMOS and VDD.

Source generation circuit without body effect

To enable the current output, copy it again

Source generation circuit output

4.3 Operational Transconductance Amplifier (OTA)

Generally we need a single-ended signals. In this case classical differential amplifier will be not applicable. Then we need operational transconducance amplifier (OTA) to transform the differential signals to single ended ones. OTA is implemented by replaceing the classical load with current mirrors.

OTA Circuit

The idea is to mirror the current in one path to another and substract. Suppose current in M1 and M2 are and separately. Then and . Obviously the output current is . Note that the output is a current and input is voltage, thus it is called “transconductance” amplifier.

When the current source is implemented by a NMOS, the circuit becomes a typical 5-transistor OTA.

5-Transistor OTA Circuit

DC analysis of OTA

If is much more negative than , M1, M3, M4 are off and M2, M5 enters deep tri., . As approaches , M1 turns on, drawing a fraction of from M3 and turning M4 on. The output voltage then depends on the difference between and . For a small difference between and , both M2 and M4 are saturated, providing a high gain. As becomes more positive than , , , and increase and decreases, allowing to rise and eventually driving M4 into the triode region. If is sufficiently large, M2 turns off, M4 operates in the deep triode region with zero current, and .

OTA DC

AC analysis of OTA

Since the circuit is not completely symmetric, so the node P is no longer a precise virtual ground. We can check the non-symmetry through impedance calculation.

Impedance of OTA

Let’s review the small-signal model of a MOSFET. The impedance from the drain to AC ground is and from source to AC ground is . For diode-connected MOS, the impedance from drain becomes (draw small signal model to verify).

Then in small signal model of OTA, when VDD becomes AC ground, at node F the impedance relative to ground is

Since is usually very large, F is a node with low impedance. Meanwhile, the impedance of output node is

When two large resistors are connected in parallel, their combined resistance remains very high; therefore, the output impedance of this circuit is very high (which is why it is suitable for driving current-driven loads). Because the impedances differ, the voltage swings at the two nodes are also different.

We assume M1 and M2 are identical, thus and . At output terminal, , thus

By the way we have known , then the approximate gain is (mind the direction)

If the effect of the current mirror is taken into consideration, the entire small signal model is

Small signal model of OTA

I’m lazy to calculate the complex fucking equations, the result is


Headroom Issue

To make the current mirror (mainly M4) sat. (or deeper to keep precise mirror), , which wastes headroom of . Meanwhile, since is approximately a constant (source following process), must be larger than . Hence, the range of input common-mode voltage is severely limited. To solve, we notice that the voltage at gate at M3 need not to be equal to its drain.

Improved OTA

Then to make M4 sat. , we can notice that a voltage headroom is released.


Common-mode Properties

Connect two input terminals, then the current through two paths are equal. Since M1 and M2, M3 and M4 are separately identical, the voltage on node F and out must be the same. Thus, the two nodes can be virtually shorted.

Common-mode analysis of OTA

The analysis is similar

thus

Then


Mismatch Issue

In case there is mismatch, for example M1 and M2 are not completely identical, the output will be distorted. Suppose the common-mode voltage changes a little , and M1 and M2 are not identical. First check the current mirror, small variation is transformed to through its impedance , then the voltage is further transformed to current in M4, through its transconductance . Thus, the variation of output current becomes

Meanwhile

Then

Compared to the non-mismatch result, this result contains the additional term in the numerator, revealing
the effect of transconductance mismatch on the common-mode gain.

Common-mode analysis of OTA with mismatch

Power Supply Rejection

OTA has terrible PSRR (Power Supply Rejection Ratio), i.e., it can almost not resist the noise on supply. That is because for M3, the noise of supply rail is a source following process. Thus, any variation on will be completely transferred to node F, and further transferred to without any dissipation. That’s a drawback of OTA (single ended output) compared to differential amplifiers.

5. Frequency Response

5.1 Poles and Zeros

For a electric system, its transfer function can be expressed as

where is called a zero, and is called a pole, . In circuits in the real world poles should always be on the left plane or the system will become unstable.

Note that is a complex number, and real-world frequencies are mapped to the purely imaginary number in the complex plane. Therefore, a “zero” does not set the output to exactly 0, but it does affect the output signal.

Poles and Zeros in Complex Plane

A zero point term, , contributes to the amplitude with a factor , suppose the zero point is on the real axis. In decibel,

Then a zero point on the right half plane (RHP) contributes to a 20dB/dec slope increasing of amplitude. The zero point also contributes a phase shift of . When , the shift is and if the input frequency becomes rather high, the total shift approaches . Thus, a RHP zero introduces a phase delay.

A left half plane (LHP) pole affects the transfer function by , the amplitude factor is obviously -20dB/dec. The phase is also delay because you can verify that the phase factor is also by realifying the denominator.

A LHP zero contributes 20dB/dec amplitude slope increase, but phase lead instead of delay because the angle direction on the complex frequency plane is reversed.

Whether it is a zero or a pole, the phase shift is exactly 45° at the frequency corresponding to that zero or pole; only when the frequency is much higher than that frequency does the phase shift gradually approach 90°. Different zero-pole effects can be superimposed.

5.2 Miller Effect

An important phenomenon that occurs in many analog (and digital) circuits is related to the “Miller effect,” as described by Miller in a theorem. This theorem is usually used to simplify loops.

Miller’s Theorem: If an impedance is connected between node X and Y, and the voltage on the two nodes has a fixed ratio , then the impedance can be decomposed into two impedances between X, GND and Y, GND, separately. If , then

Miller Effect

The proof is trivial:

Generally, the fixed voltage ratio (gain) is provided by the main signal path, for example an amplifier. Thus, Miller equivalence is only suitable for signal path parallel to the main path.

Main and parallel signal path

In reality, gain is usually frequency‑dependent. Fortunately, for many approximate analyses, precise circuit characteristics are not required. Therefore, as a simplification, engineers often use the low‑frequency gain to compute the Miller capacitance and then apply this result to high‑frequency cases, even though the Miller equivalence strictly assumes a frequency‑independent gain.

If applied to obtain the input-output transfer function, Miller’s theorem cannot be used simultaneously to calculate the output impedance. To derive the transfer function, we apply a voltage source to the input of the circuit, obtaining a value for . When calculating the output impedance the input should be connected to ground. The driver changes and the gain of the nodes are not guranteed to be the same as that in transfer function calculation.

Generally, a cap. on a feedback loop introduces a zero. But applying Miller equivalence may eliminate the zero. Take the example below:

We suppose . On the left figure, we directly calculate the transfer function,

There is one pole and one zero (imposed to be positive). Thus, the amplitude characteristic can be plotted

Frequency response of the amplitude

But when we apply Miller equivalence, like the right figure, the transfer function becomes

The pole is

We can see that the Miller equivalence may drop a zero point. By the way, the Miller’s result does not completely match the precise result. Thus, we usually apply Miller effect to estimate the pole point, not to calculate the entire transfer function.

Generally, pole points can be estimated by nodes. Take the example above, the right figure. It is obvious that the output node is decoupled with any other parts in the circuit. The node connects only a resistor and a capacitor, forming a low-pass network. Thus, the lowpass network introduces a one-order pole point.

Connecting poles with nodes is a method that is usually used.


To approximate the dropped zero, ground the output terminal and impose the output current to be 0.

5.3 CS Stage

Now we need to take the parasitic cap.’s into consideration.

CS stage with parasitic cap

The main path is obviously the MOSFET, so the can be decomposed with Miller effect. After the decomposition, the entire circuit has two decoupled nodes X and OUT because there’s only a unidirectional signal path. Neglect the CLM, at node X, the pole can be approximated as

and output node

Do not forget to take the dropped zero back.

Since is far larger than 1, then . The frequency response should be

Frequency response of CS stage

The first pole is called the dominant pole, which affects the frequency response most severely.

If you calculate from the small signal model directly, you will get a transfer function in the following form:

In most circuits the distance between poles are very large, so you do not need to solve the denominator equation precisely. Instead, approximating with Vieta’s theorem:

Approximate and then . Note that all poles are in the LHP so a negative result is correct.

5.4 CG Stage and Source Followers

CG frequency response is simple because there’s no loop if CLM is negligible.

CG stage with cap

At input node, the source impedance (should include body effect) is parallel to , hence

The output node is simple

If is not negligible, the CG stage become much more complex because the two nodes are no longer decoupled. With this resistor the unidirectional condition is broken and the input “sees” the output through the resistor.


For source follower, since its gain is approximately 1, not very large, so it exhibits some interesting properties.

Source follower with cap

The complete transfer function is

If the two poles are assumed far apart, then the lower one has a magnitude of

Note that if you want to apply Miller effect, you will find that the Miller-equivalent cap. vanishes because the gain . Thus the input pole is approximated as . But the output pole is located in a very high frequency where the gain has dropped. Hence the Miller effect no longer holds. With the original small signal model, and can also be seen by the output node. So the output pole should be

Let us now calculate the input impedance.

Then

At low frequencies, and

At hight frequencies, and

Note that there is a term in the impedance. Plugging in the and we will get a negative impedance term. We must bear in mind that negative impedance may cause instability.


Neglect the load cap , ground input terminal and apply a voltage at the output terminal to calculate the output impedance.

yielding

At low frequencies and at high frequencies . Operating as buffers, source followers must lower the output impedance, for this reason, should be larger than .

5.5 Cascode

Cascoding proves beneficial in increasing the voltage gain of amplifiers and the output impedance of current sources while providing shielding as well.

Cascode with cap

At node A, is Miller-ed. The impedance from X to VDD is , then with Miller equivalence, the pole of A is

At node X, its impedance is the M2 source impedance , the cap includes to AC ground (bias on M2 gate), , , and another Miller-ed . Then the pole of X

The node Y is easier. With CLM neglected, impedance , cap , ,

5.6 Differential Pair

The pole estimation follows the same pattern and we do not repeat them. What special in differential pair is CMRR. Consider a complete differential pair with passive load.

Differential pair with passive load

With on output node and on the source of M1 and M2, the load impedance becomes , and differential gain is

Then it comes to the common-mode response. The circuit becomes

The cap can be absorbed into one impedance

Common-mode of differential pair with passive load and cap

KCL gives

obtaining

The output differential voltage

i.e.

Notice that the CMRR has one pole of and one zero point of approximately . Then we can plot the CMRR variation with frequency.

CMRR variation with frequency

When it comes to active-loaded differential pair (OTA), the result is much more complex.

Differential pair with active load

The figure on the right is obtained by replacing , M1 and M2 by Thevenin equivalent, where due to the intrinsic amplification of MOS and due to the drain impedance of MOSFET. The total gain can be expressed by a very complex fraction.

But in practice you can still estimate poles with our classical method. At output node

At input node, neglect because , and

6. Noise

6.1 Noise Theory

Noise is a random signal that cannot be predicted even if all signals in the past are known. Thus, the research of noise must be completed with statistical models. In most cases the average power is predictable. We observe the signal in a period of time , with a load resistance , the average power is

But noise is a random signal. So the average power changes with the selection of also randomly. Thus, to measure the property of noise, the time period must reach infinity.

To adapt all types of signal (like current), we remove the resistance on the denominator.

This concept becomes more versatile when frequency spectrum is introduced, and also more practical. In the view of frequency domain, the spectrum should give the same energy (power) as that of the time domain.

The indicates the noise component of a noise signal in a very narrow frequency bandwidth , called “power spectral density” (PSD). Strictly PSD is defined as the average power carried by in a one-hertz bandwidth around . From this view, we can deduce that PSD is the Fourier transformation of .

Obviously, the is descreibed with , and PSD has dimension . Note that the PSD is a power spectrum, not a amplitude spectrum.

Theorem: If a signal with spectrum is applied to a linear time-invariant system with transfer function , then the output spectrum is given by

The absolute value on the transfer function is introduced by the nature that PSD carries energy and energy must be positive, while the square comes from the squared dimension of .

For real signal , its PSD is an even function. Since we usually focus on the positive frequency part, its more practical to fold the negative part to the positive part


The common effect of two noise source is not always independent. Let’s add two noise signals and check the power

If the cross term vanishes, the two noise sources are called uncorrelated; otherwise they are correlated.

In most cases, noise sources are uncorrelated or we can say they are independent. For example: the noise introduced by two lumped resistance. Correlated noise always originates the same generation or follows a fixed transfer relation. For example the power supply noise applied to different stages finally compiles to the same output. They comes from the same generation so these noise sources are considered correlated.


Sometimes the amplitude is also important. Then we take the square root of the PSD, obtaining the amplitude spectrum. But note that the amplitude is not . In fact, it is . Hence, the value you obtained by taking the square root of the PSD function is the RMS value.


To measure the noise performance of a system, we introduce a index called “signal to noise ratio” (SNR), defined as

SNR is usually measured in decibel (dB)

Keep in mind that the factor is 10, not 20, because SNR is a power ratio. The factor 20 is used for amplitude ratios. The factor 2 comes from applying the logarithm to a squared quantity (since power is proportional to the square of amplitude).

6.2 Thermal Noise

Resistor thermal noise

Lumped resistor introduces noise due to the thermal fluctuation of its atoms. This type of noise is called thermal noise. Since fluctuation is independent of the frequency, the PSD must be flat. Noise with this feature is called white noise. Generally, the resistor thermal noise can be expressed as

Resistor thermal noise model and the PSD

where is the Boltzmann constant. Note that is expressed in . Thus, we also write . In the datasheet, you can also see spectrum expressed in amplitude, with dimension . For example, you can see a resistor has voltage noise of 0.91nV/Hz, then in a 1MHz system, the total RMS value of noise is
.

Since we can express the noise with voltage amplitude (obviously through the Thevenin’s theorem), we can also express the noise via current, though Norton’s theorem.

Resistor thermal noise model with current

To equate with the Thevenin model, the current noise express must be modified


MOSFET thermal noise

MOSFETs also exhibit thermal noise, mainly introduced by the DS resistance. The channel thermal noise can be modelled by a parallel current source between drain and source.

MOSFET channel noise model

With noise current PSD

The is not the body effect. Instead it is a noise factor. In long channel devices we take . And in short channel devices it becomes larger and approaches 1.

Another noise originates from the distribution resistance on the gate. For a relatively wide device, the channel noise is typically negeligible and the gate resistance becomes dominant. Now take a example of the simplest MOSFET.

MOSFET layout

Suppose the total end‑to‑end resistance of the gate polysilicon (from left to right) is . However, this resistance is not lumped at one point — it is distributed along the gate length (or width, depending on orientation). At any specific point along the gate, only the small resistance from that point to the contact contributes to the noise seen by the channel segment underneath. In other words, the gate resistance appears as a distributed RC network: the resistance of a small section affects only the portion of the channel “downstream” from that section, not the entire channel at once. This distributed nature makes the effective noise contribution different from that of a simple lumped gate resistor.

The transconductance contributed by a small section

The resistance from input to

Then the total current transfered from gate voltage noise is

Hence, the distributed resistance on the gate can be equated as a lumped resistance valued . By proper layout, the gate resistance can be minimized and further minimize the gate noise. Have a look at the following two layouts.

Suppose the total end-to-end resistance in the first layout is . Then due to the narrowlization of channel, the end-to-end gate resistance of each path becomes . Then, add the distribution factor 1/3, the equivalent lumped resistance becomes . Fianlly the 4 gates are parallel, the total equivalent gate resistance becomes .

6.3 Flicker Noise

This type of noise appears in the channel of MOSFETs. Since the silicon crystal cannot be completely perfect, it must contains some defects. When the channel turns on, these defects may capture and release electrons randomly. In the external view, the electron number is changing randomly in the device. Noise caused by the defects is called flicker noise.

Obviously, slower electrons are more easily to be captured and released, while faster ones are more difficult. Thus, the PSD of flicker noise is not flat. Instead, the is anti-proportional to .

Superposed on the gate. is a factor on the order of approximately . Transfer to the channel current, it can also be expressed with

It is believed that some other phenomenons also contributes to the flicker noise. So the expression may be more complex in reality. But up to now no one knows why.

Obviously, the MOS is influenced by the thermal noise and flicker noise at the same time. In low frequencies band, flicker noise is dominant while in high frequencies thermal noise plays the most important role. The turning occurs at the frequency of .

This frequency is called corner frequency

MOSFET corner frequency

6.4 General Noise Model

Consider a general circuit with one input port and one output port. How do we quantify the effect of noise here? The natural approach would be to set the input to zero and calculate the total noise at the output due to various sources of noise in the circuit. This is indeed how the noise is measured in the laboratory or in simulations.

But the output-referred noise does not allow a fair comparison of the performance of different circuits because it depends on the gain. Considering only the output noise, we may conclude that as the gain increases, the circuit becomes noisier, an incorrect result because a larger gain also provides a proportionally higher signal level at the output. That is, the output SNR does not depend on the gain.

To remove the puzzle caused by the gain, we equate all the noise to the input terminal instead of the output terminal, obtaining the input-referred noise . Then the output-referred noise has a fixed transfer relation of .

The input-referred noise indicates how much the input signal is corrupted by the circuit’s noise. But it cannot be measured by experiments because the input-referred noise is just a mathematical equivalence. Physically the sources are still distributed in the system, not on the input terminal.

But there is still a problem: If we apply a simple Thevenin model (only one voltage source), it implies that the output noise vanishes when the output impedance of the last stage is much larger than the system input impedance, which conflicts with the experimental observations. A simple Norton model also encounters the same problem when the source impedance is much smaller than the input impedance. To solve the problem, we have to apply the two models at the same time: An input-referred noise is composed with a voltage source and a current source.

Representation of noise by voltage and current sources
How do we calculate and ? Since the model is valid for any source impedance, we consider two extreme cases: zero and infinite source impedances. As shown in the following figure (a), if the source impedance is zero, flows through and has no effect on the output. Thus, the output noise measured in this case arises solely from . Similarly, if the input is open, then has no effect and the output noise is due to only .
Calculation of input-referred noise

To illustrate in detail, take the example of the CS stage, and ignore the flicker noise.

The first figure displays the effect of voltage component. The output noise is a superposition of thermal noise of and MOS channel. Since the noise is a small signal, we should apply AC circuit method.

The gain is , deducing the input noise

To obtain the input-referred current, we must include the input cap. The output is generated by the voltage raised by the current on the cap.

Hence

The result is


Suppose the source impedance is and input impedance is . With the complete noise model, the input voltage signal is

From the equation above, we can see that if , the current term can be neglected. In the language of noise, we should write .

We conclude that the input-referred noise current can be neglected if

Some may doubt that the two sources overlap in some of the sources in the system and become correlated. Initially we suppose they are uncorrelated, and through two boundary conditions, we solved the sources. Mathematically, the two sources have unique solution. So it’s not necessary to worry about the source overlap.

6.5 Noise in Single-Stage Amplifiers

Since the gate of MOSFET usually serves as the input terminal, while the gate owns high impedance. Thus, the current source of noise encounters some problems. To solve the “high impedance problem”, the current source should be moved to some other places. This method is only applicable to single-stage amplifiers.

Lemma: The voltage source of noise on the gate is equivalent to the current source on the drain if .

Equivalence of noise source on

Since the circuits have equal output impedances, we simply examine the output short-circuit currents. If the output current is provided only by the current source (fig c),

Solved

If provided by voltage source only,

Solved

Now plug in , where the total transconductance is obviously . Then the two output currents matches, proving the correction of this lemma.


CS-Stage

We have calculated it in the example in section 6.4. We do not repeat it again. Now we just focus on the flicker noise.

Since the flicker noise can be superposed on the gate directly, the total input-referred noise is

To reduce the noise, the transconductance should be maximized.


CG-Stage

A CG-Stage serves as a current buffer. On the output terminal, the noise is composed with the thermal noise of and MOS channel (neglect the flicker noise superposed with first).

CG-Stage noise calculation

Recall that the gain of CG-stage is . Short the input to obtain the input voltage noise.

Then open the input terminal. Since the source terminal is open, the MOS noise flows to the source and encounters an infinite impedance, raising the source voltage abruptly and suppresses the . The decrease of further generates a counter current to cancel the MOS thermal noise and finally cancels them completely. Thus, the thermal noise of MOS is completely cancelled by itself through feedback process (equivalent to an infinite source degenerate resistor). Note that from the input to output there is only one current path so the input current noise equals to the load resistor thermal noise.


For flicker noise, it is originally applied on the gate of MOS, i.e., . With the current transfer by transconductance,

Divide the gain to obtain input-referred voltage

The input-referred current is more obvious


Source Followers

The noise comes from both devices. Neglect the flicker noise. The output noise

The gain of a source follower is

Then transfer back to the input


Cascode Stage

Cascode circuit

M2 vanishes because at node X the noise current of M2 is forced to equate that of M1.


6.6 Differential Stages

Differential Mirror

Differential Pair

Since the current noise on the two paths are not correlated and no one can say they owns inverse phase, the source node cannot be regarded as a virtual ground. So the half circuit equivalence is not applicable here.

Suppose the is a perfect current source first, now the noise will not affect the total current, then the node can be temporarily taken as a virtual ground. Assume the two paths are identical, then the noise current in each path is composed with the thermal noise of and MOS

Transfer to the input, the gain is ,

The single-ended noise should be included into the differential signal . Since the positive and negative has no meaning in front of the statstical property of noise, in differential signal the single-ended noise should be added together instead of substracted.

The flicker noise works the same

Then take the noise of into consideration, denoted as . With a small differential input,

Suppose , then apply Taylor expansion

Thus, the noise of can impact the differential output.


Current Mirror

Current mirror with parasitic cap.

The REF MOS introduces a thermal current noise . With the transconductance, it is transferred to the gate voltage

Note that the REF MOS is connected in the diode mode. Look from the gate (in fact from the drain), the impedance is . By Thevenin’s theorem, the REF MOS is equivalent to a voltage source with inner resistance . Its inner impedance, combined with the external cap., in fact forms a low-pass filter. The transfer function

The input of M1 is first processed by this transfer function and then becomes the output current noise through . So

Hence the noise caused by thermal noise is (include the thermal noise of M1 itself)

As for the flicker noise, just replace the with because the flicker noise is directly applied on the gate.


OTA

OTA

M1 generates a thermal noise , superposed with that of M3. The reason why they did not cancel each other is that M3 shows a low impedance of about . On the gate, the transferred is

Superposed with the two flicker noise

then these voltage noise will be transferred to current noise in M4-M1 path through , superposed with M2 and M4 thermal noise, obtaining the total output current

Suppose M1 and M2, M3 and M4 are respectively identical, then the current can be simplified

With the flicker noise of M1 and M2

The output impedance of OTA is , obtaining the output voltage noise

Dividing the result by ,

Don’t forget the flicker noise of M1 and M2

Comments