Thermodynamics and Statistical Mechanics

EleCannonic

Copyright Notice:

This article is licensed under CC BY-NC-SA 4.0.

Licensing Info:

Commercial use of this content is strictly prohibited. For more details on licensing policy, please visit the About page.

Part I. Macroscopic Thermodynamics

1.1 Equilibrium State and State Parameters

1.1.1 Equilibrium State

Thermodynamic equilibrium represents a fundamental state where a system's macroscopic properties remain constant over time without external influences. This static appearance masks intense microscopic activity - molecules continue moving and colliding, but their collective behavior produces stable macroscopic averages. A critical distinction exists between true equilibrium and steady states: while both show constant properties, steady states require continuous energy/matter exchange with surroundings, like a metal rod maintaining temperature gradient between 0°C and 100°C endpoints. True equilibrium demands three simultaneous conditions: thermal uniformity (no temperature gradients), mechanical uniformity (no pressure gradients), and chemical uniformity (no composition gradients). State parameters quantify these equilibrium properties. Volume represents the available space for molecular motion, defined as the container's internal volume minus the excluded volume of molecules themselves. Pressure is the perpendicular force per unit area exerted by molecules on container walls, arising from momentum transfer during collisions. For systems without electromagnetic effects or chemical reactions, and alone suffice to define the state - such systems are called simple systems. The equilibrium concept is idealized but essential, as real systems constantly evolve toward this state through particle interactions.

1.1.2 Temperature

The zeroth law of thermodynamics provides the rigorous foundation for temperature's existence. Consider three systems A, B, and C. When A and C achieve thermal equilibrium, their state parameters satisfy: Similarly, B-C equilibrium gives: The zeroth law asserts that A-B equilibrium must then hold. Mathematical analysis reveals these relations factorize into a universal function: Equating expressions proves . This universal function defines temperature - an intensive property identical for systems in thermal equilibrium. Temperature scales implement this concept. Empirical scales like Fahrenheit () and Celsius originated from practical thermometry but suffer from material-dependent nonlinearity. The ideal gas temperature scale resolves this by exploiting gases at low density where intermolecular forces become negligible. For a constant-volume gas thermometer: where is the pressure at water's triple point (273.16 K). This limiting process eliminates gas-specific behaviors, ensuring universal temperature definition. The thermodynamic (Kelvin) scale is defined fundamentally through Carnot efficiency: for reversible heat engines between reservoirs at and , matching the ideal gas scale where applicable. Modern practical scales like ITS-90 implement thermodynamic temperatures using specified fixed points and instruments: gas thermometry below 24K, platinum resistance thermometers (13.8K-961°C), and radiation thermometry above 961°C. The Kelvin (K) and Celsius (°C) scales relate through , maintaining identical unit sizes for temperature intervals.

1.1.3 State Equation

The state equation establishes fundamental relationships between thermodynamic parameters at equilibrium. Three experimental gas laws provide the foundation:
  • Boyle’s law ( at constant ) establishes isothermal compressibility
  • Gay-Lussac’s law ( at constant ) defines volume expansion coefficient
  • Charles’s law ( at constant ) gives pressure coefficient The ideal gas equation synthesizes these observations with Avogadro's principle. Starting from Boyle's law , consider a constant-pressure process with triple-point reference: Combining with Boyle's law at triple point yields: Avogadro's law confirms that is identical for all gases at given moles , defining universal constant : where is molar volume at triple point.
Microscopically, Boltzmann's constant connects macroscopic and molecular descriptions: For gas mixtures, Dalton's law () with partial pressures combines to: proving the mixture state equation. Real gases deviate from ideality at high densities and low temperatures. The van der Waals equation incorporates molecular volume () and attraction (): Critical parameters derive from mathematical conditions at the critical point: Solving these simultaneously yields: The reduced compressibility factor provides a universal benchmark. For broader accuracy, the Onnes virial expansion: systematically corrects deviations through temperature-dependent virial coefficients, measurable via gas compression isotherms.

1.2 Microscopic Model of Matter

1.2.1 Introduction

The foundation of molecular kinetic theory lies in understanding matter at its most fundamental level. All macroscopic substances - whether solids, liquids, or gases - consist of vast numbers of microscopic particles called atoms or molecules, separated by empty space. This atomic structure explains why materials can be compressed: the apparent solidity of matter is an illusion created by electromagnetic forces between particles, not actual contact between them. Modern scientific instruments like scanning tunneling microscopes have made this invisible world visible, allowing us to image individual atoms and even manipulate them into structures like letters or patterns. The ceaseless, chaotic motion of these particles forms the heart of thermal phenomena. This motion intensifies with temperature, as dramatically demonstrated by Brownian motion - the random dance of pollen grains or smoke particles suspended in fluid. When observed under a microscope, these particles jitter unpredictably due to unbalanced collisions with surrounding fluid molecules. The smaller the particle, the more violent its motion becomes, providing direct evidence that what we perceive as "still" liquid or gas is actually a frenzy of molecular activity. This perpetual motion isn't confined to fluids; even in solids, atoms vibrate around fixed positions like springs connecting a molecular scaffold. Interactions between molecules govern material states through competing forces. At extremely close range (<0.1 nm), strong repulsive forces dominate as electron clouds overlap, preventing matter from collapsing into infinite density. At intermediate distances (0.1-1 nm), attractive forces take over through electromagnetic interactions between temporary dipoles in otherwise neutral molecules - the van der Waals forces that give liquids cohesion. These opposing forces create an equilibrium distance where molecules naturally settle, like dancers maintaining personal space in a crowded room. The delicate balance between molecular motion and these interaction forces explains phase transitions: heating provides kinetic energy to overcome attraction, turning solids to liquids to gases.

1.2.2 Pressure of Ideal Gases

To understand gas pressure at the molecular level, we construct a simplified model that captures essential behaviors while ignoring complex details. Imagine gas molecules as infinitesimal points rather than physical objects - a reasonable approximation since molecular diameters (~10⁻¹⁰ m) are dwarfed by typical intermolecular separations (~10⁻⁹ m at STP). Between collisions, these particles move freely without mutual attraction or repulsion, like commuters ignoring each other in a vast train station. Collisions between molecules or with container walls occur instantaneously and elastically, conserving both momentum and kinetic energy like perfect billiard ball impacts. This idealization emerges naturally from experimental observations: gases expand to fill containers because molecules move independently; low densities make compression easy by reducing intermolecular distances; constant pressure at equilibrium implies steady collision rates. The model's power lies in transforming the chaotic complexity of 10²³ molecules into tractable statistics, where individual paths become irrelevant and collective averages dominate.

1.2.3 Statistical Assumptions at Equilibrium

When gas reaches thermodynamic equilibrium, two powerful statistical principles emerge despite ongoing molecular chaos. First, molecules distribute uniformly in space - any macroscopic volume element contains approximately equal particle numbers regardless of location. This spatial homogeneity allows defining number density as a constant throughout the container. Second, molecular velocities show no directional preference; all three Cartesian components share identical statistical properties: The first set indicates no net flow, while the second reveals equal partitioning of kinetic energy across dimensions. These symmetries transform the intractable problem of tracking individual molecules into manageable statistical mechanics.

1.2.4 Ideal Gas Pressure Formula

Pressure emerges from the relentless barrage of molecular impacts on container walls. Consider molecules approaching a wall area perpendicular to the x-axis. Each collision reverses the x-component of momentum, delivering impulse to the wall. To find total force, we calculate how many molecules strike in time . Molecules with velocity within distance can reach the wall, forming an imaginary collision cylinder of volume . With number density for velocity group , the collision count is . Summing impulses from all velocity groups: Using velocity isotropy and defining translational kinetic energy : This elegant derivation reveals pressure as a statistical manifestation of molecular kinetic energy.
Diagram of pressure to the vessel wall

1.2.5 Microscopic Interpretation of Temperature

Connecting microscopic motion to temperature starts with combining the pressure equation with the macroscopic ideal gas law . Equating these yields the profound relationship: This simple formula carries deep implications. Temperature directly measures the average kinetic energy of molecular translation - a universal currency where all gas molecules hold equal value regardless of mass. At room temperature (300 K), every molecule - from lightweight hydrogen to heavy xenon - carries approximately J of translational energy. This energy equipartition explains why lighter molecules move faster to achieve the same energy: For example, hydrogen molecules ( kg) zip at about 1920 m/s at 300 K, while oxygen molecules ( kg) move at a more sedate 480 m/s. The collective nature of temperature becomes apparent when considering gas mixtures. Dalton's law emerges naturally since all components share the same at equilibrium. This energy equality persists even between dissimilar molecules because collisions redistribute kinetic energy until balance is achieved - similar to billiard balls of different masses reaching the same average energy after numerous impacts.

1.2.6 Molecular Forces

Beneath the apparent simplicity of gases lies a complex interplay of electromagnetic forces. Each molecule contains positively charged nuclei surrounded by negatively charged electrons. When molecules approach, their electron clouds distort, creating temporary dipoles that generate attractive forces - the van der Waals interaction. At closer ranges, Pauli repulsion dominates as overlapping electron orbitals resist compression. These competing effects produce the characteristic molecular force curve: strongly repulsive below equilibrium distance , attractive above , and negligible beyond several nanometers. Physicists model this behavior using potential energy functions:
  • Lennard-Jones potential: balances short-range repulsion (r⁻¹² term) with longer-range attraction (r⁻⁶ term). The minimum at represents optimal bonding distance.
  • Sutherland potential: models impenetrable hard spheres with weak attraction.
  • Hard-sphere potential: ignores attraction entirely, focusing on excluded volume effects. These models serve different purposes: Lennard-Jones accurately describes noble gases, Sutherland simplifies van der Waals theory, while hard-sphere models help understand dense fluids.

1.2.7 Pressure in van der Waals Gases

Real gases deviate from ideal behavior through two molecular effects: finite size and mutual attraction. Johannes van der Waals ingeniously modified the ideal gas law to account for both. First, molecules occupy physical space, reducing the available volume for motion. For molecules, the excluded volume isn't simply times molecular volume because exclusion involves pairwise interactions. Statistical analysis shows: where is the effective molecular diameter. This correction transforms volume in the ideal gas law to . Second, attractive forces reduce pressure. Surface molecules near container walls experience net inward pulls from bulk molecules, weakening their collisions. This "internal pressure" scales with the product of densities of attracting and attracted molecules: Using Sutherland's potential, constant can be derived from potential depth and range : Combining both corrections yields the van der Waals equation: This elegantly explains real gas behavior like liquefaction and critical phenomena.

1.3 Distribution of Molecular Speeds and Energy

1.3.1 Maxwell’s Speed Distribution Law

The molecular speed distribution in an ideal gas at thermal equilibrium is derived from fundamental statistical principles. Consider the velocity distribution function , defined such that the probability of a molecule having velocity components between and , and , and and is . For an isotropic system, this function depends only on the speed , and the velocity components are statistically independent. These assumptions lead to the functional form . Taking the logarithm and applying separation of variables gives , resulting in a Gaussian distribution: The normalization condition determines via Gaussian integrals. The equipartition theorem fixes , yielding the Maxwell velocity distribution: To obtain the speed distribution , integrate over all velocity directions (spherical coordinates): Thus, describes the probability density for molecular speeds.

1.3.2 Characteristic Speeds and Distribution Properties

The most probable speed occurs at the maximum of . Solving gives: The average speed is computed by integrating weighted by : Using the substitution and gamma functions, this evaluates to: The root-mean-square speed derives from the mean-square speed: Thus . Temperature dependence arises because , causing distribution broadening. Mass dependence implies heavier molecules exhibit narrower distributions.

1.3.3 Boltzmann Distribution in Force Fields

In conservative force fields, the Maxwell distribution generalizes to the Boltzmann distribution. For a potential energy , the phase-space distribution is: where and is the density at . Integrating over velocities yields the spatial density: For gravity , this becomes: The scale height characterizes the exponential decay (e.g., ~8 km for Earth's atmosphere). The Maxwell-Boltzmann distribution combines both aspects:

1.3.4 Equipartition Theorem and Energy Distribution

The equipartition theorem states: each quadratic term in a system's Hamiltonian contributes to the average energy. For a coordinate with energy , the Boltzmann distribution gives: using Gaussian integrals. Molecular degrees of freedom include:
  • Translational: 3 terms

  • Rotational: 2 (diatomic) or 3 (polyatomic) terms → or

  • Vibrational: Kinetic and potential terms each contribute per mode For a diatomic molecule, total energy is (rigid) or (non-rigid). Molar heat capacity at constant volume is: where is the number of quadratic degrees of freedom.

1.4 Mean Free Path of Gas Molecules

1.4.1 Mean Collision Frequency and Mean Free Path

Gas molecules move at high thermal speeds (e.g., nitrogen at 27°C averages 476 m/s), yet macroscopic diffusion occurs slowly due to frequent collisions that randomize molecular paths. The mean free path quantifies the average distance a molecule travels between collisions, while the collision frequency counts collisions per unit time. To model collisions, molecules are treated as rigid spheres with effective diameter , ignoring long-range attraction but accounting for short-range repulsion that prevents overlap. For identical molecules, the collision cross-section is . When a "test" molecule moves at average relative speed through a gas of number density , it sweeps a collision cylinder of volume in time , shown in the figure below:
Collision Cylinder of a molecule
Then the collision frequency is: where is the mean thermal speed. The mean free path follows as: Using the ideal gas law , this becomes: Thus, inversely scales with pressure (e.g., air at STP: m; at Pa: m).

1.4.2 Distribution of Free Paths

The probability that a molecule travels distance without collision follows exponential decay due to random collisions. Suppose at time , there are molecules at position that have not undergone any collisions. After a time interval , the molecular beam moves to position , and molecules collide and are removed. That is, the number of molecules whose free path lies between and is . Within the distance interval to , the number of molecules undergoing collisions, , is proportional to the number of molecules at , , and proportional to the size of . The proportionality constant is , thus: This gives where is the number surviving to distance . The probability density for a free path between and is: The mean is consistent with . This distribution underpins phenomena like electron beam attenuation in vacuum tubes.

1.4.3 Viscosity

When adjacent fluid layers move at different velocities (e.g., in a -dependent flow), viscosity arises from momentum transfer perpendicular to the flow direction.
Velocity gradient in layered flow generating viscous forces
Newton's law of viscosity states: where is the dynamic viscosity, and is the velocity gradient. The force slows faster layers and accelerates slower ones, equalizing momentum. Viscosity units are Pa·s (poise: 1 P = 0.1 Pa·s). Here is a derivation: Excess momentum The number of molecules with velocity arriving at from region A during time is The momentum flow produced by these molecules is The momentum flow produced by all molecules from A to B is Excess momentum The number of molecules with velocity arriving at from region B during time is The momentum flow produced by these molecules is The contribution of all molecules from B to A to the momentum flow is The contribution to the momentum flow through by all molecules is
Momentum in Fluid

1.4.4 Heat Conduction

Heat conduction occurs when temperature gradients exist. Fourier's law describes the heat flux through area in time : where is the thermal conductivity (W·m⁻¹·K⁻¹). Heat flows down the temperature gradient, transferring energy from hot to cold regions. In gases, conduction dominates when bulk motion (convection) is negligible. For heat conduction in gases to occur without a pressure difference (), the number density is determined by: Consider regions A (hotter, at ) and B (cooler, at ) with temperatures and respectively: Using the mean speed : For small temperature differences, the flux is approximately constant: Microscopically, energy transfer occurs via molecular collisions: molecules from hotter regions carry higher kinetic energy, exchanging it upon collision with cooler-region molecules. The average thermal energy per molecule is:
  • In region A: (for translational energy; for molecules with degrees of freedom, it's , )
  • In region B: , The number of molecular pairs exchanged across an area in time is estimated as: (The factor arises from considering the fraction of molecules moving perpendicular to the surface in a specific direction within an isotropic gas). The net energy transported along the positive z-axis per exchanged molecular pair is the difference: The total energy transported through along the positive z-axis in time (i.e., the heat ) is: The temperature difference is related to the temperature gradient at : Substituting this in: Comparing this to Fourier's law , the thermal conductivity is identified as: Using the molar heat capacity at constant volume (where is Avogadro's number), and the specific heat capacity (per unit mass), and the density , this can be rewritten as:
    Demonstration of Heat Conductance

1.4.5 Diffusion

Diffusion transports mass due to density gradients. Fick's law for the mass crossing area in time is: where is the diffusion coefficient (m²·s⁻¹), and is mass density. In gases, self-diffusion (identical molecules) and mutual diffusion (different species) arise from net molecular flux from high-to-low-density regions. Consider a pressure difference where . Since pressure relates to density via or equivalently at constant temperature: The net number of molecules transported along the positive -axis through in time is: (The factor accounts for the fraction of molecules moving perpendicular to the surface). The corresponding net mass transported is: Relating the density difference to the gradient at : Substituting: Comparing to Fick's law , the diffusion coefficient is: For mixtures of different gases (mutual diffusion), the diffusion coefficient depends on the interacting species. An example is shown below:
Gas (cm²·s⁻¹)
Gas 1: Hydrogen () 3 0.594
Gas 2: Carbon Dioxide () 1 0.605
Demonstration of Diffusion

1.5 The First Law of Thermodynamics

1.5.1 Thermodynamic Processes

A thermodynamic process, or simply process, occurs when the state of a system changes over time. For example, when advancing a piston compresses gas in a cylinder, the gas's volume, density, temperature, or pressure will change, and at any moment during the process, the density, pressure, and temperature are not identical throughout the gas. Thermodynamic processes are classified into non-quasistatic processes and quasistatic processes. In non-quasistatic processes, the system transitions from an equilibrium state to a disrupted state before reaching a new equilibrium. The time from equilibrium disruption to new equilibrium establishment is called relaxation time (). Actual processes often proceed rapidly, with the system undergoing further changes before reaching new equilibrium, meaning the system passes through non-equilibrium states that cannot be described by state parameters. The free expansion of ideal gas is a typical non-quasistatic process where "free" means the gas expands without resistance. In quasistatic processes, every intermediate state is infinitesimally close to an equilibrium state. This is only possible when the process proceeds "infinitely slowly." For actual processes, quasistatic approximation requires the characteristic time of state change to be much greater than the relaxation time . Equilibrium states in quasistatic processes have definite state parameter values, represented by points on a P-V diagram for simple systems. The quasistatic change process is represented by a curve on the P-V diagram, called the process curve. Though an ideal limit, quasistatic processes are fundamental to thermodynamics and practically significant. Unless otherwise specified, thermodynamic processes refer to quasistatic processes. For instance, gradually heating a system from to through small temperature increments is quasistatic, while directly contacting a system at with a heat source at is not.

1.5.2 Work

Work is a method of energy exchange. In thermodynamics, it represents the conversion between ordered mechanical energy from the surroundings and disordered thermal energy of the system, denoted as . Work can change the system's state, affecting not only mechanical motion but also thermal motion and electromagnetic states, such as friction heating (mechanical work) or electrical heating (electrical work). For gases undergoing quasistatic processes, work done by the system is given by and where indicates work done on the system. Graphically, work magnitude equals the area under the process curve on a diagram. Work is not a state characteristic but a process characteristic—it depends on the path taken between states. For different processes from equilibrium state 1 to 2, work done by the surroundings varies. Work is a process quantity, path-dependent.

1.5.3 Heat

Heat is another method to change system state, distinct from work. While work involves energy transfer through generalized forces causing generalized displacements, heat transfer occurs due to temperature differences. Joule's experiments demonstrated that heat production or disappearance always accompanies equivalent disappearance or production of other energy forms (mechanical, electrical), proving no separately conserved "caloric" exists. Instead, heat, mechanical energy, and electrical energy together conserve energy. Heat () is transferred energy: positive when absorbed by the system, negative when released. Both heat and work quantify energy changes, are process-dependent, but differ in origin: work stems from mechanical interactions, heat from thermal interactions.

1.5.4 The First Law of Thermodynamics

Joule's experiments revealed a definite equivalence between heat and work (1 cal = 4.186 J), showing interconversion between mechanical/electromagnetic and thermal motion. This led to the energy conservation and transformation law: all matter possesses energy in various forms convertible between each other and transferable between objects, with total quantity conserved. An alternative statement: perpetual motion machines of the first kind are impossible. For a system changing from initial state 1 to final state 2 via an adiabatic process (no heat exchange), the work done by surroundings is the adiabatic work. Joule's experiments showed that for fixed initial (state 1, temperature ) and final states (state 2, temperature ), the adiabatic work is path-independent. Thus, internal energy is defined as a state function satisfying for any adiabatic process between states 1 and 2. For non-adiabatic processes with work done on the system and heat absorbed, energy conservation gives: This is the mathematical expression of the first law, where for work done on the system, for heat absorbed. For infinitesimal processes: Here, is exact (state function), while and are inexact differentials (process quantities). The first law applies to both quasistatic and non-quasistatic processes, though initial/final states must be equilibrium for calculations. If only volume work exists:

1.5.5 Heat Capacity and Enthalpy

Heat capacity is defined as when a system's temperature rises by with absorbed heat . It depends on process, substance, and mass. At constant volume (), , so , giving constant-volume heat capacity: where and are functions of and . At constant pressure, work , so absorbed heat is: Defining enthalpy (state function), . For infinitesimal processes: Thus, constant-pressure heat capacity is: where and are functions of and . For ideal gases, .

1.5.6 Internal Energy of Gases and Joule-Thomson Experiment

In Joule's 1845 free expansion experiment, gas expanded into vacuum with no temperature change observed. Applying the first law (, ) gives , indicating constant internal energy during adiabatic free expansion. Thus, for ideal gases: However, Joule's experiment was imprecise due to water's large heat capacity masking gas temperature changes. The Joule-Thomson experiment (1852) improved accuracy by studying adiabatic throttling.
Joule Thomson Experiment
In adiabatic throttling, high-pressure gas passes through a porous plug to low pressure with no heat exchange (). The process equation gives constant enthalpy. The Joule-Thomson coefficient: determines temperature change: for cooling, for heating. Real gases have , proving internal energy depends on both temperature and volume, with molecular forces present. This effect enables practical gas liquefaction (e.g., Linde process). For ideal gases obeying and : The internal energy depends solely on temperature due to negligible intermolecular forces. By the equipartition theorem, each degree of freedom contributes per mole. For degrees of freedom: The constant-volume heat capacity follows by differentiation: Enthalpy is defined as . Substituting the ideal gas law : Differentiating with respect to at constant pressure: Thus the molar heat capacity at constant pressure is: The heat capacity ratio is: This confirms and for all .

Process Applications:

  • Isochoric ( constant): , , , .

  • Isobaric ( constant):

  • Isothermal ( constant): , , .

  • Adiabatic (): , leading to , , . Work . For a polytropic process , the molar heat capacity is derived from the first law and process equation. For 1 mole of ideal gas: The molar heat capacity is defined as , so: From the polytropic equation and ideal gas law , solve for : Differentiate with respect to : Rearrange to express : Substitute: Thus: Using and , express as: Substitute into (2): Solving for in terms of heat capacities:

Then special cases can be verified:

  • Isobaric ():

  • Isochoric (): (limit of )

  • Isothermal (): (undefined, consistent with )

  • Adiabatic (): (since )

1.5.7 Cyclic Processes and Carnot Cycle

Cyclic processes involve a working substance returning to its initial state after completing a series of thermodynamic changes. Quasistatic cycles are represented as closed curves on P-V diagrams, with clockwise cycles functioning as heat engines and counter-clockwise cycles as refrigeration systems. In heat engines, the working substance absorbs heat from a high-temperature reservoir at , performs net work on the surroundings, and releases heat to a low-temperature reservoir at . The first law gives: with thermal efficiency defined as:
P-V Diagram of Carnot Figure

The Carnot cycle—a fundamental model for heat engines—combines two isothermal and two adiabatic processes. For an ideal gas working substance:

  1. Isothermal expansion (A→B):

  2. Adiabatic expansion (B→C):

  3. Isothermal compression (C→D):

  4. Adiabatic compression (D→A): The adiabatic equations yield the volume ratio relationship: Substituting into the efficiency formula: This result depends solely on reservoir temperatures and is independent of the working substance.

For refrigeration cycles (counter-clockwise on P-V diagram), the working substance absorbs heat from a cold reservoir () using work input , releasing heat to a hot reservoir (). The coefficient of performance is: For Carnot refrigerators, the maximum possible performance is:

Practical engine implementations also include:

  • Otto cycle (constant-volume heating) with efficiency: where is the compression ratio.
  • Diesel cycle (constant-pressure heating) with higher efficiency due to greater compression ratios.

1.6 The Second Law of Thermodynamics

1.6.1 The Second Law of Thermodynamics

The Second Law of Thermodynamics addresses the directionality of natural processes, complementing the First Law's energy conservation principle. While the First Law prohibits perpetual motion machines of the first kind (violating energy conservation), it does not restrict process directionality. For example, heat spontaneously flows from high to low temperatures but not the reverse. The Second Law resolves this through two equivalent formulations:
  • Kelvin Statement (1851): It is impossible to convert heat entirely from a single heat source into work without other effects. This implies heat engines cannot achieve 100% efficiency () and prohibits perpetual motion machines of the second kind.
  • Clausius Statement (1850): Heat cannot spontaneously flow from a cold to a hot body without external work input. This establishes the directional nature of heat transfer and limits refrigeration efficiency ().
Thermodynamic processes exhibit irreversibility due to inherent disequilibrium. A process is reversible only if both the system and surroundings return to their initial states without net changes. Kelvin's statement reveals the irreversibility of work-to-heat conversion, while Clausius's statement shows irreversibility in heat conduction. These statements are logically equivalent: violation of one implies violation of the other. Crucially, all irreversible processes are interconnected—demonstrating that irreversibility in one process (e.g., free expansion) implies irreversibility in others (e.g., heat transfer). The core principle is that all macroscopic thermal processes are irreversible; reversible processes are idealizations requiring quasistatic, dissipation-free conditions.

We now prove the equivalence between Kelvin and Clausius Statements:

(1) Clausius false ⇒ Kelvin false

Assume device violates Clausius: transfers heat from cold reservoir to hot reservoir without work input. Combine with reversible heat engine operating between and : Net effect: Complete conversion of heat from to work , violating Kelvin.

(2) Kelvin false ⇒ Clausius false

Assume device violates Kelvin: converts heat from entirely to work . Drive refrigerator with this work: Net effect: Heat spontaneously flows from to , violating Clausius. Irreversibility arises statistically from molecular disorder. Consider gas molecules in a partitioned container: after partition removal, molecules spread uniformly. The uniform distribution has the highest thermodynamic probability (), while ordered states (e.g., all molecules on one side) have negligible for large . The Boltzmann postulate states that isolated systems in equilibrium have equal microscopic state probabilities. Natural processes evolve toward macroscopic states with higher , increasing disorder. The Second Law's statistical interpretation: isolated systems evolve from low- to high- states, reflecting increased randomness. This applies only to macroscopic systems, not microscopic phenomena like Brownian motion, and is confined to finite isolated systems, not the universe.

1.6.2 Carnot Theorem

Carnot Theorem establishes the theoretical limits for heat engine efficiency operating between thermal reservoirs at temperatures (high) and (low). The theorem states: all reversible engines operating between these reservoirs achieve identical efficiency while irreversible engines satisfy The proof for reversible engine equality proceeds by contradiction. Consider two reversible engines and operating between reservoirs at and . Assuming , set to operate reversibly as a refrigerator with adjusted cycles satisfying and . The inequality implies where and . The composite system extracts net work while exchanging zero net heat with both reservoirs. This violates Kelvin's statement by converting heat entirely into work without other effects, forcing For irreversible engines, let be irreversible and reversible between and . Assuming and running as a refrigerator with matched , the inequality implies where and . The composite system extracts net work with zero net heat exchange at and net heat loss at . If , this violates Kelvin's statement by producing positive work from non-positive heat extraction, proving with equality only for reversible engines.

1.6.3 Thermodynamic Temperature Scale

The thermodynamic temperature scale, established by Lord Kelvin using Carnot's Theorem, provides a universal temperature definition independent of material properties. For a reversible heat engine operating between reservoirs at empirical temperatures and , the heat ratio satisfies where is a universal function. Introducing a third reservoir at with an auxiliary reversible engine yields Combining these ratios gives where the universal function depends only on temperature. Defining thermodynamic temperature establishes Fixing the triple point of water as aligns this scale with the ideal gas thermometer where applicable, confirming This derivation provides the theoretical foundation for the absolute temperature scale used in the Carnot efficiency formula

1.6.4 Entropy

The Second Law of Thermodynamics addresses the directional nature of natural processes, complementing the First Law's energy conservation principle. Kelvin's statement (1851) declares it impossible to convert heat entirely from a single source into work without other effects, implying heat engines cannot achieve 100% efficiency () and prohibiting perpetual motion machines of the second kind. Clausius' statement (1850) establishes that heat cannot spontaneously flow from cold to hot objects without work input, defining the directional nature of heat transfer and limiting refrigeration efficiency (). Thermodynamic processes exhibit inherent irreversibility due to disequilibrium, where reversible processes require both system and surroundings to return exactly to initial states—an idealization demanding quasistatic, dissipation-free conditions. Kelvin's statement reveals work-to-heat conversion irreversibility, while Clausius' statement demonstrates heat conduction irreversibility, with both formulations being logically equivalent: violation of one implies violation of the other. For example, if heat conduction were reversible, one could construct a system transferring heat from cold to hot reservoirs without work, violating Clausius' statement. Similarly, assuming free expansion reversibility leads to violation of Kelvin's statement, confirming all macroscopic thermal processes are fundamentally irreversible. Irreversibility emerges statistically from molecular disorder. Consider gas molecules in a partitioned container: after partition removal, molecules spread uniformly with maximum thermodynamic probability , while ordered states (e.g., all molecules on one side) become negligible for . Boltzmann's postulate states that isolated equilibrium systems have equal microscopic state probabilities, explaining why natural processes evolve toward higher states, increasing disorder. This statistical interpretation—isolated systems evolve from low to high states—applies exclusively to macroscopic systems and finite isolated environments. Carnot Theorem establishes efficiency limits between reservoirs at : all reversible engines achieve , while irreversible engines satisfy . The proof for reversible equality involves showing that two engines with different efficiencies would produce net work from a single reservoir when coupled, violating Kelvin's statement. Kelvin established a universal temperature scale using Carnot's Theorem. For a reversible engine between reservoirs: after defining thermodynamic temperature through heat ratios and fixing water's triple point at . Entropy is defined via reversible cycles: Combining with the First Law yields: For ideal gases: The Clausius inequality implies , leading to the entropy increase principle for isolated systems: where free expansion gives and heat conduction yields . Boltzmann's microscopic interpretation: quantifies disorder through microstate count . For open systems: with Maxwell's demon paradox resolved by Landauer's principle: erasing 1 bit dissipates at least heat.

Part II. Classical Statistical Mechanics

Note:

  • Before you read this part, please make sure you have learned Classical Mechanics, Quantum Mechanics and Probability.

  • In Part II and Part III, Boltzmann constant are all normalized to . So temperature and energy have the same unit.

2.1 Description of States

Statistical physics connects the behavior of individual particles to the measurable properties of materials
we observe in everyday life.
To understand how this connection works,
we first need a way to describe the detailed state of a many-particle system.

Consider an isolated system composed of particles,
like molecules in a sealed container.
Each particle requires six numbers to fully specify its mechanical state:
three position coordinates and three momentum components

.

For the entire system,
we must account for all particles simultaneously.

The complete microscopic state - called a microstate - is therefore described by listing all positions and momenta together:

This collection of numbers forms a mathematical point in a -dimensional abstract space known as phase space.

Classical mechanics demonstrates that if we know this phase space point at any instant,
and the system’s Hamiltonian (generally considered as energy),
we can in principle determine its future evolution through Hamilton’s equations:

.

Thus, a single point in phase space completely defines the instantaneous mechanical state of the entire system.

However, for macroscopic systems where ,
tracking individual phase space points becomes fundamentally impossible.
Moreover, laboratory measurements of quantities like pressure or temperature inherently average
over enormous numbers of microstates - approximately

microstates contribute to a single second of macroscopic observation.

This practical limitation necessitates a shift from deterministic mechanics to probabilistic description.
Rather than following exact trajectories,
we must consider how microstates are distributed throughout phase space,
leading us to the core methodology of statistical physics.

Based on this description, we must specify the characterization of the macro-system.
For an isolated system (no exchange of particles and energy with the environment),
its macro-state can be completely determined by 3 measureable conservation quantities:

  • Number of particles

  • Volume of the system

  • Energy of the system

Why these three? Macroscopic measurements cannot distinguish microstates sharing same .
Hence, parameters determines a macro-state of a system.
Such a macro-state contains a massive number of micro-states. It corresponds to a set in the phase space:

The condition
defines a -dimensional hypersurface (zero volume),
which causes difficulty for calculation of probability.
To solve this problem, we need to introduce energy shell.

Based on the hypersurface, we extend another dimension to give it a small “thickness” .
Then, the hypersurface turns into a thin space, represented by:

This extended space is called a energy shell.
Note that must be further smaller than ,
but it’s necessary to include sufficient microstates ().

With a dimension of ,
we can calculate the probability in the phase space.

We can figure out the “volume” of the energy shell:

where

The volume describes the number of micro-states in

2.2 Boltzmann Postulate

Initially, researchers tried to establish connections between macro and micro states by ergocity assumption.
They think isolated systems traverse all reachable states on an energy hypersurface for a sufficiently long time.
However, time scale required for strict traversal is far more larger than the age of the universe,
so it’s impossible in measuring.

To solve this problem, Boltzmann raised the most postulate in statistical mechanics:

Boltzmann Postulate: For an isolated system under equilibrium, all micro-states have equal probability.

Equilibrium means all observable properties (e.g., temperature, pressure, density) are time-invariant and uniform in space.
Based on this postulate, we can figure out the probability of a microstate in an energy shell:

where is the microstate.

Almost everything in statistical mechanics is established on the Boltzmann’s postulate.

2.3 Entropy

2.3.1 Gibbs-Shannon Entropy

Since macroscopic measurement cannot distinguish microstates,
while one macrostates include massive microstates,
we need a quantity to quantify the scale of indistinguishability.

Imagine you are predicting the weather tomorrow:
If it is known that tomorrow will definitely be sunny
(),
there is no doubt about the outcome of the prediction and the uncertainty is zero.
When you learn that it will be sunny,
the amount of information you get is also almost zero (you already knew).
If the probability of it being sunny or rainy tomorrow is 50/50
(),
the prediction is most uncertain. You get the most information
(eliminating the most uncertainty)
when you learn that it will be sunny (or rainy).
If the probability of tomorrow being sunny is 0.9 and rainy is 0.1,
the uncertainty is somewhere in between.
Learning that it will be sunny brings less information (expected)
and learning that it will be rainy brings more information (unexpected).

From the example above,
it’s clear that information carried by the events is
strongly related to the probability distribution.
Generally, we can define a function to describe the amount of information.
This function should satisfy the conditions below:

  • is continuous
  • If an event occurs with probability ,
  • For independent events , ,

We now derive the explicit form of :

In terms of probabilities,
letting and ,
we have the functional equation:

Define for .
From the conditions, is continuous,

, , and it satisfies:

We assume that is differentiable on .
This assumption is justified because the continuity of
and the functional equation imply that is differentiable
(as is standard in solving Cauchy functional equations).

Fix an arbitrary .
Consider the functional equation as a function of :

Differentiate both sides with respect to
(treating
as constant):

  • Left side: by the chain rule.

  • Right side: ,
    since is constant with respect to .

Thus:

Solving for :

Equate the expressions for :

Rearranging:

Equation (*) holds for all .
Since the left side depends only on and the right side only on ,
both sides must equal a constant,
say .
Therefore:

where is a constant.
Solving for :

Integrate both sides with respect to :

where is the constant of integration.

Apply the condition :

Since ,
we have . Thus:

Now apply the condition for all .
For , ,
so to ensure ,
we must have .
Let where .
Then:

This is equivalent to:

since any logarithm base can be absorbed into the constant
(as ,
so scales accordingly).

Now we have proved .
Returning to the context of microstates, we can define Gibbs-Shannon entropy:

where phase space probability density.

The Gibbs-Shannon entropy reflects the uncertainty and disorder of a thermodynamic system.

2.3.2 Boltzmann Entropy

After Gibbs-Shannon entropy, we can define another type of entropy called Boltzmann entropy.

where is the volume of the phase space.
We can find that when ,

.

So Boltzmann entropy is a special case of Gibbs-Shannon entropy at equilibrium.
By the way, Boltzmann entropy is only defined at equlibrium,
while Gibbs-Shannon entropy can also be defined at non-equilibrium.

2.3.3 Maximal Entropy Principle

Boltzmann’s postulate of equal probability leads to the principle of maximum entropy:
The Gibbs-Shannon entropy of an isolated thermodynamic system is maximized at equilibrium.
The process of converging to equilibrium is a process of increasing entropy.

There is a proof:

The equals sign is taken when .
This principle means equilibrium states gives maximum entropy.
By the thermodynamics 2nd law, any isolated systems tend to evolve towards equilibrium state.

2.4 Ensembles

An ensemble in statistical mechanics is a probability
distribution over the system’s phase space, representing our knowledge (or
ignorance) of the system’s exact microstate.

2.4.1 Micro-canonical Ensemble

The microcanonical ensemble is the starting point of equilibrium statistical
mechanics. It describes an isolated system —one that can exchange
neither energy, volume, nor particles with its surroundings —and assigns
probabilities to its microscopic states.

In classical mechanics, the state of a system of particles, confined
in a box with volume , is specified by a point ,
the -dimensional phase space of positions and momenta.
But thermodynamics makes no reference to microstates —
it deals only with macroscopic variables such
as energy , volume , and particle number .
The goal of statistical mechanics is to explain thermodynamics as a consequence of statistical
properties of microscopic degrees of freedom.

Recall the energy shell ,
instead of the characteristic function ,
we can also represent it with delta function:

where
is referred to as the “surface area” of the constant energy surface.

In the microcanonical ensemble, the fundamental thermodynamic quantity
is the Boltzmann entropy (at equilibrium), which is defined as

The factor on the denominator is to elliminate quantum effects.
We can rewrite the entropy:

It is obvious that
and are both extensive,
whereas is subextensive.
Hence in the thermodynamic limit,
we may ignore altogether,
and have

In thermodynamics, all state variables can be derived from the fundamental
relation .
It is useful to think of this function as a surface in a four dimensional space spanned by .
This surface was called the fundamental surface by Callen.
The microcanonical ensemble realizes this idea concretely by connecting entropy to the volume of phase space.
Once we have the entropy as a function of energy, volume, and particle number,
we can define all other thermodynamic quantities as its partial derivatives.

Temperature is defined as:

Temperature measures how rapidly the number of accessible microstates increases with energy.
For the temperature to be well-defined and positive,
the function must grow monotonically with .
Physically, this is almost always true for large systems —
higher energy means more ways to distribute that energy among microscopic degrees of freedom.

Pressure is defined as:

Pressure arises from the fact that increasing the volume allows more microstates
to be accessible —especially if particles are free to move. Thus,
the entropy increases with volume, and this increase (scaled by temperature)
defines the pressure exerted by the system. However, negative pressure
is not a signature of thermodynamic instability.

If the number of particles is also treated as a thermodynamic variable,
chemical potential is defined as:

This expresses how the entropy changes when a particle is added to the system,
at fixed energy and volume.
It is especially important when studying systems that can exchange particles (grand canonical ensemble),
but the definition remains valid even in the microcanonical ensemble as a formal identity.

To recap, we can reassign these definitions:

reassign again:

This is in fact the 1st thermodynamics law.

And it’s easy to extend all the things in this section above to multi-component systems.
Just extend to

Let’s take the ideal gas in micro-canonical ensemble for example:

Consider a classical ideal gas of indistinguishable particles of mass

,

confined to volume .
The Hamiltonian is purely kinetic:

The surface area is defined as:

Integrate over coordinates (yields ):

Set .
Using the surface area of a -dimensional sphere:

For :

Apply Stirling’s approximation for large :

Retain extensive terms:

This is the Sackur-Tetrode entropy.

Scale variables by :

  • Temperature:
  • Pressure:
  • Chemical potential:

2.4.2 Canonical Ensemble

Different from micro-canonical ensemble,
the canonical ensemble describes systems in thermal equilibrium with a heat bath at fixed temperature.
It allows variation of ,
which in micro-canonical fixed.

Consider a large, isolated system with total energy (microcanonical ensemble).
If we divide the entire system into 2 parts: subsystem and reservoir,
the energy will be divided into 3 parts:

  1. System: Small subsystem with Hamiltonian

  2. Heat bath: Large reservoir with Hamiltonian

  3. Interaction: Interaction between the system and the reservior

However, in large systems, interaction is extremely weak compared to interal actions.
So interaction Hamiltonian is neglected.

The system and reservoir exchange energy through a diathermal wall,
with fixed and for both.

System of Canonical Ensemble

By Boltzmann’s postulate, every microstate of the combined system has equal probability.
The total number of microstates is:

We split this integral by inserting :

Now consider microstate in the subsystem.
Given ,

is fixed,

but is still varying.
There’re many ‘s which has energy of .
Hence, the probability density for the system to be in microstate

is proportional to the number of bath states available:

where is the Boltzmann entropy of the bath.

Since the bath is much larger than the system (), we expand:

Define the inverse temperature:

Thus:

The proportionality constant defines the partition function:

giving the canonical distribution:

The probability density for the system to have energy is:

where is the system’s density of states.

The partition function, ,
is an extremely important function in statistical mechanics.
(Partition function of micro-canonical ensemble is )
It serves as the generating function for all thermodynamic properties of the system.

Let’s define free energy first:

is called the Helmholtz Free Energy.
Through Helmholtz free energy and partition function,
we can calculate almost all thermodynamics quantities.

  • Average Energy:

  • Gibbs-Shannon Entropy:

This result can derive the relationship between Helmoholtz free energy with energy:

Take derivative and plug result of , we have

  • Energy Variance:

We also have

Hence,

This shows equivalence with the microcanonical ensemble in the thermodynamic limit.

  • The bath’s constant temperature emerges from its large size
  • Rare high-energy states are exponentially suppressed by
  • Free energy encodes the competition between energy and entropy

2.4.3 Grand Canonical Ensemble

The grand canonical ensemble is almost the same as canonical ensemble
except that is also exchangeable between the subsystem and reservoir.

System of Grand Canonical Ensemble

Imagine a large, isolated system characterized by its total energy ,
total particle number ,
and total volume .
This overarching system can be described by its microcanonical partition function,

,

where represents the total Hamiltonian and
signifies integration over all possible microstates.

Now, consider dividing this isolated system into a small subsystem
(denoted by the subscript ‘s’)
and a vast reservoir (subscript ‘r’).
The total particle number is conserved,
meaning .
The microcanonical partition function can then be expressed
as a sum over possible particle numbers in the subsystem
()
and an integral over possible energies in the subsystem ():

The probability density of finding the subsystem with energy

and particle number is given by:

Here, is the entropy of the reservoir.
For a sufficiently large reservoir,
we can perform a Taylor expansion of its entropy around the total energy and particle number of the isolated system:

In this expansion,

is the inverse temperature (where is temperature),

and is the chemical potential.
Substituting this back into the probability density equation,
we arrive at a crucial expression for the probability density in the grand canonical ensemble:

The denominator, , is known as the grand canonical partition function:

This partition function effectively normalizes the probability.
The probability of a specific microstate ()
of the subsystem with particles is then:

The normalization condition,

,

allows us to express the grand canonical partition function in terms of the Hamiltonian

for a system with particles:

A remarkable aspect of the grand canonical ensemble is its direct relationship to the canonical ensemble.
The canonical partition function,

, describes a system with a fixed number of particles ,

fixed volume ,
and fixed temperature .
The grand canonical partition function can be seen as a sum over all possible canonical ensembles,
weighted by the factor :

This equation elegantly demonstrates that the grand canonical ensemble is,
in essence, a statistical amalgamation of many canonical ensembles,
each corresponding to a different number of particles.
This formulation is particularly powerful for systems where particle number is not a conserved quantity.

Just as the Helmholtz free energy is derived from the canonical partition function,
the grand potential,

,

is defined from the grand canonical partition function:

The grand potential provides a direct link to the system’s thermodynamic properties.
From the Gibbs-Shannon entropy,

,

we can derive a fundamental relationship:

Rearranging this equation,
we find an important expression for the grand potential in terms of other thermodynamic quantities:

Here, is the Helmholtz free energy ().
Differentiating the grand potential yields several crucial thermodynamic relations:

From this differential,
we can derive the following thermodynamic relations:

  • Average Particle Number:
  • Pressure:
  • Average Energy:

From the first law of thermodynamics,

,

and the extensivity of energy (),
we know that .
Combining this with the definition of the grand potential,

,

leads to a remarkable identity:

This equation establishes a direct connection between the grand potential and the pressure-volume product of the system.
Furthermore, the Gibbs free energy,

,

can be shown to be equal to :

This implies that the chemical potential represents the Gibbs free energy per particle,

. Differentiating the Gibbs free energy leads to the **Gibbs-Duhem relation**:

This relation highlights the interdependence of intensive variables
(temperature, pressure, and chemical potential).
On a per-particle basis, it can be written as:

where

is the entropy per particle and is the volume per particle.

A key aspect of the grand canonical ensemble is
its ability to describe fluctuations in energy and particle number,
which are inherently allowed due to the system’s coupling with the reservoir.
The mean square fluctuations are defined as:

These fluctuations are directly related to derivatives of the grand canonical partition function:

  • Particle Number Fluctuations:
  • Energy and Particle Number Cross-Fluctuations:

For macroscopic systems,
these fluctuations scale as

and are therefore negligible compared to the average values,
which scale as .
This means that for large systems,
the grand canonical ensemble’s predictions for average quantities
will converge with those from other ensembles.
However, the grand canonical ensemble remains invaluable
for understanding and calculating properties of open systems where particle exchange is crucial.

2.4.4 Recap

In the thermodynamic limit, where the particle number and volume approach infinity while the density remains constant, the microcanonical (NVE), canonical (NVT), and grand canonical (µVT) ensembles yield identical thermodynamic predictions. This equivalence stems from the suppression of relative fluctuations in macroscopic variables. Within the canonical ensemble, energy fluctuations diminish as the system size increases, with relative energy fluctuations scaling as . As , these fluctuations vanish, rendering the sharply peaked energy distribution indistinguishable from a fixed-energy microcanonical description.

Similarly, in the grand canonical ensemble, both particle number and energy fluctuations become negligible. Particle number fluctuations obey , which vanishes in the thermodynamic limit, effectively fixing the particle count. Concurrently, energy fluctuations relative to the mean energy disappear, aligning the ensemble with the canonical and microcanonical frameworks. The thermodynamic potentials of each ensemble—entropy , Helmholtz free energy , and grand potential —are rigorously linked through Legendre transforms. These transforms become exact in the thermodynamic limit, ensuring identical predictions for intensive properties like pressure or energy density across all ensembles.

Thus, the vanishing relative fluctuations and mathematical consistency of thermodynamic potentials establish the equivalence of the three ensembles for macroscopic systems.

2.5 Extremum Principles

2.5.1 Thermodynamic Potentials

We have known the energy ,
Helmholtz Free Energy
and grand potential .

Operate a Legendre transformation,

This is defined as the Gibbs Free Energy.

Take a diffrential, we can get

We can also define Enthalpy:

What we must clarify is that
differential variables in the differential equation is the natural variable of that potential.

Take energy for example:
we have

and we can expand this example to .
For example we can say

2.5.2 Maxwell Relation

The thermodynamic potentials are well-defined single-valued functions of their natural variables.
In mixed second order derivatives such as ,
the order of partial derivative can be exchanged:

since ,

,

so we obtain:

This leads to a very large number of identities between partial derivatives
of various thermodynamic variables, all called Maxwell relations:
Since these differentials are exact, mixed partials yield Maxwell relations.
We list all 15 Maxwell relations below:

Potential Relations
, ,
, ,
, ,
, ,
, ,

2.5.3 Extremum Principles

As we have already shown in Lect. 2, the equilibrium state of an isolated
system maximizes the Gibbs-Shannon entropy, subject to the constraints
of probability normalization and fixed energy (which means that the prob-
ability density function vanishes outside the energy surface). The entropy
maximizing probability distribution is an equal probability distribution on
the energy surface. Hence the maximal entropy principle is ultimately
equivalent to Boltzmann’s postulate of equal probability. Either can be
used as a starting point of equilibrium statistical mechanics.

Consider a system with continuous microstates

in thermal contact with a heat bath at temperature ,

with fixed volume and particle number

.

Let be the energy of microstate

and

the probability density of an arbitrary non-equilibrium state.
The average energy is:

and the Gibbs-Shannon entropy is:

Define the non-equilibrium Helmholtz free energy as:

where . For fixed , we seek the extremizing , subject to the constraint of probability normalization:

Introducing a Lagrange multiplier , the relevant functional that needs to be extremized is:

The first variation, , gives:

which yields a canonical distribution:

The second variation, , confirms that this is a minimum. This matches the canonical distribution, derived via Boltzmann’s postulate. Hence we obtain the following extremum principle for the equilibrium state of a thermally open system that can exchange energy with an equilibrium heat bath:

At equilibrium, the non-equilibrium Helmholtz free energy of a thermally open system is minimized.

Note that this principle is valid for both small systems and large systems. It is straightforward to verify that this principle can also be reformulated in the following equivalent form:

At equilibrium, the Gibbs-Shannon entropy of a thermally open system is maximized subject to normalization and fixed average energy.

Now consider a system exchanging energy and particles with a bath at temperature and chemical potential , with fixed volume . Let be the discrete particle number, and a continuous microstate for given , with energy . For a non-equilibrium state, the probability density yields averages:

and Gibbs-Shannon entropy:

The non-equilibrium grand potential is defined as:

We seek extremizing , subject to normalization:

Using a Lagrange multiplier , the functional is:

Setting the functional derivative to zero:

we find the grand canonical distribution:

where is the chemical potential, and is the grand-canonical partition function. We can also show that the second variation is positive: , confirming that this is a minimum of the grand potential. This establishes the following extremum principle for an open equilibrium system that exchanges energy and particles with its environment:

At equilibrium, the non-equilibrium grand potential of an open system is minimized.

An equivalent representation of this extremum principle is the following:

At the thermodynamic equilibrium, the Gibbs-Shannon entropy of an open system is maximized subject to the constraints of probability normalization and fixed average energy, and fixed average particle number.

2.6 Equilibrium Conditions

2.6.1 Thermal Equilibrium

Consider an isolated system composed of two subsystems that can exchange energy
(This is in fact the canonical ensemble).
The number of accessible microstates for a given division of energy is:

and the corresponding probability is

According to the maximal entropy principle,
we need to maximize .
To find the value of
maximizing
(denoted as ),
we should take the derivative:

This gives

Expanding about ,
with

yields a Gaussian distribution

and the fluctuation

If ,
then the entropy change associated with a small energy
transfer from system 1 to system 2 is:

This indicates that entropy increases when energy flows from the hotter to the colder subsystem.

2.6.2 Mechanical Equilibrium

Now consider two subsystems that can exchange volume through a movable, frictionless piston,
while the total energy and total volume remain fixed.
Let subsystem 1 occupy volume ,
and subsystem 2 occupy .
The total entropy is:

The most probable configuration maximizes the total entropy with respect to variations in

since , we have

Assuming thermal equilibrium has already been established,
this simplifies to

This is the mechanical equilibrium.

The same as temperature,
we can derive such a relationship:

This aligns with the everyday notion of pressure as a force that drives expansion.
In equilibrium, the driving force vanishes; outside equilibrium, it
gives rise to directional motion that increases entropy. Hence, the common
physical interpretation of pressure as an expansive force is fully consistent
with its thermodynamic and statistical definitions

2.6.3 Chemical Equilibrium

The same, we can also derive chemical equilibrium.
Consider a system with multiple components, we have

Using the identity:

We have

Hence, particles flow spontaneously from higher to lower chemical potential,
just as heat flows from hot to cold and volume expands from high to low
pressure. The chemical potential can thus be interpreted as a generalized
”force” driving particle exchange toward equilibrium.

2.7 Stability Condition

2.7.1 Entropy Stability

Consider two systems with same macrostates
(i.e., same ).
According to the scaling property,
we have

Now apply small perturbations to the system:

Our target is to find the stability condition.
Recall that stability corresponds to second order derivative.
So we expand

The differential of the entropy for a single subsystem is:

We can take differential of these equations. Finallt get

Since entropy is maximized,
if we want the system to be stable, . Hence

Then

2.7.2 Diagonalization of

We can express the second variation of entropy as a quadratic form:

where ,

,

etc.

From the thermodynamic identity:

we can solve for :

Substituting into eliminates cross-terms with :

To simplify, define new coefficients:

and introduce a new variable :

Suppose , then

dS =

We can derive a Jacobian determinant

Change variables to

, ,

Since

then we can find that

Substituding the results into ,

This requires two independent stability conditions:

2.7.3 Chemical Stability

We consider .
Then second derivative reduces to

This leads to another stability condition:

For an extensive system, the Helmholtz free energy is: Chemical potential: Differentiate w.r.t : Pressure relation:

Substitute:

So chemical stability leads to .

2.8 First Order Phase Transition

2.8.1 Conflict at Transition Point

First-order phase transitions, such as boiling or melting, are marked by discontinuities
in thermodynamic observables—like energy or volume—as external
parameters like temperature or pressure are varied. These discontinuities signal
the coexistence of macroscopically distinct phases and arise from a subtle breakdown
of stability in thermodynamic potentials. Understanding them requires a
precise analysis of entropy curvature (the second order derivative), ensemble
equivalence, and the geometry of thermodynamic functions.

First consider micro-canonical ensemble.
We assume that the volume is fixed,
so we do not have to worry about and .
Temperature comes:

We can find the heat capacity:

We use superscript MC to indicate that it is calculated using micro-canonical
ensemble.

Normally is concave,
so ,
leading to a positive capacity.
But for some systems, can be convex,
making .
You can see it clearly in the following figure:

Entropy Curve at Phase Transition Point

Now we study the same system using the canonical ensemble.
Because of the equivalence of different ensembles under thermodynamics limit,
canonical ensemble should give the same result as that of micro-canonical ensemble.

We have partition function:

and Helmholtz free energy:

The average energy is

We denote the capacity in canonical ensemble as .
It can be calculated by

Fluctuation is always positive,
so must also be positive at any time.
However, this will lead to a conflict when entropy is convex.
Such a weird conflict must contain some unusual physical phenomenon,
this is the first order phase transition.

2.8.2 Phase Transition under Micro-canonical Ensemble

To resolve the conflict between ensembles,
we employ a geometric approach that reveals the physical mechanism of phase separation.

Let’s go back to the curve of entropy.

Entropy Curve at Phase Transition Point

For any energy between and ,
we can achieve higher entropy by forming a mixture of two phases:

  • Fraction in phase (energy , entropy )
  • Fraction in phase (energy , entropy )

The mixture entropy exceeds the homogeneous entropy:

The optimal phases (, ) are determined by the double tangent condition:

  1. The straight line connecting and must be tangent to at both points

  2. This requires equal slopes at and :

The slope equality implies:

This guarantees thermal equilibrium between the two phases.
Temperature at this point is denoted as ,
called critical temperature.
Furthermore, we can show chemical equilibrium:

The equilibrium entropy follows the concave hull:

where the mixture entropy is the linear interpolation:

Between and ,

.

So we can find the capacity divergent:

The apparent contradiction between the microcanonical heat capacity

and canonical heat capacity is resolved through a fundamental singularity.

This divergence eliminates the distinction between positive and negative values -
the pole singularity renders the sign ambiguity physically meaningless.
The “negative”
in the microcanonical ensemble and the strictly positive

requirement in the canonical ensemble

converge to identical divergent behavior at the phase transition.

It agrees with our life experiment.
We know that melting and boiling will absorb extra heat while temperature remains unchanged.
According to the definition of capacity,
it is bound to be infinite.
The heat change during the process is called latent heat.
We will clarify more details in the following sections.

We must pay attention to the fact that
the convex curve is a result of state equation,
and the double tangent line is how entropy evolves in reality.
The entropy curve’s convex region arises directly from the system’s fundamental equation of state—this curved path represents all possible homogeneous configurations governed by microscopic interactions. However, this convex segment remains thermodynamically forbidden in physical systems due to its instability. The double tangent line reveals what actually occurs: systems spontaneously undergo phase separation to follow this straight-line path in the entropy-energy plane.

This straight trajectory connects coexisting phases at energies and , bypassing the unstable curved segment entirely. The system “short-circuits” the equation of state’s prediction through phase separation, with the slope defining the constant transition temperature.

This geometric duality—curved state equation versus straight phase-transition path—explains why boiling water at 100°C absorbs heat without temperature change: the system evolves along the double tangent line, not the unstable curved entropy path predicted for homogeneous states.

2.8.3 Phase Transition under Canonical Ensemble

Since the conflict has been solved,
we will derive the whole system with canonical ensemble.

Consider a simple fluid ( system) with particle number and volume in contact with a heat bath at temperature . According to the minimum free energy principle, the Helmholtz free energy is minimized at equilibrium. Suppose we compute the Helmholtz free energy per particle assuming homogeneity. The differential form is:

Pressure is obtained via derivative:

The mechanical stability condition requires:

When this condition is violated in interval ,
phase separation occurs.

Helmholtz Free Energy at Phase Transition Point

Using the same method,
we find the two points of phase transition via .
Transition points satisfy:

where is coexistence pressure. The double tangent is:

For , the system minimizes free energy by separating into phases:

Equilibrium free energy is the convex hull:

We can still derive equlibrium condirtions:

  • Thermal equilibrium: Same

  • Mechanical equilibrium:

  • Chemical equilibrium:

2.8.4 Maxwell Construction

The Helmholtz free energy curve exhibits a concave region within the phase transition interval.
This concavity violates the mechanical stability condition requiring ,
indicating that homogeneous states in this region are thermodynamically unstable.

The chemical equilibrium condition for coexisting phases is given by:

Using the pressure definition:

we evaluate the integral:

This result is the Maxwell construction,
whose essence is the enforcement of chemical potential equilibrium.
Geometrically,
it requires the area of Region I to equal that of Region II in the figure below:

Maxwell Construction in Curve

2.8.5 Latent Heat

During first-order phase transitions like boiling or melting,
systems absorb or release significant energy while maintaining constant temperature—
a phenomenon most familiar when water boils at
and remains at that temperature until fully vaporized.
This hidden energy exchange,
termed latent heat (denoted ),
breaks or forms molecular bonds rather than increasing thermal motion.

For a system of particles,
the total latent heat is quantified as

where
represents the entropy difference per particle between phases—for example,
vaporizing water requires approximately to overcome intermolecular forces.
Crucially, latent heat equivalently equals the enthalpy difference

between coexisting phases,

expressed as ,
a relationship arising directly from the equality of chemical potentials
() at phase equilibrium.

The coexistence of phases in the temperature-pressure
()
plane is defined by the chemical potential equality

,

describing an intersection curve of two surfaces in

space where phases are equally stable;

elsewhere, the phase with lower dominates as it minimizes the Gibbs free energy.
When tracing small displacements along this coexistence curve (),
the preservation of chemical equilibrium leads to the Clapeyron equation:

where is the specific volume change per particle.
This equation reveals how coexistence pressures shift with temperature: for most substances,
volume expands during melting or boiling (),
resulting in a positive slope ();
water, however, exhibits anomalous behavior where ice melting decreases volume
( since liquid water is denser),
producing a negative slope ()
that explains why increased pressure melts ice at constant temperature—a principle enabling ice skating.
For water boiling at ,
the values and

yield ,

indicating substantial pressure sensitivity near phase boundaries.

2.8.6 Stable and Metastable

The critical point is the thermodynamic state where distinct liquid and gas phases become indistinguishable,
characterized by:

  • Critical temperature ()

  • Critical pressure ()

  • Critical volume ()

At this point, liquid-gas coexistence terminates,
surface tension vanishes,
and thermodynamic response functions
(e.g., compressibility ) diverge.
You can see it clearly in the following figure:

Critical Point

Taking Van der Waals gas for example.
It has state equation of

where is the volume per particle.
For temperatures below ,

curve is not monotonic,

which means there’re concave regions.
This region indicates (instability).
The boundary of phase transition is constructed by Maxwell construction.
The critical point is determined by

Solving this for Van der Waals gas, the result is

We need to illustrate the elements in figure.

Diagram after Maxwell Construction

We must relate this figure to the Helmholtz free energy double tangent line.
Point O and D are the phase transitions boundary,
corresponding to the tangent points of .
Line OD is the projection of constructed double tangent line.
Curve OMFD is the projection of before construction.
Obviously, with temperature growing,
skrewed point M and F gradually converges to their middle point K
and finally merged to one point at critical temperature .

What still puzzles us is the meaning of point M and F.
Notice that between M and F is the unstable region we mentioned above.
so M and F is the boundary of instability regions.
The segments OM and FD correspond to metastable states,
where the system resides in a local minimum of the Helmholtz free energy.
These states satisfy local stability conditions
()
but are thermodynamically inferior to the phase-separated state on the Maxwell line.
Classic examples include:

  • Supercooled water: Liquid persists below (e.g., down to ) due to kinetic barriers inhibiting ice nucleation.

  • Superheated water: Liquid exists above (e.g., up to at high pressure) without boiling.

These metastable states are sensitive to perturbations:

  • Nucleation triggers collapse: Introduction of impurities, vibrations, or density fluctuations drives the system toward the global minimum—spontaneously phase-separating along the Maxwell line OD.

  • Physical mechanism: Local free energy minima (OM/FD) are separated from the coexistence line (OD) by energy barriers; once overcome, the system releases latent heat and achieves equilibrium via phase separation.

Now we will have to calculate the exact boundary line on plane.
We will use a method called asymptotic expansion.
Still take Van der Waals gas,

To analyze behavior near the critical point, we define reduced variables:

Expanding the Van der Waals equation around the critical point gives:

This governs liquid-gas coexistence for .

The coexistence volumes (liquid) and (gas) satisfy:

  1. Equal pressure:

  2. Equal chemical potential:

  3. Maxwell area rule:

Assume asymptotic expansions (square root to avoid pole point):

Combining the equations above and matching orders of yields:

Applying the area condition determines :

Thus, the coexistence volumes are:

and the coexistence pressure is:

The derivative defines stability:

The spinodal points
(where ) mark stability limits:

Here is a summary:

  • Stable liquid:

  • Stable gas:

  • Metastable supercooled liquid:

  • Metastable superheated vapor:

  • Unstable (spinodal) region:

Stability diagram for (horizontal line: Maxwell construction)

2.8.7 Gibbs Phase Rule

In this section we will research multi-component systems.
For multi-component systems,
differential equation of energy is expanded to

Gibbs free energy
is especially useful here.
Its differential form is

This equation gives

Commutativity of partial derivatives then leads to

Inspecting formulas above,
we may find a constraint of chemical potential:

So

Now suppose we have a system with chemical components
and coexisting phases.
Coexisting phases means the two phases have macroscopic, observable boundary.
For example, solid ice, liquid water, soild alcohol are three coexisting phases.

We assume the temperature and pressure are uniform and fixed throughout the system, as required by equilibrium. Therefore, the only remaining intensive variables needed to characterize each phase are the relative compositions. We denote as the mole fraction of component in phase . Hence we have Because of this constraint, each phase has only independent variables. So across all phases, there're indepndent variables. Meanwhile, besides the fractions, the chemical potential is also related to and . So has degree of freedom (DOF) of . Next, equilibrium requires that all chemical potentials must be equal: This gives equations, reducing DOF of . Then the total DOF turns out to be This is called **Gibbs Phase Rule**, telling us the relationship between components number and coexisting phase number. The DOF indicates how many parameters can be adjusted while remaining the current state (both and ) unchanged. To determine the maxima of coexisting phase number, we should impose to be 0, and get For example, we have a mixture composed by water and alcohol. By Gibbs phase rule, there're at most 4 phases coexisting. They may be liquid water, gas water, liquid alcohol, gas alcohol. If we reduce the temperature to freeze water, gas alcohol will disappear.

2.8.8 Chemical Reactions

Consider a chemical reaction involving components: where denotes the -th chemical species and is its stoichiometric coefficient. indicates products and for reactants. Let denote the number of moles of species . We define the reaction coordinate . It is given by The mole number of each species can then be written as where is the initial amount of species i before the reaction proceeds. Under fixed , Gibbs energy is written as Differentiate both sides, we get The second term vanishes because So we can represent the Gibbs free energy with total particle number change: where is called the Gibbs free energy of reaction. This is a quantity that indicates the extent of the reaction (because increases during the reaction). You can consider it as the derivative of to the extent of the reaction. In our direct sense, when the reaction reaches equilibrium, . We will prove it next, using the second thermodynamics law. Consider a typical canonical ensemble system, composed by a subsytem and a reservoir. The total entropy change must not be negative: Suppose the system is perturbed and the macroparameters changed a small value. where its energy and volume change by and . The reservior, correspondingly, changed and . Then their change of entropy can be written as Since are fixed, the total entropy change can then be written as This gives . Therefore, This implies that the reaction proceeds in the direction that reduces . At equlibrium, Thus, the system minimizes its Gibbs free energy at fixed , and total atom numbers. If , which means the reaction goes forward, whereas if , which means the reaction goes backward. In practice, the stoichiometric equation suggests that the reaction could proceed until one of the components is fully consumed. However, this is not thermodynamically possible. As the mole number of any species , the chemical potential typically diverges: for ideal gases and ideal mixtures. This means that as the number density of a reactant decreases, the chemical potential becomes increasingly negative, creating a divergent chemical force that opposes further reaction. Therefore, the system can never actually reach a state where for any component. Instead, the reaction slows and stalls asymptotically as depletion is approached. Similarly, we can also define and . The partial molar enthalpy and partial molar entropy are defined as These two variables play the same role as chemical potential in the construction of Gibbs free energy. and are all homogeneous functions about , Taking derivatives about for both sides, we get Recall , we further find Then like what we did for chemical potential and Gibbs free energy, we can define similar quantities for and : and naturally get Let us assume that the reaction is thermodynamically reversible. Consider an infinitesimal step of chemical reaction with . The heat of reaction, i.e., the heat absorbed by the system during the reaction step, is given by If we further assume that the system is very close to equilibrium, vanishes. Then heat reduced to The law of mass action provides a quantitative relationship between the chemical potentials of the reacting species and their concentrations (or activities) at chemical equilibrium. It is an expression of the condition that the Gibbs free energy is minimized, or equivalently, that the chemical potentials are balanced across the reaction. The chemical potential of ideal gas is: where is the de Brogile thermal wavelength. Let us choose as an arbitrary reference pressure, and defined as Then where is the concentration (number density). We define activity : Then the chemical potential takes the unified form Near equlibrium, We reassign this equation: Further define equilibrium constant: Note that the equilibrium constant is independent of the duration of reaction. This constant is defined on reference state. There is another similar quantity, but defined everywhere during an reaction, called reaction quotient: They look the same, but is in fac quite different. To clarify, we use : Since at equilibrium, we can see that matches this condition.

2.9 Second Order Phase Transition

2.9.1 Ising Model

A phase transition occurs when a material’s macroscopic properties change abruptly as an external parameter,
like temperature, is varied.
In a first-order phase transition,
exemplified by phenomena like boiling where density changes suddenly,
specific thermodynamic quantities exhibit a discontinuous jump at the transition point.
Second-order phase transitions,
also termed continuous phase transitions,
present a different picture.
Here, a fundamental physical quantity known as the order parameter is central.
The order parameter quantitatively measures the degree of order or broken symmetry within the material’s structure.

Specifically, the order parameter takes on a non-zero value in the phase exhibiting long-range order,
typically at lower temperatures,
signifying the established order,
such as spontaneous magnetization in a ferromagnet.
As the temperature approaches the critical temperature in a second-order transition,
this order parameter diminishes continuously and smoothly,
finally reaching zero in the disordered, high-temperature phase.
Crucially, while the order parameter itself evolves continuously to zero in a second-order transition,
its derivatives with respect to external parameters like temperature –
quantities such as susceptibility or specific heat –
typically exhibit discontinuities or even diverge at the critical point.
Therefore, the defining characteristic distinguishing first-order from second-order phase transitions
lies in the behavior of the order parameter:
a discontinuous jump signals a first-order transition,
whereas a continuous decrease to zero signifies a second-order transition.

In this chapter, we use iron as a example of second order phase transitions. Iron exhibits ferromagnetism below its critical Curie temperature, where spontaneous magnetization emerges, while becoming paramagnetic above this temperature. The fundamental order parameter characterizing this transition is the magnetization density, denoted as . Consider a macroscopic iron sample in thermal equilibrium with a heat bath at temperature . To construct a tractable statistical model, we partition the system into cubic lattice cells. Each cell contains a thermodynamically significant ensemble of atoms - sufficiently large that statistical averaging becomes valid, yet small enough to preserve spatial resolution. Within each cell , we define the coarse-grained magnetization as the volume-average of atomic moments: **Coarse-graining** procedure effectively reduces the complex many-atom system to a lattice representation. Crucially, in the ferromagnetic phase (), two key physical effects emerge:
  1. Local alignment: Exchange interactions drive near-perfect parallel alignment of atomic moments within each cell

  2. Spontaneous symmetry breaking: The entire cell spontaneously selects a preferred magnetization direction

These collective phenomena justify representing each coarse-grained unit by a binary spin variable , where: - corresponds to macroscopic "spin up" orientation - represents macroscopic "spin down" orientation The discrete description captures the essential physics of broken symmetry while neglecting subleading fluctuations. This simplification preserves universal critical behavior near and enables powerful analytical and computational approaches to phase transition modeling. After coarse graining, we will introduce **Ising model**. Its diagram is shown below:
Diagram of Ising Model
The Ising model assigns binary spins to each sube of lattice. Spins interact with their nearest neighbor spins and the external field. This give Hamiltonian: Here indicates ferromagnetic coupling and indicates anti-ferromagnetic. is the external magnetic field. Because can only take , if more than 3 spins are considered, the equation will finally collapsed to a two-body term, which is equivalent to two-body interaction. Meanwhile it has a trnaslational symmetry, which means the Hamiltonian and physical properties remain invariant under any discrete shift of all spins by one lattice site due to the uniform lattice structure and identical interactions. Using canonical ensemble, we can write the partition function: where Define Helmholze energy: Thermodynamic observables are obtained as derivatives of . For example, the magnetization per site is: Meanwhile, check the Helmholtz free energy, take derivative about external field : is in fact the probability distribution, since we are in canonical ensemble. Hence, the derivative above turns out to be Then magnetization can be represented with derivative of : The magnetic susceptibility, measuring the response to external field, is: This equation indicates that the magnetic susceptibility depends on the fluctuation of total magnetization. The two-point correlation function measures the extent to which spins at different locations are correlated: In a trnaslationally invarant system, , called correlation function. Then the magnetic suspectivity is given by Let be a trial probability distribution over spin configurations . Recall that the non-equilibrium Helmholtz free energy functional is defined as To minimize , take variation, At equilibrium, we have We can use this inequality to find a variational approximation to the equilibrium state and the equilibrium free energy. Next is the key step:

We replace the actual fluctuating interactions on a spin with an effective field generated by the average magnetization of its neighbors.

Based on this assumption, probabilities of each spin are decoupled and become independent. Then the joint distribution becomes simple product of each probabilities: We suppose , for it has translational symmetry. For the latter, spins in neighboring sites are anti-parallel to each other, and hence we would have to introduce two different probability distributions for two sublattices. For a single spin, we have two constraints: Solving this, we get Then, with some derivation, we can find the condition to minimize : where z is the coordination number, meaning how many spins there are next to one specific spin. Because We have And Then This is the condition of minimizing . Unfortunately, we can't find an analytic solution for this equation. Let's check a special situation with . Then equation becomes We can research equation instead. When , has 3 solutions; and when , has only 1 solution. Obviously here. We can see the critical situation is when . So we can easily find the critical temperature. When the equation yields three solutions, two symmetric non-zero solutions emerge. As temperature decreases, initial fluctuations determine which branch the system selects, leading to spontaneous symmetry breaking.

2.9.2 Scaling Hypothesis

In second-order (continuous) phase transitions, the system's symmetry changes, and critical phenomena emerge near the critical point. These phenomena—including power-law divergences, scaling laws, universality, and fractal behavior—arise from the divergence of the correlation length and correlation time. Critical phenomena are characterized by a set of critical exponents, which describe how physical quantities behave as power laws in the reduced temperature (where is the critical temperature) or the external field . To analyze these behaviors, physical quantities are decomposed into:
  • A singular (divergent) part: This part diverges or vanishes as a power law at the critical point.

  • A regular (smooth) part: This part remains finite and non-singular.

The critical exponents are defined by the power-law scaling of the singular parts.
Below are the key power-law relations,
illustrated using the Ising model as an example.

  • Specific Heat (): The singular part of the specific heat diverges as when (i.e., as ). For , diverges at ; for , it may exhibit a logarithmic divergence.

  • Order Parameter (): Below (), the spontaneous magnetization (order parameter) vanishes as . This describes how the ordered phase disappears as approaches from below.

  • Susceptibility (): The magnetic susceptibility (response to an external field) diverges as near . This indicates enhanced response to perturbations at criticality.

  • Critical Isotherm ( at ): At (), the magnetization depends on the external field as a power law. A large implies weak response to near criticality.

  • Correlation Function (): The correlation function describes spatial correlations. The correlation length indicates how far one spin can affect another spin. At criticality (), it decays algebraically as , where is the spatial dimension. The exponent quantifies deviations from mean-field decay.

  • Correlation Length (): The correlation length diverges as near , defining the spatial scale over which fluctuations are correlated. This divergence underpins all critical phenomena.

The exponents are universal — they depend only on spatial dimension and symmetry (e.g., Ising class), not microscopic details. Physists have measured them in experiment, currently you can consider them as a experimental conclusion. These power laws provide a foundational framework for understanding critical phenomena in diverse systems, from magnets to fluids. To explain why the power laws hold, Landau raises scaling hypothesis. Scaling hypothesis is the central idea underlying modern theories of critical phenomena. It consists of two main postulates:
  • Singular part of free energy. Near the critical point, the free energy can be decomposed into a smooth part and a singular part. The singular part is assumed to be a generalized homogeneous function of and : is called the scaling function, which is smooth except at
  • Single length scale: Close to the critical point, the correlation
    length, becomes the only relevant length scale.
    This means that second order transition is because that
    near the critical point,
    correlation length becomes too large (divergent)
    that all microscopic scales are too small compared to the correlation length.

So any scaling for the entire system will not change the properties of the system. Hence, the singular part of free energy must only depend on temperature and correlation length . By dimension analysis, in -dimension space, must have unit of . Then it's natural to find It turns out this assumption holds only for . Let us first inspect the implication of the scaling form of the singular free energy: Setting , we require to ensure singular behavior, leading to: The heat capacity is derived from the second derivative of the free energy: confirming as the heat capacity exponent. Differentiating with respect to gives the magnetization: For and , spontaneous magnetization requires , yielding: Another derivative gives the susceptibility: At , , leading to: From the single-length-scale hypothesis and : The susceptibility-correlation relation combined with for gives: These yield five scaling relations for exponents . Eliminating gives four experimentally testable relations:
  • Fisher law:

  • Rushbrooke law:

  • Widom law:

  • Josephson law:

Therefore only two of these exponents are independent.

2.9.3 Landau Free Energy

Let us go back to the Ising Hamiltonian and the canonical partition function as well as the free energy: Since , we now carry out the sum over spin configurations in Eq. (36) in two steps:
  • First, sum over all spin configurations with fixed .

  • Then sum over all allowed values of .

The partition function can then be written as: where is the partition function of a sub-ensemnle which is fixed. This ensemble keeps the total magnetization fixed and sums over microstates compatible with that constraint. The Landau free energy can then be defined: And the free energy in the total ensemble is given by In the thermodynamic limit, both and are extensive quantities. Hence the sum over M is dominated by only one term: the value of that maximizes the exponent. Because this term exceeds by far than other terms, even their summation. This operation is called

saddle point approximation

. Then To find value in this situation, denoeted as This equation in fact gives the state equation.

2.9.4 Landau Theory of Phase Transitions

We define free energy per spin and Landau free energy per spin: According to section 2.9.3, It is Landau's idea to expand as a Taylor series of : Odd terms vanish because we require energy must remain unchanged when changes its direction (symmetry). In the equation above, is caused by the background, changes its sign near . Near critical point, we can introduce reduced temperature and to simplify.
  • When , , and we can ignore the quartic term in the Landau free energy because . The minimum occurs at , and substituting back gives: Evidently, and are respectively the regular part and the singular part of the free energy. We need to cast the singular free energy into the scaling form. The equation above gives . To balance coefficient, This gives However, defines the behavior of . From the exact form of , when , sigular part vanishes. Then we can only have one solution: Note that converges under such parameters configuration. This is in fact the limitations of Landau's theory.

  • When , from the analysis above, is always the regular part of the free energy. We may therefore ignore it in the Legendre transform and focus directly on the singular free energy. For , the singular free energy is derived by minimizing: To extract the scaling form, rescale the variables. Define the dimensionless magnetization and field: Substituting into the free energy expression gives: where is a dimensionless function determined by the dimensionless minimization problem. This recovers the scaling form: confirming the critical exponents: Other exponents can also be calculated by Landau's function.

  • Spontaneous magnetization exponent : At and , minimization of the Landau free energy gives: Thus, .

  • Critical isotherm exponent : At , minimization yields: Hence, .

  • Susceptibility exponent : For (), the linear response gives: Thus, .

  • Correlation length exponent and correlation function exponent : Using the Gaussian approximation of the Landau-Ginzburg functional: The correlation function satisfies: with solution: This implies . The short-distance behavior corresponds to (since implies ).

  • Verification via Fisher relation:
    The consistency is confirmed by:

  • Summary of mean-field exponents: All critical exponents in Landau theory are: These satisfy the scaling relations and are exact for spatial dimensions .

Part III. Quantum Statistical Mechanics

3.1 Fermions and Bosons

3.1.1 Fundamental Properties

Fermions—encompassing electrons, protons, neutrons, and quarks—are quantum particles with
half-integer spin
whose wavefunctions exhibit antisymmetry under particle exchange.
This defining property arises from the spin-statistics theorem and fundamentally shapes their behavior:

  • Fermionic antisymmetry: The wavefunction reverses sign when any two particles are exchanged:
  • Pauli exclusion principle: Fermions resist occupying identical quantum states, leading to spatial exclusion and degeneracy pressure in neutron stars and white dwarfs.

Bosons—including photons, gluons, and Higgs bosons—possess integer spin
and exhibit symmetric wavefunctions under particle exchange:

  • Bosonic symmetry: The wavefunction remains invariant under particle exchange:
  • Bose enhancement: Multiple bosons congregate in low-energy states, enabling phenomena like superconductivity and Bose-Einstein condensation.

Quantum particles also obey principle of indistinguishability
fundamentally distinguishes quantum statistics from classical mechanics.
While classical particles maintain identity through trajectories,
quantum particles lose individuality—their wavefunctions coalesce into collective states.
The permutation operator exchanging particles
and embodies this principle:

Crucially in 3+1 dimensions, satisfies:

requiring physical states to transform as:

This phase constraint crystallizes the spin-statistics theorem:

  • Fermions () inhabit antisymmetric wavefunctions

  • Bosons () populate symmetric wavefunctions

If there’re two particles in one system,
fermions are constrained by Pauli exclusion principle while bosons don’t:

Fermions:

This state vanishes when , manifesting the Pauli exclusion principle that prevents fermions from sharing quantum states.

Bosons:

When , this state reinforces itself, doubling the amplitude for bosons to coalesce in identical configurations.

And now we expand the system to particles.
The whole state space becomes products of Hilbert spaces,
called Fock space:

In this Fock space, joint state of all particles
can be represented as derminant or permanent.

For fermions:

The determinant’s antisymmetry ensures wavefunction sign-flips under particle exchange while automatically enforcing exclusion when states coincide.

For bosons:

Where is to replace all negative sign with positive.

Summing over all permutations constructs a wavefunction invariant under particle exchange, allowing unlimited occupation of single-particle states.

This symmetry dichotomy orchestrates quantum many-body phenomena—from the stability of matter enforced by fermionic exclusion to the coherent quantum phases enabled by bosonic condensation.

3.1.2 Second Quantization

Suppose we have six particles distributed among single-particle states labeled by momentum or energy eigenstates

Imagine:

  • Particle 1 in state

  • Particle 2,3 in state

  • Particle 4,5,6 in state

The corresponding product state is:

and we must then symmetrize (for bosons) or antisymmetrize (for fermions)
the product state, which is tedious and cumbersome.
However, at the end of this process, we only care about the number
of particles in each single-particle state.
In this case, we combine the same states and denote only occupation numbers:

where represents number of particles on state .
For example, the configuration above will be denotes as

For bosons, can take any non-negative integers.
However, fermions must not violate Pauli.
So of fermions can only take 0 or 1.

3.1.3 Creation and Annihilation Operators

We are now ready to introduce the natural operators that act on these states ( |n_1, n_2, \ldots \rangle : ) creation and annihilation operators. These operators do exactly what their names suggest:

: creates a particle in the single-particle state : annihilates a particle from the state

Their action on the occupation number basis is defined as follows:

These definitions ensure the proper normalization of states and preserve the orthonormality of the Fock basis.

The algebra of these operators depends on the statistics of the particles. Traditionally we use for bosons, and for fermions. They have the following properties:

  • For bosons:

  • For fermions:

Here, denotes the commutator,
and the anticommutator.
We further define number operators:

for Bosons for Fermions

To verify, just operate this operator on a state ,
you will finally get a eigenvalue of .

We can easily show the following commutation relations:

  • For Bosons:
  • For Fermions:

These operators provide a powerful and elegant language for quantum many-body systems. They:

  • Simplify the construction of many-body states and operators

  • Encode particle statistics algebraically

  • Constitute the foundation for quantum field theory and quantum many-body theory

For example, consider the operator:

which is entirely natural in the second quantized language.
It is a linear operator on Fock space,
and when applied to the vacuum,
it yields a valid physical state:

We can use creation operators to construct the occupation number basis
states from the vacuum state:

3.2 Free Fermi Gas

3.2.1 Thermodynamic Properties

For a macroscopic free Fermi gas, the vast number of particles necessitates the use of second quantization to represent the system's state; this formalism inherently incorporates de Broglie's wave-particle duality, where particles exhibit both wave-like character (plane waves) and granularity. To determine the total energy of this non-interacting gas, we use the second-quantized Hamiltonian Note that each has more than 1 states, for we must take spins into consideration. In general, with spin of , each energy contains states. The is called the degeneracy. Before proceeding, it is important to understand when quantum effects become important for a gas. At high temperatures or low densities, the gas behaves classically, and quantum statistics can be ignored. However, when the thermal de Broglie wavelength becomes comparable to the mean interparticle spacing, the thermal wave packets of individual particles overlap significantly, and hence quantum effects can no longer be neglected. The mean inter-particle spacing is roughly , where is the particle number density. Therefore quantum degeneracy sets in when This finally gives the critical temperature where quantum effect cannot be simply neglected. For fermions, it is called Fermi temperature. Or, in this part, is normalized to 1, so temperature and energy have the same unit. This value can also be Fermi energy: When , quantum effects play en important role, and vice versa. We continue our research on Fermi gas with grand canonical ensemble because the particle number does not remain constant. In a quantum system, the grand partition function is calculated by trace: where . The trace takes all possible eigenvalues of and . And for fermions, . Hence, So, Pluging in will get the probability of taking 0 and 1. We can now compute the average occupation number of a given single particle state .

This is called the Fermi-Dirac distribution.

Furthermore, we can obtain the grand canonical potential Sum over the spins to simpltfy: Using the grand potential, we can derive all thermodynamic quantities:
  • The average particle number:

  • The internal energy:

  • The pressure:

3.2.2 Density of States (DOS)

In the thermodynamic limit, where the volume becomes very large, the allowed momenta form a dense set. It is then convenient to replace the sum over discrete with an integral over continuous momentum space: where is the volume per discrete -point in -space (because the spacing between adjacent -points in each dimension is under periodic boundary conditions, making the volume per state in -space . The reciprocal of this gives the prefactor). When considering energy-dependent physical quantities (e.g., calculating the partition function, total energy, or heat capacity in statistical mechanics), it is further useful to introduce the concept of the density of states (DOS). The density of states is defined as the number of states per unit energy interval, meaning represents the number of states with energy between and . This allows us to convert the momentum integral into an energy integral, simplifying calculations. To derive , we utilize the properties of isotropic systems (like the free electron gas), where the energy depends only on the magnitude of the wavevector , via the relation . Starting from the momentum space integral, the differential number of states is written as: In spherical coordinates, . Integrating over the angular part gives: thus: Transforming variables using the energy-momentum relation: so: Substituting into : The density of states is defined as , hence: This is often written in the simplified form: Using the density of states, any sum over (if the summed term depends only on the energy ) can be transformed into an integral over energy: This is highly efficient when dealing with thermodynamic quantities (like the Fermi-Dirac distribution for an electron gas), as it reduces the dimensionality of the integral and highlights the energy dependence.

With the density of states we can rewrite the thermodynamic quantities:

  • Grand potential

  • Particle number

  • Internal energy

We can integrate by parts and finally show which supplies an exact relation between the pressure and the internal energy density:

3.2.3 Zero Temperature Limit

When the temperature approaches 0, the Fermi-Dirac distribution function reduces to a unit-step function about energy . And the chemical potential at is defined as Fermi energy Therefore, at , all single-particle states with energy less than are occupied, and all states with energy greater than are empty. This configuration forms a Fermi-sphere in space, whose radius is , making the states on the surface to be .
Fermi Sphere,
Then the particle number is given by Substitude in the density of states in the last section, Solving for , we find For electrons, , so the calculated from zero-temperature limit agrees with that calculateed from de Brogile wavelength. We may rewrite the density of states using Fermi energy: The internal energy becomes Combine the equations above, we find which says that at zero temperature, average energy per particle is proportional to the Fermi energy. Finally, the pressure can be obtained Thus, even at zero temperature, the Fermi gas exerts a nonzero pressure, known as the degeneracy pressure. This pressure is due to Pauli’s exclusion principle. This degeneracy pressure plays a crucial role in the stability of dense astrophysical objects such as white dwarfs and neutrons stars. These stars collapse if their degenerate pressures are not strong enough to withstand the negative pressure due to the gravity force.

3.2.4 Low Temperature Sommerfeld Expansion

At low but finite temperatures , the Fermi-Dirac distribution is no longer a sharp step function. Instead, it gets slightly smeared around the Fermi energy. We will research this small drift with Sommerfeld expansion. Suppose we wish to compute integrals of the form The key insight enabling the Sommerfeld expansion is to recognize that the derivative of the Fermi-Dirac distribution, , acts as a thermally broadened Dirac delta function centered at with width . This function, defined as is normalized to unity and becomes sharply peaked at as . To exploit this behavior, we integrate by parts after introducing the auxiliary function . This yields The boundary terms vanish: At , ; at , exponentially. Thus, In the low-temperature limit (), the lower integration limit can be extended to since decays rapidly for . Expanding in a Taylor series around , and substituting into the integral gives The integrals (with ) vanish for odd due to antisymmetry. For even , they evaluate to where the dimensionless integrals are related to the Riemann zeta function: Using and , the first few coefficients are Since for , the series becomes This expansion is asymptotic, with each term proportional to . The dominance of the Fermi surface () at low temperatures ensures rapid convergence when .

Building upon the Sommerfeld expansion technique, we now systematically examine how key thermodynamic quantities of the degenerate Fermi gas evolve at low temperatures. The expansion provides a powerful framework for calculating precise temperature-dependent corrections to zero-temperature behavior.

  • Chemical Potential

    The fundamental connection between particle number and chemical potential emerges from the density of states and Fermi-Dirac distribution: which simplifies to the constraint equation: Applying the Sommerfeld expansion with yields: Solving perturbatively by writing and expanding in small and gives the chemical potential's temperature dependence: This reveals that decreases quadratically from with increasing temperature, reflecting thermal excitation of fermions just below the Fermi surface.
  • Internal Energy

    The internal energy expression: is evaluated using in the Sommerfeld expansion: Substituting the chemical potential from above and re-expanding in produces: The quadratic increase in with temperature contrasts sharply with classical gases, arising from Pauli blocking that restricts excitations to a thin shell near .
  • Pressure and Compressibility

    Using the relation from the grand potential, pressure inherits the temperature dependence of internal energy: The isothermal compressibility then follows from its thermodynamic definition: Differentiating while noting yields: This shows compressibility decreases with temperature, indicating the gas becomes stiffer as thermal excitations populate states above the Fermi surface.

3.3 Free Bose Gas

3.3.1 Thermodynamic Properties

In this lecture, we study a system of non-interacting massive bosonic particles confined in a volume at temperature and chemical potential . Bosons obey Bose-Einstein statistics, meaning that multiple particles can occupy the same quantum state. This is in contrast to fermions, which obey the Pauli exclusion principle that prevents more than one particle from occupying the same quantum state.

Our research path is the same as that of fermions.

Still, in grand canonical ensemble Then all thermodynamic quantities are given:
  • Grand potential

    It is instructive to write as a product of infinitely many grand canonical partition functions for one single-particle state systems: Similarly, the grand potential can be written as the sum: These results tell us that the Bose gas can be understood as an independent sum of infinitely many small open systems. These small open systems are in contact with the same bath with temperature and chemical potential .
  • Particle number

    This is known as Bose-Einstein distribution.

We also have

Meanwhile, the density of states for fermions remains correct for bosons, for they are derived in the same way.

3.3.2 High Temperature Limit

At high temperatures or low densities, quantum gases behave classically, with quantum statistical effects appearing as small corrections. The key parameter governing these corrections is the degeneracy parameter where is the particle density and is the thermal de Broglie wavelength. When , quantum statistics can be treated perturbatively. For a unified description of Bose and Fermi gases, we introduce the statistical parameter : This allows us to express the occupation number distribution and grand potential in a unified form. The Bose-Einstein functions () and Fermi-Dirac functions () are defined through integral representations that admit power series expansions in the fugacity : which converge for . With density of states, can be estimated with integral, where . Particle number density and energy density then take the compact forms: where is the spin degeneracy. The equation of state follows as for both statistics. To derive high-temperature expansions, we solve the particle number equation perturbatively for small . Inverting the series yields the fugacity expansion: Substituting this into the pressure and energy expressions and re-expanding in powers of gives the quantum corrections to classical behavior: The chemical potential expansion follows similarly: The sign difference in the terms reveals fundamental statistical behavior: bosonic exchange symmetry reduces pressure and energy relative to classical predictions, while fermionic exclusion increases them. This reflects how quantum statistics modify effective interparticle interactions—bosons exhibit effective attraction while fermions exhibit effective repulsion compared to distinguishable particles. The accuracy of these expansions is controlled by the smallness of , which measures the overlap of thermal wave packets and vanishes in the classical limit.

3.3.3 Bose-Einstein Condensation

Let's go back to the Bose-Einstein distribution. In order for average occupation number to be positive, the chemical potential must be less than the energy level . This must hold for every single-particle state. Hence we have where is the single particle ground state. For a free Bose gas confined in a box, we have Hence we have If , the particle number in the ground state apparently diverges. For a system with finite particle number and finite volume, of course, such a divergence cannot happen. We will have to invoke other physical conditions to determine the actual number of particles in the ground state. As we will show next, in three dimensions, it may happen that a macroscopic fraction of particles occupy the ground state, a phenomenon called Bose-Einstein condensation (BEC). For simplicity, we consider spinless bosons, so . Using the expansion, we have Since , fugacity . When increases from 0 to 1, Bose-Einstein function approaches a finite value, represented by Riemann Zeta function: There is a conflict here, Consider a fixed temperature , and gradually increase the number of particles in the system. The number density will gradually reach the satuation point: However, the LHS has no upper boundary while RHS has. We can also fix particle number, the same, we can find a critical temperature . for , the equation has no solution. Currently we have no idea why the particle number has a upper boundary. This equation lead to a critical number . Maybe we need a leap of faith. The leap is completed by Einstein. At the critical point , we have , so . But the Bose distribution tells us the occupation number of the ground state () is This divergence is an alarm that we can not really approximate the discrete sum as a continuous integral. The particle number in the ground state must be treated separately. This is where Einstein's genius at play! Einstein finally re-explained the integral-approximated part of summation: If we use the integral of density of states to estimate the sum. The density of states vanishes as . Meanwhile the ground states has zero energy. Hence, the integral does not take the ground state into account. This is how our faith leap:

We consider it as only the particles of excited states instead of all bosons, and bosons have finite excited states. When the particles exceeds the upper bound of excited states number, no matter the energy, the exceeded part must be filled in the ground state. They condensed in the ground state and do not contribute to the property of the entire system. This phenomenon is called Bose-Einstein condensation.

Under this perspective, we can reinterpret the equations above: This is the maximum capacity of all excited states. Under this number, particles tend to fill in the excited states. Beyond this number, other particles must go to the ground state. The total number According to this interpretation, the occupation number of the ground state becomes macroscopic|proportional to the system volume. We can further interpret number density with temperature, for de Brogile wavelength is related to . When , BEC occurs, then Correspondingly, When , no BEC, , and the two equations no longer hold.

Because of the condensation,
the system is divided into two phases:

  • Condensate: A macroscopic number of particles occupy the ground state . These particles have zero kinetic energy and zero mo- mentum. As a result, they contribute nothing to the energy, pressure, or entropy of the system.

  • Thermal Cloud: All other particles are distributed among excited states with , described by the usual Bose-Einstein distribu- tion with . These particles determine all thermodynamic properties energy, pressure, and heat capacity.

When , the Bose gas exists entirely in the thermal phase with no condensate (). The fugacity , and all thermodynamic properties are governed by the thermal cloud—particles distributed across excited states () according to the Bose-Einstein distribution. For spinless bosons (), the key relations are: Here, , , and denote the number density, internal energy density, and pressure of the thermal cloud. To determine the equation of state, solve the first equation for and substitute into the expressions for and .
  • Isothermal Compressibility The thermal cloud exhibits critical behavior near . The isothermal compressibility quantifies its response to pressure changes at fixed temperature: With fixed, variations in and arise solely from changes in via : Using the identity , these become: In logarithmic form (): The ratio yields : where is fixed by . As , and diverges while remains finite. Consequently, diverges:
This divergence signifies critical fluctuations—a hallmark of continuous phase transitions. The thermal cloud becomes infinitely compressible near , reflecting long-range correlations and heightened sensitivity to external pressure, despite other thermodynamic properties remaining analytic.

3.4 Trapped in a Potential

The behavior of quantum gases confined in external potentials exhibits fundamental differences from free-space systems, with harmonic traps serving as a cornerstone for cold-atom experiments. For bosons in such isotropic traps, the discrete single-particle spectrum generates a density of states scaling as , diverging from the free-space dependence. This restructuring dramatically alters Bose-Einstein condensation (BEC): the critical temperature scales as weaker than the scaling in uniform systems. Below , the condensate fraction follows decaying faster than the free-space law due to the trap’s spectral geometry. The condensate—localized in the ground state—contributes zero energy and momentum, while thermodynamics is governed by the thermal cloud, which exhibits diverging compressibility near the transition. For fermions in identical traps, degeneracy emerges when . The Fermi energy at zero temperature, and ground-state energy reflect the virial theorem’s balance between kinetic and potential energies. At low temperatures , the Sommerfeld expansion yields mirroring free fermions but with rescaled exponents. Confinement redistributes phase space, altering scaling relations—degeneracy pressure now follows versus in free space. In the semiclassical regime , discrete sums converge to phase-space integrals: This reveals universal scaling laws such as for -dimensional traps, demonstrating how confinement reshapes thermodynamics while preserving quantum degeneracy phenomena. Cold-atom experiments directly validate these predictions, probing the interplay between quantum statistics and spatial constraints.

3.5 Massless Quantum Gas

3.5.1 Density States

Massless bosonic particles such as photons and phonons exhibit a linear dispersion relation , where is the propagation speed (speed of light for photons, speed of sound for phonons). To derive the energy density of states in -dimensional space, consider the number of quantum states in a spherical shell of radius and thickness in -space. The volume element is proportional to the surface area of a -dimensional sphere: is the surface area of a unit sphere. Substituting the linear dispersion and yields: For photons in three dimensions (), including two polarization states, . For phonons in 3D crystals, accounting for three acoustic branches (two transverse, one longitudinal), .

3.5.2 Blackbody Radiation

The electromagnetic field in thermal equilibrium is modeled as a gas of non-interacting photons with zero chemical potential. Each mode of frequency behaves as a quantum harmonic oscillator with average energy . Using the photon density of states, the spectral energy density per unit volume is: Integrating over all frequencies gives the **total energy density** , scaling as (Stefan-Boltzmann law). The pressure of the photon gas satisfies , consistent with massless relativistic gases. The **radiative energy flux** emitted by a black surface relates to energy density via , where is the Stefan-Boltzmann constant. Historically, Planck’s quantization hypothesis resolved the ultraviolet catastrophe in classical electrodynamics and laid the foundation for quantum theory. The cosmic microwave background (CMB), observed at , provides a cosmological validation of blackbody radiation theory.

3.5.3 Lattice Vibration

In solids, lattice vibrations are quantized as phonons with linear dispersion . The Debye model approximates the crystal as an isotropic continuum with a cutoff frequency ensuring modes (for atoms in 3D): The internal energy is: where is the Debye temperature. The heat capacity exhibits distinct limits:
  • At high temperatures (), (Dulong-Petit law).

  • At low temperatures (), due to phonon modes behaving as massless bosons. In metals, low-temperature specific heat combines electronic and phonon contributions: , where probes the Fermi surface and reflects phonon properties. The Debye model succeeds by capturing the finite mode density and linear dispersion of acoustic phonons, while Einstein’s model (single-frequency oscillators) fails at low due to its exponential suppression of .


References:

[1] 李椿, 章立源, 钱尚武, 热学, 第三版. 北京: 高等教育出版社, 2015.

Note: Part of the mathematical formatting and structural organization of this article were assisted by AI models

Comments
On this page
Thermodynamics and Statistical Mechanics