Everything else (uncertainty principle, eigenstate properties, orthogonality, etc.) can be mathematically derived from these postulates. This page shows both postulates and their derived consequences.
What is the fundamental difference between classical and quantum mechanics?
In classical mechanics, we describe particles as having definite positions and momenta at all times. We describe their motion using Newton’s laws: \(\mathbf{F} = m\mathbf{a}\).
In quantum mechanics, we describe particles by wavefunctions\(\psi(\mathbf{r}, t)\) that encode probabilities. We see that the particle doesn’t have a definite position until measured. Instead, we find that \(|\psi(\mathbf{r}, t)|^2\) gives the probability density of finding the particle at position \(\mathbf{r}\) at time \(t\).
Why do we need quantum mechanics?
We observe that classical mechanics fails at atomic scales. At these scales:
We see that particles exhibit wave-like behavior (diffraction, interference)
We find that energy is quantized (only certain discrete values allowed)
We cannot simultaneously know position and momentum precisely (uncertainty principle)
We observe that particles can tunnel through barriers classically forbidden
Quantum mechanics is the framework that correctly describes what we observe in nature at atomic and subatomic scales.
Louis de Broglie (1924) proposed that all matter has wave properties. We see that a particle with momentum \(p\) has an associated wavelength:
\[\lambda = \frac{h}{p}\]
where \(h = 6.626 \times 10^{-34}\) J·s is Planck’s constant.
Physical meaning: We observe that higher momentum gives shorter wavelength and more particle-like behavior. Lower momentum gives longer wavelength and more wave-like behavior.
We see that the first term is kinetic energy, the second is potential energy.
What does the wavefunction mean physically?
We observe that the wavefunction \(\psi(\mathbf{r}, t)\) itself is not directly observable. We find that its squared magnitude gives the probability density:
\[P(\mathbf{r}, t) = |\psi(\mathbf{r}, t)|^2\]
This means: the probability of finding the particle in a small volume \(dV\) around position \(\mathbf{r}\) is \(|\psi(\mathbf{r}, t)|^2 dV\).
We require that the wavefunction must be normalized:
We see one of the most fundamental principles in quantum mechanics: if\(\psi_1\) and \(\psi_2\) are solutions to the Schrödinger equation, then any linear combination is also a solution:
\[\psi = c_1 \psi_1 + c_2 \psi_2\]
where \(c_1\) and \(c_2\) are complex constants.
Physical meaning: We observe that a quantum system can exist in a superposition of multiple states simultaneously. The particle isn’t “in state 1” or “in state 2” — we see it’s in both at once until we measure it.
Why does superposition work?
We note that the Schrödinger equation is linear in \(\psi\). If \(\psi_1\) satisfies:
We see that the superposition \(\psi\) also satisfies the Schrödinger equation!
What does this mean for measurements?
If we have the system in a superposition \(\psi = c_1 \psi_1 + c_2 \psi_2\) where \(\psi_1\) and \(\psi_2\) are eigenstates with eigenvalues \(E_1\) and \(E_2\):
We find that measuring the energy will give either\(E_1\) or \(E_2\) (not both, not an average!)
We calculate the probability of getting \(E_1\): \(|c_1|^2/(|c_1|^2 + |c_2|^2)\)
We calculate the probability of getting \(E_2\): \(|c_2|^2/(|c_1|^2 + |c_2|^2)\)
This is what we observe as the essence of quantum mechanics: superposition before measurement, definite outcome after measurement.
Famous examples:
Electron going through double slits (we see superposition of “through left slit” and “through right slit”)
Schrödinger’s cat (superposition of “alive” and “dead” — though this is a thought experiment!)
Quantum computing: we work with qubits in superposition of |0⟩ and |1⟩
Where does the kinetic energy operator come from?
We see that the kinetic energy operator comes from quantum mechanical postulates and the correspondence principle.
Classical mechanics: We write kinetic energy as \(T = \frac{p^2}{2m}\) where \(p\) is momentum.
Quantum mechanics postulate: We have that physical observables become operators. Position \(\mathbf{r}\) stays as multiplication, but we find that momentum becomes a derivative:
\[\hat{\mathbf{p}} = -i\hbar \nabla\]
Why this form? We see that it ensures the de Broglie relation\(\lambda = h/p\) is satisfied.
First, we connect de Broglie to wave vector k:
We start from de Broglie’s relation:
\[\lambda = \frac{h}{p}\]
We use the definition of wave number \(k = 2\pi/\lambda\):
\[k = \frac{2\pi}{\lambda} = \frac{2\pi p}{h}\]
We use \(\hbar = h/(2\pi)\) (so \(h = 2\pi\hbar\)):
where \(\nabla^2 = \frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2} + \frac{\partial^2}{\partial z^2}\) is the Laplacian.
Physical meaning: We observe that the kinetic energy depends on the curvature of the wavefunction. We see that more oscillations (higher \(k\)) give higher kinetic energy, consistent with \(E = p^2/(2m) = \hbar^2 k^2/(2m)\).
We observe attractive force between electron and nucleus. Solution: atomic orbitals with quantized energies.
Key insight: We see that the potential \(V(\mathbf{r})\) determines:
What regions the particle can access (classically forbidden if \(E < V\))
The shape of allowed wavefunctions
The quantized energy levels for bound states
The forces acting on the particle: \(\mathbf{F} = -\nabla V\)
Quantum tunneling: Unlike classical mechanics, we observe that particles can penetrate into regions where \(E < V\). We see that the wavefunction decays exponentially in these classically forbidden regions but doesn’t vanish completely.
We see that this works! The time dependence \(e^{-iEt/\hbar}\) ensures the Schrödinger equation is satisfied when the Hamiltonian acting on the spatial part gives \(E\psi\).
Physical meaning: We observe that the phase rotates at frequency \(\omega = E/\hbar\). We see that higher energy gives faster rotation in the complex plane.
We just saw that free particle solutions are plane waves \(\psi = e^{i(kx - \omega t)}\). But we encounter a fundamental problem: plane waves are not normalizable!
We see that a plane wave exists with equal probability everywhere in space (\(|\psi|^2 = 1\) at all \(x\)). We observe that this violates normalization and doesn’t represent a localized particle.
Physical problem:
We find that a plane wave has definite momentum\(p = \hbar k\) but completely undefined position (\(\Delta x = \infty\))
We see this is consistent with uncertainty (\(\Delta x \cdot \Delta p \geq \hbar/2\) with \(\Delta p = 0 \Rightarrow \Delta x = \infty\))
But we observe that real particles are localized somewhere!
What is a wave packet?
We construct a wave packet as a superposition of plane waves with different momenta, built to be localized in space:
Here, we have \(A(k)\) as the amplitude distribution in momentum space.
Key idea: We see that by combining waves with slightly different wavelengths (momenta), we create constructive interference in one region (localized particle) and destructive interference elsewhere.
Physical interpretation:
We find that \(|A(k)|^2\) gives the probability distribution in momentum space
We see that \(|\psi(x,t)|^2\) gives the probability distribution in position space
We observe that the packet represents a particle that is localized but not perfectly defined in position or momentum
We see that this matches the classical momentum-velocity relation! We observe that quantum mechanics reproduces classical motion for the center of wave packets.
Key insight: As the position-space packet spreads (\(\sigma(t)\) increases), the momentum-space width stays constant!
Physical meaning: Spreading doesn’t change the momentum distribution – it’s determined by the initial conditions. The packet spreads because different momenta correspond to different velocities.
Left panel: We see that the momentum distribution \(|A(k)|^2\) is Gaussian, centered at \(k_0\) with width \(\Delta k = 1/(2\sigma)\). We observe that this distribution doesn’t change as the packet spreads!
Right panel: We find the uncertainty product \(\Delta x \cdot \Delta p\) for Gaussian packets. We see that the minimum value \(\hbar/2\) (red line) is achieved, making Gaussians the most localized possible states.
where \(r\) is the distance between electron and proton, and \(k = 1/(4\pi\epsilon_0) \approx 9 \times 10^9\) N·m²/C².
Key features:
\(V(r) \to 0\) as \(r \to \infty\) (free electron reference)
\(V(r) \to -\infty\) as \(r \to 0\) (strong attraction at nucleus)
Spherically symmetric: \(V\) depends only on \(r = |\vec{r}|\), not on direction
Simplification: Treat the nucleus as infinitely heavy (fixed at origin). More accurately, use the reduced mass \(\mu = m_e m_p/(m_e + m_p) \approx m_e\).
In Cartesian coordinates, \(\nabla^2 = \partial^2/\partial x^2 + \partial^2/\partial y^2 + \partial^2/\partial z^2\). But this is a nightmare to solve!
Key insight: The potential has spherical symmetry, so use spherical coordinates \((r, \theta, \phi)\).
This looks like a 1D Schrödinger equation with effective potential \(V_{\text{eff}}(r)\)!
Boundary conditions:
\(u(0) = 0\) (wavefunction finite at origin)
\(u(\infty) = 0\) (bound state, normalizable)
Energy quantization: The principal quantum number
Solving the radial equation
The radial equation can be solved using power series methods (Frobenius method). The requirement that \(u(r) \to 0\) as \(r \to \infty\)quantizes the energy!
Result: Bound states exist only for discrete energies:
Radial nodes: Number of nodes = \(n - l - 1\) (where \(R_{nl} = 0\))
- 1s has 0 nodes, 2s has 1 node, 3s has 2 nodes, etc.
Most probable radius: Peak of \(P(r)\) (red) shows where electron is most likely found
- For 1s: \(r_{\text{max}} = a_0\) (the Bohr radius!)
- Generally increases with \(n\) (higher energy → larger orbit)
Centrifugal barrier: Higher \(l\) → electron pushed away from nucleus (\(R \propto r^l\) near origin)
Exponential decay: All states decay as \(e^{-r/(na_0)}\) at large \(r\)
Let me create comprehensive orbital visualizations:
importnumpyasnpimportmatplotlib.pyplotaspltfrommpl_toolkits.mplot3dimportAxes3Dfromscipy.specialimportsph_harm# Define spherical harmonics squared (probability densities)defY_00(theta,phi):returnnp.ones_like(theta)/np.sqrt(4*np.pi)defY_10(theta,phi):returnnp.sqrt(3/(4*np.pi))*np.cos(theta)defY_1m1(theta,phi):returnnp.sqrt(3/(8*np.pi))*np.sin(theta)*np.exp(-1j*phi)defY_1p1(theta,phi):return-np.sqrt(3/(8*np.pi))*np.sin(theta)*np.exp(1j*phi)defY_20(theta,phi):returnnp.sqrt(5/(16*np.pi))*(3*np.cos(theta)**2-1)fig=plt.figure(figsize=(16,12))# Create grid in spherical coordinatestheta=np.linspace(0,np.pi,100)phi=np.linspace(0,2*np.pi,100)theta_grid,phi_grid=np.meshgrid(theta,phi)# Orbital data: (Y_func, title, subplot_pos)orbitals=[(Y_00,'1s (l=0, m=0)',1),(Y_10,'2pz (l=1, m=0)',2),(lambdat,p:Y_1m1(t,p)-Y_1p1(t,p),'2px (l=1, mx)',3),(lambdat,p:1j*(Y_1m1(t,p)+Y_1p1(t,p)),'2py (l=1, my)',4),(Y_20,'3dz² (l=2, m=0)',5),]forY_func,title,posinorbitals:ax=fig.add_subplot(2,3,pos,projection='3d')# Calculate angular functionY=Y_func(theta_grid,phi_grid)Y_abs=np.abs(Y)# Normalize for visualizationY_abs_norm=Y_abs/Y_abs.max()# Convert to Cartesian (radius determined by |Y|²)r=Y_abs_normx=r*np.sin(theta_grid)*np.cos(phi_grid)y=r*np.sin(theta_grid)*np.sin(phi_grid)z=r*np.cos(theta_grid)# Color by sign of real partcolors=np.real(Y)# Plot surfacesurf=ax.plot_surface(x,y,z,facecolors=plt.cm.seismic(colors/colors.max()),alpha=0.8,linewidth=0,antialiased=True,shade=True)# Formattingax.set_xlabel('x',fontsize=24)ax.set_ylabel('y',fontsize=24)ax.set_zlabel('z',fontsize=24)ax.set_title(title,fontsize=26,weight='bold')ax.tick_params(axis='both',labelsize=22)# Set equal aspect ratiomax_range=1.0ax.set_xlim([-max_range,max_range])ax.set_ylim([-max_range,max_range])ax.set_zlim([-max_range,max_range])ax.set_box_aspect([1,1,1])# Add explanation panelax=fig.add_subplot(2,3,6)ax.axis('off')text=("Orbital visualization notes:\n\n""• Surface shows |Y(θ,φ)|²\n""• Color: blue (+), red (−) phase\n""• s orbitals: spherical\n""• p orbitals: dumbbell shape\n""• d orbitals: complex lobes\n\n""Quantum numbers:\n""• l = 0: s (1 orbital)\n""• l = 1: p (3 orbitals)\n""• l = 2: d (5 orbitals)\n""• l = 3: f (7 orbitals)\n\n""Each n contains orbitals\n""for l = 0, 1, ..., n−1")ax.text(0.1,0.5,text,fontsize=22,family='monospace',verticalalignment='center',bbox=dict(boxstyle='round',facecolor='lightyellow',alpha=0.9))plt.tight_layout()
s orbitals (\(l=0\)): Spherically symmetric, no angular dependence
p orbitals (\(l=1\)): Dumbbell-shaped, three orientations (px, py, pz)
- \(p_z\) points along z-axis
- \(p_x, p_y\) point along x, y axes (linear combinations of \(m = \pm 1\))
d orbitals (\(l=2\)): More complex lobes, five orientations
- \(d_{z^2}\): Special shape along z-axis
- \(d_{xy}, d_{xz}, d_{yz}, d_{x^2-y^2}\): Four-lobed patterns
Phase/sign: Color indicates sign of wavefunction (important for bonding!)
\[\boxed{\mu = \hbar m, \quad m = -l, -l+1, \ldots, l-1, l}\]
Key results:
Total angular momentum: \(L^2\) has eigenvalues \(\hbar^2 l(l+1)\), not \(\hbar^2 l^2\)!
Quantization: The quantum number \(l\) can be integer or half-integer (0, 1/2, 1, 3/2, 2, …)
z-component: \(L_z = m\hbar\) where \(m\) ranges from \(-l\) to \(+l\) in integer steps
Degeneracy: For each \(l\), there are \(2l+1\) possible values of \(m\)
Why \(l(l+1)\) and not \(l^2\)?
The magnitude of angular momentum is:
\[|\vec{L}| = \hbar\sqrt{l(l+1)}\]
This is always larger than \(|L_z|_{\max} = \hbar l\). Why?
Because \(L_x\) and \(L_y\) are uncertain! The angular momentum vector cannot point exactly along the z-axis. It “precesses” around the z-axis with uncertain \(x\) and \(y\) components.
importnumpyasnpimportmatplotlib.pyplotaspltfrommpl_toolkits.mplot3dimportAxes3Dfig=plt.figure(figsize=(14,6))# Left panel: Vector model for l=2ax1=fig.add_subplot(121,projection='3d')l=2L_magnitude=np.sqrt(l*(l+1))# Standard color palette for m quantum numbersstandard_colors=['#C73E1D','#F18F01','#6A994E','#2E86AB','#A23B72']colors_map=standard_colors[:2*l+1]if2*l+1<=len(standard_colors)elseplt.cm.viridis(np.linspace(0,1,2*l+1))fori,minenumerate(range(-l,l+1)):L_z=m# L must have magnitude sqrt(l(l+1)), with z-component = m# So L_perp = sqrt(l(l+1) - m^2)L_perp=np.sqrt(L_magnitude**2-L_z**2)# Draw cone at this L_ztheta_cone=np.linspace(0,2*np.pi,50)x_cone=L_perp*np.cos(theta_cone)y_cone=L_perp*np.sin(theta_cone)z_cone=np.full_like(theta_cone,L_z)ax1.plot(x_cone,y_cone,z_cone,color=colors_map[i],linewidth=2,alpha=0.7)# Draw one example vectorangle=i*np.pi/3x_vec=L_perp*np.cos(angle)y_vec=L_perp*np.sin(angle)ax1.quiver(0,0,0,x_vec,y_vec,L_z,color=colors_map[i],arrow_length_ratio=0.15,linewidth=2.5,alpha=0.9)# Label m valuelabel_r=L_perp+0.3ax1.text(label_r,0,L_z,f'm={m}',fontsize=22,weight='bold',color=colors_map[i])# Draw z-axisax1.plot([0,0],[0,0],[-l-0.5,l+0.5],'k--',linewidth=2,alpha=0.5)ax1.text(0,0,l+0.7,'$L_z$',fontsize=26,weight='bold')# Formattingax1.set_xlabel('$L_x$ (uncertain)',fontsize=24,weight='bold')ax1.set_ylabel('$L_y$ (uncertain)',fontsize=24,weight='bold')ax1.set_zlabel('$L_z$ (measured)',fontsize=24,weight='bold')ax1.set_title(f'Angular momentum vector model\n$l={l}$, $|\\vec{{L}}| = \\hbar\\sqrt{{{l}({l}+1)}} = {L_magnitude:.2f}\\hbar$',fontsize=26,weight='bold')ax1.set_xlim([-3,3])ax1.set_ylim([-3,3])ax1.set_zlim([-3,3])ax1.tick_params(axis='both',labelsize=22)# Right panel: Energy level diagram showing degeneracyax2=fig.add_subplot(122)l_values=[0,1,2,3]forlinl_values:y_pos=l# Draw the l levelax2.hlines(y_pos,0,2*l+1,colors='black',linewidth=3,alpha=0.5)ax2.text(-0.5,y_pos,f'$l={l}$',fontsize=26,weight='bold',verticalalignment='center',horizontalalignment='right')# Draw each m statefori,minenumerate(range(-l,l+1)):x_pos=i+0.5state_color=colors_map[i]ifl==2else'#2E86AB'ax2.plot(x_pos,y_pos,'o',markersize=20,color=state_color,alpha=0.8)ax2.text(x_pos,y_pos-0.15,f'{m}',fontsize=22,horizontalalignment='center',verticalalignment='top')# Label degeneracyax2.text(2*l+1.5,y_pos,f'{2*l+1} states',fontsize=22,style='italic',verticalalignment='center',color='darkred')ax2.set_xlabel('Magnetic quantum number $m$',fontsize=26,weight='bold')ax2.set_ylabel('Angular momentum quantum number $l$',fontsize=26,weight='bold')ax2.set_title('Degeneracy: $(2l+1)$ states per $l$',fontsize=26,weight='bold')ax2.set_xlim([-1,8])ax2.set_ylim([-0.5,3.5])ax2.tick_params(axis='both',labelsize=22)ax2.grid(alpha=0.3,axis='y')ax2.set_xticks([])plt.tight_layout()
Left panel: The “vector model” of angular momentum for \(l=2\). Each cone represents a possible \(m\) state. The angular momentum vector has definite length \(|\vec{L}| = \hbar\sqrt{6}\) and definite z-component \(L_z = m\hbar\), but \(L_x\) and \(L_y\) are uncertain, so the vector “precesses” around the z-axis.
Right panel: For each \(l\), there are \(2l+1\) degenerate states (different \(m\) values). This degeneracy is lifted by magnetic fields (Zeeman effect).
In the hydrogen atom, we found that the angular wavefunctions are spherical harmonics\(Y_l^m(\theta, \phi)\). These are precisely the eigenfunctions of \(\hat{L}^2\) and \(\hat{L}_z\)!
Suppose we have two angular momenta \(\vec{L}_1\) and \(\vec{L}_2\) (e.g., two electrons). The total angular momentum is:
\[\vec{J} = \vec{L}_1 + \vec{L}_2\]
Question: If we know the quantum numbers \((l_1, m_1)\) and \((l_2, m_2)\), what are the possible values of \((j, m_j)\) for the total angular momentum?
Quantum addition is weird
In classical physics: If \(|\vec{L}_1| = l_1\hbar\) and \(|\vec{L}_2| = l_2\hbar\), then:
Key insight: The beam splits into exactly two components, proving that electron spin has only two possible values of \(S_z\): \(+\hbar/2\) (spin up) and \(-\hbar/2\) (spin down).
The spin precesses around the z-axis at frequency \(\omega_0\)! This is the basis for Nuclear Magnetic Resonance (NMR) and MRI.
importnumpyasnpimportmatplotlib.pyplotaspltfrommpl_toolkits.mplot3dimportAxes3Dfig=plt.figure(figsize=(14,6))# Left panel: Spin precessionax1=fig.add_subplot(121,projection='3d')# Time pointst_vals=np.linspace(0,2*np.pi,50)omega0=1# Spin expectation valuesSx=0.5*np.cos(omega0*t_vals)Sy=0.5*np.sin(omega0*t_vals)Sz=np.zeros_like(t_vals)# Plot trajectoryax1.plot(Sx,Sy,Sz,'#2E86AB',linewidth=3,alpha=0.7)# Plot vectors at several time pointsforiin[0,10,20,30,40]:ax1.quiver(0,0,0,Sx[i],Sy[i],Sz[i],color='#C73E1D',arrow_length_ratio=0.2,linewidth=2.5,alpha=0.8)# Magnetic field directionax1.quiver(0,0,0,0,0,1,color='#6A994E',arrow_length_ratio=0.15,linewidth=4,alpha=0.9,label='$\\vec{B} = B_0\\hat{z}$')# Formattingax1.set_xlabel('$\\langle S_x \\rangle$ (ℏ/2)',fontsize=24,weight='bold')ax1.set_ylabel('$\\langle S_y \\rangle$ (ℏ/2)',fontsize=24,weight='bold')ax1.set_zlabel('$\\langle S_z \\rangle$ (ℏ/2)',fontsize=24,weight='bold')ax1.set_title('Spin precession in magnetic field\n$\\omega_0 = \\gamma B_0$',fontsize=26,weight='bold')ax1.set_xlim([-0.7,0.7])ax1.set_ylim([-0.7,0.7])ax1.set_zlim([-0.7,0.7])ax1.tick_params(axis='both',labelsize=22)ax1.legend(fontsize=22)# Right panel: Time evolution of componentsax2=fig.add_subplot(122)t_plot=np.linspace(0,4*np.pi,200)Sx_plot=0.5*np.cos(omega0*t_plot)Sy_plot=0.5*np.sin(omega0*t_plot)ax2.plot(t_plot,Sx_plot,'#C73E1D',linewidth=3,label='$\\langle S_x \\rangle / (\\hbar/2)$')ax2.plot(t_plot,Sy_plot,'#2E86AB',linewidth=3,label='$\\langle S_y \\rangle / (\\hbar/2)$')ax2.axhline(0,color='gray',linestyle='--',linewidth=2,alpha=0.5)ax2.set_xlabel('Time $\\omega_0 t$',fontsize=26,weight='bold')ax2.set_ylabel('Spin expectation value',fontsize=26,weight='bold')ax2.set_title('Time evolution of spin components',fontsize=26,weight='bold')ax2.legend(fontsize=22,loc='upper right')ax2.grid(alpha=0.3)ax2.tick_params(axis='both',labelsize=22)ax2.set_xticks([0,np.pi,2*np.pi,3*np.pi,4*np.pi])ax2.set_xticklabels(['0','π','2π','3π','4π'])plt.tight_layout()
In the electron’s rest frame, the nucleus appears to orbit, creating a magnetic field:
\[\vec{B} \propto \vec{L}\]
This field interacts with the electron’s magnetic moment \(\vec{\mu} \propto \vec{S}\):
\[H_{SO} = \xi(r) \vec{L} \cdot \vec{S}\]
where \(\xi(r)\) depends on the radial wavefunction.
Total angular momentum
Spin-orbit coupling means \(\vec{L}\) and \(\vec{S}\) are not separately conserved, but their sum is:
\[\vec{J} = \vec{L} + \vec{S}\]
The good quantum numbers are \((n, l, s, j, m_j)\) where:
\[j = l \pm \frac{1}{2} \quad \text{(for single electron)}\]
Fine structure in hydrogen
Spin-orbit coupling splits the \(n=2, l=1\) state (\(2p\)) into:
\(2p_{1/2}\): \(j = 1/2\) (2 states)
\(2p_{3/2}\): \(j = 3/2\) (4 states)
This creates the famous fine structure in atomic spectra, with splitting \(\Delta E \propto \alpha^2\) (where \(\alpha \approx 1/137\) is the fine structure constant).
We have seen how quantum mechanics describes particles using wavefunctions and the Schrödinger equation. But to truly understand quantum systems, we need a deeper mathematical framework. The concepts we develop next (eigenstates, operators, measurements) form the foundation that applies to every quantum system, whether it is an electron in an atom, a photon in a cavity, or a qubit in a quantum computer. These mathematical tools will allow us to solve any quantum problem systematically.
Before we dive deeper into operators and measurements, we need to understand a crucial point: quantum states can be represented in multiple equivalent ways. The physics is the same, but the mathematical notation differs. This often confuses students, so let’s clarify the relationships between these representations.
The notation: \(|\psi\rangle\) represents a quantum state as an abstract vector in Hilbert space.
What it means: Think of \(|\psi\rangle\) as an arrow in an infinite dimensional vector space. It doesn’t refer to any specific basis (position, momentum, energy, etc.). It is the pure, abstract quantum state itself.
Operations:
Inner product: \(\langle \phi | \psi \rangle\) (overlap between states, gives a complex number)
Outer product: \(|\psi\rangle\langle\phi|\) (gives an operator)
Action of operator: \(\hat{A}|\psi\rangle\) (gives another state)
Advantages:
Basis independent (coordinate free)
Makes symmetries and general principles clear
Compact notation for complex calculations
When to use: Abstract derivations, general theorems, operator algebra
Harmonic oscillator: Energy basis (ladder operators)
Particle in a box: Position representation (boundary conditions)
Quantum computing: Matrix representation (2×2 matrices for qubits)
The physics doesn’t change:
\(\langle \psi | \hat{A} | \psi \rangle\) is the same number regardless of representation
Eigenvalues are the same (these are measurable quantities!)
Transition probabilities \(|\langle \phi | \psi \rangle|^2\) are the same
You can switch representations anytime: If you get stuck in one representation, try another! Fourier transforming between position and momentum is often useful.
Bottom line: These are all equivalent ways to describe quantum mechanics. Master them all, and you can tackle any problem in the representation that makes it simplest!
In quantum mechanics, physical observables (energy, momentum, position) are represented by Hermitian operators. An operator \(\hat{A}\) is Hermitian if:
Convention: Operators are applied to the state on their right. In \(\langle \psi | \hat{A} \phi \rangle\), the operator \(\hat{A}\) acts on \(|\phi\rangle\) (to the right), giving \(\hat{A}|\phi\rangle\), and then we take the inner product with \(\langle \psi |\).
Why must observables be Hermitian?
Because measurements must yield real values! Hermitian operators guarantee:
All eigenvalues are real (proven below)
Eigenstates with different eigenvalues are orthogonal (proven below)
These properties are essential for quantum mechanics to make sense physically.
Proving \(\langle \phi | A \psi \rangle = \langle A \phi | \psi \rangle\)
This property is the key to proving that Hermitian operators have real eigenvalues. Let’s see why this equation holds using matrix form, which is how we actually work with quantum systems in practice (finite basis sets, computational calculations, etc.).
The matrix definition of Hermitian
A Hermitian matrix \(A\) satisfies:
\[A = A^\dagger = (A^T)^*\]
This means taking the transpose and complex conjugate gives you back the same matrix.
Starting with the inner product
For two column vectors \(\phi\) and \(\psi\), their inner product is:
When you apply operator \(A\) to the right vector:
\[\langle \phi | A \psi \rangle = \phi^\dagger A \psi\]
The algebraic manipulation
Take the complex conjugate of this entire expression:
\[(\langle \phi | A \psi \rangle)^* = (\phi^\dagger A \psi)^* = \psi^\dagger A^\dagger \phi\]
Here’s where the Hermitian property becomes useful. Since \(A^\dagger = A\):
\[(\langle \phi | A \psi \rangle)^* = \psi^\dagger A \phi = \langle \psi | A \phi \rangle\]
Taking the complex conjugate one more time:
\[\langle \phi | A \psi \rangle = \langle A \phi | \psi \rangle\]
What this means: You can “move” the operator \(A\) across the inner product from the right side to the left side. This algebraic property is exactly what we’ll use below to prove that eigenvalues must be real.
Physical meaning: Different energy states don’t “mix” — they’re completely separate. This is fundamental to quantum mechanics.
Note: If \(\lambda_1 = \lambda_2\) (degenerate eigenvalues), eigenstates aren’t automatically orthogonal, but can be made orthogonal using Gram-Schmidt.
Think of it like this: the eigenstates \(|\phi_n\rangle\) are like basis vectors (north, east, up), and your state \(|\psi\rangle\) is built from these pieces. The coefficients \(c_n\) tell you “how much” of each eigenstate is in your state.
Step 2: Finding the coefficients (projection)
How do we find \(c_n\)? We use the inner product (projection):
This is literally the overlap between \(|\phi_n\rangle\) and \(|\psi\rangle\). If your state \(|\psi\rangle\) looks a lot like \(|\phi_n\rangle\), then \(c_n\) is large. If they’re perpendicular (orthogonal), then \(c_n = 0\).
Example: If \(|\psi\rangle = 0.6|\phi_1\rangle + 0.8|\phi_2\rangle\), then:
\(c_1 = \langle \phi_1 | \psi \rangle = 0.6\)
\(c_2 = \langle \phi_2 | \psi \rangle = 0.8\)
The projection “extracts” each component from the superposition!
Step 3: Measurement outcomes and probabilities
When you measure \(\hat{A}\):
You will get one of the eigenvalues \(a_n\) (NOT a superposition of values!)
The probability of getting \(a_n\) is \(|c_n|^2\)
After the measurement, the system “collapses” to \(|\phi_n\rangle\)
Continuing the example: For \(|\psi\rangle = 0.6|\phi_1\rangle + 0.8|\phi_2\rangle\):
Probability of measuring \(a_1\): \(|c_1|^2 = |0.6|^2 = 0.36 = 36\%\)
Probability of measuring \(a_2\): \(|c_2|^2 = |0.8|^2 = 0.64 = 64\%\)
Total probability: \(0.36 + 0.64 = 1\) ✓
The key insight: Before measurement, the system is in a superposition (both states at once). After measurement, it’s definitely in one eigenstate. The coefficients \(c_n\) from the superposition determine the measurement probabilities!
This is why we needed real eigenvalues (Hermitian operators) and orthogonal eigenstates!
Expectation values — the average of many measurements
If you repeat the same measurement many times on identically prepared systems, the average result is:
Connection to eigenstates: If \(|\psi\rangle\) is an eigenstate of \(\hat{A}\) with eigenvalue \(a\), then every measurement gives \(a\) with certainty:
Heisenberg’s uncertainty principle is not just a statement about measurement limitations — it’s a fundamental property of quantum mechanics arising from commutators.
The general uncertainty relation
For any two observables \(\hat{A}\) and \(\hat{B}\), the uncertainties \(\Delta A\) and \(\Delta B\) satisfy:
\[\Delta A \cdot \Delta B \geq \frac{1}{2} |\langle [\hat{A}, \hat{B}] \rangle|\]
where the uncertainty is defined as the standard deviation:
\[\Delta A = \sqrt{\langle \hat{A}^2 \rangle - \langle \hat{A} \rangle^2}\]
Proof of the general uncertainty relation
This is a beautiful result that follows from the Cauchy-Schwarz inequality. Let’s derive it step by step.
Step 1: Define shifted operators
Define operators shifted by their expectation values:
This matches the exact ground state energy \(E_1 = \frac{\hbar^2 \pi^2}{2mL^2}\) within a factor of \(\pi^2 \approx 10\)!
Summary: Why these mathematical foundations matter
Everything we’ve covered forms the complete mathematical framework for quantum mechanics:
1. Eigenstates and eigenvalues: The foundation
States with definite measurement outcomes
Solving QM = finding eigenstates
2. Hermitian operators: What represents observables
Ensure real eigenvalues (physical measurements)
Ensure orthogonal eigenstates (no mixing)
3. Expansion in eigenstates: Building any state
Any state = superposition of eigenstates
Coefficients determine measurement probabilities
4. Measurements: Connecting math to experiments
Measurement outcomes = eigenvalues
Probabilities from superposition coefficients
Expectation values for repeated measurements
5. Commutators and uncertainty: Fundamental limits
Non-commuting operators can’t be simultaneously known
Uncertainty relations from wave-particle duality
Every quantum system (particle in a box, harmonic oscillator, hydrogen atom, quantum computer) uses these principles!
Now that we understand the mathematical foundations (eigenstates, Hermitian operators, measurements, and uncertainty relations), let’s see these principles in action with real quantum systems. We will start with the simplest example: a particle confined to a one dimensional box. This seemingly simple problem contains all the essential features of quantum mechanics: wave particle duality, quantized energy levels, and probability interpretation. Once we master this, we can tackle more complex systems using the same mathematical machinery.
Why is the particle in a box the first quantum mechanics problem we solve?
The particle in a box is the simplest non-trivial quantum system that demonstrates energy quantization. Here’s why it’s pedagogically important:
1. Simplest potential with bound states
The potential is trivial inside the box (\(V = 0\)), so we only deal with the kinetic energy term of the Schrödinger equation:
\[-\frac{\hbar^2}{2m} \frac{d^2\psi}{dx^2} = E \psi\]
There’s no complicated potential function to worry about. This is the minimum complexity needed to see quantization.
2. First system with boundary conditions
The infinite walls at \(x = 0\) and \(x = L\) impose boundary conditions: \(\psi(0) = \psi(L) = 0\).
These boundary conditions force quantization. Not all wavelengths fit in the box—only those satisfying the boundary conditions are allowed. This is the physical origin of discrete energy levels.
3. Builds intuition for all quantum systems
Concepts learned here apply everywhere:
Standing waves → energy eigenstates
Boundary conditions → quantization
Wave nature → probability interpretation
Nodes and antinodes → wavefunction structure
4. Analytically solvable
The solution involves only trigonometric functions (\(\sin\), \(\cos\)). No special functions (Bessel, Hermite, etc.) are needed. This lets us focus on physical concepts rather than mathematical machinery.
5. Real physical applications
Despite its simplicity, this model describes:
Electrons in quantum dots (nanoscale semiconductor crystals)
Conjugated π-electrons in molecules (particle-in-a-box approximation)
Nucleons in nuclear shell model
Conduction electrons in 1D nanowires
Why not start with the hydrogen atom?
The hydrogen atom is physically more important (it’s a real atom!), but mathematically much harder:
Requires spherical coordinates
Needs special functions (Laguerre polynomials, spherical harmonics)
Three quantum numbers instead of one
Coulomb potential \(V(r) = -e^2/(4\pi\epsilon_0 r)\) is more complex
The particle in a box teaches the same fundamental concepts (quantization, wavefunctions, probability) with simpler mathematics.
The figure above shows the infinite square well potential used in the particle-in-a-box model: \(V(x)=0\) for \(0 < x < L\) and \(V(x)=\infty\) outside. This is the model/postulate that enforces the boundary conditions \(\psi(0)=\psi(L)=0\) (the particle cannot be found where the potential is infinite).
What are the boundary conditions?
Since \(V = \infty\) outside the box, the wavefunction must vanish at the walls:
Key insight: Energy is quantized! Only discrete values \(E_1, E_2, E_3, \ldots\) are allowed. The boundary conditions force specific wavelengths, which force specific energies.
Not necessarily! The states \(\psi_1, \psi_2, \psi_3, \ldots\) are called energy eigenstates or stationary states. They have definite energy and their probability distributions don’t change with time.
However, a particle can exist in a superposition of multiple states:
Physical intuition: Like a guitar string, only wavelengths that “fit” with the boundary conditions are allowed. The walls enforce standing wave patterns, and standing waves only occur at specific frequencies.
Real-world example: electrons in quantum dots
Quantum dots are tiny semiconductor crystals (nanometers in size) that trap electrons in a small region. The electrons behave like particles in a box, with:
Discrete energy levels that depend on the dot size
Larger dots → smaller energy spacing (more classical)
Smaller dots → larger energy spacing (more quantum)
These discrete levels cause quantum dots to emit specific colors of light
The left panel shows the wavefunctions \(\psi_n(x)\) for the first four quantum states. Each wavefunction has \(n\) half-wavelengths fitting in the box, with \(n-1\) nodes (zero crossings). The right panel shows the probability densities \(|\psi_n(x)|^2\), indicating where the particle is most likely to be found. For the ground state (\(n=1\)), the particle is most probable at the center. Higher energy states show multiple peaks and nodes.
The energy levels \(E_n = n^2 E_1\) are shown as horizontal blue lines, where \(E_1 = h^2/(8mL^2)\) is the ground state energy. Notice how the spacing between levels increases with quantum number \(n\): the spacing grows as \(E_{n+1} - E_n = (2n+1)E_1\). This increasing spacing is a key difference from the harmonic oscillator, which has constant spacing.
Why is energy quantized?
Standing wave condition: Only wavelengths that fit an integer number of half-wavelengths in the box are allowed:
\[L = n \frac{\lambda}{2} \quad \Rightarrow \quad \lambda = \frac{2L}{n}\]
Why must the waves fit within the box boundaries?
The particle cannot exist outside the box because \(V = \infty\) in those regions. The wavefunction must be zero at \(x = 0\) and \(x = L\) (boundary conditions).
This means we can only have standing waves that:
Start at zero: \(\psi(0) = 0\)
End at zero: \(\psi(L) = 0\)
Have an integer number of half-wavelengths between the walls
Just like a guitar string fixed at both ends, only certain wavelengths “fit” in the box. Wavelengths that don’t satisfy the boundary conditions would create destructive interference and cannot exist as stable states.
The figure above shows the first four standing wave patterns. Notice:
Each wavefunction goes to zero at the walls (\(x=0\) and \(x=L\))—required by boundary conditions
Gray shaded regions show the forbidden regions where \(V=\infty\) (particle cannot exist there)
\(n=1\): one half-wavelength fits in the box
\(n=2\): two half-wavelengths (one full wavelength) fit in the box
\(n=3\): three half-wavelengths fit in the box
\(n=4\): four half-wavelengths (two full wavelengths) fit in the box
Red dots mark nodes (places where \(\psi=0\) inside the box)
How does this lead to discrete energies?
Using de Broglie (\(\lambda = h/p\)) and \(E = p^2/(2m)\):
\[p = \frac{h}{\lambda} = \frac{nh}{2L}\]
\[E = \frac{p^2}{2m} = \frac{n^2 h^2}{8mL^2}\]
Physical intuition: Like a guitar string, only certain “notes” (energies) are allowed. The box imposes boundary conditions that select discrete wavelengths, which correspond to discrete momenta and therefore discrete energies.
An interesting observation: Mass doesn’t appear in the wavefunction
Wait, there’s no mass in \(\psi_n(x) = \sqrt{2/L} \sin(n\pi x/L)\)?
That’s right! The normalized wavefunction depends only on:
The box size \(L\)
The quantum number \(n\)
Position \(x\)
The mass \(m\) does not appear in the wavefunction itself.
Where does mass matter then?
Mass appears in the energies, not the wavefunctions:
\[E_n = \frac{n^2 h^2}{8mL^2}\]
Heavier particle (\(m\) large):
Lower energy levels (energies scale as \(1/m\))
Same wavefunctions (same spatial patterns)
Smaller de Broglie wavelength at same energy
Lighter particle (\(m\) small):
Higher energy levels
Same wavefunctions
Larger de Broglie wavelength at same energy
Why is this interesting?
This reveals a deep fact about quantum mechanics: The spatial structure of allowed states depends only on the boundary conditions (geometry), not on what particle is in the box.
An electron and a proton in the same box have identical wavefunction shapes
But the proton’s energy levels are ~2000 times lower (because \(m_{\text{proton}} \approx 2000 \, m_{\text{electron}}\))
The probability distributions\(|\psi_n(x)|^2\) are identical regardless of particle mass
Physical meaning: The wavefunction describes where the particle can be found (geometry). The energy describes how fast it’s moving (kinetics). Mass couples the two through \(E = p^2/(2m)\), but doesn’t change the allowed spatial patterns.
What about the “shape” of the electron?
In quantum mechanics, elementary particles like electrons are treated as point particles—they have no internal structure or “shape” in the classical sense.
The wavefunction \(\psi(x)\) doesn’t describe the electron’s shape; it describes the probability of finding the electron at different locations. The electron itself is a point, but its probability distribution is spread out according to \(|\psi(x)|^2\).
This is fundamentally different from classical physics, where we might imagine a tiny ball bouncing around. In quantum mechanics:
The electron doesn’t have a trajectory
It doesn’t have a definite position until measured
\(|\psi(x)|^2\) tells us the probability density, not the electron’s “size” or “shape”