Differentiating through sort


Differentiable programming is useful in machine learning research, because it allows for efficient optimization of any desired parameters. Forward- and reverse-mode automatic differentiation are the two major paradigms for finding these derivatives, and differentiable programming is often framed in terms of functional programming. In this post, I will focus on reverse-mode automatic differentiation, because forward-mode is trivial (dual numbers) and inefficient for functions from many variables to few variables, e.g. f:\mathbb{R}^n\rightarrow \mathbb R.

I was curious how derivatives for the more obscure/less functional-appearing operations are determined; one such operation is sort. Assume we have an array A with elements A_i. We perform the operation B = \text{sort}(A). The question is to (efficiently) find the Jacobian

$$J_{ij}=\frac{\partial B_i}{\partial A_j}=\bar J_{ij}^{-1}=\frac{1}{\frac{\partial A_i}{\partial B_j}}$$

Without trying to dig through the source code for JAX or another automatic differentiation framework, here are the observations I made. First, sorting an array is equivalent to identifying a particular permutation of the elements. In addition to the traditional jax.numpy.sort operation, there is jax.numpy.argsort which precisely finds the list of indices specifying the permutation. When we perform reverse mode automatic differentiation, we are given a computational graph, for which each node represents g(f(x)), and we are given g' and f(x), and we must use the chain rule to compute (g\circ f)(x)=g'(f(x))f'(x) (technically, we keep track of the adjoint, denoted by the \bar J above). In multiple dimensions, we must also use the multi-dimensional chain rule, so that there is an implicit sum/matrix-multiplication.

In the case of f(A)=\text{sort}(A)=B, the derivatives \frac{\partial B_i}{\partial A_j} are 1 or 0, with it being 1 when A_j is the i^{th} component of the sorted array and 0 otherwise.

Since the derivative is so simple, we just need to propagate the gradient from the sorted elements B_j back to the corresponding elements of A_i, taking care of the chain rule. In other words, if A_i is sorted to B_j, then the gradient from B_j is copied back to A_i. We thus need to invert the permutation provided by jax.numpy.argsort. How do we invert a sort? The trick is to argsort the argsort (https://stackoverflow.com/questions/9185768/inverting-permutations-in-python)!

Assume we have already argsorted A, which returns indices I. If we argsort I, then sort I is in the original order of the indices of A, [0,1,2,\cdots], therefore argsortI must be the inverse permutation to the permutation represent by I.

I implemented the forward and backward of jax.numpy.sort manually.

No description available.

 

After JITting the appropriate functions, the timing is precisely the same. The gradients are also correct (up to some transpositional/dimensional trickery). This strongly suggests that we achieved the reference implementation for differentiating sort.

No description available.

No description available.

I thought this was a neat trick of coercing what is typically a highly branched and seemingly imperative code, (sorting being a prototypical example of this) into a functional programming form. Essentially, argsort is a weird kind of idempotent operation, though it preserves only the ordering of a collection of items unless you explicitly keep track of the elements as well.

Code: https://github.com/mikesha2/diffsort/tree/main

Lancaster and Blundell Chapter 26

(26.1) Consider \hat\phi_A^\dagger,\hat\phi_B^\dagger such that $$[\hat Q_N,\hat \phi_A^\dagger]=\hat \phi_B^\dagger$$ for some generator of a symmetry group such that [\hat Q_N,\hat H]=0. Show that e^{i\alpha \hat Q_N}|0\rangle=|0\rangle.

$$\hat H e^{i\alpha\hat Q_N}|0\rangle=e^{i\alpha\hat Q_N}\hat H|0\rangle=0$$

Thus it must be that $$e^{i\alpha\hat Q_N}|0\rangle=|0\rangle$$ as we assume a complete basis of eigenstates in the Fock space.

Show also that \hat E_A=E_B, where \hat H\hat \phi_A^\dagger|0\rangle=E_A\hat \phi_A^\dagger|0\rangle and \hat H\hat \phi_B^\dagger|0\rangle=E_B\hat \phi_B^\dagger|0\rangle.

$$E_B\hat \phi_B^\dagger|0\rangle=\hat H\hat \phi_B^\dagger|0\rangle\\=\hat H\left[\hat Q_N,\hat\phi_A^\dagger\right]|0\rangle\\=\hat H\hat Q_N\hat \phi_A^\dagger|0\rangle-\hat H\hat\phi_A^\dagger\hat Q_N|0\rangle\\=\hat Q_N\hat H\hat \phi_A^\dagger|0\rangle\\=\hat Q_NE_A\phi_A^\dagger|0\rangle\\=E_A(\hat\phi_B^\dagger+\hat\phi_A^\dagger \hat Q)|0\rangle=E_A\hat\phi^\dagger_B|0\rangle$$

Thus E_A=E_B.


(26.2) Prove the Fabri-Picasso theorem.

$$\langle 0|\hat J(x)\hat Q|0\rangle=\langle 0|e^{i\hat p\cdot a}\hat J(0)e^{-i\hat p\cdot a}\hat Q|0\rangle$$

We know that [\hat p^\mu,\hat Q]=0, because \hat Q is an internal symmetry. The vacuum has momentum 0, so

$$\langle 0|\hat J(x)\hat Q|0\rangle=\langle 0|\hat J(0)\hat Q|0\rangle$$

Now considering

$$\langle 0|\hat Q\hat Q|0\rangle=\int d^3x\langle 0|J(x)\hat Q|0\rangle=\int d^3x \langle 0|J(0)\hat Q|0\rangle=\langle 0|J(0)\hat Q|0\rangle\int d^3x$$

This quantity diverges unless \hat Q|0\rangle, so \hat Q|0\rangle either has 0 or infinite norm.


(26.3) Prove Goldstone’s theorem.

Assume there is a continuous symmetry with charge \hat Q where \hat Q|0\rangle\neq 0. Consider a field \hat \phi(y) such that

$$[\hat Q, \hat \phi(y)]=\hat\psi(y)$$

and \langle 0|\psi(y)|0\rangle\neq 0 due to spontaneously broken symmetry.

Show that 

$$\frac{\partial}{\partial x^0} \langle 0|\psi(0)|0\rangle=-\int d\mathbf S\cdot \langle 0|[\hat {\mathbf J}(x),\hat\phi(0)]|0\rangle$$

This is obvious from the continuity equation, since

$$\frac{\partial}{\partial x^0}\hat J^0=-\nabla\cdot \hat J^i$$

The divergence theorem gives the relevant quantity surface integral.

 

Lancaster and Blundell Chapter 24

(24.1) Verify eqn 24.25 and show that eqn 24.29 solves eqn 24.28.

Integrating the term \partial_\mu A_\nu \partial^\mu A^\nu by parts gives -A_\nu\partial^2 A^\nu as promised, while the term -\partial_\mu A_\nu\partial^\nu A^\mu gives A^\mu \partial_\mu\partial_\nu A^\nu. Therefore,

$$-\frac12(\partial_\mu A_\nu\partial^\mu A^\nu-\partial_\mu A_\nu\partial^\nu A^\mu)+\frac12m^2 A^\mu A_\mu=\frac 12A_\mu\left([\partial^2+m^2]\eta_{\mu\nu}-\partial_\mu\partial_\nu\right)A^\nu$$

Evaluating

$$[-(p^2-m^2)g^{\mu\nu}+p^\mu p^\nu]\frac{-i(g_{\nu\lambda}-p_\nu p_\lambda/m^2)}{p^2-m^2}$$

$$=ig^{\mu\nu}(g_{\nu\lambda}-p_\nu p_\lambda/m^2)-i\frac{p^\mu p_\lambda-p^\mu p^2 p_\lambda/m^2}{p^2-m^2}\\=ig^\mu_\lambda – i\frac{p^\mu p_\lambda}{m^2}+i\frac{(p^2-m^2) p^\mu p_\lambda}{m^2(p^2-m^2)}=ig^\mu_\lambda$$


(24.2) Consider the \phi^4 Lagrangian with a shift

$$\mathcal L=\frac12(\partial_\mu\phi)^2-\frac{m^2}2\phi^2-\frac g8\phi^4+\frac{1}{2g}\left(\sigma-\frac g2\phi^2\right)^2$$

By performing a functional integral over the field \sigma, show that \sigma doesn’t change the dynamics of the theory.

The Lagrangian density becomes

$$\frac12(\partial_\mu\phi)^2-\frac{m^2}2\phi^2-\frac g8\phi^4+\frac g8\phi^4+\frac 1{4g^2}\sigma^2-\frac12\sigma \phi^2\\=\frac12(\partial_\mu\phi)^2-\frac{m^2}2\phi^2+\frac1{4g^2}\sigma^2-\frac12\sigma\phi^2$$

Identifying a=\frac{1}{4g^2} and b=-\frac{\phi^2}{2}, The path integral over \sigma gives a factor of

$$B[\det a]^{-\frac 12}e^{-\frac i2\int d^4x d^4y \left(-\frac{\phi^2}{2}\right)2g\left(-\frac{\phi^2}2\right)}$$

Here, we realize that a^{-1} is diagonal, i.e. a Dirac delta \delta(x-y). Thus we recover the -\frac g8\phi^4 term in the Lagrangian, and the overall constant out in front cancels.

The Euler-Lagrange equations for \sigma are

$$\frac{\partial \mathcal L}{\partial\sigma}=0$$

since there is no dependence on the derivatives of \sigma.

$$\frac1{2g^2}\sigma+\frac{\phi^2}2=0$$

Thus \sigma is entirely determined by \phi, and thus there are no dynamical degrees of freedom in \sigma. The Feynman diagrams have vertices with 2 \phi particles and 1 \sigma particle.


(24.3) We want to do the integral

$$Z(J)=\int dx\ e^{-\frac12 Ax^2-\frac\lambda{4!}x^4+Jx}$$

$$=\int dx\ e^{-\frac12 Ax^2+Jx}e^{-\frac\lambda{4!}x^4}=\sum_n \frac1{n!}\left(-\frac\lambda{4!}x^4\right)^n\int dx\ e^{-\frac12 Ax^2+Jx}\\=\left[\sum_n\frac1{n!}\left(-\frac\lambda{4!}\frac{\partial^4}{\partial J^4}\right)^n\right]\int dx\ e^{-\frac12 Ax^2+Jx}\\=\left[e^{-\frac\lambda{4!}\frac{\partial^4}{\partial J^4}}\right]\left[\left(\frac{2\pi}A\right)^{\frac12}e^{\frac{J^2}{2A}}\right]$$


(24.4) By analogy, the generating functional for \phi^4 theory is

$$Z[J]=\left[e^{-\frac\lambda{4!}\int d^4z\ \frac{\delta^4}{\delta J(z)^4}}\right]\mathcal Z_0[J]$$

where

$$\mathcal Z_0[J]=e^{-\frac12\int d^4xd^4y\ J(x)\Delta(x-y)J(y)}$$

Act on \mathcal Z_0[J] with the functional derivative four times.

Applying the definition of the functional derivative,

$$\frac{\delta \mathcal Z_0[J]}{\delta J(z)}=\lim_{\epsilon\rightarrow 0}\frac1\epsilon(Z[J(x)+\epsilon\delta(z-x)]-Z[J(x)])$$

$$=\lim_{\epsilon\rightarrow 0}\frac1\epsilon \left[e^{-\frac12\int d^4xd^4y\left[\left(J(x)+\epsilon\delta(x-z)\right)\Delta(x-y)\left(J(y)+\epsilon\delta(x-y)\right)\right]}-e^{-\frac12\int d^4xd^4y\ J(x)\Delta(x-y)J(y)}\right]$$

Expanding the exponentials and keeping only the terms to first order in \epsilon, we indeed find

$$=\left[-\int d^4y\ \Delta(z-y)J(x)\right]\mathcal Z_0[J]$$

The rest of this calculation is very tedious, and I skip it. The functional derivative acts as expected with the product rule and chain rules.

Lancaster and Blundell Chapter 23

(23.1) With Lagrangian L=\frac12x\hat{\mathcal A}x+bx, use the Euler-Lagrange equation to find x and show that the Lagrangian may be expressed equivalently as L=-b\frac{1}{2\hat{\mathcal A}}b.

Since there is no dependence on \dot x,

$$\frac{\partial L}{\partial x}=0$$

$$\frac12(\hat Ax+x\hat A)+b=0$$

Treating \hat{\mathcal A} as a number,

$$x=-\frac b{\hat{\mathcal A}}$$

Plugging this into L,

$$\left(-\frac b{\hat{\mathcal A}}\right)\hat{\mathcal A}\left(-\frac b{\hat{\mathcal A}}\right)+b\left(-\frac b{\hat{\mathcal A}}\right)=b\frac{1}{2\hat{\mathcal A}}b-b\frac{1}{\hat{\mathcal A}}b=-b\frac 1{2\hat{\mathcal A}}b$$


(23.2) The path integral derivation of Wick’s theorem.

$$\int_{-\infty}^\infty dx\ x^2e^{-\frac12ax^2}=\frac{d}{da}\int_{-\infty}^\infty dx\ e^{-\frac12ax^2}=-2\frac{d}{da}\sqrt{\frac{2\pi}{a}}=\sqrt{\frac{2\pi}{a^3}}$$

Define

$$\langle x^n\rangle=\frac{\int_{-\infty}^\infty dx\ x^ne^{-\frac12ax^2}}{\int_{-\infty}^\infty dx\ e^{-\frac12ax^2}}$$

Calculate \langle x^2\rangle, \langle x^4\rangle, \langle x^n\rangle.

$$\langle x^2\rangle=\frac{\sqrt{\frac{2\pi}{a^3}}}{\sqrt{\frac{2\pi}{a}}}=\frac1a$$

$$\langle x^4\rangle=\frac{-2\frac{d}{da}\sqrt{\frac{2\pi}{a^3}}}{\sqrt{\frac{2\pi}{a}}}=\frac{3}{a^2}$$

When n is odd, the numerator is 0 by antisymmetry of the integrand, so \langle x^n\rangle=0. Otherwise when n is even, acting \frac n2 times on the numerator with the operator -2\frac{d}{da} gives

$$\langle x^n\rangle=\frac{(n-1)!!}{a^{\frac n2}}$$

Diagrammatically, each factor of \frac1a comes from a possible contraction of the n different x‘s.

Consider now the integral

$$\mathcal K=\int dx_1\cdots dx_N e^{-\frac12\mathbf x^T\mathbf A\mathbf x+\mathbf b^T\mathbf x}\\=\left(\frac{(2\pi)^N}{\det \mathbf A}\right)^{\frac12}e^{\frac12\mathbf b^T\mathbf A^{-1}\mathbf b}$$

Find the corresponding moments.

Ignoring the overall factor of \left(\frac{(2\pi)^N}{\det \mathbf A}\right)^{\frac12} which will cancel with the denominator, we need to differentiate with respect to the i,j components of \mathbf b to get $$\langle x_ix_j\rangle=\frac{\int dx_1\cdots dx_N x_ix_j e^{-\frac12\mathbf x^T\mathbf A\mathbf x+\mathbf b^T\mathbf x}}{\int dx_1\cdots dx_N e^{-\frac12\mathbf x^T\mathbf A\mathbf x+\mathbf b^T\mathbf x}}$$

$$\frac{d}{db_i}\frac{d}{db_j}e^{\frac12\mathbf b_m\mathbf A^{-1}_{mn}\mathbf b_n}=\frac12\left(\delta_{im}\delta_{jn}+\delta_{in}\delta_{jm}\right)A^{-1}_{mn}e^{\frac12\mathbf b_m\mathbf A^{-1}_{mn}\mathbf b_n}$$

Assuming that \mathbf A,\mathbf A^{-1} are symmetric,

$$\langle x_ix_j\rangle=\left(\mathbf A^{-1}\right)_{ij}$$

It’s easy to see how this generalizes, and there will be a bunch of Kronecker deltas which give every possible contraction. Just as in the 1D case, odd moments vanish since the integral is antisymmetric. Thus

$$\langle x_ix_jx_kx_l\rangle=\left(\mathbf A^{-1}\right)_{ij}\left(\mathbf A^{-1}\right)_{kl}+\left(\mathbf A^{-1}\right)_{ik}\left(\mathbf A^{-1}\right)_{jl}+\left(\mathbf A^{-1}\right)_{il}\left(\mathbf A^{-1}\right)_{jk}$$


(23.3) Show that the amplitude for the forced harmonic oscillator with constant force f_0 to stay in the ground state from time t=0 to t=T is

This is the interaction picture

$$\mathcal A=e^{-\frac12\int dt’ dt\ f(t)G(t,t’)f(t’)}$$

$$G(t,t’)=\frac{\theta(t-t’)e^{-i\omega(t-t’)}+\theta(t’-t)e^{i\omega(t-t’)}}{2m\omega}$$

This is just the value of the path integral, with time ordering as in the Feynman propagator. Performing the integral,

$$\int_{-\infty}^\infty dt\int_{-\infty}^\infty dt’\ f_0^2 G(t,t’)=-\frac{if_0}{m\omega^2}\left(T-\frac{\sin\omega T}{\omega}+i\frac2\omega\sin^2\frac{\omega T}{2}\right)$$

The imaginary part of the integral is the simple phase acquired from normal, unforced time evolution.

Lancaster and Blundell Chapter 21

(21.1) Find the thermal average number of excitations in the quantum harmonic oscillator.

$$\langle \hat n\rangle_t=\text{Tr} \hat \rho\hat n$$

$$\hat\rho=\frac{e^{-\beta\hat H}}{Z}$$

$$Z=\sum_n e^{-\beta E_n}=\sum_n e^{-\beta\hbar\omega n}=\sum_n \left(e^{-\beta\hbar\omega}\right)^n\\=\frac{1}{1-e^{-\beta\hbar\omega}}$$

Use units where \hbar =1, and we also ignore the zero-point energy.

$$\text{Tr} \hat \rho\hat n=\sum_n \langle n|e^{-\beta\hat H}(1-e^{-\beta\omega})\hat n|n\rangle$$

$$=\sum_n n e^{-\beta \omega n}(1-e^{-\beta\omega})\\=-\frac{1}{\beta}(1-e^{-\beta\omega})\frac{\partial}{\partial\omega}\sum_n e^{-\beta\omega n}$$

$$=-\frac1\beta(1-e^{-\beta\omega})\frac{\partial}{\partial\omega}\frac{1}{1-e^{-\beta\omega}}\\=-\frac1\beta(1-e^{-\beta\omega})\frac{-\beta e^{-\beta\omega}}{(1-e^{-\beta\omega})^2}$$

$$=\frac{(1-e^{-\beta\omega})e^{-\beta\omega}}{(1-e^{-\beta\omega})^2}=\frac{1}{e^{\beta\omega}-1}$$


(21.2) Consider the Lagrangian $$L=\frac12m\dot x(t)^2-\frac12m\omega^2 x(t)^2+f(t)x(t)$$

$$\langle \psi(t)|\hat x(t)|\psi(t)\rangle=\int_{-\infty}^\infty dt’\chi(t-t’)f(t)’$$

Using H'=-f(t)\hat x(t) as the interaction part of the Hamiltonian, find the interaction picture state \psi_I(t)\rangle to first order in f_I(t).

$$f_I(t)=e^{iH_0t}f(t)e^{-iH_0t}$$

$$|\psi_I(t)\rangle=T[e^{-i \int_{-\infty}^t H_I’ dt’}]|0\rangle\approx (1-i\int_{-\infty}^t H_I’ dt’)|0\rangle\\=|0\rangle+i\int_{-\infty}^t dt’ f_I(t’)x_I(t’)|0\rangle$$

If we try to find the expectation of \hat x(t), we indeed find that

$$\langle\psi_I(t)|\hat x(t)|\psi(t)\rangle=\langle 0|\hat x_I(t)|0\rangle+i\int_{-\infty}^t dt’f_I(t’)\langle 0|[\hat x_I(t),\hat x_I(t’)]|0\rangle=i\int_{-\infty}^\infty dt’\theta(t-t’) f_I(t’)\langle 0|[\hat x_I(t), \hat x_I(t’)]|0\rangle$$

$$\chi(t-t’)=i\theta(t-t’)\langle 0|[\hat x_I(t),\hat x_I(t’)]|0\rangle$$

The commutator evaluates to $$\frac{-i}{m\omega}\sin(\omega(t-t’))$$


(21.3) Find the Green’s function for the diffusion equation.

Perform a Laplace transform in time and a Fourier transform in space. The Laplace transform is equivalent to a Fourier transform with s=i\omega, since then e^{-st}=e^{-i\omega t}.

$$-s\tilde G-D(i\mathbf q)^2\tilde G=1$$

$$\tilde G = \frac{1}{-i\omega + D|\mathbf q|^2}$$

Lancaster and Blundell Chapter 19

(19.1) Write down the momentum space amplitudes for the processes in Fig. 19.6.

a) $$(2\pi)^4\delta^{(4)}(p-q)\frac{-i\lambda}{2}\int \frac{d^4k}{(2\pi)^4}\frac{i}{k^2-m^2+i\epsilon}$$

b) $$(2\pi)^4\delta^{(4)}(p-q)\frac{-\lambda^2}{4}\left(\int \frac{d^4k}{(2\pi)^4}\frac{i}{k^2-m^2+i\epsilon}\right)^2$$

c) $$\frac{-i\lambda}{8}\int \frac{d^4k_1 d^4k_2}{(2\pi)^8}\frac{i}{k_1^2-m^2+i\epsilon}\frac{i}{k_2^2-m^2+i\epsilon}$$

d) $$(2\pi)^4\delta^{(4)}(p-q)\frac{-\lambda^2}{3!}\int\frac{d^4k_1 d^4k_2}{(2\pi)^8}\frac{i}{k_1^2-m^2+i\epsilon}\frac{i}{k_2^2-m^2+i\epsilon}\frac{i}{(q-k_1-k_2)^2-m^2+i\epsilon}$$

e) $$(2\pi)^4\delta^{(4)}(p-q)\frac{-\lambda^2}{4}\int\frac{d^4k_1d^4k_2}{(2\pi)^8}\frac{i}{k_1^2-m^2+i\epsilon}\left(\frac{i}{k_2^2-m^2+i\epsilon}\right)^2$$

f) $$(2\pi)^4\delta^{(4)}(p_1+p_2-q_1-q_2)\frac{i\lambda^3}{4}\\\int\frac{d^4k_1d^4k_2}{(2\pi)^8}\\\frac{i}{k_1^2-m^2+i\epsilon}\frac{i}{(p_1+p_2-k_1)^2-m^2+i\epsilon}\\\frac{i}{k_2^2-m^2+i\epsilon}\frac{i}{(p_1+p_2-k_2)^2-m^2+i\epsilon}$$

g) $$(2\pi)^4\delta^{(4)}(p_1+p_2-q_1-q_2)\frac{i\lambda^3}{2}\\\int\frac{d^4k_1d^4k_2}{(2\pi)^8}\\\frac{i}{k_1^2-m^2+i\epsilon}\frac{i}{k_2^2-m^2+i\epsilon}\\\frac{i}{k_1^2-m^2+i\epsilon}\\\frac{i}{(q_1+q_2-k_1)^2-m^2+i\epsilon}$$

h) $$(2\pi)^4\delta^{(4)}(p_1+p_2-q_1-q_2)\frac{-\lambda^4}{8}\\\int\frac{d^4k_1d^4k_2d^4k_3}{(2\pi)^12}\left(\frac{i}{k_1^2-m^2+i\epsilon}\right)^2\\\frac{i}{k_2^2-m^2+i\epsilon}\frac{i}{k_3^2-m^2+i\epsilon}\\\left(\frac{i}{(q_1+q_2-k_1)^2-m^2+i\epsilon}\right)^2$$

i) $$(2\pi)^4\delta^{(4)}(p_1+p_2-q_1-q_2)\frac{i\lambda}{2}\\\int\frac{d^4k_1d^4k_2}{(2\pi)^8}\\\frac{i}{k_1^2-m^2+i\epsilon}\frac{i}{k_2^2-m^2+i\epsilon}\\\frac{i}{(q_1+q_2-k_2)^2-m^2+i\epsilon}\frac{i}{(k_2-k_1-p_2)^2-m^2+i\epsilon}$$


(19.2) Draw the interaction vertex for \phi^3 theory. Find the contributions to the amplitude \langle q|\hat S|p\rangle up to second order in the interaction strength. What are the symmetry factors?

If I’m ever motivated, I’ll do the remainder of this homework, which is again not very enlightening. Ditto for chapter 20.

Lancaster and Blundell Chapter 18

(18.1) Consider a spin-1/2 particle in a static magnetic field subject to a perpendicular, oscillating magnetic field

$$\hat H=\gamma B_0\hat S_z+\gamma B_1(\hat S_x\cos\gamma B_0t+\hat S_y\sin\gamma B_0 t)$$

where \gamma is the gyromagnetic ratio. Write the problem in the interaction representation.

$$\hat H=\hat H_0+\hat H’$$

$$\hat H_0=\gamma B_0\hat S_z$$

$$\hat H’=\gamma B_1(\hat S_x\cos\gamma B_0t+\hat S_y\sin\gamma B_0 t)$$

Simplify the interaction Hamiltonian using the circular basis \hat S_\pm.

$$\hat S_\pm=\hat S_x\pm i\hat S_y$$

$$\hat S_\pm e^{\pm i\omega t}=e^{i\omega \hat S_z t}\hat S_{\pm}e^{-i\omega \hat S_z t}$$

$$\hat S_x\cos\gamma B_0t+\hat S_y\sin\gamma B_0t=\hat S_++\hat S_-$$

$$\hat H_I=\frac12\gamma B_1(\hat S_++\hat S_-)$$

Find the interaction picture evolution operator.

$$\hat U_I(t_2, t_1)=T\left[e^{-i\int_{t_1}^{t_2} dt\hat H_I}\right]$$

Since there is no time dependence,

$$\hat U_I(t_2,t_1)=e^{-\frac12\gamma B_1(\hat S_++\hat S_-)(t_2-t_1)}$$

What is the probability a particle initially in the |\uparrow\rangle state at t=0 will still be in that state at time t?

We need to compute

$$e^{-i\frac{\omega}{2} t}\langle \uparrow|U_I(t,0)|\downarrow\rangle$$

where the factor out in front comes from the non-interacting Hamiltonian which has expectation \frac{\omega}{2} for the |\uparrow\rangle state. The only nonzero terms in the Taylor expansion of the exponential are when S_- acts on |\uparrow\rangle and S_+ acts on \langle \uparrow\rangle. Additionally, S_-|\uparrow\rangle=|\downarrow\rangle. We find that

$$e^{-i\frac\omega2t}\langle \uparrow|e^{-\frac12\gamma B_1(\hat S_++\hat S_-)t}|\uparrow\rangle=e^{-i\frac\omega2t}\cos\frac{\gamma B_1t}{2}$$

The probability is thus $$\cos^2\frac{\gamma B_1t}{2}$$

which of course means the probability of the transition to |\downarrow\rangle is $$1-\cos^2\frac{\gamma B_1t}{2}=\sin^2\frac{\gamma B_1t}{2}$$


(18.2) Show that |\psi(t=\infty)\rangle_I=\sum_\phi\langle \phi|\hat S|\psi\rangle|\phi\rangle, where the states on the RHS are the simple-world states.

$$|\psi_I(\pm \infty)\rangle=|\psi\rangle_{\text{simple world}}=|\psi\rangle$$

By definition,

$$|\psi_I(+\infty)\rangle=\hat S|\psi_I(-\infty)\rangle=\hat S|\psi\rangle$$

Inserting a complete orthonormal basis of states,

$$|\psi_I(+\infty)\rangle=\sum_\phi|\phi\rangle\langle\phi| \hat S|\psi\rangle=\sum_\phi\langle \phi|\hat S|\psi\rangle|\phi\rangle$$


(18.3) Use Wick’s thoerem to express the string of Bose operators \hat a_{\mathbf p}\hat a^\dagger_{\mathbf q}\hat a_{\mathbf k} in terms of normal ordered fields and contractions.

These operators are time-independent, so

$$\hat a_{\mathbf p}\hat a^\dagger_{\mathbf q}\hat a_{\mathbf k}=T[ \hat a_{\mathbf p}\hat a^\dagger_{\mathbf q}\hat a_{\mathbf k}]=N[\text{all contractions}]$$

To compute contractions, we see that \overline{\hat b\hat b^\dagger}=[\hat b,\hat b^\dagger], whereas all other contractions are 0.

$$\overline{\hat b\hat b^\dagger}=(\hat b\hat b^\dagger-N[\hat b\hat b^\dagger])=[\hat b,\hat b^\dagger]$$

Computing,

$$\hat a_{\mathbf p}\hat a^\dagger_{\mathbf q}\hat a_{\mathbf k}=\hat a^\dagger_{\mathbf q}\hat a_{\mathbf p}\hat a_{\mathbf k}+[\hat a_{\mathbf p},\hat a^\dagger_{\mathbf q}]\hat a_{\mathbf k}\\=\hat a^\dagger_{\mathbf q}\hat a_{\mathbf p}\hat a_{\mathbf k}+\hat a_{\mathbf k}\delta^{(3)}(\mathbf{p-q})$$


(18.4) Find an expression for \hat b\hat g\hat b\hat b^\dagger\hat b^\dagger in terms of normal-ordered products.

$$\hat b\hat g\hat b\hat b^\dagger\hat b^\dagger=\hat b \hat g(\hat b^\dagger\hat b+[\hat b,\hat b^\dagger])\hat b^\dagger=\hat b\hat g\hat b^\dagger \hat b\hat b^\dagger+\hat b\hat g\hat b^\dagger\\=\hat b^\dagger\hat g\hat b\hat b\hat b^\dagger+\hat g\hat b\hat b^\dagger+\hat b\hat g\hat b^\dagger=\hat b^\dagger\hat g\hat b\hat b^\dagger\hat b+\hat b^\dagger\hat g\hat b+2(\hat g\hat b^\dagger\hat b+\hat g)\\=\hat b^\dagger\hat g\hat b^\dagger\hat b\hat b+\hat b^\dagger\hat g\hat b+3\hat g\hat b^\dagger\hat b+2\hat g\\=\hat b^\dagger\hat b^\dagger\hat b\hat b\hat g+4\hat b^\dagger\hat b\hat g+2\hat g$$

Calculating the same with Wick’s theorem,

$$\hat b\hat g\hat b\hat b^\dagger\hat b^\dagger=T[\hat b\hat g\hat b\hat b^\dagger\hat b^\dagger]=N[\text{all contractions}]$$

The contractions of \overline{\hat b\hat b}, \overline{\hat b\hat g}, \overline{\hat b^\dagger\hat b} are all 0.

$$=N[\hat b\hat g\hat b\hat b^\dagger\hat b^\dagger+4\hat g\hat b\hat b^\dagger+2\hat g]\\=\hat b^\dagger\hat b^\dagger\hat b\hat b\hat g+4\hat b^\dagger\hat b\hat g+2\hat g$$


(18.5) Use Wick’s theorem on

$$\langle 0|\hat c^\dagger_{\mathbf p_1-\mathbf q}\hat c^\dagger_{\mathbf p_2+\mathbf q}\hat c_{\mathbf p_2}\hat c_{\mathbf p_1}|0\rangle$$

$$=-\delta^{(3)}(\mathbf{p_1-q-p_2})\delta^{(3)}(\mathbf{p_2+q-p_1})+\delta^{(3)}(\mathbf{p_1-q-p_1})\delta^{(3)}(\mathbf{p_2+q-p_2})\\=\delta^{(3)}(\mathbf{q})\delta^{(3)}(\mathbf{q})-\delta^{(3)}(\mathbf{p_1-p_2-q})\delta^{(3)}(\mathbf{p_2-p_1+q})$$

(we ignored the overall normalizations).

Lancaster and Blundell Chapter 17

(17.1) Calculate the retarded field propagator for a free particle in momentum space and the time domain.

$$G^+_0(\mathbf p,t_x,\mathbf q, t_y)=\theta(t_x-t_y)\langle 0|\hat a_{\mathbf p}(t_x)\hat a^\dagger_{\mathbf q}(t_y)|0\rangle$$

We go into the Schrodinger picture:

$$=\theta(t_x-t_y)\langle 0|e^{i\hat Ht_x}\hat a_{\mathbf p}e^{-i\hat Ht_x}e^{i\hat Ht_y}\hat a^\dagger_{\mathbf q}e^{-i\hat Ht_y}|0\rangle$$

Since \hat H|0\rangle=0 and \langle 0|\hat H=0, the exponentials are 1.

$$=\theta(t_x-t_y)\langle \mathbf p|e^{-i\hat Ht_x}e^{i\hat Ht_y}|\mathbf q\rangle\\=\theta(t_x-t_y)e^{-i(E_{\mathbf p}t_x-E_{\mathbf q}t_y)}\langle \mathbf p|\mathbf q\rangle\\=\theta(t_x-t_y)e^{-i(E_{\mathbf p}t_x-E_{\mathbf q}t_y)}\delta^{(3)}(\mathbf p-\mathbf q)$$


(17.2) Demonstrate that the (1+1-dimensional) free scalar propagator is the Green’s function of the Klein-Gordon equation.

We write

$$\Delta(x,y)=\langle0|T\hat\phi(x)\hat\phi(y)|0\rangle=\int \frac{d^2p}{(2\pi)^2}e^{-ip\cdot(x-y)}\frac{i}{p^2-m^2+i\epsilon}$$

Operating on this with \partial^2+m^2=\partial_t^2-\partial_x^2+m^2,

$$\partial_t^2-\partial_x^2+m^2(\Delta(x,y))=\int\frac{d^2p}{(2\pi)^2}\left(-p_0^2+p_x^2+m^2\right)\frac{ie^{-ip\cdot(x-y)}}{p_0^2-p^2-m^2+i\epsilon}\\=-i\int\frac{d^2p}{(2\pi)^2}\frac{p_0^2-p_x^2-m^2}{p_0^2-p^2-m^2+i\epsilon}e^{-ip\cdot(x-y)}\\=-i\delta^{(2)}(x-y)$$


(17.3) Show that the action for the free scalar field may be written $$S=\frac12\int\frac{d^4p}{(2\pi)^4}\tilde\phi(-p)(p^2-m^2)\tilde\phi(p)$$

We integrate the Lagrangian density by parts to get

$$\mathcal L=\frac12\phi(x)\partial^2\phi(x)-\frac{m^2}2\phi(x)^2=\frac12\phi(x)(\partial^2-m^2)\phi(x)$$

We substitute the Fourier transform for each copy of \phi:

$$\mathcal S=\frac12\int\frac{d^4xd^4pd^4q}{(2\pi)^8}\tilde\phi(p)e^{ip\cdot x}(\partial^2-m^2)\tilde\phi(q)e^{iq\cdot x}=\frac12\int\frac{d^4xd^4pd^4q}{(2\pi)^8}\tilde\phi(p)e^{i(p+q)\cdot x}\phi(q)(-q^2-m^2)$$

The integral over \frac{d^4x}{(2\pi)^4} gives \delta^{(4)}(p+q), so integrating over q gives

$$\frac12\int\frac{d^4p}{(2\pi)^4}\tilde\phi(-p)(p^2-m^2)\tilde\phi(p)$$

We may thus identify \tilde G_0(p) as \frac i2 times the inverse of the quadratic term in the momentum-space action ((p^2-m^2+i\epsilon)^{-1}).


(17.4) Find the Feynman propagator \tilde G(\omega) for the quantum harmonic oscillator with spring constant m\omega_0^2.

We write the Lagrangian as

$$L=\frac{\dot x^2}{2m}+\frac{m\omega_0^2}{2}x$$

Substituting the Fourier transform of x=\int\frac{d\omega}{2\pi}\tilde xe^{i\omega t}, the same steps as before (integrating the first term by parts and identifying the free propagator) gives

$$\tilde G(\omega)=\frac1m\frac{i}{\omega^2-\omega_0^2+i\epsilon}$$


(17.5) Consider the Lagrangian density $$\mathcal L=\frac12(\partial_x\phi)^2+\frac{m^2}2\phi(x)^2$$ Discretize this theory (\phi_j=\frac{1}{\sqrt{Na}}\sum_p\tilde\phi_pe^{ipja}) and find the momentum space expression for the action.

The derivative terms becomes a finite difference $$\frac{\phi_{j+1}-\phi_j}{a}=\frac{1}{a\sqrt{Na}}\sum_p\tilde\phi_p(e^{ip(j+1)a}-e^{ipja})$$

The integral becomes Na\sum_j. Carrying out some tedious sums, we find that the momentum is p^2\rightarrow\frac{2-e^{ipa}-e^{iqa}}{a^2}.

$$S=\frac12\sum_p\tilde\phi_{-p}\left(\frac2{a^2}-\frac{2\cos pa}{a^2}+m^2\right)\tilde\phi_p$$

$$\tilde G(p)=\frac{i}{\frac2{a^2}-\frac{2\cos pa}{a^2}+m^2}$$

The Lagrangian for a 1D elastic string in (1+1)-Minkowski space is 

$$\mathcal L=\frac12\left((\partial_0\phi)^2-(\partial_1\phi)^2\right)$$

Discretize this theory in space only and find the propagator, with \omega_0^2=\frac{2}{a^2}.

This is truly unenlightening.

$$\tilde G(\omega,p)=\frac{i}{\omega^2-\omega_0^2(1-\cos pa)}$$

To reason a little, the \partial_t\phi acts as the effective mass in this theory, leading to the \omega^2 in the denominator, whereas the \omega_0^2(1-\cos pa) comes out of the spatial derivative term. There is an extra minus sign on that term since the Lagrangian has the extra minus sign compared to before.

Lancaster and Blundell Chapter 16

(16.1) Solve the Schrodinger equation for a one-dimensional infinite square well.

Separation of variables gives

$$\psi(x,t)=\sum_n e^{-iE_n t}\psi_n(x)$$

The time-independent Schrodinger equation is

$$\hat H\psi_n=E_n\psi$$

$$\frac1{2m}\partial_x^2\psi_n=-E_n\psi$$

The general solution to this homogeneous ODE is

$$\psi_n=A_n\sin \sqrt{2mE_n}x+B_n\cos\sqrt{2mE_n}x$$

Since we require \psi_n(0)=0, B_n=0. Since we require \psi(a)=0, 2m E_n a=n\pi, n\in\mathbb{Z}. Since we require the wavefunction to be nonzero, n>0. The energy is thus \sqrt{2mE_n}=\frac{n\pi}{a},

$$E_n=\frac{n^2\pi^2}{2ma^2}$$

Computing the retarded Green’s function,

$$\langle n|\hat U(t_2-t_1)|n\rangle=\langle n|e^{-i\hat H(t_2-t_1)}|n\rangle\\=\langle n|e^{-iE_n(t_2-t_1)}|n\rangle=e^{-i\frac{n^2\pi^2}{2ma^2}(t_2-t_1}\langle n|n\rangle=e^{-i\frac{n^2\pi^2}{2ma^2}(t_2-t_1)}$$

Finally, we enforce t_2\geq t_1:

$$G^+(n, t_2, t_1)=\theta(t_2-t_1)e^{-i\frac{n^2\pi^2}{2ma^2}(t_2-t_1)}$$

Find G^+(n,\omega).

$$G_0^+(n,\omega)=\int_{-\infty}^\infty dt_2 G^+(n,t_2, t_1)e^{i\omega(t_2-t_1)}\\=\int_{-\infty}^\infty dt_2 \theta(t_2-t_1) e^{i\left(\omega -\frac{n^2\pi^2}{2ma^2}+i\epsilon\right)(t_2-t_1)}$$

We add a factor of e^{-\epsilon} so that the integral converges.

$$=\frac{1}{i\left(\omega-\frac{n^2\pi^2}{2ma^2}+i\epsilon\right)}e^{i\left(\omega-\frac{n^2\pi^2}{2ma^2}+i\epsilon\right)(t_2-t_1)}\bigg\vert_{t_1}^\infty\\=\frac{i}{\omega-\frac{n^2\pi^2}{2ma^2}+i\epsilon}$$


(16.2) Derive G^+(x,y,E).

I suppress the complete sets of states for x,y.

$$G^+_0(x,y,E)=\sum_n\langle n|\int_{-\infty}^\infty \theta(t_2-t_1)e^{-iE_n (t_2-t_1)}e^{iE(t_2-t_1)}|n\rangle\\=\sum_n\langle n|\int_{t_1}^\infty e^{i\left(E-E_n+i\epsilon\right)(t_2-t_1)}|n\rangle\\=\sum_n\langle n|\frac{1}{i\left(E-E_n+i\epsilon\right)}e^{i\left(E-E_n+i\epsilon\right)(t_2-t_1)}\bigg\vert_{t_1}^\infty|n\rangle\\=\sum_n\frac{i\phi_n(x)\phi_n^*(y)}{E-E_n+i\epsilon}$$

Use the Fourier definition of the Heaviside \theta function to derive G^+_0(p,E).

$$G^+_0(p,t)=\theta(t)e^{-iE_pt}$$

$$G^+_0(p,E)=\int_{-\infty}^\infty dt G^+_0(p,t)e^{iE t}\\=\int_{-\infty}^\infty dt \int_{-\infty}^\infty\frac{dz}{2\pi}\frac{ie^{-izt}}{z+i\epsilon}e^{i(E-E_p)t}$$

Performing the t integral,

$$=\int_{-\infty}^\infty dz \frac i{z+i\epsilon}\int_{-\infty}^\infty \frac{dt}{2\pi} e^{-i(z-E+E_p)t}$$

$$=\int_{-\infty}^{\infty} dz \frac{i}{z+i\epsilon}\delta(E-E_p-z)$$

$$=$\frac{i}{E-E_p+i\epsilon}}$


(16.3) Consider the one-dimensional simple harmonic oscillator with a forcing function f(t)=\tilde F(\omega)e^{-i\omega(t-u)} described by an equation of motion

$$m\frac{\partial^2}{\partial t^2}A(t-u)+m\omega_0^2 A(t-u)=\tilde F(\omega) e^{-i\omega(t-u)}$$

Solving for A(t-u),

A specific solution is

$$-\frac{\tilde F(\omega)}{m}\frac{e^{-i\omega(t-u)}}{\omega^2-\omega_0^2}$$

Therefore the general solution is

$$A(t-u)=-\frac{\tilde F(\omega)}{m}\frac{e^{-i\omega(t-u)}}{\omega^2-\omega_0^2}+B(t)$$

where B(t) is a solution to the homogeneous equation of motion.

To find the Green’s function, we write

$$\left(m\frac{\partial^2}{\partial t^2}+m\omega_0^2\right)G(t,u)=\delta(t-u)$$

Performing a Fourier transform,

$$\left(m\frac{\partial^2}{\partial t^2}+m\omega_0^2\right)\int \frac{d\omega}{2\pi}\tilde{G}(\omega)e^{-i\omega(t-u)}=\int \frac{d\omega}{2\pi} e^{-i\omega(t-u)}$$

$$m(-\omega^2+\omega_0^2)\tilde G(\omega)=1$$

$$\tilde{G}(\omega)=-\frac{1}{m}\frac{1}{\omega^2-\omega^2_0}$$

Thus

$$G(t,u)=-\frac{1}{m}\int_{-\infty}^\infty\frac{d\omega}{2\pi}\frac{e^{-i\omega(t-u)}}{\omega^2-\omega^2_0}$$

If G(t,u) is subject to the boundary conditions that at t=0 we have G=0 and \dot G=0, show that G^+(t,u)=\frac1{m\omega_0}\sin\omega_0(t-u) for 0<u<t.

The poles of the integral are at \omega=\pm \omega_0. By the residue theorem,

$$G^+(t,u)=-\frac 1{2\pi m}\left[\frac{e^{-i\omega(t-u)}}{\omega-\omega_0}\bigg\vert_{\omega=-\omega_0}+\frac{e^{-i\omega(t-u)}}{\omega+\omega_0}\bigg\vert_{\omega=\omega_0}\right]\\=\frac{1}{m\omega_0}\frac{e^{i\omega_0(t-u)}-e^{-i\omega_0(t-u)}}{2i}\\=\frac{1}{m\omega_0}\sin\omega_0(t-u)$$

Given the forcing function f(t)=F_0\sin\omega_0 t, find the trajectory.

$$A(t)=\int du G^+(t,u) F_0\sin\omega_0 u\\=\frac{F_0}{m\omega_0}\int_0^t du \sin(\omega_0(t-u))\sin(\omega_0 u)$$

Using a version of the sum of angles formula for cosine, \sin(\alpha)\sin(\beta)=\frac12(\cos(\alpha-\beta)-\cos(\alpha+\beta)), this integral comes out to

$$A(t)=\frac{F_0}{2m\omega_0^2}\left(\sin\omega_0 t-\omega_0 t\cos\omega_0 t\right)$$


(16.4) Find the Green’s function for the Helmholtz equation.

$$(\nabla^2+k^2)G_k(x)=\delta^{(3)}(x)$$

Performing the Fourier transform,

$$\int\frac{d^3q}{(2\pi)^3}(\nabla^2+k^2)\tilde G_k(q)e^{-iq\cdot x}=\int\frac{d^3q}{(2\pi)^3}e^{-iq\cdot x}$$

Equating the Fourier transforms,

$$\tilde G_k(q)=\frac{1}{k^2-q^2}$$

Take the Fourier transform of G^+_k(x)=-\frac{e^{i|k||x|}}{4\pi|x|} for outgoing waves.

$$-\int d^3x\frac{e^{-i|k||x|}}{4\pi|x|}e^{-iq\cdot x-\epsilon|x|}\\=-\int d^3x\frac{e^{-i|k||x|}}{4\pi|x|}e^{-i|q||x|\cos\theta-\epsilon|x|}$$

$$=-\frac{1}{4\pi}\int_0^{2\pi}d\phi\int_{-1}^1d(\cos\theta)\int_0^\infty x^2 dx\frac1x e^{i(|k|+i\epsilon)|x|+i|q||x|\cos\theta}$$

The \phi integral becomes 2\pi, while the \cos\theta integral gives

$$=-\frac{1}{2}\int_0^\infty dx\ xe^{i(|k|+i\epsilon)|x|}\frac{1}{i|q||x|}(e^{i|q||x|}-e^{-i|q||x|})$$

$$=-\frac{1}{|q|}\frac{|q|}{|q|^2-(|k|+i\epsilon)^2}=\frac{1}{|k|^2-|q|^2+i\epsilon}$$

after we ignore the second order \epsilon term, and rewrite the remaining positive infinitesimal as i\epsilon.

Take the inverse Fourier transform of G^+_k(q).

This is very similar, and we perform a contour integral with the residue theorem, whose poles are already nicely separated as shown above.

What do you expect for an incoming wave solution?

We have to shift the poles the opposite way, so we will have an advanced propagator with -i\epsilon in the denominator.

Lancaster and Blundell Chapter 15

(15.1) Why is \pi^0\rightarrow\gamma+\gamma+\gamma disallowed?

Charge conjugation acts on each \gamma, producing a -1 to the quantum state. The eigenvalue of C must be preserved under charge conjugation, due to the electromagnetism, but would not be in this process, since \pi^0 has an eigenvalue of 1.


(15.2)

a) Magnetic flux is a pseudoscalar.

b) Angular momentum is a pseudovector.

c) Charge is a scalar.

d) The scalar product of a vector with a pseudovector is a pseudoscalar.

e) The scalar product of two vectors is a scalar.

f) The scalar product of two pseudovectors is a scalar.


(15.3) Find the representations for the spinor rotation matrices.

$$\mathbf R(\hat{\mathbf x},\theta)=\begin{bmatrix}\cos\frac\theta2&-i\sin\frac\theta2\\-i\sin\frac\theta2&\cos\frac\theta2\end{bmatrix}$$

$$\mathbf R(\hat{\mathbf y},\theta)=\begin{bmatrix}\cos\frac\theta2&-\sin\frac\theta2\\\sin\frac\theta2&\cos\frac\theta2\end{bmatrix}$$

$$\mathbf R(\hat{\mathbf z},\theta)=\begin{bmatrix}\cos\frac\theta2-i\sin\frac\theta2&0\\0&\cos\frac\theta2+i\sin\frac\theta2\end{bmatrix}=\begin{bmatrix}\exp-i\frac\theta2&0\\0&\exp i\frac\theta2\end{bmatrix}$$