Solving the Laplace Equations with Linear Relations

The Laplace equation is ubiquitous in physics and engineering.

$latex \nabla^2 \phi = 0 = \partial_x^2 \phi + \partial_y^2 \phi = 0

It and slight variants of it describes electrostatics, magnetostatics, steady state heat flow, elastic flex, pressure, velocity potentials.

There are a couple reasons for that.

  • It results from minimizing the squared gradient of a field |\nabla \phi |^2 which can make sense from an energy minimization perspective.
  • Similarly it results from the combination of a flow conservation law and a linear constitutive relation connecting flow and field (such as Ohm’s law, Fick’s law, or Hooke’s law).
  • It also gets used even if not particularly appropriate because we know how to mathematically deal with it, for example in image processing.

There are a couple of questions we may want to ask about a Laplace equation system

  • Given the field on the boundary, determine the field in the interior (Dirchlet problem)
  • Given the normal derivative of the field on the boundary determine the field in the interior (Neumann problem)
  • Given sources in the interior and 0 boundary condition, determine the field. The Laplace equation is called the Poisson equation when you allow a source term on the right hand side. \nabla^2 \phi = \rho.
  • Given the field at the boundary, determine the derivative at the boundary. Dirichlet-to-Neumann map or Poincare-Steklov operator.

Given the Dirichlet to Neumann map, you do not have to consider the interior of a region to use it. The Dirichlet to Neumann map is sort of the same thing as an effective resistance or scattering matrix. It gives you a black box representation of a region based solely on the variables at its boundary.

This linear relation algebra is useful for many things that I’d have considered a use case for the Schur complement. The Schur complement arises when you do Gaussian elimination on a blocked matrix. It is good that this pattern has a name, because once you know about it, you’ll see it in many places. Domain decomposition, marginalized gaussian distributions, low-rank update, Scattering matrices.

By composing the linear relations corresponding to the Dirchlet-Neumann relations of regions, we can build the Dirichlet-Neumann relations of larger regions.

To make this more concrete, let us take the example of electrical circuits like before. A grid of resistors is a finite difference approximation to the continuous problem

-\nabla \phi = E Electric field is gradient of potential

E = \rho j continuum ohm’s law

\nabla\cdot j = 0 current conservation

In this post, I mentioned how you can make reasonable 2 dimensional diagrams out of a monoidal category, by sort of arbitrarily flipping one wire up and one wire down as in the diagram below. This defines a horizontal and vertical composition which have to do the required book-keeping (associations) to keep an arrow in canonical form. I had considered this for the management method of weights in neural networks, but it is way more natural as the actual geometrical layout of a finite difference grid of a laplace equation.

So we can reuse our categorical circuit combinators to build a finite difference Laplace equation.

Just showing how you can bend a 4-wire monoidal box into a 2-d diagram. Ignore the labels.

This can be implemented in Haskell doing the following. Neato.

type HLinRel2D u d l r = HLinRel (Either u l) (Either d r)
A stencil of 2d resistors for tiling
l -/\/\/----/\/\/\- r
stencil :: HLinRel2D IV IV IV IV
stencil = (hpar r10 r10) <<< short <<< (hpar r10 r10) where r10 = resistor 10
horicomp :: forall w w' w'' w''' a b c. (BEnum w, BEnum w', BEnum a, BEnum w''', BEnum b, BEnum w'', BEnum c ) => HLinRel2D w' w'' b c -> HLinRel2D w w''' a b -> HLinRel2D (Either w' w) (Either w'' w''') a c
horicomp f g = hcompose f' g' where
f' :: HLinRel (Either (Either w' w''') b) (Either (Either w'' w''') c)
f' = (first hswap) <<< hassoc' <<< (hpar hid f) <<< hassoc <<< (first hswap)
g' :: HLinRel (Either (Either w' w) a) (Either (Either w' w''') b)
g' = hassoc' <<< (hpar hid g) <<< hassoc
rotate :: (BEnum w, BEnum w', BEnum a, BEnum b) => HLinRel2D w w' a b -> HLinRel2D a b w w'
rotate f = hswap <<< f <<< hswap
vertcomp :: (BEnum w, BEnum w', BEnum a, BEnum d, BEnum b, BEnum w'', BEnum c ) => HLinRel2D w' w'' c d -> HLinRel2D w w' a b -> HLinRel2D w w'' (Either c a) (Either d b)
vertcomp f g = rotate (horicomp (rotate f) (rotate g) )
view raw laplace.hs hosted with ❤ by GitHub

Bits and Bobbles

  • Not the tightest post, but I needed to get it out there. I have a short attention span.
  • Homology and defining simplices as categories. One way of describing Homology is about linear operators that are analogues of finite difference operators (or better yet, discrete differential geometry operators / exterior derivatives). To some degree, it is analyzing the required boundary conditions to fully define differential equations on weirdo topological surfaces, which correspond to geometrical loops/holes. You can figure this out by looking at subspaces and quotients of the difference operators. Here we have here a very category theory way of looking at partial differential equations. How does it all connect?
  • Continuous circuit models – Telegrapher’s equation is classic example.
  • Cody mentioned that I could actually build circuits and measure categorical identities in a sense. That’s kind of cool. Or I could draw conductive ink on carbon paper and actually make my string diagrams into circuits? That is also brain tickling
  • Network circuits
  • I really want to get coefficients that aren’t just doubles. allowing rational functions of a frequency \omega would allow analysis of capacitor/inductor circuits, but also tight binding model systems for fun things like topological insulators and the Haldane model . I may need to leave Haskell. I’m not seeing quite the functionality I need. Use Sympy? HLinear. Flint bindings for haskell? Looks unmaintained. Could also use a grobner basis package as dynamite for a mouse.
  • This is relevant for the boundary element method. Some really cool other stuff relevant here.

Blah Blah blah: The subtlest aspect of differential equations is that of boundary conditions. It is more correct usually to consider the system of the interior differential equation and the boundary conditions as equally essential parts of the statement of the problem you are considering.

The Classical Coulomb Gas as a Mixed Integer Quadratic Program

The coulomb gas is a model of electrostatics where you take the discreteness of charge into account. That is what makes it hard compared to the continuous charge problem. Previously, I’ve used mixed integer programming to find lowest energy states of the ising model. This is even more obvious application of mixed integer programming to a physics model.

We ordinarily consider electric charge to be a continuum, but it isn’t. It comes in chunks of the electron charge. Historically, people didn’t even know that for quite a while. It is usually a reasonable approximation for most purposes to consider electric charge to be continuous

If you consider a network of capacitors cooled to the the level that there is not enough thermal energy to borrow to get an electron to jump, the charges on the capacitors will be observably discretized. With a sufficiently slow cooling to this state, the charges should arrange themselves into the lowest energy state.

The coulomb gas model also is of interest due to its connections to the XY model, which I’ve taken a stab at with mixed integer programming before. The coulomb gas models the energy of vortices in that model. I think the connection between the models actually requires a statistical or quantum mechanical context though, whereas we’ve been looking at the classical energy minimization.

We can formulate the classical coulomb gas problem very straightforwardly as a mixed integer quadratic program. We can easily include an externally applied field and a charge conservation constraint if we so desire within the framework.

We write this down in python using the cvxpy library, which has a built in free MIQP solver, albeit not a very good one. Commercial solvers are probably quite a bit better.

import cvxpy as cvx
import numpy as np
#grid size
N = 5
# charge variables
q = cvx.Variable( N*N ,integer=True)

# build our grid
x = np.linspace(0,1,N) 
y = np.linspace(0,1,N) 
X, Y = np.meshgrid(x,y, indexing='ij')
x1 = X.reshape(N,N,1,1)
y1 = Y.reshape(N,N,1,1)
x2 = X.reshape(1,1,N,N)
y2 = Y.reshape(1,1,N,N)
eps = 0.1 / N #regularization factor for self energy. convenience mostly
V = 1. / ((x1-x2)**2 + (y1-y2)**2 + eps**2)** ( 1 / 2)
V = V.reshape( (N*N,N*N) )

U_external = 100 * Y.flatten() # a constant electric field in the Y direction 
energy = cvx.quad_form(q,V) + U_external*q

# charge conservation constraint
prob = cvx.Problem(cvx.Minimize(energy),[cvx.sum(q)==0])
res = prob.solve(verbose=True)


#plotting junk

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D  
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

ax.plot_surface(X, Y, q.value.reshape((N,N)))
A plot of charge in a constant external electric field.

The results seems reasonable. It makes sense for charge to go in the direction of the electric field. Going to the corners makes sense because then like charges are far apart. So this might be working. Who knows.

Interesting places to go with this:

Prof Vanderbei shows how you can embed an FFT to enable making statements about both the time and frequency domain while keeping the problem sparse. The low time/memory N log(N) complexity of the FFT is reflected in the spasity of the resulting linear program.

Here’s a sketch about what this might look like. Curiously, looking at the actual number of nonzeros in the problem matrices, there are way too many. I am not sure what is going on. Something is not performing as i expect in the following code

import cvxpy as cvx
import numpy as np
import scipy.fftpack # import fft, ifft
def swizzle(x,y):
    assert(x.size == y.size)
    N = x.size
    s =  np.exp(-2.j * np.pi * np.arange(N) / N)
    #ret = cvx.hstack( [x + s*y, x - s*y]) 
    return cvx.hstack( [x - s*y, x + s*y]) 

def fft(x):
    N = x.size
    #assert(2**int(log2(N)) == N) # power of 2

    if N == 1:
        return x, []
        y = cvx.reshape(x,(N//2,2))
        c = []
        even, ce = fft(y[:,0])
        c += ce
        odd, co = fft(y[:,1])
        c += co
        z = cvx.Variable(N, complex=True)
        c += [z == swizzle(even,odd)]
        return z, c

N = 256
x = cvx.Variable(N, complex=True)
z, c = fft(x)
v = np.zeros(N) #np.ones(N) #np.random.rand(N)
v[0]= 1
c += [x == v]
prob = cvx.Problem( cvx.Minimize(1), c)
res = prob.solve(verbose=True)
print(scipy.fftpack.fft(v) - z.value)

The equivalent dense DFT:

x = cvx.Variable(N, complex=True)
fred = cvx.Variable(N, complex=True)
c = [fred == np.exp(-2.j * np.pi * np.arange(N).reshape((N,1)) * np.arange(N).reshape((1,N)) / N) * x]
prob = cvx.Problem( cvx.Minimize(1), c)

It would be possible to use a frequency domain solution of the interparticle energy rather than the explicit coulomb law form. Hypothetically this might increase the sparsity of the problem.

It seems very possible to me in a similar manner to embed a fast multipole method or barnes-hut approximation within a MIQP. Introducing explicit charge summary variables for blocks would create a sparse version of the interaction matrix. So that’s fun.

Solving the XY Model using Mixed Integer Optimization in Python

There are many problems in physics that take the form of minimizing the energy. Often this energy is taken to be quadratic in the field. The canonical example is electrostatics. The derivative of the potential \phi gives the electric field E. The energy is given as \int (|\nabla \phi|^2 + \phi \rho) d^3 x . We can encode a finite difference version of this (with boundary conditions!) directly into a convex optimization modelling language like so.

import cvxpy as cvx
import numpy as np
import matplotlib.pyplot as plt
import scipy.linalg
from mpl_toolkits import mplot3d

N = 10

# building a finite difference matrix. It is rectangle of size N x (N-1). It maps from the vertices of our grid to the lines in between them, where derivatives live.
col = np.zeros(N)
col[0] = -1
col[1] = 1
delta = scipy.linalg.toeplitz(col, np.zeros(N-1)).T

gradx = np.kron(delta, np.eye(N))
grady = np.kron(np.eye(N), delta)

# a variable for our potential
phi = cvx.Variable((N, N))

# vectorization is useful. It flattens out the x-y 2Dness.
phivec = cvx.vec(phi)
gradxvec = gradx.reshape(-1, N*N)
gradyvec = grady.reshape(-1, N*N)

V = cvx.sum_squares(gradxvec * phivec) + cvx.sum_squares(gradyvec * phivec)

constraints = []
# boundary conditions. Dirichlet
constraints += [phi[:,0] == 0, phi[0,:] == 0, phi[:,-1] == 0, phi[-1,:] == 0 ]

# fixed charge density rho
rho = np.zeros((N,N))
rho[N//2,N//2] = 1

# objective is energy
objective = cvx.Minimize(V + cvx.sum(cvx.multiply(rho,phi)))
prob = cvx.Problem(objective, constraints)
res = prob.solve()

# Plotting 
x = np.linspace(-6, 6, N)
y = np.linspace(-6, 6, N)

X, Y = np.meshgrid(x, y)
fig = plt.figure()

ax = plt.axes(projection='3d')
ax.plot_surface(X, Y, phi.value, rstride=1, cstride=1,
                cmap='viridis', edgecolor='none')
The resulting logarithm potential

It is noted rarely in physics, but often in the convex optimization world that the barrier between easy and hard problems is not linear vs. nonlinear, it is actually more like convex vs. nonconvex. Convex problems are those that are bowl shaped, on round domains. When your problem is convex, you can’t get caught in valleys or on corners, hence greedy local methods like gradient descent and smarter methods work to find the global minimum. When you differentiate the energy above, it results in the linear Laplace equations \nabla^2 \phi = \rho. However, from the perspective of solvability, there is not much difference if we replace the quadratic energy with a convex alternative.

def sum_abs(x):
    return cvx.sum(cvx.abs(x))
V = cvx.sum_squares(gradxvec * phivec) + cvx.sum_squares(gradyvec * phivec)
V = cvx.sum(cvx.huber(gradxvec * phivec)) + cvx.sum(cvx.huber(gradyvec * phivec))
V = cvx.pnorm(gradxvec * phivec, 3) + cvx.pnorm(gradyvec * phivec, 3)

a = 1 
dxphi = gradxvec * phivec
dyphi = gradyvec * phivec
V = cvx.sum(cvx.maximum( -a - dxphi, dxphi - a, 0 )) + cvx.sum(cvx.maximum( -a - dyphi, dyphi - a, 0 ))

Materials do actually have non-linear permittivity and permeability, this may be useful in modelling that. It is also possible to consider the convex relaxation of truly hard nonlinear problems and hope you get the echoes of the phenomenology that occurs there.

Another approach is to go mixed integer. Mixed Integer programming allows you to force that some variables take on integer values. There is then a natural relaxation problem where you forget the integer variables have to be integers. Mixed integer programming combines a discrete flavor with the continuous flavor of convex programming. I’ve previously shown how you can use mixed integer programming to find the lowest energy states of the Ising model but today let’s see how to use it for a problem of a more continuous flavor.

As I’ve described previously, in the context of robotics, the non-convex constraint that variables lie on the surface of a circle can be approximated using mixed integer programming. We can mix this fairly trivially with the above to make a global solver for the minimum energy state of the XY model. The XY model is a 2d field theory where the value of the field is constrained to lie on a circle. It is a model of a number of physical systems, such as superconductivity, and is the playground for a number of interesting phenomenon, like the Kosterlitz-Thouless phase transition. Our encoding is very similar to the above except we make two copies of the field phi and we then force them to lie on a circle. I’m trying to factor out the circle thing into my library cvxpy-helpers, which is definitely a work in progress.

import cvxpy as cvx
import numpy as np
import matplotlib.pyplot as plt
import scipy.linalg
from mpl_toolkits import mplot3d
from cvxpyhelpers import cvxpyhelpers as mip

N = 6

# building a finite difference matrix. It is rectangle of size Nx(N-1). It maps from the vertices of our grid to the lines in between them, where derivatives live.
col = np.zeros(N)
col[0] = -1
col[1] = 1
delta = scipy.linalg.toeplitz(col, np.zeros(N-1)).T

gradx = np.kron(delta, np.eye(N))
grady = np.kron(np.eye(N), delta)

# a variable for our potential
phix = cvx.Variable((N, N))
phiy = cvx.Variable((N, N))

# vectorization is useful. It flattens out the x-y 2Dness.
phixvec = cvx.vec(phix)
phiyvec = cvx.vec(phiy)
gradxvec = gradx.reshape(-1, N*N)
gradyvec = grady.reshape(-1, N*N)

def sum_abs(x):
    return cvx.sum(cvx.abs(x))

#V = cvx.sum_squares(gradxvec * phixvec) + cvx.sum_squares(gradyvec * phixvec) + cvx.sum_squares(gradxvec * phiyvec) + cvx.sum_squares(gradyvec * phiyvec) 
V = sum_abs(gradxvec * phixvec) + sum_abs(gradyvec * phixvec) + sum_abs(gradxvec * phiyvec) + sum_abs(gradyvec * phiyvec) 
constraints = []
# coundary conditions. Nice and vortexy.
constraints += [phix[:,0] >= 0.9, phiy[0,1:-1] >= 0.9, phix[:,-1] <= -0.9, phiy[-1,1:-1] <= -0.9 ]

for i in range(N):
    for j in range(N):
        x, y, c =
        constraints += c
        constraints += [phix[i,j] == x]
        constraints += [phiy[i,j] == y]

# fixed charge density rho
rho = np.ones((N,N)) * 0.01
rho[N//2,N//2] = 1

# objective is energy
objective = cvx.Minimize(V + cvx.sum(cvx.multiply(rho,phix)))
prob = cvx.Problem(objective, constraints)
print("solving problem")
res = prob.solve(verbose=True, solver=cvx.GLPK_MI)

# Plotting 
x = np.linspace(-6, 6, N)
y = np.linspace(-6, 6, N)

X, Y = np.meshgrid(x, y)
fig = plt.figure()

plt.quiver(X,Y, phix.value, phiy.value)

Now, this isn't really an unmitigated success as is. I switched to an absolute value potential because GLPK_MI needs it to be linear. ECOS_BB works with a quadratic potential, but it was not doing a great job. The commercial solvers (Gurobi, CPlex, Mosek) are supposed to be a great deal better. Perhaps switching to Julia, with it's richer ecosystem might be a good idea too. I don't really like how the solution of the absolute value potential looks. Also, even at such a small grid size it still takes around a minute to solve. When you think about it, it is exploring a ridiculously massive space and still doing ok. There are hundreds of binary variables in this example. But there is a lot of room for tweaking and I think the approach is intriguing.


  • Can one do steepest descent style analysis for low energy statistical mechanics or quantum field theory?
  • Is the trace of the mixed integer program search tree useful for perturbative analysis? It seems intuitively reasonable that it visits low lying states
  • The Coulomb gas is a very obvious candidate for mixed integer programming. Let the charge variables on each grid point = integers. Then use the coulomb potential as a quadratic energy. The coulomb gas is dual to the XY model. Does this exhibit itself in the mixed integer formalism?
  • Coulomb Blockade?
  • Nothing special about the circle. It is not unreasonable to make piecewise linear approximations or other convex approximations of the sphere or of Lie groups (circle is U(1) ). This is already discussed in particular about SO(3) which is useful in robotic kinematics and other engineering topics.

Edit: /u/mofo69extreme writes:


By absolute value potential, I mean using |del phi| as compared to a more ordinary quadratic |del phi|2.

This is where I'm getting confused. As you say later, you are actually using two fields, phi_x and phi_y. So I'm guessing your potential is the "L1 norm"

|del phi| = |del phi_x| + |del phi_y|

? This is the only linear thing I can think of.

I don't feel that the exact specifics of the XY model actually matter all the much.

You should be careful here though. A key point in the XY model is the O(2) symmetry of the potential: you can multiply the vector (phi_x,phi_y) by a 2D rotation matrix and the Hamiltonian is unchanged. You have explicitly broken this symmetry down to Z_4 if your potential is as I have written above. In this case, the results of the famous JKKN paper and this followup by Kadanoff suggest that you'll actually get a phase transition of the so-called "Ashkin-Teller" universality class. These are actually closely related to the Kosterlitz-Thouless transitions of the XY model; the full set of Ashkin-Teller phase transitions actually continuously link the XY transition with that of two decoupled Ising models.

You should still get an interesting phase transition in any case! Just wanted to give some background, as the physics here is extremely rich. The critical exponents you see will be different from the XY model, and you will actually get an ordered Z_4 phase at low temperatures rather than the quasi-long range order seen in the low temperature phase of the XY model. (You should be in the positive h_4 region of the bottom phase diagram of Figure 1 of the linked JKKN paper.)"

These are some interesting points and references.

A Touch of Topological Quantum Computation in Haskell Pt. I

Quantum computing exploits the massive vector spaces nature uses to describe quantum phenomenon.

The evolution of a quantum system is described by the application of matrices on a vector describing the quantum state of the system. The vector has one entry for every possible state of the system, so the number of entries can get very, very large. Every time you add a new degree of freedom to a system, the size of the total state space gets multiplied by the size of the new DOF, so you have a vector exponentially sized in the  number  of degrees of freedom.

Now, a couple caveats. We could have described probabilistic dynamics similarly, with a probability associated with each state. The subtle difference is that quantum amplitudes are complex numbers whereas probabilities are positive real numbers. This allows for interference. Another caveat is that when you perform a measurement, you only get a single state, so you are hamstrung by the tiny amount of information you can actually extract out of this huge vector. Nevertheless, there are a handful of situations where, to everyone’s best guess, you get a genuine quantum advantage over classical or probabilistic computation.

Topological quantum computing is based around the braiding of particles called anyons. These particles have a peculiar vector space associated with them and the braiding applies a matrix to this space. In fact, the space associated with the particles can basically only be manipulated by braiding and other states require more energy or very large scale perturbations to access. Computing using anyons has a robustness compared to a traditional quantum computing systems. It can be made extremely unlikely that unwanted states are accessed or unwanted gates applied. The physical nature of the topological quantum system has an intrinsic error correcting power. This situation is schematically similar in some ways to classical error correction on a magnetic hard disk. Suppose some cosmic ray comes down and flips a spin in your hard disk. The physics of magnets makes the spin tend to realign with it’s neighbors, so the physics supplies an intrinsic classical error correction in this case.

The typical descriptions of how the vector spaces associated with anyons work I have found rather confusing. What we’re going to do is implement these vector spaces in the functional programming language Haskell for concreteness and play around with them a bit.


In many systems, the splitting and joining of particles obey rules. Charge has to be conserved. In chemistry, the total number of each individual atom on each side of a reaction must be the same. Or in particle accelerators, lepton number and other junk has to be conserved.

Anyonic particles have their own system of combination rules. Particle A can combine with B to make C or D. Particle B combined with C always make A. That kind of stuff. These rules are called fusion rules and there are many choices, although they are not arbitrary. They can be described by a table N_{ab}^{c} that holds counts of the ways to combine a and b into c. This table has to be consistent with some algebraic conditions, the hexagon and pentagon equations, which we’ll get to later.

We need to describe particle production trees following these rules in order to describe the anyon vector space.

Fibonacci anyons are one of the simplest anyon systems, and yet sufficiently complicated to support universal quantum computation. There are only two particles types in the Fibonacci system, the I particle and the \tau  particle. The I particle is an identity particle (kind of like an electrically neutral particle). It combines with \tau to make a \tau. However, two \tau particle can combine in two different ways, to make another \tau particle or to make an I particle.

So we make a datatype for the tree structure that has one constructor for each possible particle split and one constructor (TLeaf, ILeaf) for each final particle type. We can use GADTs (Generalize Algebraic Data Types) to make only good particle production history trees constructible. The type has two type parameters carried along with it, the particle at the root of the tree and the leaf-labelled tree structure, represented with nested tuples.

data Tau
data Id
data FibTree root leaves where
   TTT :: FibTree Tau l -> FibTree Tau r -> FibTree Tau (l,r)
   ITT :: FibTree Tau l -> FibTree Tau r -> FibTree Id (l,r) 
   TIT :: FibTree Id l -> FibTree Tau r -> FibTree Tau (l,r)
   TTI :: FibTree Tau l -> FibTree Id r -> FibTree Tau (l,r)
   III :: FibTree Id l -> FibTree Id r -> FibTree Id (l,r)
   TLeaf :: FibTree Tau Tau
   ILeaf :: FibTree Id Id
Free Vector Spaces

We need to describe quantum superpositions of these anyon trees. We’ll consider the particles at the leaves of the tree to be the set of particles that you have at the current moment in a time. This is a classical quantity. You will not have a superposition of these leaf particles. However, there are some quantum remnants of the history of how these particles were made. The exact history can never be determined, kind of like how the exact history of a particle going through a double slit cannot be determined. However, there is still a quantum interference effect left over. When you bring particles together to combine them, depending on the quantum connections, you can have different possible resulting particles left over with different probabilities. Recombining anyons and seeing what results is a measurement of the system .

Vectors can be described in different basis sets. The bases for anyon trees are labelled by both a tree structure and what particles are at the root and leaves. Different tree associations are the analog of using some basis x vs some other rotated basis x’. The way we’ve built the type level tags in the FibTree reflects this perspective.

The labelling of inner edges of the tree with particles varies depending on which basis vector we’re talking about. A different inner particle is the analog of \hat{x} vs \hat{y}.

To work with these bases we need to break out of the mindset that a vector put on a computer is the same as an array. While for big iron purposes this is close to true, there are more flexible options. The array style forces you to use integers to index your space, but what if your basis does not very naturally map to integers?

A free vector space over some objects is the linear combination of those objects. This doesn’t have the make any sense. We can form the formal sum (3.7💋+2.3i👩‍🎨) over the emoji basis for example. Until we attach more meaning to it, all it really means is a mapping between emojis and numerical coefficients. We’re also implying by the word vector that we can add two of the combinations coefficient wise and multiply scalars onto them.

We are going to import free vectors as described by the legendary Dan Piponi as described here:

What he does is implement the Free vector space pretty directly. We represent a Vector space using a list of tuples [(a,b)]. The a are basis objects and b are the coefficients attached to them.

data W b a = W { runW :: [(a,b)] } deriving (Eq,Show,Ord)
instance Semigroup (W b a) where
  (W x) <> (W y) = W (x <> y)
instance Monoid (W b a) where
  mempty = W mempty 
mapW f (W l) = W $ map (\(a,b) -> (a,f b)) l
instance Functor (W b) where
    fmap f (W a) = W $ map (\(a,p) -> (f a,p)) a

instance Num b => Applicative (W b) where
  pure x = W [(x,1)]
  (W fs) <*> (W xs) = W [(f x, a * b) | (f, a) <- fs, (x, b) <- xs] 

instance Num b => Monad (W b) where
   return x = W [(x,1)]
   l >>= f = W $ concatMap (\(W d,p) -> map (\(x,q)->(x,p*q)) d) (runW $ fmap f l)

a .* b = mapW (a*) b

instance (Eq a,Show a,Num b) => Num (W b a) where
     W a + W b = W $ (a ++ b)
     a - b = a + (-1) .* b
     _ * _ = error "Num is annoying"
     abs _ = error "Num is annoying"
     signum _ = error "Num is annoying"
     fromInteger a = if a==0 then W [] else error "fromInteger can only take zero argument"

collect :: (Ord a,Num b) => W b a -> W b a
collect = W . Map.toList . Map.fromListWith (+) . runW

trimZero = W . filter (\(k,v) -> not $ nearZero v) . runW
simplify = trimZero . collect
-- filter (not . nearZero . snd)

type P a = W Double a

type Q a = W (Complex Double) a


The Vector monad factors out the linear piece of a computation. Because of this factoring, the type constrains the mapping to be linear, in a similar way that monads in other contexts might guarantee no leaking of impure computations. This is pretty handy. The function you give to bind correspond to a selector columns of the matrix.

We need some way to zoom into a subtrees and then apply operations. We define the operations lmap and rmap.

lmap :: (forall a. FibTree a b -> Q (FibTree a c)) -> (FibTree e (b,d) -> Q (FibTree e (c,d)))
lmap f (ITT l r) = fmap (\l' -> ITT l' r) (f l)
lmap f (TTI l r) = fmap (\l' -> TTI l' r) (f l)
lmap f (TIT l r) = fmap (\l' -> TIT l' r) (f l)
lmap f (TTT l r) = fmap (\l' -> TTT l' r) (f l)
lmap f (III l r) = fmap (\l' -> III l' r) (f l)

rmap :: (forall a. FibTree a b -> Q (FibTree a c)) -> (FibTree e (d,b) -> Q (FibTree e (d,c)))
rmap f (ITT l r) = fmap (\r' -> ITT l r') (f r)
rmap f (TTI l r) = fmap (\r' -> TTI l r') (f r)
rmap f (TIT l r) = fmap (\r' -> TIT l r') (f r)
rmap f (TTT l r) = fmap (\r' -> TTT l r') (f r)
rmap f (III l r) = fmap (\r' -> III l r') (f r)

You reference a node by the path it takes to get there from the root. For example,  `(rmap . lmap . rmap) f` applies f at the node that is at the right-left-right position down from the root.


For Fibonacci anyons, the only two non trivial braidings happen when you braid two \tau particles. 

braid :: FibTree a (l,r) -> Q (FibTree a (r,l))
braid (ITT l r) = W [(ITT r l,  cis $ 4 * pi / 5)] 
braid (TTT l r) = W [(TTT r l,  (cis $ - 3 * pi / 5))]
braid (TTI l r) = pure $ TIT r l
braid (TIT l r) = pure $ TTI r l
braid (III l r) = pure $ III r l

-- The inverse of braid
braid' :: FibTree a (l,r) -> Q (FibTree a (r,l))
braid' = star . braid

We only have defined how to braid two particles that were split directly from the same particle. How do we describe the braiding for the other cases? Well we need to give the linear transformation for how to change basis into other tree structures. Then we have defined braiding for particles without the same immediate parent also.


We can transform to a new basis. where the histories differs by association. We can braid two particles by associating the tree until they are together. An association move does not change any of the outgoing leaf positions. It can, however, change a particle in an interior position. We can apply an F-move anywhere inside the tree, not only at the final leaves.

fmove :: FibTree a (c,(d,e)) -> Q (FibTree a ((c,d),e))
fmove (ITT  a  (TIT b c)) = pure $ ITT ( TTI  a b) c
fmove (ITT  a  (TTT b c)) = pure $ ITT ( TTT  a b) c
fmove (ITT  a  (TTI b c)) = pure $ III ( ITT  a b) c

fmove (TIT  a  (TTT b c)) = pure $ TTT ( TIT  a b) c
fmove (TIT  a  (TTI b c)) = pure $ TTI ( TIT  a b) c
fmove (TIT  a  (TIT b c)) = pure $ TIT ( III  a b) c

fmove (TTI  a  (III b c)) = pure $ TTI ( TTI  a b) c
fmove (TTI  a  (ITT b c)) = W [(TIT ( ITT  a b) c, tau)         , (TTT ( TTT  a b) c, sqrt tau)]

fmove (TTT  a  (TTT b c)) = W [(TIT ( ITT  a b) c, sqrt tau)  ,   (TTT ( TTT  a b) c, - tau   )]
fmove (TTT  a  (TTI b c)) = pure $ TTI ( TTT  a b) c
fmove (TTT  a  (TIT b c)) = pure $ TTT ( TTI  a b) c 

fmove (III  a  (ITT b c)) = pure $ ITT ( TIT  a b) c
fmove (III  a  (III b c)) = pure $ III ( III  a b) c

fmove' :: FibTree a ((c,d),e) -> Q (FibTree a (c,(d,e)))
fmove' (ITT ( TTI  a b) c) = pure $ (ITT  a  (TIT b c))
fmove' (ITT ( TTT  a b) c) = pure $  (ITT  a  (TTT b c))
fmove' (ITT ( TIT  a b) c) = pure $  (III  a  (ITT b c))

fmove' (TTI ( TTT  a b) c) = pure $ (TTT  a  (TTI b c))
fmove' (TTI ( TTI  a b) c) = pure $ (TTI  a  (III b c))
fmove' (TTI ( TIT  a b) c) = pure $ TIT  a  (TTI b c)

fmove' (TTT ( TTI  a b) c ) = pure $ TTT  a  (TIT b c)
fmove' (TTT ( TIT  a b) c ) = pure $ TIT  a  (TTT b c)
fmove' (TTT ( TTT  a b) c) = W [(TTI  a  (ITT b c), sqrt tau)  , (TTT  a  (TTT b c),   - tau  )]

fmove' (TIT ( ITT  a b) c) = W [(TTI  a  (ITT b c), tau)         , (TTT  a  (TTT b c) , sqrt tau)]
fmove' (TIT ( III  a b) c ) = pure $ TIT  a  (TIT b c)

fmove' (III ( III  a b) c ) = pure $ III  a  (III b c)
fmove' (III ( ITT  a b) c


Fusion / Dot product

Two particles that split can only fuse back into themselves. So the definition is pretty trivial. This is like \hat{e}_i \cdot \hat{e}_j = \delta_{ij}.

dot :: FibTree a (b, c) -> FibTree a' (b, c) -> Q (FibTree a' a)
dot x@(TTI _ _) y@(TTI _ _) | x == y = pure TLeaf
                             | otherwise = mempty
dot x@(TIT _ _) y@(TIT _ _) | x == y = pure TLeaf
                             | otherwise = mempty
dot x@(TTT _ _) y@(TTT _ _) | x == y = pure TLeaf
                            | otherwise = mempty
dot x@(III _ _) y@(III _ _) | x == y = pure ILeaf
                            | otherwise = mempty
dot x@(ITT _ _) y@(ITT _ _) | x == y = pure ILeaf
                            | otherwise = mempty
dot _ _ = mempty

Hexagon and Pentagon equations

The F and R matrices and the fusion rules need to obey consistency conditions called the hexagon and pentagon equations. Certain simple rearrangements have alternate ways of being achieved. The alternative paths need to agree.

pentagon1 ::  FibTree a (e,(d,(c,b))) -> Q (FibTree a (((e,d),c),b))
pentagon1 v =  do 
                 v1 <- fmove v
                 fmove v1

pentagon2 :: FibTree a (b,(c,(d,e))) -> Q (FibTree a (((b,c),d),e))
pentagon2 v = do
                v1 :: FibTree a (b,((c,d),e)) <- rmap fmove v
                v2 :: FibTree a ((b,(c,d)),e) <- fmove v1
                lmap fmove v2

ex1 = TTT TLeaf (TTT TLeaf (TTT TLeaf TLeaf))
-- returns empty
pentagon =  simplify $ ((pentagon1 ex1) - (pentagon2 ex1))

hexagon1 :: FibTree a (b,(c,d)) -> Q (FibTree a ((d,b),c))
hexagon1 v = do
             v1 :: FibTree a ((b,c),d) <- fmove v
             v2 :: FibTree a (d,(b,c)) <- braid v1
             fmove v2  

hexagon2 :: FibTree a (b,(c,d)) -> Q (FibTree a ((d,b),c))
hexagon2 v = do
             v1 :: FibTree a (b,(d,c)) <- rmap braid v
             v2 :: FibTree a ((b,d),c) <- fmove v1
             lmap braid v2

ex2 = (TTT TLeaf (TTT TLeaf TLeaf))
--returns empty
hexagon =  simplify $ ((hexagon1 ex2) - (hexagon2 ex2))

Next Time:

With this, we have the rudiments of what we need to describe manipulation of anyon spaces. However, applying F-moves manually is rather laborious. Next time we’ll look into automating this using arcane type-level programming. You can take a peek at my trash WIP repo here


A big ole review on topological quantum computation:
Ady Stern on The fractional quantum hall effect and anyons:


Another good anyon tutorial:

Mathematica program that I still don’t get, but is very interesting:

Kitaev huge Paper:

Bonderson thesis:

Bernevig review:

More food for thought:

The Rosetta Stone:



Solving the Ising Model using a Mixed Integer Linear Program Solver (Gurobi)

I came across an interesting thing, that finding the minimizer of the Ising model is encodable as a mixed integer linear program.

The Ising model is a simple model of a magnet. A lattice of spins that can either be up or down. They want to align with an external magnetic field, but also with their neighbors (or anti align, depending on the sign of the interaction). At low temperatures they can spontaneously align into a permanent magnet. At high temperatures, they are randomized. It is a really great model that contains the essence of many physics topics.

Linear Programs minimize linear functions subject to linear equality and inequality constraints. It just so happens this is a very solvable problem (polynomial time).

MILP also allow you to add the constraint that variables take on integer values. This takes you into NP territory. Through fiendish tricks, you can encode very difficult problems. MILP solvers use LP solvers as subroutines, giving them clues where to search, letting them step early if the LP solver returns integer solutions, or for bounding branches of the search tree.

How this all works is very interesting (and very, very roughly explained), but barely matters practically since other people have made fiendishly impressive implementations of this that I can’t compete with. So far as I can tell, Gurobi is one of the best available implementations (Hans Mittelman has some VERY useful benchmarks here, and they have a gimped trial license available (2000 variable limit. Bummer.). Shout out to CLP and CBC, the Coin-Or Open Source versions of this that still work pretty damn well.

Interesting Connection: Quantum Annealing (like the D-Wave machine) is largely based around mapping discrete optimization problems to an Ising model. We are traveling that road in the opposite direction.

So how do we encode the Ising model?

Each spin is a binary variable s_i \in {0,1}

We also introduce a variable for every edge. which we will constrain to actually be the product of the spins. e_{ij} \in {0,1}. This is the big trick.

We can compute the And/Multiplication (they coincide for 0/1 binary variables) of the spins using a couple linear constraints. I think this does work for the 4 cases of the two spins.

e_{ij} \ge s_i +s_j - 1

e_{ij} \le s_j

e_{ij} \le s_i

The xor is usually what we care about for the Ising model, we want aligned vs unaligned spins to have different energy. It will have value 1 if they are aligned and 0 if they are anti-aligned. This is a linear function of the spins and the And.

s_i \oplus s_j = s_i + s_j - 2 e_{ij}

Then the standard Hamiltonian is

H=\sum B_i s_i + \sum J_{ij} (s_i + s_j - 2 e_{ij})

Well, modulo some constant offset. You may prefer making spins \pm 1, but that leads to basically the same Hamiltonian.

The Gurobi python package actually let’s us directly ask for AND constraints, which means I don’t actually have to code much of this.

We are allowed to use spatially varying external field B and coupling parameter J. The Hamiltonian is indeed linear in the variables as promised.

After already figuring this out, I found this chapter where they basically do what I’ve done here (and more probably). There is nothing new under the sun. The spatially varying fields B and J are very natural in the field of spin glasses.

For a while I thought this is all we could do, find the lowest energy solution, but there’s more! Gurobi is one of the few solvers that support iteration over the lowest optimal solutions, which means we can start to talk about a low energy expansion.

Here we’ve got the basic functionality. Getting 10,000 takes about a minute. This is somewhat discouraging when I can see that we haven’t even got to very interesting ones yet, just single spin and double spin excitations. But I’ve got some ideas on how to fix that. Next time baby-cakes.

(A hint: recursion with memoization leads to some brother of a cluster expansion.)


from gurobipy import *
import matplotlib.pyplot as plt
import numpy as np

# Create a new model
m = Model("mip1")

m.Params.PoolSearchMode = 2
m.Params.PoolSolutions = 10000

# Create variables
N = 10
spins = m.addVars(N,N, vtype=GRB.BINARY, name='spins')
links = m.addVars(N-1,N-1,2, vtype=GRB.BINARY, name='links')

xor = {}
B = np.ones((N,N)) #np.random.randn(N,N)
J = 1. #antialigned
H = 0.
for i in range(N-1):
	for j in range(N-1):
		#for d in range(2)
		m.addGenConstrAnd(links[i,j,0], [spins[i,j], spins[i+1,j]], "andconstr")
		m.addGenConstrAnd(links[i,j,1], [spins[i,j], spins[i,j+1]], "andconstr")
		xor[i,j,0] = spins[i,j] + spins[i+1,j] - 2*links[i,j,0]
		xor[i,j,1] = spins[i,j] + spins[i,j+1] - 2*links[i,j,1]
		H += J*xor[i,j,0] + J*xor[i,j,1]
for i in range(N):
	#m.addGenConstrAnd(links[i,N-1,0], [spins[i,N-1], spins[i+1,j]], "andconstr")
	#m.addGenConstrAnd(links[N-1,j,1], [spins[i,j], spins[i,j+1]], "andconstr")
	#m.addGenConstrAnd(links[i,N-1,1], [spins[i,N-1], spins[i,0]], "andconstr")
	#m.addGenConstrAnd(links[N-1,j,0], [spins[i,j], spins[i,j+1]], "andconstr")
	for j in range(N):
		H += B[i,j]*spins[i,j]
		#B[i,j] = 1.
#quicksum([2*x, 3*y+1, 4*z*z])


#x = m.addVar(vtype=GRB.BINARY, name="x")
#y = m.addVar(vtype=GRB.BINARY, name="y")
# = m.addVar(vtype=GRB.BINARY, name="z")

# Set objective
m.setObjective(H, GRB.MINIMIZE)

# Add constraint: x + 2 y + 3 z <= 4
#m.addConstr(x + 2 * y + 3 * z <= 4, "c0")

# Add constraint: x + y >= 1
#m.addConstr(x + y >= 1, "c1")


#for v in m.getVars():
#    print(v.varName, v.x)

print('Obj:', m.objVal)
print('Solcount:', m.SolCount)
for i in range(m.SolCount):
	m.Params.SolutionNumber = i #set solution numbers
	print("sol val:", m.Xn)
	print("sol energy:", m.PoolObjVal)

ising = np.zeros((N,N))
for i in range(N):
	for j in range(N):
		ising[i,j] = spins[i,j].Xn



Here’s the ground antiferromagnetic state. Cute.





Variational Method of the Quantum Simple Harmonic Oscillator using PyTorch

A fun method (and useful!) for solving the ground state of the Schrodinger equation is to minimize the energy integral dx \psi^\dagger H \psi while keeping the total probability 1. Pytorch is a big ole optimization library, so let’s give it a go.

I’ve tried two versions, using a stock neural network with relus and making it a bit easier by giving a gaussian with variable width and shift.

We can mimic the probability constraint by dividing by to total normalization \int dx \psi^\dagger \psi. A Lagrange multiplier or penalty method may allows us to access higher wavefunctions.

SGD seems to do a better job getting a rounder gaussian, while Adam is less finicky but makes a harsh triangular wavefunction.

The ground state solution of -\frac{d^2\psi}{dx^2} + x^2\psi=E\psi is e^{-x^2/2}, with an energy of 1/2 (unless I botched up my head math). We may not get it, because we’re not sampling a very good total domain. Whatever, for further investigation.

Very intriguing is that pytorch has a determinant in it, I believe. That opens up the possibility of doing a Hartree-Fock style variational solution.

Here is my garbage

import torch
import matplotlib.pyplot as plt
import numpy as np
import torch.optim
from scipy import linalg
import time

import torch.nn as nn
import torch.nn.functional as F

class Psi(nn.Module):
    def __init__(self):
        super(Psi, self).__init__()

        # an affine operation: y = Wx + b
        self.lin1 = nn.Linear(1, 10) #Takes x to the 10 different hats
        self.lin2 = nn.Linear(10, 1) #add up the hats.
        #self.lin1 = nn.Linear(1, 1) #Takes x to the 10 different hats
        #self.lin2 = nn.Linear(2, 1) #add up the hats.
    def forward(self, x):
        # Max pooling over a (2, 2) window
        shifts = self.lin1(x)
        hats =  F.relu(shifts) #hat(shifts)hat(shifts)
        y = self.lin2(hats)
        #y = torch.exp(- shifts ** 2 / 4)
        return y

#a = torch.ones(10, requires_grad=True)
#z = torch.linspace(0, 1, steps=10)

# batch variable for monte carlo integration
x = torch.randn(10000,1, requires_grad=True) # a hundred random points between 0 and 1

psi = Psi()
y = psi(x)
import torch.optim as optim

# create your optimizer

optimizer =  optim.SGD(psi.parameters(), lr=0.0001, momentum=0.9, nesterov=True)
#y2 = torch.sin(np.pi*x)
#x2 = x.clone()
plt.scatter(x.detach().numpy(),y.detach().numpy(), label="original")
scalar = torch.ones(1,1)
for i in range(4000):
    #y.backward(torch.ones(100,1), create_graph=True)
    x = torch.randn(1000,1, requires_grad=True) # a hundred random points between 0 and 1

    y = psi(x)

    y.backward(torch.ones(1000,1), create_graph=True)
    #torch.autograd.backward(torch.sum(y), grad_tensors=x)
    #print(dir(x)) #x.grad ** 2 +
    E = torch.sum(x.grad ** 2 + x**2 * y**2)#+ 10*(psi(scalar*0)**2 + psi(scalar)**2)
    N = torch.sum(y ** 2)
    L = E/N
for param in psi.parameters():
plt.scatter(x.detach().numpy(),y.detach().numpy(), label="new")

# may want to use the current wavefunction for gibbs style sampling
# we need to differentiate with respect to x for the kinetic energy

Edit: Hmm I didn’t compensate for the fact I was using randn sampling. That was a goof. I started using unifrom sampling, which doesn’t need compensation



Annihilation Creation with Wick Contraction in Python

Trying an alternative approach to dealing with the algebra of qft computationally based on wick contraction. We can easily find an algorithm that finds all possible pairs. Then we just reduce it all accordingly. Some problems: The combinatorics are going to freak out pretty fast.

I think my use of functional stuff like maps and lambdas is intensely unclarifying the code.

import numpy as np

#Needs to return a list of lists of pairs
def pair(mylist):
    #Base case
    if len(mylist) == 2:
        return [[(mylist[0], mylist[1])]]
    #popoff the first element
    element1 = mylist[0]
    pairs = []
    for i in range(1,len(mylist)):
        #Pick one
        element2 = mylist[i]
        #get all the pairs expluidng the already picked pair
        subpairs = pair(mylist[1:i] + mylist[i+1:])
        #Put the picked pair back in x is a list of pairs.
        pairs = pairs + map(lambda x: [(element1,element2)] + x ,subpairs)
    return pairs

#For Fermionic pairing, it might be conveneitn to extend pairs to (element1,element2,+-1)
#With the sign depending on wheter i is even or odd

pairing = pair([1,2,3,4])
print pairing
print len(pairing)

#Here's an idea, I can use a and adag as objects with indices
#and return functions (two point green's functions)
# or could return matrcies that are discretized green's functions.
def contract(twoguys):
    if twoguys[0] == 'a' and twoguys[1] == 'adag':
        return 1
        return 0

#Confusing, but what
pairings = pair(['a','a','a','adag','adag','adag'])
print pairings
#could abstract out multiplication as a function passed in. Then if contraction returns functions, I could
#return a new function that multiplies the inner functions
#or could replace multiply with if returning matrices
def multiplyup(pairing):
    return reduce(lambda acc, val: acc * val,map(contract,pairing))

print sum(map(multiplyup,pairings))


Attaching the Jordan Wigner String in Numpy

Just a fast (fast to write, not fast to run) little jordan wigner string code

import numpy as np
from numpy import kron, identity
import numpy.linalg as la

# sigma functions

sigma_x = np.array([[0, 1],[1, 0]])
sigma_y = np.array([[0, -1j],[1j, 0]])
sigma_z = np.array([[1, 0],[0, -1]])

# standard basis

spin_up = np.array([[1],[0]])
spin_down = np.array([[0],[1]])

# spin ladder operators

sigma_plus = sigma_x + 1j * sigma_y
sigma_minus = sigma_x - 1j * sigma_y

# pauli spin

N = 3
def chainify(mat, pos):
    if pos == 0:
        newmat = mat
        newmat = identity(2)
    for j in range(1,N):
        if j == pos:
            newmat = kron(newmat,mat)
            newmat = kron(newmat,identity(2))
    return newmat

def sx(i):
    return chainify(sigma_x,i)
def sy(i):
    return chainify(sigma_y,i)
def sz(i):
    return chainify(sigma_z,i)
def sp(i):
    return chainify(sigma_plus,i)
def sm(i):
    return chainify(sigma_minus,i)

#print sz(0)
#print sz(1)
#print sz(2)

# sp sm = 2 + 2 sz
#print,sm(0))- 2*identity(2**N) - 2*sz(0)

I = identity(2**N)

fdag = lambda i: sp(i)/2
f = lambda i: sm(i)/2

def stringify(mat, pos):
    if pos == 0:
        newmat = mat
        newmat = sigma_z
    for j in range(1,N):
        if j == pos:
            newmat = kron(newmat,mat)
        elif j<pos:
            newmat = kron(newmat,sigma_z)
            newmat = kron(newmat,identity(2))
    return newmat

def cdag(i):
    return np.mat(stringify(sigma_plus/2, i))

def c(i):
    return np.mat(stringify(sigma_minus/2, i))

#print,c(1)) +,cdag(1)) # This is 1
#print,c(2)) +,cdag(1)) # This is 0

#It does appear to work.

print cdag(1)*c(1) + c(1)*cdag(1) # This is 1
print cdag(1)*c(2) + c(2)*cdag(1) # This anticommutator is 0.

What fun!