Solving the Ising Model using a Mixed Integer Linear Program Solver (Gurobi)

I came across an interesting thing, that finding the minimizer of the Ising model is encodable as a mixed integer linear program.

The Ising model is a simple model of a magnet. A lattice of spins that can either be up or down. They want to align with an external magnetic field, but also with their neighbors (or anti align, depending on the sign of the interaction). At low temperatures they can spontaneously align into a permanent magnet. At high temperatures, they are randomized. It is a really great model that contains the essence of many physics topics.

Linear Programs minimize linear functions subject to linear equality and inequality constraints. It just so happens this is a very solvable problem (polynomial time).

MILP also allow you to add the constraint that variables take on integer values. This takes you into NP territory. Through fiendish tricks, you can encode very difficult problems. MILP solvers use LP solvers as subroutines, giving them clues where to search, letting them step early if the LP solver returns integer solutions, or for bounding branches of the search tree.

How this all works is very interesting (and very, very roughly explained), but barely matters practically since other people have made fiendishly impressive implementations of this that I can’t compete with. So far as I can tell, Gurobi is one of the best available implementations (Hans Mittelman has some VERY useful benchmarks here, and they have a gimped trial license available (2000 variable limit. Bummer.). Shout out to CLP and CBC, the Coin-Or Open Source versions of this that still work pretty damn well.

Interesting Connection: Quantum Annealing (like the D-Wave machine) is largely based around mapping discrete optimization problems to an Ising model. We are traveling that road in the opposite direction.

So how do we encode the Ising model?

Each spin is a binary variable s_i \in {0,1}

We also introduce a variable for every edge. which we will constrain to actually be the product of the spins. e_{ij} \in {0,1}. This is the big trick.

We can compute the And/Multiplication (they coincide for 0/1 binary variables) of the spins using a couple linear constraints. I think this does work for the 4 cases of the two spins.

e_{ij} \ge s_i +s_j - 1

e_{ij} \le s_j

e_{ij} \le s_i

The xor is usually what we care about for the Ising model, we want aligned vs unaligned spins to have different energy. It will have value 1 if they are aligned and 0 if they are anti-aligned. This is a linear function of the spins and the And.

s_i \oplus s_j = s_i + s_j - 2 e_{ij}

Then the standard Hamiltonian is

H=\sum B_i s_i + \sum J_{ij} (s_i + s_j - 2 e_{ij})

Well, modulo some constant offset. You may prefer making spins \pm 1, but that leads to basically the same Hamiltonian.

The Gurobi python package actually let’s us directly ask for AND constraints, which means I don’t actually have to code much of this.

We are allowed to use spatially varying external field B and coupling parameter J. The Hamiltonian is indeed linear in the variables as promised.

After already figuring this out, I found this chapter where they basically do what I’ve done here (and more probably). There is nothing new under the sun. The spatially varying fields B and J are very natural in the field of spin glasses.

For a while I thought this is all we could do, find the lowest energy solution, but there’s more! Gurobi is one of the few solvers that support iteration over the lowest optimal solutions, which means we can start to talk about a low energy expansion.

Here we’ve got the basic functionality. Getting 10,000 takes about a minute. This is somewhat discouraging when I can see that we haven’t even got to very interesting ones yet, just single spin and double spin excitations. But I’ve got some ideas on how to fix that. Next time baby-cakes.

(A hint: recursion with memoization leads to some brother of a cluster expansion.)


from gurobipy import *
import matplotlib.pyplot as plt
import numpy as np

# Create a new model
m = Model("mip1")

m.Params.PoolSearchMode = 2
m.Params.PoolSolutions = 10000

# Create variables
N = 10
spins = m.addVars(N,N, vtype=GRB.BINARY, name='spins')
links = m.addVars(N-1,N-1,2, vtype=GRB.BINARY, name='links')

xor = {}
B = np.ones((N,N)) #np.random.randn(N,N)
J = 1. #antialigned
H = 0.
for i in range(N-1):
	for j in range(N-1):
		#for d in range(2)
		m.addGenConstrAnd(links[i,j,0], [spins[i,j], spins[i+1,j]], "andconstr")
		m.addGenConstrAnd(links[i,j,1], [spins[i,j], spins[i,j+1]], "andconstr")
		xor[i,j,0] = spins[i,j] + spins[i+1,j] - 2*links[i,j,0]
		xor[i,j,1] = spins[i,j] + spins[i,j+1] - 2*links[i,j,1]
		H += J*xor[i,j,0] + J*xor[i,j,1]
for i in range(N):
	#m.addGenConstrAnd(links[i,N-1,0], [spins[i,N-1], spins[i+1,j]], "andconstr")
	#m.addGenConstrAnd(links[N-1,j,1], [spins[i,j], spins[i,j+1]], "andconstr")
	#m.addGenConstrAnd(links[i,N-1,1], [spins[i,N-1], spins[i,0]], "andconstr")
	#m.addGenConstrAnd(links[N-1,j,0], [spins[i,j], spins[i,j+1]], "andconstr")
	for j in range(N):
		H += B[i,j]*spins[i,j]
		#B[i,j] = 1.
#quicksum([2*x, 3*y+1, 4*z*z])


#x = m.addVar(vtype=GRB.BINARY, name="x")
#y = m.addVar(vtype=GRB.BINARY, name="y")
# = m.addVar(vtype=GRB.BINARY, name="z")

# Set objective
m.setObjective(H, GRB.MINIMIZE)

# Add constraint: x + 2 y + 3 z <= 4
#m.addConstr(x + 2 * y + 3 * z <= 4, "c0")

# Add constraint: x + y >= 1
#m.addConstr(x + y >= 1, "c1")


#for v in m.getVars():
#    print(v.varName, v.x)

print('Obj:', m.objVal)
print('Solcount:', m.SolCount)
for i in range(m.SolCount):
	m.Params.SolutionNumber = i #set solution numbers
	print("sol val:", m.Xn)
	print("sol energy:", m.PoolObjVal)

ising = np.zeros((N,N))
for i in range(N):
	for j in range(N):
		ising[i,j] = spins[i,j].Xn



Here’s the ground antiferromagnetic state. Cute.





OSQP and Sparsegrad: Fast Model Predictive Control in Python for an inverted pendulum

OSQP is a quadratic programming solver that seems to be designed for control problems. It has the ability to warm start, which should make it faster.

I heard about it in this Julia talk

They have some really cool technology over there in Julia land.

The problem is setup as a sequential quadratic program. I have a quadratic cost to try to bring the pendulum to an upright position.

The equations of motion are roughly




We don’t include the back reaction of the pole on the cart. It is basically irrelevant for our systems and overly complicated for no reason. The interesting bit is the nonlinear dynamics and influence of the cart acceleration.

I write down obeying the equations of motion as a linear constraint between timesteps. I use sparsegrad to get a sparse Jacobian of the equations of motion. The upper and lower (l and u) bounds are set equal to make an equality constraint.

Our setup has a finite track about a meter long. Our motors basically are velocity control and they can go about +-1m/s. Both of these can be encapsulated as linear constraints on the position and velocity variables. l \le Ax \le u


\phi(x) \approx \phi(x_0) + \partial \phi(x_0) (x - x_0)

A= \partial \phi(x_0)

l=u=\partial \phi(x_0) x_0 - \phi_0(x_0)

Similarly for finding the quadratic cost function in terms of the setpoint x_s. \frac{1}{2}x^T P x + q^Tx= \frac{1}{2}(x-x_s)P(x-x_s) Expanding this out we get

q = - Px_s

I run for multiple iterations to hopefully converge to a solution (it does). The initial linearization might be quite poor, but it gets better.

The OSQP program seems to be able to run in under 1ms. Nice! Initial tests of our system seem to be running at about 100Hz.

Could probably find a way to make those A matrices faster than constructing them entirely anew every time. We’ll see if we need that.

I very inelegantly smash all the variables into x. OSQP and scipy sparse don’t support multiIndex objects well, best as I can figure. Vectors should be single dimensioned and matrices 2 dimensioned.

Still to be seen if it works on hardware. I was already having infeasibility problems. Adjusting the regularization on P seemed to help.


import sparsegrad.forward as forward
import numpy as np
import osqp
import scipy.sparse as sparse
import matplotlib.pyplot as plt
import time

#def f(x):
#	return x**2

N = 1000
NVars  = 5
T = 5.0
dt = T/N
dtinv = 1./dt
#Px = sparse.eye(N)
#sparse.csc_matrix((N, N)) 
# The three deifferent weigthing matrices for x, v, and external force
reg = sparse.eye(N)*0.01
# sparse.diags(np.arange(N)/N)
P = sparse.block_diag([reg,reg ,sparse.eye(N), 1*reg,1*reg])
q = np.zeros((NVars, N))
q[THETA,:] = np.pi
#q[N,0] = -2 * 0.5 * 10
q = q.flatten()
q= -P@q
#u = np.arr

x = np.random.randn(N,NVars).flatten()
#x = np.zeros((N,NVars)).flatten()
#v = np.zeros(N)
#f = np.zeros(N)


g = 9.8
L = 0.5
gL = g * L
m = 1.0 # doesn't matter
I = L**2 / 3
Iinv = 1.0/I

def constraint(var, x0, v0, th0, thd0):
	#x[0] -= 1
	x = var[:N]
	v = var[N:2*N]
	theta = var[2*N:3*N]
	thetadot = var[3*N:4*N]
	a = var[4*N:5*N]
	dynvars = (x,v,theta,thetadot)
	xavg, vavg, thetavg, thdotavg = map(lambda z: (z[0:-1]+z[1:])/2, dynvars)
	dx, dv, dthet, dthdot = map(lambda z: (z[1:]-z[0:-1])*dtinv, dynvars)
	vres = dv - a[1:]
	xres = dx - vavg
	torque = -gL*np.sin(thetavg) + a[1:]*L*np.cos(thetavg)
	thetdotres = dthdot - torque*Iinv
	thetres = dthet - thdotavg

	return x[0:1]-x0, v[0:1]-v0, theta[0:1]-th0, thetadot[0:1]-thd0, xres,vres, thetdotres, thetres
	#return x[0:5] - 0.5

def getAlu(x, x0, v0, th0, thd0):
	gt = np.zeros((2,N))
	gt[0,:] = 0.1 # x is greaer than 0
	gt[1,:] = -1 #veclotu is gt -1m/s
	gt = gt.flatten()
	lt = np.zeros((2,N))
	lt[0,:] = 0.8
	lt[1,:] = 1 # velocity less than 1m/s
	lt = lt.flatten()

	z = sparse.bsr_matrix((N, N))
	ineqA = sparse.bmat([[sparse.eye(N),z,z,z,z],[z,sparse.eye(N),z,z,z]]) #.tocsc()
	cons = constraint(forward.seed_sparse_gradient(x), x0, v0, th0, thd0)
	A = sparse.vstack(map(lambda z: z.dvalue, cons)) #  y.dvalue.tocsc()
	totval = np.concatenate(tuple(map(lambda z: z.value, cons)))
	temp = A@x - totval

	A = sparse.vstack((A,ineqA)).tocsc()

	#print(tuple(map(lambda z: z.value, cons)))
	u = np.concatenate((temp, lt))
	l = np.concatenate((temp, gt))
	return A, l, u

#A = y.dvalue.tocsc()
#sparse.hstack( , format="csc")
A, l, u = getAlu(x, 0.5, 0, 0, 0)
m = osqp.OSQP()
m.setup(P=P, q=q, A=A, l=l, u=u) #  **settings
results = m.solve()

start = time.time()
iters = 100
for i in range(iters):
	A, l, u = getAlu(results.x, 0.5, 0, 0, 0)
	m.update(, l=l, u=u)
	results = m.solve()
end = time.time()
print("Solve in :", iters / (end-start) ,"Hz")

plt.plot(results.x[:N], label="x")
plt.plot(results.x[N:2*N], label="v")
plt.plot(results.x[2*N:3*N], label="theta")
plt.plot(results.x[3*N:4*N], label="thetadot")
plt.plot(results.x[4*N:5*N], label="force")
#m.update(q=q_new, l=l_new, u=u_new)


Approximating Compiling to Categories using Type-level Haskell: Take 2

The previous episode is here .

Summary: I’m trying to use typelevel programming in Haskell to achieve some of the aims of Conal Elliott’s compiling to categories GHC plugin. The types of highly polymorphic tuple functions are enough to specify the implementation. We aren’t going to be able to piggy-back off of GHC optimizations (a huge downside), but we can reify lambdas into other categories and avoid the scariness of plugins.

The current implementation github source is here 


Of course, Oleg already did it. This is a file where he builds the implementation of a polymorphic function from the type signature. Instead of tuples, he is focusing on higher order functions with deep nesting of (->).

The trick that I was missing is in the IsFunction typeclass at the very end, which is only achievable as a Incoherent instances.

I would never have had the courage to use an Incoherent instance if I hadn’t seen a higher authority do it first. It has turned out in my typelevel journey that many instances that I’ve been tempted to make overlapping or incoherent don’t actually have to be, especially with the availability of closed type families. I think you really do need Incoherent in this application because type variables stay polymorphic forever.

To the best of my knowledge, if you need to differentiate between a tuple type (a,b) and an uninstantiated polymorphic value a’ like we do when deconstructing the input type of our lambda, you need to use an Incoherent instance. Since a’ could hypothetically eventually be unified to some (a,b) we should not be able to do different things for the two cases without stepping outside the usual laws of typeclass resolution.

New features of the implementation:

  • The new implementation does not require the V annotation of the input like previous version by using the previously mentioned. This is super cool because now I can throw the stock Prelude.fst into toCcc.
  • I changed how the categorical implementation is built, such that it short circuits with an ‘id’ if a large structure is needed from the input, rather than deconstructing all the way to every piece of the input. Lots of other optimizations would be nice (vital?), but it is a start.
  • I also implemented a FreeCat GADT so that we can see the implementation in ghci.
  • I switched from using Data.Proxy to the type annotations extensions, which is a huge readability win.
  • I added a binary application operator binApp, which is useful for encapsulating categorical literals as infix operators into your lambda expressions.
  • General cleanup, renaming, and refactoring into library files.

A couple typelevel tricks and comments:

You often have to make helper typeclasses, so don’t be afraid to. If you want something like an if-then-else in your recursion, it is very likely you need a form of the typeclass that has slots for ‘True or ‘False to help you pick the instance.

If possible, you often want the form

(a~Something) => MyClass 'True a

rather than

Myclass 'True Something

The type inference tends to be better.

Here are some usage examples of toCcc.

example6 = toCcc (\x -> x) 'a'

example7 = toCcc @FreeCat (\(x, y) -> x)

example8 = toCcc @FreeCat (\(x, y) -> y)
example8andahalf = toCcc' (Proxy :: Proxy FreeCat) (\(x, y) -> y)
example9 = toCcc @FreeCat (\(x, y) -> (y,x)) 
example10 = toCcc @FreeCat (\(( x, z), y) -> (y,x))
swappo = toCcc @FreeCat $ \((x, z), y) -> (x,(z,y))

example11 = toCcc @(->) $ \(x,y) -> binApp addC x y
example12 = toCcc @(->) $ \(x,y) -> App negateC x

-- infix synonyms
(+++) = binApp addC
(***) = binApp mulC
example13 = toCcc @(->) $ \(x,(y,z)) -> x +++ (y *** z)

My partially implemented version of some of Conal’s category typeclasses. Should I switch over to using the constrained-categories package?

{-# LANGUAGE GADTs, StandaloneDeriving, NoImplicitPrelude  #-}

module Cat where
import Control.Category
import Prelude hiding ((.))

class Category k => Monoidal k where
    par :: k a c -> k b d -> k (a,b) (c,d)

class Monoidal k => Cartesian k where
    fst :: k (a,b) a 
    snd :: k (a,b) b 
    dup :: k a (a,a) 

fan f g = (par f g) . dup

data FreeCat a b where
    Comp :: FreeCat b c -> FreeCat a b -> FreeCat a c
    Id :: FreeCat a a
    Fst :: FreeCat (a,b) a
    Snd :: FreeCat (a,b) b
    Dup :: FreeCat a (a,a)
    Par :: FreeCat a b -> FreeCat c d -> FreeCat (a,c) (b,d)

deriving instance Show (FreeCat a b)

instance Category FreeCat where
    (.) = Comp
    id = Id

instance Monoidal FreeCat where
    par = Par

instance Cartesian FreeCat where
    fst = Fst
    snd = Snd
    dup = Dup

instance Monoidal (->) where
    par f g = \(x,y) -> (f x, g y)  

instance Cartesian (->) where
    fst (x,y) = x
    snd (x,y) = y
    dup x = (x,x)

class NumCat k where
    mulC :: Num a => k (a,a) a
    negateC :: Num a => k a a
    addC :: Num a => k (a,a) a

instance NumCat (->) where
    mulC = uncurry (*)
    negateC = negate
    addC = uncurry (+)


The actual implementation of toCcc

{-# LANGUAGE DataKinds, 
    ScopedTypeVariables #-}
module CCC (
    , App (App)
    , binApp
    , toCcc'
    , toCcc ) where

import Cat
import Data.Proxy 
import Data.Type.Equality (type (==))
import Prelude (Bool(..))
import Control.Category

-- The unfortunate Incoherent instances I need to force polymorphic values
class IsTup a b | a -> b
instance {-# INCOHERENT #-} (c ~ 'True) => IsTup (a,b) c
instance {-# INCOHERENT #-} (b ~ 'False) => IsTup a b

data Leaf n = Leaf

data Z
data S a

data App f a = App f a

f $$ x = App f x
binApp f a b = App f (a,b)

class Cartesian k => CCC k a a' b' | a a' -> b' where
   toCcc :: a -> k a' b'
instance (Tag a,
         Build k a b a' b',
         Cartesian k)
    => CCC k (a->b) a' b' where
        toCcc f = build @k @a @b @a' @b' res where  -- build (Proxy :: Proxy labels) (Proxy :: Proxy b) res where
                res = f val 

toCcc' :: CCC k f a' b' => Proxy k -> f -> k a' b' 
toCcc' _ f = toCcc f

class Tag a where
    val :: a

instance (IsTup a flag, Tag' a Z flag n) => Tag a where
    val = val' @a @Z @flag

class Tag' a n (flag :: Bool) n'| a n flag -> n' where
    val' :: a

instance (IsTup a flaga,
          IsTup b flagb,
          Tag' a n flaga n'',
          Tag' b n'' flagb n') => Tag' (a,b) n 'True n'  where
    val' = (val' @a @n @flaga, val' @b @n'' @flagb)

instance (a ~ Leaf n) => Tag' a n 'False (S n)  where
    val' = Leaf

type family Or a b where
    Or 'True b = 'True
    Or a 'True = 'True
    Or 'False 'False = 'False

class In a b flag | a b -> flag where
instance (
    ((a,b) == c) ~ isc, 
    flag' ~ Or flaga flagb, 
    flag ~ Or flag' isc,
    In a c flaga, 
    In b c flagb) => In (a,b) c flag
instance ((Leaf n == c) ~ flag) => In (Leaf n) c flag

class Build k input key a' b' | input key a' -> b' where
   build :: Cartesian k => key -> k a' b'

instance ( 
    iseq ~ ((a,b) == key),
    In a key isinleft,
    In b key isinright,
    Cond k iseq isinleft isinright (a,b) key a' b'
    ) => Build k (a,b) key a' b' where
    build key = cond @k @iseq @isinleft @isinright @(a,b) @key @a' key

instance (Leaf n ~ b, a' ~ b') => Build k (Leaf n) b a' b' where
    build _ = id

class Cond k iseq isinleft isinright input key a b | iseq isinleft isinright input key a -> b where
    cond :: Cartesian k => key -> k a b
-- Find the key is in the input
instance (a ~ b) => Cond k 'True x x input key a b where
    cond _ = id
instance (Build k a key a' c', (a',b') ~ ab) => Cond k 'False 'True x (a,b) key ab c' where -- get those input types inferred baby!
    cond key = (build @k @a @key @a' key) . fst
instance (Build k b key b' c', (a',b') ~ ab) => Cond k 'False 'False 'True (a,b) key ab c' where
    cond key = (build @k @b @key @b' key) . snd

-- Otherwise destruct on the key
instance (Build k input key1 a' c', 
          Build k input key2 a' d') => Cond k 'False 'False 'False input (key1,key2) a' (c',d') where
    cond (key1,key2) = fan (build @k @input @key1 @a' key1) (build @k @input @key2 @a' key2)

instance (Build k input key a' b',
    f ~ k b' c')
 => Cond k 'False 'False 'False input (App f key) a' c' where
    cond (App f key) = f . (build @k @input @key @a' key)

-- Could I replace almost everything with App? A very powerful construct
-- This is a of some relation to defunctionalization like in HList
-- Maybe I should build a typelevel FreeCat and then do compilation passes on it

type family (StripLeaf a) where
    StripLeaf (a,b) = (StripLeaf a, StripLeaf b)
    StripLeaf (Leaf n a) = a