Unification in Julia

Unification is a workhorse of symbolic computations. Comparing two terms (two syntax trees with named variables spots) we can figure out the most general substitution for the variables to make them syntactically match.

It is a sister to pattern matching, but it has an intrinsic bidirectional flavor that makes it feel more powerful and declarative.

Unification can be implemented efficiently (not that I have done so yet) with some interesting variants of the disjoint set / union-find data type.

  • The magic of Prolog is basically built in unification + backtracking search.
  • The magic of polymorphic type inference in Haskell and OCaml comes from unification of type variables.
  • Part of magic of SMT solvers using the theory of uninterpreted functions is unification.
  • Automatic and Interactive Theorem provers have unification built in somewhere.

To describe terms I made a simple data types for variables modelled of those in SymbolicUtils (I probably should just use the definitions in SymbolicUtils but i was trying to keep it simple).

#variables
struct Sym
    name::Symbol
end

struct Term
    f::Symbol
    arguments::Array{Any} # Array{Union{Term,Sym}} faster/better?
end

The implementation by Norvig and Russell for the their AI book is an often copied simple implementation of unification. It is small and kind of straightforward. You travel down the syntax trees and when you hit variables you try to put them into your substitution dictionary. Although, like anything that touches substitution, it can be easy to get wrong. See his note below.

I used the multiple dispatch as a kind of pattern matching on algebraic data types whether the variables are terms or variables. It’s kind of nice, but unclear to me whether obscenely slow or not. This is not a high performance implementation of unification in any case.

occur_check(x::Sym,y::Term,s) = any(occur_check(x, a, s) for a in y.arguments)

function occur_check(x::Sym,y::Sym,s)
    if x == y
        return s
    elseif haskey(s,y)
        return occur_check(x, s[y], s)
    else
        return nothing
    end  
end


function unify(x::Sym, y::Union{Sym,Term}, s) 
   if x == y
        return s
   elseif haskey(s,x)
        return unify(s[x], y, s)
   elseif haskey(s,y) # This is the norvig twist
        return unify(x, s[y], s)
   elseif occur_check(x,y,s)
        return nothing
   else
        s[x] = y
        return s
   end
end

unify(x::Term, y::Sym, s) = unify(y,x,s)

function unify(x :: Term, y :: Term, s)
    if x.f == y.f && length(x.arguments) == length(y.arguments)
        for (x1, y1) in zip(x.arguments, y.arguments)
            if unify(x1,y1,s) == nothing
                return nothing
            end
        end
        return s
    else
        return nothing
    end
end

unify(x,y) = unify(x,y,Dict())

I also made a small macro function for converting simple julia expressions to my representation. It uses the prolog convention that capital letter starting names are variables.

function string2term(x)
    if x isa Symbol
        name = String(x)
        if isuppercase(name[1])
           return Sym( x)
        else
           return Term( x, []  )
        end
    elseif x isa Expr
        @assert(x.head == :call)
        arguments = [string2term(y) for y in x.args[2:end] ]
        return Term( x.args[1], arguments )
    end
end
macro string2term(x)
    return :( $(string2term(x)) )
end

print(unify( @string2term(p(X,g(a), f(a, f(a)))) , @string2term(p(f(a), g(Y), f(Y, Z)))))
# Dict{Any,Any}(Sym(:X) => Term(:f, Any[Term(:a, Any[])]),Sym(:Y) => Term(:a, Any[]),Sym(:Z) => Term(:f, Any[Term(:a, Any[])]))

Links

Unification: Multidisciplinary Survey by Knight https://kevincrawfordknight.github.io/papers/unification-knight.pdf

https://github.com/roberthoenig/FirstOrderLogic.jl/tree/master/src A julia project for first order logic that also has a unification implementation, and other stuff

An interesting category theoretic perspective on unification http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.3615 what is unification by Goguen

There is also a slightly hidden implementation in sympy (it does not appear in the docs?) http://matthewrocklin.com/blog/work/2012/11/01/Unification https://github.com/sympy/sympy/tree/master/sympy/unify

PyRes https://github.com/eprover/PyRes/blob/master/unification.py

Norvig unify
https://github.com/aimacode/aima-python/blob/9ea91c1d3a644fdb007e8dd0870202dcd9d078b6/logic4e.py#L1307

norvig – widespread error
http://norvig.com/unify-bug.pdf

Efficient unification note
ftp://ftp.cs.indiana.edu/pub/techreports/TR242.pdf

blog post
https://eli.thegreenplace.net/2018/unification/

Efficient representations for triangular substitituions
https://users.soe.ucsc.edu/~lkuper/papers/walk.pdf

conor mcbride – first order substitition structurly recursive dependent types
http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=880725E316FA5E3540EFAD83C0C2FD88?doi=10.1.1.25.1516&rep=rep1&type=pdf

z3 unifier – an example of an actually performant unifier
https://github.com/Z3Prover/z3/blob/520ce9a5ee6079651580b6d83bc2db0f342b8a20/src/ast/substitution/unifier.cpp

Warren Abstract Machine Tutorial Reconstruction http://wambook.sourceforge.net/wambook.pdf

Handbook of Automated Reasoning – has a chapter on unification

Higher Order Unification – LambdaProlog, Miller unification

Syntax trees with variables in them are a way in which to represent sets of terms (possibly infinite sets!). In that sense it asks can we form the union or intersection of these sets. The intersection is the most general unifier. The union is not expressible via a single term with variables in general. We can only over approximate it, like how the union of convex sets is not necessarily convex, however it’s hull is. This is a join on a term lattice. This is the process of anti-unification.

What about the complement of these sets? Not really. Not with the representation we’ve chosen, we can’t have an interesting negation. What about the difference of two sets?

I had an idea a while back about programming with relations, where I laid out some interesting combinators. I represented only finite relations, as those can be easily enumerated.

Walk on Spheres Method in Julia

I saw a cool tweet (and corresponding conference paper) by Keenan Crane

http://www.cs.cmu.edu/~kmcrane/Projects/MonteCarloGeometryProcessing/index.html

I was vaguely aware that one can use a Monte Carlo method to solve the boundary value Laplace equation \nabla^2 \phi = 0 , but I don’t think I had seen the walk on spheres variant of it before. I think Crane’s point is how similar all this is to stuff graphics people already do and do well. It’s a super cool paper. Check it out.

Conceptually, I think it is plausible that the Laplace equation and a monte carlo walk are related because the static diffusion equation \nabla^2 n = 0 from Fick’s law ultimately comes from the brownian motion of little guys wobbling about from a microscopic perspective.

Slightly more abstractly, both linear differential equations and random walks can be describe by matrices, a finite difference matrix (for concreteness) K and a transition matrix of jump probabilities T. The differential equation is discretized to Kx=b and the stationary probability distribution is Tp=b, where b are sources and sinks at the boundary.

The mean value property of the Laplace equation allows one to speed this process up. Instead of having a ton of little walks, you can just walk out randomly sampling on the surface of big spheres. en.wikipedia.org/wiki/Walk-on-spheres_method. Alternatively you can think of it as eventually every random walk exits a sphere, and it is at a random spot on it.

So here’s the procedure. Pick a point you want the value of \phi at. Make the biggest sphere you can that stays in the domain. Pick a random point on the sphere. If that point is on the boundary, record that boundary value, otherwise iterate. Do this many many times, then the average value of the boundaries you recorded it the value of \phi

This seems like a good example for Julia use. It would be somewhat difficult to code this up efficiently in python using vectorized numpy primitives. Maybe in the future we could try parallelize or do this on the GPU? Monte carlo methods like these are quite parallelizable.

The solution of the 1-d Laplace equation is absolutely trivial. If the second derivative is 0, then $\phi = a + b x $. This line is found by fitting it to the two endpoint values.

So we’re gonna get a line out

using LinearAlgebra
avg = 0
phi0 = 0
phi1 = 10
x_0 = 0.75
function monte_run(x)
    while true
            l = rand(Bool) # go left?
            if (l && x <= 0.5) # finish at left edge 0
                return phi0
            elseif (!l && x >= 0.5) # finish at right edge 1
                return phi1
            else
                if x <= 0.5 # move away from 0
                    x += x
                else
                    x -= 1 - x # move away from 1
                end
            end
    end
end

monte_runs = [monte_run(x) for run_num =1:100, x=0:0.05:1 ]
import Statistics
avgs = vec(Statistics.mean( monte_runs , dims=1))
stddevs = vec(Statistics.std(monte_runs, dims=1)) ./ sqrt(size(monte_runs)[1]) # something like this right?

plot(0:0.05:1, avgs, yerror=stddevs)
plot!(0:0.05:1,  (0:0.05:1) * 10 )

And indeed we do.

You can do a very similar thing in 2d. Here I use the boundary values on a disc corresponding to x^2 – y^2 (which is a simple exact solution of the Laplace equation).



function monte_run_2d(phi_b, x)
    while true
            r = norm(x)
            if r > 0.95 # good enough
                return phi_b(x)
            else
                dr = 1.0 - r #assuming big radius of 1
                θ = 2 * pi * rand(Float64) #
                x[1] += dr * cos(θ)
                x[2] += dr * sin(θ)
            end
    end
end


monte_run_2d( x -> x[1],  [0.0 0.0] )


monte_runs = [monte_run_2d(x -> x[1]^2 - x[2]^2 ,  [x 0.0] ) for run_num =1:1000, x=0:0.05:1 ]

import Statistics
avgs = vec(Statistics.mean( monte_runs , dims=1))
stddevs = vec(Statistics.std(monte_runs, dims=1)) ./ sqrt(size(monte_runs)[1]) # something like this right?
plot(0:0.05:1, avgs, yerror=stddevs)
plot!(0:0.05:1,  (0:0.05:1) .^2 )

There’s more notes and derivations in my notebook here https://github.com/philzook58/thoughtbooks/blob/master/monte_carlo_integrate.ipynb

Sum of Squares optimization for Minimax Optimal Differential Eq Residuals

Huh. This doesn’t embed very well. Maybe you’re better off just clicking into the thing. It’s nice not to let things rot too long though. *shrug*

Other ideas: Can I not come up with some scheme to use Sum of Squares for rigorous upper and lower bound regions like in https://github.com/JuliaIntervals/TaylorModels.jl ? Maybe a next post.

Bouncing a Ball with Mixed Integer Programming

Edit: A new version.

Here I made a bouncing ball using mixed integer programming in cvxpy. Currently we are just simulating the bouncing ball internal to a mixed integer program. We could turn this into a control program by making the constraint that you have to shoot a ball through a hoop and have it figure out the appropriate initial shooting velocity.

import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt

N = 100
dt = 0.05

x = cvx.Variable(N)
v = cvx.Variable(N)
collision = cvx.Variable(N-1,boolean=True)

constraints = []
M = 20 # Big M trick

#initial conditions
constraints += [x[0] == 1, v[0] == 0]

for t in range(N-1):
    predictedpos = x[t] + v[t] * dt
    col = collision[t]
    notcol = 1 - collision[t]
    constraints += [ -M * col <= predictedpos ,  predictedpos <= M * notcol]
    #enforce regular dynamics if col == 0
    constraints +=  [  - M * col <= x[t+1] - predictedpos, x[t+1] - predictedpos  <= M * col   ] 
    constraints +=  [  - M * col <= v[t+1] - v[t] + 9.8*dt, v[t+1] - v[t] + 9.8*dt  <= M * col   ]

    # reverse velcotiy, keep position the same if would collide with x = 0
    constraints +=  [  - M * notcol <= x[t+1] - x[t], x[t+1] - x[t]  <= M * notcol   ] 
    constraints +=  [  - M * notcol <= v[t+1] + 0.8*v[t], v[t+1] + 0.8*v[t]  <= M * notcol   ] #0.8 restitution coefficient

objective = cvx.Maximize(1)
 
prob = cvx.Problem(objective, constraints)
res = prob.solve(solver=cvx.GLPK_MI, verbose=True)
print(x.value)
print(v.value)
plt.plot(x.value, label='x')
plt.plot(v.value, label= 'v')
plt.plot(collision.value, label = 'collision bool')
plt.legend()
plt.xlabel('time')
plt.show()

Pretty cool.

The trick I used this time is to make boolean indicator variables for whether a collision will happen or not. The big M trick is then used to actually make the variable reflect whether the predicted position will be outside the wall at x=0. If it isn’t, it uses regular gravity dynamics. If it will, it uses velocity reversing bounce dynamics


Just gonna dump this draft out there since I’ve moved on (I’ll edit this if I come back to it). You can embed collisions in mixed integer programming.  I did it below using a strong acceleration force that turns on when you enter the floor. What this corresponds to is a piecewise linear potential barrier.

Such a formulation might be interesting for the trajectory optimization of shooting a hoop, playing Pachinko, Beer Pong, or Pinball.

using JuMP
using Cbc
using Plots
N = 50
T = 5
dt = T/N

m = Model(solver=CbcSolver())

@variable(m, x[1:N]) # , Bin
@variable(m, v[1:N]) # , Bin
@variable(m, f[1:N-1])
@variable(m, a[1:N-1], Bin) # , Bin


@constraint(m, x[1] == 1)
@constraint(m, v[1] == 0)
M = 10
for t in 1:N-1
	@constraint(m, x[t+1] == x[t] + dt*v[t])
	@constraint(m, v[t+1] == v[t] + dt*(10*(1-a[t])-1))
	#@constraint(m, v[t+1] == v[t] + dt*(10*f[t]-1))
	@constraint(m, M * a[t] >= x[t+1]) #if on the next step projects into the earth
	@constraint(m, M * (1-a[t]) >= -x[t+1])
	#@constraint(m, f[t] <=  M*(1-a[t])) # we allow a bouncing force



end

k = 10
# @constraint(m, f .>= 0)
# @constraint(m, f .>= - k * x[2:N])

# @constraint(m, x[:] .>= 0)


E = 1 #sum(f) # 1 #sum(x) #sum(f) # + 10*sum(x) # sum(a)

@objective(m, Min, E)
solve(m)

println(x)
println(getvalue(x))
plotly() 
plot(getvalue(x))
#plot(getvalue(a))
gui()

More things to consider:

Is this method trash? Yes. You can actually embed the mirror law of collisions directly without needing to using a funky barrier potential.

You can extend this to ball trapped in polygon, or a ball that is restricted from entering obstacle polygons. Check out the IRIS project – break up region into convex regions

https://github.com/rdeits/ConditionalJuMP.jl Gives good support for embedding conditional variables.

https://github.com/joehuchette/PiecewiseLinearOpt.jl On a related note, gives a good way of defining piecewise linear functions using Mixed Integer programming.

Pajarito is another interesting Julia project. A mixed integer convex programming solver.

Russ Tedrake papers – http://groups.csail.mit.edu/locomotion/pubs.shtml

Break up obstacle objects into delauney triangulated things.

www.mit.edu/~jvielma/presentations/MINLPREPSOLJUL_NORTHE18.pdf