# Egglog Examples: Pullbacks, SKI, Lists, and Arithmetic

Egglog is a bottom up style prolog written in Rust powered by egg egraph library. Because of this you get efficient equality as a native predicate. This enables some very powerful things.

Github repo here https://github.com/philzook58/egglog

Online wasm demo here http://www.philipzucker.com/egglog/

### New stuff

I’ve made some additions

- Made queries more like actual prolog queries. They return pattern variables. This actually was very easy. Queries use the same
`MultiPattern`

mechanism that the conjunctive bodies of rules used. I just print the returned substitutions. This however changes how you query for a simplification to`?- plus(x,zero) = T.`

which returns`[?T = x]`

. I return all the different eclasses available to the pattern, but do not print every possible instantiation of the bound eclasses. - I made
`MultiPattern`

an applier, so now you can have conjunctions in the heads of rules. This makes rules with multiple heads for the same bodies more efficient. - I made
`MultiPattern`

generic as an`Applier`

and`Searcher`

. In principle, this should perhaps be part of the main egg library as it is rather useful. - I added an
`:- include()`

directive to enable me to structure files. It does not work on the web version though, so that sort of cripples that - I switched to using
`clap`

to make my cli https://github.com/clap-rs/clap, which enables me to elegantly add more options as i see fit - I reorganized the code to make a repl possible and added a basic one using
`rustyline`

https://github.com/kkawakam/rustyline. I have barely used it however.

## Examples

I also started spending time just making examples, to understand what I can do and what the pain points are. Why build a thing if you don’t want to use it? Well, that’s a dishonest rhetorical question. I build things I have no use for all the time just for the hell of it. It feels good to build.

### Basic Arithmetic

You can get egglog to perform arithmetic calculations. An interesting aspect of this is you can just spray overcomplete axioms and you won’t hurt much. This is different from a prolog treatement of the same. You also do not have to resort to a peano representation. The naturals presented as a free ring is perfectly acceptable.

```
plus(X,Y) <- plus(Y,X).
plus(plus(X,Y),Z) <-> plus(X,plus(Y,Z)).
mul(X,Y) <- mul(Y,X).
mul(mul(X,Y),Z) <-> mul(X, mul(Y,Z)).
/* distributive */
mul(X,plus(Y,Z)) <-> plus(mul(X,Y),mul(X,Z)).
X <- mul(one, X).
X <- plus(zero, X).
plus(one,X) <- succ(X).
two = succ(one).
three = succ(two).
four = succ(three).
five = succ(four).
six = succ(five).
plus(mul(x, three), mul(four,x)).
plus(mul(x, three), mul(two, plus(one, mul(four,x)))).
mul(two,two).
?- mul(two,two) = plus(two,two).
?- mul(two,two) = plus(one, X).
?- mul(two,two) = Z.
?- plus(mul(x, three), mul(two, plus(one, mul(four,x)))) = Z.
```

### Datalog

To drive home the point that there is a strong similarity to datalog, here are some simple datalog examples. As is, egglog is very inefficient as a datalog, but a very intriguing question is which datalog implementation techniques are transferable to egglog

```
/* Straight from https://en.wikipedia.org/wiki/Datalog */
parent(xerces, brooke).
parent(brooke, damocles).
ancestor(X, Y) :- parent(X, Y).
ancestor(X, Y) :- parent(X, Z), ancestor(Z, Y).
?- ancestor(xerces, X).
/* https://www.stephendiehl.com/posts/exotic04.html */
/* All men are mortal / valar morghulis */
mortal(X) :- human(X).
/* Socrates is a man. */
human(socrates).
/* Is Socrates mortal? */
?- mortal(socrates).
/* https://souffle-lang.github.io/simple */
path(X, Y) :- edge(X, Y).
path(X, Y) :- path(X, Z), edge(Z, Y).
edge(a,b).
edge(b,c).
?- path(X,Y).
```

### SKI combinators.

The SKI combinators are a turing complete set of combinators. Using defunctionalization (defining an apply function) you can represent them in egglog. This is one possible avenue to having lambda calculus reasoning in an egraph. Pavel Panchekha reports that at scale this chokes the egraph with junk https://twitter.com/pavpanchekha/status/1425952544509947912?s=20

```
/*
Defunctionalized SKI Combinators
https://en.wikipedia.org/wiki/SKI_combinator_calculus
*/
apply(i,X) <-> i(X).
apply(k,X) <-> k(X).
apply(k(X),Y) <-> k(X,Y).
apply(s,X) <-> s(X).
apply(s(X),Y) <-> s(X,Y).
apply(s(X,Y),Z) <-> s(X,Y,Z).
X <- i(X).
Y <- k(X,Y).
apply(apply(X,Z),apply(Y,Z)) <- s(X,Y,Z).
k(i(k),i(i)).
?- k(i(k),i(i)) = A.
s(k,k,s).
?- s(k,k,s) = A.
```

### Lists

A classic prolog example is showing how to derive the possible lists that append together to create a third. Here I replicate that. Another very intriguing angle is to think about equational simplification of list combinators like map, fold, filter, repeat, etc.

```
list(nil).
list(cons(X,Y)) :- list(Y), cons(X,Y).
A <- car(cons(A,L)).
L <- cdr(cons(A,L)).
memb(A,cons(A,L)) :- cons(A,L).
/*
How should this work? Aliasing issues. We may discover K1 is K later.
lookup() <- lookup( K1, cons(kv(K,V),L) ), K1 != K .
V <- lookup( K, cons(kv(K,V),L) ).
*/
append(nil,Y) <-> Y.
append(Y,nil) <-> Y.
append(cons(X,Y),Z) <-> cons(X, append(Y,Z)).
rev(nil) = nil.
X <- rev(rev(X)).
rev(append(X,Y)) <-> append(rev(Y),rev(X)).
cons(x,cons(y,cons(z,nil))) = l.
?- append(X,Y) = l.
nil <- map(F,nil).
cons(apply(F,X), map(F,L)) <- map(F,cons(X,L)).
apply(F,apply(G,X)) <-> apply(comp(F,G), X).
map(F,map(G,L)) <-> map(comp(F,G),L).
nil <- filter(F,nil).
cons(X,filter(F,L)) <- filter( F, cons(X,L) ), apply(F,X) = true.
filter(F,L) <- filter( F, cons(X,L) ), apply(F,X) = false.
/* defunctionalization */
X <- apply(id, X).
/* What is id? a partial function over the union of all types? A relation? */
filter(id, cons(true,cons(false,nil))).
?- filter(id, cons(true,cons(false,nil))) = Res.
```

### The Uniqueness of Identity

A simple proof in category theory is the uniqueness of the identity. Here I encode this into egglog. These categorical examples make clear to me that being able to include libraries, logical syntax like forall, exists, and exists_unique would be very useful. They are a bit long winded in this form. Hey, I’m workin’ on it!

```
type(id(A)) = hom(A,A) :- ob = type(A).
/* ob = type(A) is probabnly slightly more efficient to search for than type(A) = ob */
F <- comp(id(A), F).
F <- comp(F, id(A)).
comp(F,id(B)) = F :- type(F) = hom(A,B).
comp(id(A),F) = F :- type(F) = hom(A,B).
/* associativity of composition */
comp(comp(F,G),H) <-> comp(F, comp(G,H)).
/* Composition exists if types work out */
type(comp(F,G)) = hom(A,C) :- hom(A,B) = type(F), hom(B,C) = type(G).
/* A supposed second identity for object a */
type(a) = ob.
type(id2(a)) = hom(a,a).
F <- comp(F,id2(a)).
F <- comp(id2(a),F).
comp(F,id2(a)) = F :- type(F) = hom(A,a).
comp(id2(a),F) = F :- type(F) = hom(a,B).
?- id2(a) = id(a).
/* sanity check. f should not be the identity a priori */
type(f) = hom(a,a).
?- f = id2(a).
```

### Composition of pullbacks is a pullback

If you paste together two pullback squares, the big square is also a pullback square.

```
/* https://proofwiki.org/wiki/Pullback_Lemma
https://ncatlab.org/nlab/show/pasting+law+for+pullbacks
*/
/* Standard categorical definitions */
type(id(A)) = hom(A,A) :- ob = type(A).
F <- comp(id(A), F).
F <- comp(F, id(A)).
comp(F,id(B)) = F :- type(F) = hom(A,B).
comp(id(A),F) = F :- type(F) = hom(A,B).
/* associativity of composition */
comp(comp(F,G),H) <-> comp(F, comp(G,H)).
comp(F,comp(G,H)) <-> comp(F,G,H).
/* Composition exists if types work out */
type(comp(F,G)) = hom(A,C) :- hom(A,B) = type(F), hom(B,C) = type(G).
/* Pullback definitions */
/* pullback is square */
comp(H,F) = comp(K,G) :- pullback(F,G,H,K).
/* universal morphism exists. Triangles commute
TODO: is univ a function of H1 K1?
*/
comp(univ(F,G,H,K),H1) = H,
comp(univ(F,G,H,K),K1) = K,
type(univ(F,G,H,K)) = hom(Z,E)
:- pullback(F,G,H1,K1), comp(H,F) = comp(K,G), type(F) = hom(A,B), type(H) = hom(Z,A), type(H1) = hom(E,A).
/* uniqueness of universal morphism */
univ(F,G,H,K) = U :- pullback(F,G,H1,K1), comp(H,F) = comp(K,G), H = comp(U,H1), K = comp(U,K1).
/*
a <-H- d p a1
F | | K Q
v v
c <-G -b J e
*/
/*
Ideally users don't have to fill out this table.
It is obnoxious, obvious, and error prone.
*/
type(a) = ob.
type(b) = ob.
type(c) = ob.
type(d) = ob.
type(e) = ob.
type(a1) = ob.
type(f) = hom(a,c).
type(g) = hom(b,c).
type(h) = hom(d,a).
type(k) = hom(d,b).
type(j) = hom(e,b).
type(p) = hom(a1,d).
type(q) = hom(a1,e).
pullback(f,g,h,k).
pullback(k,j,p,q).
/*
Is big square a pullback?
In some world it would be nice to reuse the above definition. I don't know how to do this
1. Does the square commute?
*/
?- comp(p,h,f) = comp(q,j,g).
/*
2: Given another square, is there a morphism that makes the triangles commute
*/
/*
-- r -- z
a <-H- d p a1
F | | K Q w |
v v
c <-G -b J e
*/
type(z) = ob.
type(r) = hom(z,a).
type(w) = hom(z,e).
comp(r,f) = comp(w,j,g). /* is square */
/* exists a morphism for which triangles commute */
?- comp(U,p,h) = r.
?- comp(U,q) = w.
/* 3: and it is unique?
Some questions here about how to phrase this.
Is this right? or am I positing that the require morphism already exists with this?
I think the uniqueness of the eclass actually might do it.
That's interesting.
ALso there might be a
(build morphism, write it down if you find it, now instanatiate it explcitly in this query) semanatics
?- u2 = univ(k,j, univ(f,g,r,comp(w,j)), w).
Eh that doesn't really matter does it?
Well it matters that it succeeded before I inserted this stuff maybe.
-? (comp p h f) = (comp q j g)
[];
-? (comp ?U p h) = r
[?U = (univ k j (univ f g r (comp w j)) w)];
-? (comp ?U q) = w
[?U = (univ k j (univ f g r (comp w j)) w)];
type(u2) = hom(z,a1).
comp(u2,comp(p,h)) = r.
comp(u2,q) = w.
?- u2 = U.
*/
```

## Good Features to Add

- Egraph output - a serialization of the egraph could be useful
- Graphviz output on the web https://github.com/hpcc-systems/hpcc-js-wasm/
- sexp/smtlib-ish input/output
- More logical connectives.
- set node limit and timeout on the command line
- Make a more logic-y flavored input language. Do some simple rearrangements like skolemaization and forall expansion. Harrop formula?
- Hashlog - what if I use just a hashcons to back a datalog? Seems like a helpful thought experiment to figure out how to semi naively evaluate. Combined with the souffle paper of union find equiavlance classes in datalog https://souffle-lang.github.io/pdf/pact2019eqrel.pdf
- The Chase - What is the relationship with the chase? https://en.wikipedia.org/wiki/Chase_(algorithm) https://en.wikipedia.org/wiki/Embedded_dependency