## More Stupid Z3Py Tricks: Simple Proofs

Z3 can be used for proofs. The input language isn’t anywhere near as powerful as interactive theorem provers like Coq, Isabelle, or Agda, but you can ask Z3 to prove pretty interesting things. Although the theorems that follow aren’t hard in interactive theorem provers, they would take beyond complete novice level skills to state or prove.

I like to think of the z3 proving process as “failing to find a counterexample”. Z3py has supplies a function `prove` which is implemented like this.

Basically, it negates the thing you want to prove. It then tries to find a way to instantiate the variables in the expression to make the statement false. If it comes back unsat, then there is no variable assignment that does it. Another way to think about this is rewriting the $\forall y. p(y)$ as $\neg \exists y \neg p (y)$. The first $\neg$ lives at sort of a meta level, where we consider unsat as a success, but the inner $\neg$ is the one appearing in `s.add(Not(claim))`.

We can prove some simple facts. This is still quite cool, let’s not get too jaded. Manually proving these things in Coq does suck (although is easy if you use the ring, psatz, and lra tactics https://coq.inria.fr/refman/addendum/micromega.html, which you DEFINITELY should. It is a great irony of learning coq that you cut your teeth on theorems that you shouldn’t do by hand).

Ok, here’s our first sort of interesting example. Some properties of even and odd numbers. Even and Odd are natural predicates. What are possible choices to represent predictaes in z3?
We can either choose python functions `IntSort -> BoolSort()` as predicates or we can make internal z3 functions `Function(IntSort(), BoolSort())`

All well and good, but try to prove facts about the multiplicative properties of even and odd. Doesn’t go through. 🙁

Here’s a simple inductive proof. Z3 can do induction, but you sort of have to do it manually, or with a combinator. Given a predicate f, inductionNat returns

Here’s another cute and stupid trick. Z3 doesn’t have a built in sine or cosine. Perhaps you would want to look into dreal if you think you might be heavily looking into such things. However, sine and cosine are actually defined implicitly via a couple of their formula. So we can instantiate
A slightly counterintuitive thing is that we can’t use this to directly compute sine and cosine values. That would require returning a model, which would include a model of sine and cosine, which z3 cannot express.
However, we can try to assert false facts about sine and cosine and z3 can prove they are in fact unsatisfiable. In this way we can narrow down values by bisection guessing. This is very silly.

A trick that I like to use sometimes is embedding objects in numpy arrays. Numpy slicing is the best thing since sliced bread. A lot, but not all, of numpy operations come for free, like matrix multiply, dot, sum, indexing, slicing, reshaping. Only some are implemented in terms of overloadable operations. here we can prove the Cauchy Schwartz inequality for a particular vector and some axioms of vector spaces.

Defining and proving simple properties of Min and Max functions

Proving the Babylonian method for calculating square roots is getting close to the right answer. I like the to think of the Babylonian method very roughly this way: If your current guess is low for the square root x/guess is high. If your guess is high, x/guess is low. So if you take the average of the two, it seems plausible you’re closer to the real answer. We can also see that if you are precisely at the square root, (x/res + x)/2 stays the same. Part of the the trick here is that z3 can understand square roots directly as a specification. Also note because of python overloading, `babylonian` with work on regular numbers and symbolic z3 numbers. We can also prove that babylon_iter is a contractive, which is interesting in it’s own right.

A funny thing we can do is define interval arithmetic using z3 variables. Interval arithmetic is very cool. Checkout Moore’s book, it’s good. This might be a nice way of proving facts related to real analysis. Not sure.
This is funny because z3 internally uses interval arithmetic. So what we’re doing is either very idiotically circular or pleasantly self-similar.
We could use a similar arrangement to get complex numbers, which z3 does not natively support

## Stupid Z3Py Tricks: Verifying Sorting Networks off of Wikipedia

Sorting networks are a circuit flavored take on sorting. Although you can build circuits for any size input, any particular circuit works for a fixed sized input. They are like an unrolling of the loops or recursion of more familiar sorting algorithms. They come up also in the context of parallel and gpu sorting

Here’s an interesting thing. We can go to Wikipedia and get a little python snippet for the comparison order of a Batcher odd-even mergesort. Kind of a confusing algorithm. Why does it even work? Is it even right? It’s written in some kind of funky, indexful generator style.

Well we can confirm this relatively straightforwardly using z3 by replacing the implementation of compare_and_swap with its z3 equivalent. We then ask z3 .

This comes back unsat, hence there are no inputs or executions that do not come back sorted. If I delete some elements from pair_to_compare, it comes back sat, showing that it doesn’t always sort.

The trick here is that the circuit is fixed size, so we have no need for induction, one of the main things z3 is rather finicky at.

It’s somewhat interesting to note that the output of odd_even_merge is a sequence of instructions, we can think of this as interpreting a very small 1 instruction programming language.

We can also confirm similarly a simple odd-even bubblesort and other similar algorithms.

Q: What about using uninterpreted sorts rather than integers? Integers is pretty convincing to me.

same_elems is slightly weaker than a permutation predicate. Wasn’t super obvious to me the best way to do a permutation predicate in z3. Would I want to internalize the array?

Edit: Upon further thought, actually the sort IS a nice predicate for permutation. How do we compute if two things are permutations of each other? By sorting them and forcing a zipped equality. Alternatively count the number of each element (a piece of bucket sort). Since this sort is done by composing swaps, it is somewhat intrinsically a permutation

As a bummer though, I think randomized testing on arrays would be equally or perhaps more convincing of the correctness of the algorithm. Oh well.

## Programming and Interactive Proving With Z3Py

I’ve been fiddling with z3py, figuring out some functionality and realizing some interesting things you could do with it. I think I’m at a point where it is nice to checkpoint myself with a blog post.

I’m a little surprised z3py doesn’t overload the & and | operators and some kind of implies operator for BoolRef. You can insert them later using this.

#### Functional Programming

Python is not the best functional programming environment imo. And by functional programming I implicitly mean roughly ML-like FP a la Haskell or OCaml. I don’t venture much into lisp land.

The lack of good algebraic datatypes (the class syntax is so ungainly) and a type system hurts. The lack of pattern matching hurts. The `lambda` keyword is so long it makes me sad.

But you have full access to z3 from the python bindings. Z3 does have algebraic data types, and a type system. It has built in substitution mechanisms and evaluation. And it has insane search procedures and the ability to prove things. Pretty damn cool!

Unfortunately the type system is rather simplistic, being basically simply typed rather than polymorphic or something else. But using python a a schema/macro system for z3 seems a plausible way forward.

To build templated types, you can have constructor functions in python for the appropriate types.

You can access the constructors from the returned types. Check this out. You get detector functions `is_Nothing` and `is_Just` , the extractor function `fromJust` and constructor functions `Nothing` and `Just`. I do a lot of `dir` exploration with z3py. It’s hard to know what’s available sometimes

It’s possible to build a general purpose match combinator on these types since you can introspect the number of constructors of the ADT using `num_constructors`, `constructor`, `recognizer`, and `accessor`. There might be a match inside z3py somewhere? I think it’s part of the SMTLIB standard now.

Example usage:

Z3 has a substitution mechanism built in. This is useful for instantiating `ForAll` and for evaluating `Lambda`. The `substitute_vars` function is what you want like so `substitute_vars(f.body(), x, y, z)`

It is possible to reflect the syntax in a fairly straightforward way back into python via a lambdify function, mimicking the equivalent very useful function from sympy. Lambdify is basically an `interp` function. Here is a start for such a function. I by no means have implemented interpretation of the entirety of z3. Also I feel like this implementation is very clunky. Some kind of CPS?

There is the ability to define recursive functions in z3. It is also plausible to define them via. In this way you can get a reversible functional programming language, maybe some subset of mercury / curry’s power.

### Interactive Theorem Proving

Z3 is awesome at thoerem proving. But somethings it just doesn’t handle right and needs human guidance.

Through searching, there are a couple interesting python interactive theorem prover projects. Cody pointed me to a project he worked on a while back, Boole https://github.com/avigad/boole . It has a dependently typed lambda calculus in it with the purpose of gluing together many systems, I think. He implemented a lot of stuff from scratch. I think I want to try to get less and do less. There is also holpy https://arxiv.org/abs/1905.05970 which appears to be being actively developed. It’s roughly a translation of hol to python I think. It’s available from a strange chinese github on the author’s website if you go looking for it.

This suggests an interesting approach. Most interactive theorem provers start unautomated and add it later. Instead we can iteratively build an interface to de-automate z3.

Altogether, this approach is more HOL flavored than Coq/Agda flavored. z3 terms are our logic and python is our manipulation metal language. Ideally, one would want to verify that every.

Python is so unprincipled that I can’t imagine that you could ever build a system up to the trustworthiness of the other theorem provers. But this is freeing in many ways. Since that is off the table, we can just do the best we can.

Using the z3 syntax tree and the z3 proof automation and z3 substitution mechanisms gives us a HUGE step up from implementing them from scratch. Ideally, we’d want to write as little python as possible, and especially as little python as possible that has to be trusted to be implemented correctly.

One big concern is accidental mutation of the proof under our feet by python. Perhaps using hashes and checking them might be a way to at least detect this. I need to have a good think about how to factor out a trusted core from all possible tactics.

I think it helps a little that z3 often will be able to verify the equivalence of small steps in proofs even if it can’t do the entire proof itself.

I think induction principles will need to be injected by hand. Z3 doesn’t really have that built in. There are definitely situations that after you introduce the induction, z3 can slam all the cases no problem. For example, check this out.

Another thing that might be nice is integration/translation to sympy. Sympy has a ton of useful functionality, at the very least differentiation.

Translation and integration with cvxpy for sum of squares proofs would also be quite neat. I already did something with this using sympy. I’m not super sure how you extract exact proofs from the floating point solutions SCS returns? I think there is a thing. I’ve heard the LLL algorithm can be used for this somehow (finding likely algebraic number matches to floating point numbers)?

So here are some sketched out ideas for tactics.

Another question is how to implement an apply tactic gracefully. Fully deconstructing syntax trees and unifying ourselves is not utilizing z3 well. If you have a good idea how to get unification out of z3, I’d be interested to hear from you here. https://stackoverflow.com/questions/59398955/getting-z3-instantiations-of-quantified-variables/59400838#59400838

Here’s an idea though. In the cold light of day, I am still not sure this reasoning makes much sense. Suppose we’re trying to apply forall x. a(x) -> b(x) to a c(y). If forall x. b(x) -> c(y) we’re good and by assumption that is obvious for some reason, like the syntactic instantiation of b gives c. We can ask z3 to prove that and it will hopefully easy. If we can prove forall x. a(x) in the current context, that would be sufficient, but not true typically. It is an overly difficult request. We really only need to prove a(x) for values pertinent to the proof of c(y). Here’s a suspicious strategem. Any a -> b can be weakened to (q -> a) -> (q -> b). In particular we can choose to weaken forall x. a(x) -> b(x) to forall x. ((c(y) -> b(x)) -> a(x)) -> ((c(y) -> b(x)) -> b(x)). Then we can replace the goal with forall x. ((c(y) -> b(x)) -> a(x)) after we prove that (forall x. (c(y) -> b(x)) -> b(x)) -> c(y). Maybe c(y) -> b(x) is sufficient to restrict the values of x? Not sure.

Another rough sketch of induction on Nat. Not right yet.

We could also make a simple induction for ADTs based on the similar introspection we used for `match` above. It’s ugly but I think it works.

I haven’t really though much about tacticals yet.

## Proving some Inductive Facts about Lists using Z3 python

Z3 is an SMT solver which has a good rep. Here are some excellent tutorials.

https://rise4fun.com/z3/tutorial

https://theory.stanford.edu/~nikolaj/programmingz3.html

http://homepage.cs.uiowa.edu/~ajreynol/VTSA2017/

SMT stands for satisfiability modulo theories. The exact nature of power of these kinds of solvers has been and is still hazy to me. I have known for a long time that they can slam sudoku or picross or other puzzles, but what about more infinite or logic looking things? I think I may always be hazy, as one can learn and discover more and more encoding tricks to get problems and proofs that you previously thought weren’t solvable into the system. It’s very similar to learning how to encode to linear programming solvers in that respect.

SMT solvers combine a general smart search procedure with a ton of specialized solvers for particular domains, like linear algebra, polynomials, linear inequalities and more.

The search procedure goes by the name DPLL(T). It is an adjustment of the procedure of SAT solvers, which are very powerful and fast. SAT solvers find an assignment of boolean variables in order to make a complicated boolean expression true, or to eventually find that it can never be made true. SAT solvers work on the principle of guessing and deducing. If `a OR b` needs to be true and we already know `a` is false, we can deduce `b` must be true. When the deduction process stalls, we just guess and then backtrack if it doesn’t work out. This is the same process you use manually in solving Sudoku.

The modern era of SAT solvers came when a couple new tricks vastly extended their power. In particular Conflict Driven Clause Learning (CDCL), where when the solver finds itself in a dead end, it figures out the choices it made that put it in the dead end and adds a clause to its formula so that it will never make those choices again.

https://sahandsaba.com/understanding-sat-by-implementing-a-simple-sat-solver-in-python.html

SMT works by now having the boolean variables of the SAT solver contain inner structure, like the boolean p actually represents the fact $x + y < 5$. During the search process it can take the pile of booleans that have been set to true and ask a solver (maybe a linear programming solver in this case) whether those facts can all be made true in the underlying theory. This is an extra spice on top of the SAT search.

Something that wasn’t apparent to me at first is how important the theory of uninterpreted formulas is to SMT solvers. It really is their bread and butter. This theory is basically solved by unification, which is the fairly intuitive process of finding assignments to variables to make a set of equations true. If I ask how to make $fred(x,4) = fred(7,y)$, obviously the answer is $y=4$, $x=7$. That is unification. Unification is a completely syntax driven way to deduce facts. It starts to give you something quite similar to first order logic.

https://eli.thegreenplace.net/2018/unification/

https://cs.mtsu.edu/~rbutler/courses/sdd/automated_deduction/unification.pdf

I was also under the impression that quantifiers $\forall, \exists$ were available but heavily frowned upon. I don’t think this is quite true. I think they are sort of a part of the entire point of the SMT solver now, although yes, they are rather flaky. There are a number of methods to deal with the quantifier, but one common one is to look for a pattern or parts of the subformula, and instantiate a new set of free variables for all of the quantified ones and add the theorem every time the patterns match. This is called E-matching.

Here are a couple tutorials on proving inductive facts in SMT solvers. You kind of have to hold their hand a bit.

http://lara.epfl.ch/~reynolds/vmcai15.pdf

http://homepage.divms.uiowa.edu/~ajreynol/pres-ind15.pdf

SMT solvers queries usually have the flavor of finding something, in this case a counterexample. The idea is that you try to ask for the first counterexample where induction failed. Assuming that proposition P was true for (n-1), find n such that P is not true. If you can’t find it, then the induction has gone through.

And here is some code where I believe I’m showing that some simple list functions like reverse, append, and length have some simple properties like $\forall t. rev (rev(t)) == t$.

## Gettin’ that Robot some Tasty Apples: Solving a simple geometrical puzzle in Z3 python

At work there is a monthly puzzler.

“Design a robot that can pick up all 9 apples arranged on a 3 by 3 rectangular grid, and spaced 1m apart. The robot parts they have are faulty. The robot can only turn three times”

I think the intent of the puzzle is that the robot can drive in forward and reverse, but only actually turn 3 times. It’s not very hard to do by hand. I decided to take a crack at this one using Z3 for funzies. Z3 is an SMT solver. It is capable of solving a interesting wide variety of problems.

I interpret this as “find 4 lines that touch all points in the grid, such that each subsequent line intersects.”

It is fairly easy to directly translate this into a Z3 model. If we ask z3 to use only 3 lines, it returns unsat. Try to prove that by hand.

However, If the robot is on the projective plane, it is possible with 3 lines. It only needs to drive to infinity and turn twice. All lines intersect exactly once on the projective plane. How convenient.

The problem only seems somewhat difficult to computerize because of the seemingly infinite nature of geometry. If we only consider the lines that touch at least two points, all possible robot paths becomes extremely enumerable. Is there a proof that we only need these lines?

Another interesting approach might be to note that the points are described by the set of equations $x*(x-1)*(x-2)=0$ and $y*(y-1)*(y-2)=0$. I think we could then possibly use methods of nonlinear algebra (Groebner bases) to find the lines. Roughly an ideal containment question? Don’t have this one fully thought out yet. I think z3 might be doing something like this behind the scenes.