Relational Algebra with Fancy Types

Last time, I tried to give a primer of relations and relational algebra using the Haskell type type Rel a b = [(a,b)]. In this post we’re going to look at these ideas from a slightly different angle. Instead of encoding relations using value level sets, we’ll encode relations in the type system. The Algebra of Programming Agda repo and the papers quoted therein are very relevant, so if you’re comfortable wading into those waters, give them a look. You can find my repo for fiddling here

At this point, depending on what you’ve seen before, you’re either thinking “Yeah, sure. That’s a thing.” or you’re thinking “How and why the hell would you do such a ridiculous thing.”

Most of this post will be about how, so let’s address why first:

  1. Examining relations in this style illuminates some constructions that appear around the Haskell ecosystem, particularly some peculiar fellows in the profunctor package.
  2. More syntactic approach to relations allows discussion of larger/infinite domains. The finite enumerations of the previous post is nice for simplicity, but it seems you can’t get far that way.
  3. Mostly because we can – It’s a fun game. Maybe a useful one? TBD.

With that out of the way, let’s go on to how.

Translating Functions to Relation GADTs

We will be using some Haskell extensions in this post, at the very least GADTs and DataKinds. For an introduction to GADTs and DataKinds, check out this blog post. DataKinds is an extension that reflects every data constructor of data types to a type constructor. Because there are values True and False there are corresponding types created'True and 'False. GADTs is an extension of the type definition mechanism of standard Haskell. They allow you to declare refined types for the constructors of your data and they infer those refined type when you pattern match out of the data as well, such that the whole process is kind of information preserving.

We will use the GADT extension to define relational datatypes with the kind

. That way it has a slot a for the “input” and b for the “output” of the relation. What will goes in these type slots will be DataKind lifted types like 'True, not ordinary Haskell types like Int. This is a divergence from from the uses of similar kinds you see in Category, Profunctor, or Arrow. We’re doing a more typelevel flavored thing than you’ll see in those libraries. What we’re doing is clearly a close brother of the singleton approach to dependently typed programming in Haskell.

Some examples are in order for what I mean. Here are two simple boolean functions, not and and defined in ordinary Haskell functions, and their equivalent GADT relation data type.

You can already start to see how mechanical the correspondence between the ordinary function definition and our new fancy relation type. A function is often defined via cases. Each case corresponds to a new constructor of the relation and each pattern that occurs in that case is the pattern that appears in the GADT. Multiple arguments to the relations are encoded by uncurrying everything by default.

Any function calls that occur on the right hand side of a function definition becomes fields in the constructor of our relation. This includes recursive calls and external function calls. Here are some examples with a Peano style natural number data type.

We can also define things that aren’t functions. Relations are a larger class of things than functions are, which is part of their utility. Here is a “less than equal” relation LTE.

You can show that elements are in a particular relation by finding a value of that relational type. Is ([4,7], 11) in the relation Plus? Yes, and I can show it with with the value PS (PS (PS (PS PZ))) :: Plus (4,7) 11 . This is very much the Curry-Howard correspondence. The type R a b corresponds to the proposition/question is (a,b) \in R .

The Fun Stuff : Relational Combinators

While you need to build some primitive relations using new data type definitions, others can be built using relational combinators. If you avoid defining too many primitive relations like the above and build them out of combinators, you expose a rich high level manipulation algebra. Otherwise you are stuck in the pattern matching dreck. We are traveling down the same road we did in the previous post, so look there for less confusing explanations of the relational underpinnings of these constructions, or better yet some of the references below.

Higher order relational operators take in a type parameters of kind

and produce new types of a similar kind. The types appearing in these combinators is the AST of our relational algebra language.

The first two combinators of interest is the composition operator and the identity relation. An element (a,c) is in R \cdot Q if there exists a b such that (a,b) \in R and (b,c) \in Q. The fairly direct translation of this into a type is

The type of the composition is the same as that of Profunctor composition found in the profunctors package.

Alongside a composition operator, it is a knee jerk to look for an identity relation and we do have one

This is also a familiar friend. The identity relation in this language is the Equality type.

We can build an algebra for handling product and sum types by defining the appropriate relational combinators. These are very similar to the combinators in the Control.Arrow package.

The converse of relations is very interesting operation and is the point where relations really differ from functions. Inverting a function is tough. Conversing a relation always works. This data type has no analog in profunctor to my knowledge and probably shouldn’t.

Relations do not have a notion of currying. The closest thing they have is

Lattice Operators

For my purposes, lattices are descriptions of sets that trade away descriptive power for efficiency. So most operations you’d perform on sets have an analog in the lattice structure, but it isn’t a perfect matching and you’re forced into approximation. It is nice to have the way you perform these approximation be principled, so that you can know at the end of your analysis whether you’ve actually really shown anything or not about the actual sets in question.

? No. No… Yes? Oh. OH! IT IS!

The top relation holds all values. This is represented by making no conditions on the type parameters. They are completely phantom.

Bottom is a relation with no inhabitants.

The meet is basically the intersection of the relations, the join is basically the union.

A Lattice has an order on it. This order is given by relational inclusion. This is the same as the :-> combinator can be found in the profunctors package.

Relational equality can be written as back and forth inclusion, a natural isomorphism between the relations. There is also an interesting indirect form.

Relational Division

If we consider the equation (r <<< p) :-> q with p and q given, in what sense is there a solution for r? By analogy, this looks rather like r*p = q, so we’re asking a kind of division question. Well, unfortunately, this equation may not necessarily have a solution (neither do linear algebraic equations for that matter), but we can ask for the best under approximation instead. This is the operation of relational division. It also appears in the profunctor package as the right Kan Extension. You’ll also find the universal property of the right division under the name curryRan and uncurryRan in that module.

One formulation of Galois connections can be found in the adjunctions file. Galois Connections are very slick, but I’m running out of steam, so let’s leave that one for another day.

Properties and Proofs

We can prove many properties about these relational operations. Here a a random smattering that we showed using quickcheck last time.

Odds and Ends

  • Recursion Schemes – Recursion schemes are a methodology to talk about recursion in a point free style and where the rubber meets the road in the algebra of programming. Here is an excellent series of articles about them. Here is a sample of how I think they go:
  • Higher Order Relations?
  • Examples of use. Check out the examples folder in the AoP Agda repo. These are probably translatable into Haskell.
  • Interfacing with Singletons. Singletonized functions are a specialized case or relations. Something like?
  • A comment to help avoid confusion. What we’ve done here feels confusingly similar to profunctor, but it is in fact distinct I think. Profunctors are described as a categorical generalization of relations , but to be honest, I kind of don’t get it. Despite many of our constructions appearing in the profunctor package, the profunctor typeclass itself appears to not play a role in our formulation. There just isn’t a good way to dimap under our relations as written, unless you construct free profunctors. Converse at the least is a wrench in the works.
  • Star and graphs. Transition relations are a powerful methodology. A transition relation is in some respects the analog of a square matrix. We can iteratively compose it with itself.

References

CAV 2019 Notes: Probably Nothin Interestin’ for You. A bit of noodling with Liquid Haskell

I went to the opening workshops of CAV 2019 in New York this year (on my own dime mind you!) after getting back from joining Ben on the long trail for a bit. The tutorials on Rosette and Liquid Haskell were really fun. Very interesting technology. Also got some good ramen and mochi pudding, so it’s all good. Big Gay Ice Cream was dece.

Day 1 workshops

Calin Belta http://sites.bu.edu/hyness/calin/.Has a new book. Control of Temporal logic systems. Automata. Optimized. Partition space into abstraction. Bisimulation https://www.springer.com/gp/book/9783319507620

Control Lyapunov Function (CLF) – guarantees you are going where you want to go

Control Barrier Function – Somehow controls regions you don’t want to go to.

Lyapunov function based trajectory optimization. You somehow have (Ames 2014) http://ames.gatech.edu/CLF_QP_ACC_final.pdf Is this it?

Differential flatness , input output linearization

Sadradiini worked there.

Temproal logic with

Rise of Temporal Logic

Linear Temporal Logic vs CTL

Fixpoint logic,

Buchi automata – visit accepting state infinite times

equivalency to first order logic

monadic logic, propositions only take 1 agrument. Decidable. Lowenheim. Quantifier elimination. Bounded Mondel property

Languages: ForSpec, SVA, LDL, PSL, Sugar

Monadic second order logic (MSO).

method of tableau

Sadraddini

Polytopic regions. Can push forward the dynmaics around a trajectory and the polytope that you lie in. RRT/LQR polytopic tree. pick random poitn. Run.

Evauating branching heuristics

branch and prune icp. dreal.

branch and prune. Take set. Propagate constraints until none fire.

branching heuristics on variables

largest first, smearing, lookahead. Try different options see who has the most pruning. Non clear that helped that muhc

QF_NRA. dreal benchmarks. flyspeck, control, robotics, SMT-lib

http://capd.sourceforge.net/capdDynSys/docs/html/index.html

Rosette

commands: saolver adied programming

verify – find an input on which the assertions fail. exists x. not safe

debug – Minimal unsat core if you give an unsat query. x=42/\ safe(s,P(x))$ we know thia is unsat because of previous step

solve – exists v si.t safe(v)

synthesis – exists e forall x safe(x,P(x))

define-symbolic, assert, verify, debug, solve, sythesis

Rosette. Alloy is also connected to her. Z Method. Is related to relational logic?

https://homes.cs.washington.edu/~emina/media/cav19-tutorial/index.html

http://emina.github.io/rosette/

Building solver aided programming tool.

symbolic compiler. reduce program all possible paths to a constraint

Cling – symbolic execution engine for llvm

implement intepreter in rosette

Symbolic virtual machine

layering of languages. DSL. library (shallow) embedding. interpreter (deep) embedding.

deep embedding for sythesis.

I can extract coq to rosette?

how does it work?

reverse and filter keeping only positive queries.

symbolic execution vs bounded model checking

symbolic checks every possible branch of the program. Cost is expoentntial

CBMC.

type driven state merging. Merge instances of primitiv types. (like BMC), value types structurally ()

instance Merge Int, Bool, Real — collect up SMT context

vs. Traversable f => Merge (f c) – do using Traversable

symbolic union a set of guarded values with diskoint guard.

merging union. at most one of any shape. bounded by number of possible shapes.

puts some branching in rosette and some branch (on primitives) in SMT.

symbolic propfiling. Repair the encdoing.

tools people have built.

veify radiation

strategy generation. That’s interesting. builds good rewrite rules.

serval.

certikso komodo keystone. fintie programss

IS rosette going to be useful for my work? coooooould be

Liquid Haskell

https://ranjitjhala.github.io/

Liquid Haskell – What is?

another thing we could do is galois connections between refinements. Pos, Zero, Neg <-> Int

Liquid Haskell uses SMT solvers to resolve it’s type checking requirements.

Agda et al also work very much via unification. Unification is a broad term but it’s true.

It also has a horn clause solver for inference. Every language needs some kind of inference or you’d go insane. Also it is piggybacking on haskell

It’s not as magical as I thought? Like seeing the magicians trick. It really does understand haskell code. Like it isn’t interpretting it. When it knows facts about how (+) works, that is because the refined type was put in by hand in the prelude connecting it to SMT facts. What is imported by liquid haskell?

The typing environment is clutch. You need to realize what variables are in scope and what their types are, because that is all the SMT can use to push through type checking requirements.

Installing the stack build worked for me. It takes a while . I couldn’t get cabal install to work, because I am not l33t.

Uninterpeted functions. Unmatchability?

It wouldn’t be haskell without a bunch of compiler directives. It is somewhat difficult to find in a single cohesive place what all the syntax, and directives are from liquid haskell. Poking around it best.

  • ple
  • reflection
  • no-termination
  • higherorder – what is this?

https://github.com/ucsd-progsys/230-wi19-web course notes

https://github.com/ucsd-progsys/liquid-sf some of software foundations

https://nikivazou.github.io/publications.html niki vazou’s pubs. Check out refinement reflection

https://nikivazou.github.io/static/Haskell17/law-abiding-instances.pdf draft work? Shows stuff about typeclasses. This is a haskell 2017 paper though

https://arxiv.org/pdf/1701.03320 intro to liquid haskell. Interesting to a see a different author’s take

http://goto.ucsd.edu/~nvazou/presentations/ presentations. They are fairly similar to one another.

Liquid haskell gives us the ability to put types on stuff that wasn’t possible before.

Linearity :: f :: {a -> b | f (s ^* a) == s ^* (f a) }

Pullback. {(a,b) | f a == g b}

Equalizer

Many things in categoruy theory rely on the exists unique. Do we have functiona extensionality in Liquid haskell?

product : {(a,b) | f q = x, g q = y, => }

Pushing the boundaries on what liquid haskell can do sounds fun.

Equalizer. The eqaulizer seems prominents in sheaves. Pre-sheaves are basically functors. Sheaves require extra conditions. Restriction maps have to work? Open covers seem important

type Equalizer f g a b = {(e :: a , eq :: a -> b) | f (eq e) = g (eq e) }

I think both the type a and eq are special. e is like an explcit parametrization.

type Eq f g a = {e :: a | f e = g e} I think this is more in the spirit. Use f and g both as measures.

presheaf is functor. But then sheaf is functor that

(a, Eq (F a) (G a)). typelevel equalizer? All types a that F and G agree on.

https://ncatlab.org/nlab/show/equalizer

https://blog.functorial.com/posts/2012-02-19-What-If-Haskell-Had-Equalizers.html

Records are sheaves – Jon Sterling. Records have subtyping. This gives you a toplogy feeling thing.

https://www.slideshare.net/jonsterling/galois-tech-talk-vinyl-records-in-haskell-and-type-theory

What about purescript records?

{foo | a} {bar | a} -> intersection = {foo bar | b} can inhabit either

union is
or do you want closed records? union is union of fields. intersection is intersection of fields.

In this case a cover would be a set of records with possibly overlapping fields whose combined labels cover the whle space we want to talk about. consistency condition of sheaf/equalizer is that overlapping records fields have to match. I guess {  q.foo = r.foo } ?There is a way to combine all the stuff up. This is exactly what Ghrist was getting at with tables. Tables with shared columns.

data R1 = R1 {foo :: Int, bar :: Int}

{ (r1 :: R1, r2 :: R2) | (foo r1) = (foo r2) } — we manitain duplicates across records.

{. }

if you have a “cover” {foo bar |} {bar fred} {gary larry} whose in

https://www.sciencedirect.com/science/article/pii/S1571066108005264

Sheaves. As a model of concurrency? Gaguen paper.

sheaves as constraint satisfcation? sheafifcation. Constraint solving as a way of fusing the local constraints to be globally consistent.

sheaves as records

sheaves as data fusion

http://www.cs.bham.ac.uk/~mhe/papers/barbados.pdf

Escardo. Compact data types are those finitely searchable

Continuous funcitons are ~computable? Productive?

http://www.paultaylor.eu/

http://www.paultaylor.eu/ASD/foufct/

http://www.paultaylor.eu/~pt/prafm/

typed recursion theory toplogy

typed computatabiltity theory

Topological notions in computation. Dictionary of terms realted decidable, searchable, semi decidablee

cs.ioc.ee/ewscs/2012/escardo/slides.pdf

https://en.wikipedia.org/wiki/Computable_topology

Through NDArray overloading, a significant fragment of numpy code is probably verifiable.

Start with functional arithmetic programs.

Need to inspect function annotations to know how to build input type.

@verify() tag

Use (Writer a) style monad.

If statements are branching. We are again approaching inspecting functions via probing. But what if we lazily probe. At every __bool__ point, we run a z3 program to determine if there is an avaiable bool left possible (we don’t need to inspect dead code regions. Also would be cool to mention it is a dead region). Curious. We’re reflecting via Z3.

Loops present a problem. Fixed loops are fine. but what about loops that depend on the execution? for i in range(n). I guess again we can hack it…? Maybe. range only takes an integer. we don’t have overload access.

Maybe we need to go into a full eval loop. utterly deconstructing the function and evaluating it statelemnt by statement.

(compare :: a -> a -> Comparison). We could select a choice based on if there is a new one avaialable. Requires access to external store. We lose the thread. How can we know a choice was made? How can we know what the choice was? Did it ask var1 or var2? We can probably do it in python via access to a global store. But in haskell?

while loops take invariant annotations.

It would be cool to have a program that takes

pre conditions. Post ocnditions, but then also a Parameter keyword to declare const variables as deriveable. exists parameter. forall x precondition x => post condition.

Parameter could be of a type to take a dsl of reasonable computations. Perhaps with complexity predicates. and then interpretting the parameter defines the computation.

Or simpler case is parameter is an integer. a magic number.

Proving some Inductive Facts about Lists using Z3 python

Z3 is an SMT solver which has a good rep. Here are some excellent tutorials.

https://rise4fun.com/z3/tutorial

https://theory.stanford.edu/~nikolaj/programmingz3.html

http://homepage.cs.uiowa.edu/~ajreynol/VTSA2017/

SMT stands for satisfiability modulo theories. The exact nature of power of these kinds of solvers has been and is still hazy to me. I have known for a long time that they can slam sudoku or picross or other puzzles, but what about more infinite or logic looking things? I think I may always be hazy, as one can learn and discover more and more encoding tricks to get problems and proofs that you previously thought weren’t solvable into the system. It’s very similar to learning how to encode to linear programming solvers in that respect.

SMT solvers combine a general smart search procedure with a ton of specialized solvers for particular domains, like linear algebra, polynomials, linear inequalities and more.

The search procedure goes by the name DPLL(T). It is an adjustment of the procedure of SAT solvers, which are very powerful and fast. SAT solvers find an assignment of boolean variables in order to make a complicated boolean expression true, or to eventually find that it can never be made true. SAT solvers work on the principle of guessing and deducing. If a OR b needs to be true and we already know a is false, we can deduce b must be true. When the deduction process stalls, we just guess and then backtrack if it doesn’t work out. This is the same process you use manually in solving Sudoku.

The modern era of SAT solvers came when a couple new tricks vastly extended their power. In particular Conflict Driven Clause Learning (CDCL), where when the solver finds itself in a dead end, it figures out the choices it made that put it in the dead end and adds a clause to its formula so that it will never make those choices again.

https://sahandsaba.com/understanding-sat-by-implementing-a-simple-sat-solver-in-python.html

SMT works by now having the boolean variables of the SAT solver contain inner structure, like the boolean p actually represents the fact x + y < 5. During the search process it can take the pile of booleans that have been set to true and ask a solver (maybe a linear programming solver in this case) whether those facts can all be made true in the underlying theory. This is an extra spice on top of the SAT search.

Something that wasn’t apparent to me at first is how important the theory of uninterpreted formulas is to SMT solvers. It really is their bread and butter. This theory is basically solved by unification, which is the fairly intuitive process of finding assignments to variables to make a set of equations true. If I ask how to make fred(x,4) = fred(7,y), obviously the answer is y=4, x=7. That is unification. Unification is a completely syntax driven way to deduce facts. It starts to give you something quite similar to first order logic.

https://eli.thegreenplace.net/2018/unification/

https://cs.mtsu.edu/~rbutler/courses/sdd/automated_deduction/unification.pdf

I was also under the impression that quantifiers \forall, \exists were available but heavily frowned upon. I don’t think this is quite true. I think they are sort of a part of the entire point of the SMT solver now, although yes, they are rather flaky. There are a number of methods to deal with the quantifier, but one common one is to look for a pattern or parts of the subformula, and instantiate a new set of free variables for all of the quantified ones and add the theorem every time the patterns match. This is called E-matching.

Here are a couple tutorials on proving inductive facts in SMT solvers. You kind of have to hold their hand a bit.

http://lara.epfl.ch/~reynolds/vmcai15.pdf

http://homepage.divms.uiowa.edu/~ajreynol/pres-ind15.pdf

SMT solvers queries usually have the flavor of finding something, in this case a counterexample. The idea is that you try to ask for the first counterexample where induction failed. Assuming that proposition P was true for (n-1), find n such that P is not true. If you can’t find it, then the induction has gone through.

And here is some code where I believe I’m showing that some simple list functions like reverse, append, and length have some simple properties like \forall t. rev (rev(t)) == t .