A Short Skinny on Relations & the Algebra of Programming

I’ve been reading about the Algebra of Programming lately and lovin’ it. See J.N. Oliveira’s draft text in particular and the links in the references. I’ve started exploring the stuff from this post and more over here: https://github.com/philzook58/rel

Why and What?

Relations can expand the power of functional programming for the purpose of specification.

The point of a specification is to be able to write down in a very compact and clear way your intent for a program, more clearly and compactly than a full implementation could be written. It therefore makes sense to add to your specification language constructs that are not necessarily executable or efficient for the sake of compactness and clarity. When one needs executability or efficiency, one writes an implementation whose behavior you can connect to the spec via a formal or informal process.

Functional programming, with it’s focus on the abstraction of the mathematical function, is a good trade-off between executability, efficiency, and expressibility. It lies in a reasonable location between the ideas amenable to reasoning by a human mind and the command-driven requirements of the machines.

Functions are a specialization of relations. Relations extend the mathematical notion of functions with constructs like nondeterministic choice, failure and converse. These constructs are not always obviously executable or efficient. However, they greatly extend the abilities of reasoning and the clarity of expression of a specification.

The point-free style of reasoning about functions extends to a point-free style reasoning about relations, which is known as relation algebra. There are rich analogies with databases, category theory, linear algebra, and other topics.

Plus, I think it is very neato for some reason. If anyone ever thinks something is really neato, isn’t it worth giving it a listen?

A Simple Representation of Relations in Haskell

The simplest description of relations is as a set of tuples. So first let’s talk a bit about the options for sets in Haskell.

Sets in Haskell

There are a couple different reasonable ways to represent sets in Haskell.

  • [a] or Vector a
  • a -> Bool
  • Set a — a tree based Set from the containers package.

These have different performance characteristics and different power. The list [a] is very simple and has specialized pleasant syntax available. The indicator function a -> Bool gives you no ability to produce values of type a, but can easily denote very sophisticated spaces. Set a is a good general purpose data structure with fast lookup. You might also choose to mix and match combinations of these. Interconversion is often possible, but expensive. This is not a complete list of possibilities for sets, for example you may want a representation with a stronger possibility for search.

Relations in Haskell

We can directly use the definition of relations as a set of tuples with the above

type Rel a b = [(a,b)]
type SetRel a b = Set (a,b)
type FunRel a b = (a,b) -> Bool

But we also have the option to “curry” our relation representations, sort of mixing and matching properties of these representations.

type List a b = a -> [b] -- Commonly seen type  in List monad/logic programming
type MapRel a b = Map a (Set b)

You might also choose to package up multiples of these representations, choosing the most appropriate as the situation requires, see for example the relation package, whose type holds both Map a (Set b) and Map b (Set a).

Despite fiendishly poor performance, for simplicity and list comprehension syntax we are going to be using type Rel a b = [(a,b)] for the remainder of the post.

I’m also taking the restriction that we’re working in bounded enumerable spaces for ease. I assume such a requirement can be lifted for many purposes, but finite spaces like these are especially well tamed. The following typeclass and definition is very useful in this case.

type BEnum = (Enum a, Bounded a) 
enumAll :: (BEnum a) => [a]
enumAll = [minBound .. maxBound]

Functions and Relations

Functions can be thought of as relations with the special property that for each left part of the tuple, there is exactly one right side and every possible left side appears. The relation corresponding to a function f looks like F = \{(x,y) | x \in X, y \in Y, y = f (x)\}.

tabulate :: (BEnum a) => (a -> b) -> Rel a b
tabulate f = [(x, f x) | x <- enumAll]

There is a natural and slightly clever lifting of function composition to relations. We now check whether there exists a value that is in the right side of one and the left side of the other.

rcompose :: Eq b => Rel b c -> Rel a b -> Rel a c
rcompose xs ys = [ (a,c)  | (a, b) <- ys, (b', c) <- xs, b' == b]

rid :: (Enum a, Bounded a) => Rel a a
rid = tabulate id

Because of these two operations (and their properties of associativity and absorption), FinRel is a category. We do however need the Eq b restriction to make Rel an instance of the category typeclass, so it does not quite fit the definition of category in base. It is a constrained category.

We can lift the common arrow/categorical combinators up to relations for example.

-- arrow/category combinators
rfan :: Eq a => Rel a b -> Rel a c -> Rel a (b,c) 
rfan f g = [ (a, (b,c)) | (a,b) <- f, (a',c) <- g, a == a']

rfst :: BEnum (a,b) => Rel (a,b) a 
rfst = tabulate fst 

rsnd :: BEnum (a,b) => Rel (a,b) b 
rsnd = tabulate snd

rleft :: (Enum a, Bounded a) => Rel a (Either a b) 
rleft = tabulate Left

rright :: BEnum b => Rel b (Either a b) 
rright = tabulate Right

reither :: Eq a => Rel a c -> Rel b c -> Rel (Either a b) c 
reither f g = [(Left a, c) | (a,c) <- f] ++ [(Right b, c) | (b,c) <- g] 

With these combinators, you have access to many functions on basic non-recursive algebraic data types. By combining them in a point free style, you can build some other useful combinators.

--- goofy inefficient definitions
dup :: (Eq a, Eq b, BEnum a, BEnum b) => Rel a (a,a)
dup = rfan rid rid
swap ::(Eq a, Eq b, BEnum (a,b)) => Rel (a,b) (b,a)
swap = rfan rsnd rfst
par :: (Eq a, Eq c, BEnum a, BEnum c) => Rel a b -> Rel c d -> Rel (a,c) (b,d) 
par f g =  rfan (rcompose f rfst) (rcompose g rsnd)

An Aside: Relations, Linear Algebra, Databases

The composition operation described above is not so unfamiliar as it may first appear.

Relation algebra has a great similarity to linear algebra. This connection can be made more clear by considering sparsity patterns of matrices and tensors. Sparsity patterns are a useful abstraction of linear algebraic operations. Instead of considering matrices of numbers, instead the entries are "zero" and "possibly nonzero" or, if you prefer, a matrix of boolean values corresponding to those same questions.

The ordinary row times column matrix multiplication corresponds to relation composition. Replace * with AND and + with OR. If any of the numbers is zero, then multiplying them will result in zero. In summing two numbers, if either is possibly nonzero, then the result is possibly nonzero.

Another interesting way of looking at it is that we are replacing the summation binding form \sum_i with the logical quantifier \exists_i. Both introduce a scoped "dummy variable" i and have a lot of syntactic similarity. Other related forms include \lambda i, \forall i, \int di, \max_i .

There is also an analog of the point free relation algebra in linear algebra. Linear algebra has the most widely used point free notation in the world, matrix notation. Consider the expressions Ax=b and X = ABC as compared to \sum_j A_{ij} x_j = b_i and X_{il} = \sum_{jk} A_{ij} B_{jk} C_{kl} . Matrix notation is SO much better for certain calculations. Other pieces of the matrix notation include transpose, inverse, Kronecker product, the Khatri-Rao product, and Hadamard product. Their properties are more clear in the index free form in my opinion. I believe even massive tensor expressions can be written index free using these operators. There are also analogies to be drawn between the graphical notations in these different fields.

Databases can be thought of very similarly to sparse matrices. In principle, you could enumerate all the possible values for a column of a database. So you could think of a database as a giant matrix with a 1 if the item is in the database and 0 if not. Databases are very very sparse from this perspective, and you would never store them this way. The join operation is a relative of relational composition, however join usually operates via looking at the column names, whereas our join is position based.

Query optimization in databases has interesting analogs in sparse linear algebra. For example, the Taco compiler http://tensor-compiler.org/ is doing something very akin to a query optimizer.

Inverting Relations

Unlike functions, Relations are always "invertible". We call this the converse of a relation. When a function is invertible, it corresponds to the converse. In terms of the tuples underlying our representation, it just swaps them. Relations also possess operations trans and untrans that may be thought of as a kind of currying or as a partial inverse on a single parameter.

converse :: Rel a b -> Rel b a
converse = [(b,a) | (a,b) <- r]

untrans :: Rel a (b,c) -> Rel (a,b) c
untrans r = [((a,b),c)  | (a, (b,c)) <- r]

trans :: Rel (a,b) c -> Rel a (b, c)
trans r = [(a, (b,c))| ((a,b),c)  <- r]

Orderings can also be lifted to relations (\leq) = \{(a,b) | a \leq b\}. The composition of relations also respects the usual composition of ordering.

reflectOrd :: (Ord a, BEnum a) => Rel a a
reflectOrd = [(x,y) | x <- enumAll, y <- enumAll, x <= y]

Nondeterministic choice is sometimes represented in Haskell using Set returning functions a -> [b]. You may recall this from the context of the List monad. In fact in this case, we have an isomorphism as evidenced by tabulateSearch and searchRel.

tabulateSearch :: BEnum a => (a -> [b]) -> Rel a b
tabulateSearch f = [(a,b) | a <- enumAll, b <- f a]

searchRel :: Eq a => Rel a b -> (a -> [b])
searchRel r a = [b | (a', b) <- r, a == a']

Similarly partial functions can be reflected into relations

tabulatePartial :: BEnum a => (a -> Maybe b) -> Rel a b
tabulatePartial f = [(a,b) | a <- enumAll, b <- toList (f a)]

A useful trick is to lift sets/subsets to relations as a diagonal relation. \{(a,a) | a \in S \}. Projection onto the set can be achieve by composing with this relation. The identity results if you are talking about the entire set S.

diagRel :: [a] -> Rel a a
diagRel = map dup where dup x = (x,x)

leftSet :: Eq a => Rel a b -> [a]
leftSet = nub . (map fst)

rightSet :: Eq b => Rel a b -> [b]
rightSet = nub . (map snd)

Comparing Relations

We can compare sets by asking if one is a subset of the other A\subseteq B . Relations can also be compared by this operation, which we call relational inclusion.

rSub :: (Eq a, Eq b) => Rel a b -> Rel a b -> Bool
rSub xs ys = and [x `elem` ys | x <- xs]

x <~ y = rSub x y

A subservient notion to this is relational equality.


rEq :: (Eq a, Eq b) => Rel a b -> Rel a b -> Bool
rEq xs ys = (xs `rSub` ys) && (ys `rSub` xs)

x ~~ y = rEq x y

Relational algebra is chockful of inequality style reasoning, which is richer and slightly more complicated than equality style reasoning. This is one of the benefits of moving from functional descriptions to a relational description.

Relations also form a lattice with respect to these comparisons. What the hell are lattices? In the context of finite relations, lattices may be over powered mathematical machinery, but it really is useful down the line. They give you binary operators that play nice with some kind of ordering, in our case relational inclusion. These two operations are the meet and the join, which find the greatest lower bound and least upper bound of the operands respectively. For our relations, these correspond to the more familiar notion of set intersection and union. The intersection of two sets is the biggest set that is in both of them. The union is the smallest set for which both sets are a subset of it.

meet' :: (Eq a, Eq b) => Rel a b -> Rel a b -> Rel a b
meet' xs ys = [x | x <- xs, x `elem` ys] -- intersection

join' :: (Eq a, Eq b) => Rel a b -> Rel a b -> Rel a b
join' p q = nub (p ++ q) -- union


instance (Eq a, Eq b) => Lattice (Rel a b) where
    x /\ y = meet' x y
    x \/ y = join' x y

Using meet/join vs intersection/union becomes more interesting when the domain is fancier than relations over finite domains. Issues of infinity can make this interesting, or when using a representation that can't explicitly represent arbitrary unions or intersections, but that instead has to approximate them. My favorite example is polyhedra. Polyhedra are not closed under unions. So in this case the join and union do not coincide. You need to take a convex hull of the union instead, which is the best approximation. Concretely, polyhedra can be represented as a list of their vertices, which generate the polyhedron. There is no way to express a union in this representation. Concatenating the lists represents taking the convex hull of the union.

An additional property that a lattice may possess is a largest and small element, called top (\top ) and bottom (\bot). Our finite domain relations do have these.

instance (Eq a, Eq b, BEnum a, BEnum b) => BoundedMeetSemiLattice (Rel a b) where
    top :: (BEnum a, BEnum b) => Rel a b 
    top = [(x,y) | x <- enumAll, y <- enumAll]
-- all possible tuples

-- because of superclass constraints :(
instance (Eq a, Eq b) => BoundedJoinSemiLattice (Rel a b) where
    bottom :: Rel a b -- no tuples
    bottom = [] 

Relational Division

And now finally we get to one of the most interesting, powerful, and confusing operations: relational division. Relational division is a kind of pseudo inverse to relational composition. In linear algebra, the pseudo inverse is a matrix that does the best job it can to invert another matrix in a least squares sense. If the matrix is actually invertible, it equals the inverse. Relational division does the best job it can to invert a relational composition. Instead of taking least squares as a criteria, it ensures that the result doesn't over approximate. If you had the inequality X \cdot Y \subseteq Z and you want to solve for X, relational division is the thing that does that. The right division Q = Z/Y is the largest relation such that Q \cdot Y \subseteq Z.

A helpful example is the similar operation of division in database tables.

And here is an implementation that I think is correct. I've goofed it up a couple times, it is a rather confusing construct.

rdiv :: (BEnum a, BEnum b, Eq a, Eq b, Eq c) => Rel a c -> Rel b c -> Rel a b
rdiv x y = [ (a,b)  | a <- enumAll, b <- enumAll, all (\c -> ((b,c) `elem` y)`implies` ((a,c) `elem` x)) (rightSet y)]

There also exists a very similar operation of ldiv.

Relational division encapsulates many notions of searching or optimizing. I invite you to read more about it in J.N. Oliveira's text or the Bird & de Moor text.

Properties and QuickCheck

Oh. Mah. Glob. You guys. So many properties. (Artwork courtesy of David)

Relation algebra is so chock full of properties. This is a perfect opportunity for some QuickCheck , a randomized property testing framework. There are so many more to test. I need to dig through to collect up all the identities.

type R1 = Rel Bool Ordering

prop_ridleft :: Rel Bool Ordering -> Bool
prop_ridleft x = (rid <<< x) ~~ x

prop_ridright :: Rel Bool Ordering  -> Bool
prop_ridright x = (x <<< rid) ~~ x

prop_meet :: R1 -> R1  -> Bool
prop_meet x y = (x /\ y) <~ x

prop_meet' :: R1 -> R1  -> Bool
prop_meet' x y = (x /\ y) <~ y

prop_join_univ :: R1 -> R1 -> R1 -> Bool
prop_join_univ x y z = ((x \/ y) <~ z) == ((x <~ z) && (y <~ z))

prop_join :: R1 -> R1  -> Bool
prop_join x y = y <~ (x \/ y) 

prop_meet_univ :: R1 -> R1 -> R1 -> Bool
prop_meet_univ x y z = (z <~ (x /\ y)) == ((z <~ x) && (z <~ y))

prop_top :: R1 -> Bool
prop_top x = x <~ top

prop_bottom :: R1 -> Bool
prop_bottom x = bottom <~ x

prop_bottom' :: R1 -> Bool
prop_bottom' x = (x <<< bottom) ~~ (bottom :: R1)

prop_trans_iso :: Rel (Bool, Ordering) Word8 -> Bool
prop_trans_iso x = untrans (trans x) == x

prop_rdiv :: Rel Bool Ordering -> Rel Word8 Ordering -> Bool
prop_rdiv g j = (j <<< (rdiv g j)) <~ g

prop_con :: R1 -> Bool
prop_con x = con (con x) ~~ x

prop_rdiv' :: Rel Bool Word8 -> Rel Bool Ordering -> Rel Word8 Ordering -> Bool
prop_rdiv' x g j = (x <~ (rdiv g j)) == ((j <<< x) <~ g) 
 

Bits and Bobbles

  • Relations over continuous spaces. Vector subspaces (Linear Relations), Polyhedra (Linear inequality relations).
  • Non Bool-valued Relations. Replace \exists_x with \max_x. The weighted edgelist of a graph is a natural relation. By using composition we can ask about paths. We still have a comparison operator \subseteq which now respects the ordering of weights
  • Galois connections are cool.
  • Relations combined with recursion schemes. Recursion schemes are the point free way of describing recursion.
  • Moving into infinite spaces. How do we cope?
  • Faster search. Some relations are best specified by functions, Maps, others, mixes and matching.
  • If you "singletonize" relations a la the Agda project https://github.com/scmu/aopa, you get very interesting interconnections with profunctors, which people say are a categorical generalization of relations.
  • Point-free DSLs are interesting and pleasant. Many worries about alpha renaming are gone, at the expense of point-free pain. A DSL like this may be necessary to choose good relational query plans

Edit: A follow up post on that type level angle here http://www.philipzucker.com/relational-algebra-with-fancy-types/

References

Edit : A math exchange question about a -> [b] relational type. https://math.stackexchange.com/questions/3360026/can-division-be-expressed-intensionally-in-relation-algebra/3361351#3361351

Edit: An interesting comment and related library from /u/stevana

Why I (as of June 22 2019) think Haskell is the best general purpose language (as of June 22 2019)

Me and my close friends have been interested in starting a project together and I suggested we use Haskell. I do not think the suggestion was received well or perhaps in seriousness. I have a tendency to joke about almost everything and have put forward that we use many interesting but not practical languages in the same tone that I suggest Haskell. This was a tactical mistake. I find myself in despair at the idea I can’t convince my personal friends, who are curious and intellectual people, to use Haskell on a fresh start web project we have complete control over. What hope do I have in the world at large? This brain dump post is meant for them. My suggestion to use Haskell is not just me being an asshole, although that does make it more fun for me. I will now try to explain in all seriousness and in all the honesty that I can muster what my opinions on languages are and why I have them.

Pragmatically can you start a new project using a language you don’t know? This is a problem. A project always has some intrinsic difficulty. Not all projects will survive an extra layer of unnecessary speedbump. But when and how are you supposed to learn new languages? Never is one answer. I disagree with this answer. In this case we have the leg up that I do know Haskell. Perhaps this is a downside in that it will be extra frustrating? It is also easy for me to ask, as using Haskell is not a burden for me, I have already sunk the cost, but a massive learning burden for others.

Is Haskell actually practical for a web application? Short answer: yes. Expect pain though. If your web application is so simple you could rip it out in 100 lines of python, this is such a simple project that it is a good opportunity to learn something new. If it will become large and complex, then I believe Haskell does shine, keeping complexity under control. I base this upon the professional experience making a web application using a Haskell-Purescript stack. For honesty, it wasn’t all good. I recall ripping my hair out threading shit through monad stacks to where it need to go. Yet on the whole, it kept the project sane. I also believe this based on the word of mouth that I believe but could be just cultish ramblings.

I believe that truly dominating properties of Haskell appear in large complex projects. This is difficult to prove in any other way except empirically and the experiments will be so wildly uncontrolled in terms of the project and people involved that no conclusions can truly be drawn. And yet I have faith, and think that personal experience validates this opinion to myself at least. We have to live this life even though truth does not exist. Choices and opinions must be made.

For programs that are going to be a single manageable file and written in one night, it doesn’t matter much what you use in terms of being choked on your own code. At this scale, I still think Haskell is enjoyable and interesting though. Haskell was my doorway into the world of computer science as I now understand it. I hope there are more doorways.

Things about me that may be different from you. Decide for yourself if these aspects of me make our opinions fundamentally incompatible.

  • I do have a talent and a like for practically oriented mathematical topics (computational methods, linear algebra, formal methods, calculus, geometric algebra, projective geometry, optimization, etc.). I actually have very little taste at all for mathematical topics that I see no purpose to.
  • I do have some desire and taste for esoterica for its own purpose. I cannot exactly characterize what makes some topics acceptable to me and others not.
  • A hard learning curve is not necessarily a downside for me. I enjoy the challenge if it is overcomable and worth it on the other side.
  • I like weird and different. That is a positive for me, but a negative for many. I might just be a millennial hipster idiot.
  • I would LOVE to find a language I think is better than Haskell. I would LOVE to abandon Haskell. Perhaps this already makes me odd. Perhaps I think many people don’t consider the differences between languages to be worth making the switch and the wasted knowledge. The people with this opinion may or may not have tried enough languages.
  • I have “drank the koolaid”. I do read what comes out of the Haskell and functional programming community and have a tendency to believe things without strong empirical backing.
  • I have been more deeply entwined with Haskell than any other language. Perhaps if I had reached a level of familiarity in another language I’d be as fervent about that one? I don’t believe this is the case.
  • While I desire performance, I consciously curb this desire. I am strongly in favor of cleanly written, clear, principled code. This is of course a weighted judgement. I will probably use a 100x performance gain for a 2x decrease in clarity or reusability. This is a result of the problem domains and scale that have interested me in the past and that I anticipate in the future. I STRONGLY believe the world at large over values performance at the expense of other good qualities of code. Or at least over optimizes early.

A Biased List of Pros and Cons

What do I find bad about C++.

  • The feature set is huge and difficult to understand
  • Extreme amounts of legacy features that will be even recommended if you read legacy documentation. Kitchen sink.
  • The language appears to be almost entirely built out of footguns.
  • The syntax is too verbose.
  • I have a distaste for mutation.
  • I have a distaste for object oriented programming

What do I find good about C++

  • It is commonly used. Large community.
  • It is possible for it to be very fast
  • Kitchen Sink. You can find almost any feature you want here.
  • The high level goals of the mind-leaders sound good.
  • Very interesting projects are written in C++. HPC things. Scientific computation.
  • Template metaprogramming seems very powerful, but arcane.

What I find bad about Java

  • The syntax puts me off as being incredibly verbose
  • Extreme object oriented focus
  • Corporate/Enterprise feel. I am an iconoclast and see myself as a reasonable but slightly rebellious character. Java in my mind brings images of cubicles and white walls. Perhaps this is not fair.

What do I find good about Java

  • ?

I’m joking. Sorry Java. But I also kind of mean it. Yes, there are positive aspects to Java.

What I like about python

  • Very commonly used and understood. Perhaps the lingua franca
  • Incredible library ecosystem
  • Numpy and scipy in particular are marvels of the modern age.
  • Syntax is basically imperative pseudo-code
  • I am personally very familiar with it
  • python is the easiest to use language I know.

What I dislike about python

  • Python has no natural tendency for correctness due to the free wheeling dynamically typed character. This is patched up with testing, opt-in type systems.
  • I don’t know how to grow as a pythonista. The skill curve flattens out. For some this may be a positive.
  • The main way of building new data types is the class system. I think this is ungainly, overly verbose, and not always a good conceptual fit.
  • Despite being among the most succinct of common imperative languages, I still think it ends up being too verbose.
  • It is slow. This is a negative, although not high on my priorities.

What is bad about Haskell

  • Very difficult learning curve. Let’s get real. Haskell is a very confusing programming language to get started in for common programming tasks.
  • functional programming is weirder than imperative programming.
  • The monad paradigm, even once learned is ungainly. Tracking multiple effects is a pain that does not exist in most languages.
  • The pain is up front. It is easy to get a sketch of what you want ripped out in python faster than in Haskell (for me). If you want a web server, command line tool, optimization problem, curve fitter, I can rip all of these out faster in python than I can in Haskell. As a psychological thing, this feels awful. For a small scale project, unless toying with Haskell itself or one of its domain expertises like implementing DSLs, python is the easier and correct choice. Python is a scripting language. I’d make the switch at two screens worth of code.
  • I think laziness is confusing and easy to shoot yourself with.
  • Haskell is not the fastest language, although faster than python.
  • Concern for there not being jobs and interestingly on the converse side, no people to hire. There is a catch-22 there. There is a set of people that would KILL for a Haskell job.
  • There are vocal smug assholes who use Haskell and push it. I am smug, I hope only mildly an asshole.

What I find good about Haskell

  • The number one reason is that there is something ephemeral that I just like. I am not naturally inclined to analyze such things. When I like a movie or don’t like it, it just happens, and if forced to explain why, I’ll bullshit.
  • Errors and bugs are to a shocking degree caught at compile time. More than any other language I have experience with does Haskell code run correctly without hidden bugs if it compiles. I am not claiming this is absolute but it is all the more incredible the degree to which it is true since it didn’t literally have to be this way. I have done major rewiring of a data structure in a project. The compiler guided my hand to exactly the positions that needed to be adjusted. Could C++ do this? Yes, to some degree. I believe that the tendencies of C++ programming make it less satisfactory on this point.
  • Types are an incredible design tool. I find designing the types of my program to be an extremely enjoyable and helpful activity. Much more so than box and wire class and interface diagrams. A good function name and a type signature basically entirely constrains the behavior of a function. Types can be quickly and completely be given to the compiler and machine enforced.
  • The pain that Monads cause are pains you should be feeling. They are the pain of explicitness which I 70% choose over the pain of not knowing what the fuck a function might do, and not enabling the compiler to enforce that. If something is capable of mutating state, it should say so in goddamn huge purple letters.
  • Haskell is more than fast enough. It isn’t that even people don’t care. The Haskell community at large cares a lot more for performance than I do, and I reap the dividends. The people in charge of the compiler and the main libraries are goddamn wizards who’s work I get to benefit from. This is true of all languages perhaps.
  • Laziness is very cool. At the beginning I thought it was INCREDIBLY awesome and inconceivable that I could manipulate an infinite list.
  • The way of specifying new data types is so succinct and so incredible. Pattern matching is SO GODDAMN good for some tasks.
  • Haskell has a paradigm of small combinators. It is common to build a sequence of very small natural functions for the domain and then build larger and larger things out of them. This is good for reusability and conceptual clearness.
  • Extreme preference for Immutability. As part of keeping what you must keep in your head or remember small while programming, immutability is an insane win. You think you know what you want now. You know you could just tweak this variable here, make a special variable over here. You can reason about how to make this all work and be correct now. You will not in a month. Your coworkers will mess it all up too.
  • Haskell code is generic by default. This allows the same code to be reused in many situations
  • The standard typeclass hierarchy is extremely well thought out and powerful. To some degree it is unnecessary in other languages. The difference between Functor, Applicative, Monad, and Traversable makes little sense in languages with unconstrained mutations and effects.
  • Haskell paradigms are inspired by mathematics, and I have great faith in mathematics. The concepts behind Haskell feel closer to discoveries rather than inventions. Imperative programming speaks in a language formed for the accidental nature of the machines we have. Functional programming is a language closer to mathematics, which I believe is closer to how the human mind works, and closer to what the problem at hand actually is.
  • Complexity scales. It is my belief, perhaps unverified, that as a project grows larger, the insane miserable churn and mental overhead grows slower in a Haskell project than in other languages. I do not have strong empirical evidence to this assertion. Word of mouth (of Haskellers).
  • The ceiling on Haskell is extremely high. You can continue to learn and get better, gaining more and more power. I do not currently see the end of this.
  • When I do reach for some new library, I am very often impressed by how thoughtfully built it is. Haskell itself is INCREDIBLY thoughtfully built.
  • The haskell community is very excited and they have many things to teach. There are many Haskellers out there who are very welcoming, kind, and intelligent.
  • Haskell does have some cache. I am not immune to wanting to seem smart. If the world thought that only idiots use Haskell, that would offput me some. That the world things that only impractical ivory tower smug weenies use Haskell does offput me, although I perhaps embrace it belligerently.
  • The Haskell library ecosystem is strong. Less strong than python, but much better than some of the other languages that intrigue my eye. There is functionality somewhere for most common tasks.
  • Haskell is used industrially. I live in an echo chamber, but by and large the industrial users of Haskell sing its praises.

For context, this is my programming journey:

I first learned BASIC in middle school. I wrote a computer game in Visual Basic. I toyed with the TI-83. I went to a summer camp in high school where I learned the extreme rudiments of C++. Did a small web business with some friends. Did really bad work with big dreams. I took a fairly poor Java course in high school. I learned MATLAB in college. I doinked around with Arduino level C projects. I learned python in grad school I think. I think I got far more proficient at python than any other language before. At this point, I was on board for object oriented programming, albeit not at a deep level (design patterns or ilk), just as light organizational principle. Did some Android projects, really didn’t like Java there. Did a small business with my friend and got deep in Javascript. As our web application got bigger, this really start to hurt. Errors all over the place. Unknown junk in objects. Who has access to what? We tried our darndest to read and follow best practices as we could find them, but it felt like we sinking in quicksand. More and more effort for more and more bugs and less and less progress. It was around this era that I first even heard of Haskell. I was intrigued immediately for some reason, maybe for just how weird it was. It took probably 2 years of forgetting about it and going back to tutorials every 6 months. I didn’t necessarily KNOW that this was a thing that I wanted, I just found it interesting. Currently I am fairly proficient at some things in Haskell (unfortunately I am more proficient at esoterica than more practical things). I have had a professional job writing Haskell, and it seems like my like of Haskell is working out very well for me professionally. Passion in all forms is powerful.

My history may not be as convincing as someone who spent 20 years as a professional C++ dev and then switched, but I have at least experienced different paradigms and languages and found Haskell the most to my liking.

Random Thoughts

I am now trying to push myself into comfort in Julia, Rust, Agda, Coq, OCaml, all languages I feel show promise for different reasons. To my knowledge Haskell is a better choice than these as a general purpose tool for pragmatic reasons. Haskell’s library ecosystem is strong and performance is good. These are points against agda and coq. Julia has a focus on scientific programming.

Rust might be a good compromise. I consider it a trojan horse for useful programming language features that I associate with functional languages. It claims to performance and being an acceptable systems level language, which appeals to some. The syntax does not scare anyone off. The library ecosystem seems good and the community strong. I did find myself missing Haskell features in Rust though, am personally much less familiar with it. I think the design of Rust weights more to performance than is warranted in most applications and has less descriptive and abstraction power than Haskell, qualities that I prioritize. This opinion is not strongly held.

What makes Haskells types any better or worse than C? At the beginning many of the features of Haskell seem like magic. But as time has worn on, I can see how the features can be emulated with just some slight syntactic and conceptual overhead in other languages. This slight overhead is enough though. A language is more than just it’s syntax. It is also its idioms. It is also they way it makes people think. Language is almost a prerequisite for thought. You cannot even conceive of the ways there are to express yourself before learning.

What exactly makes a language “good”? This is a question poorly phrased enough to have no answer. Excel can be an excellent language for many tasks. Easy to learn. Very powerful. Yet, it is not considered a good general purpose programming language. Library ecosystem is extremely important. Specialized languages are often the best choice for special problem domains, at the expense of learning them, or eventually finding incompatibility of what you want from what they designed for.

What makes abstractions “good”. Why do I have queasiness about object oriented-programmming. Java, I think basically. I, overeagerly have gone down the road of trying to design deep subclass hierarchies, which is not OO at it’s best. Zebra is a Quadruped is an Animal is Alive kind of stuff. I believe object oriented in an interesting principle. I hear about SmallTalk and Common Lisp doing object oriented right and I am duly intrigued. There has been some recent work in Haskell about how to do objects in a way aesthetically compatible with Haskell. I think object oriented has been over used and abused. I think it is a poor conceptual fit for many situations. I think it tends to make non reusable code. I think the form that it takes in C++ and Java development is arcane horseshit.

I deserve almost no opinion about Java or C++, having not done sufficient that much in them. Yet, I must state my opinions, take them as you will, for I do in fact hold them strongly. I have worked on a network simulator and a robotics framework in C++, but begrudgingly. I have done a very small amount of Java development for a personal project and some Processing sketches. My coworker was a 10 year professional Java dev before switching to Scala then Haskell. He despises Java now. Highly anecdotal, and he is a similar iconoclastic character like me. Nevertheless, this also informs my opinion. I have been reading Bjarne Stroustrup’s book (his stated goals and what he claims C++ achieves are admirable and one can’t argue he hasn’t changed the world) and actually find C++ rather interesting, especially in the sense that many projects that I find interesting are written in C++, I just don’t want to myself work in the language.

Haskell love:

Haskell Criticism (perhaps warranted)

https://news.ycombinator.com/item?id=17114308 Ah Hacker News. Always a sunny worldview.

https://medium.com/@snoyjerk/least-favorite-thing-about-haskal-ef8f80f30733

Hacker news discussion of this post:

https://news.ycombinator.com/item?id=20258218#20271511

Lens as a Divisibility Relation: Goofin’ Off With the Algebra of Types

Types have an algebra very analogous to the algebra of ordinary numbers (video). This is the basic table of correspondences. Code with all the extensions available here.

type a * b = (a,b)
type a + b = Either a b
type b ^ a = a -> b
type O = Void
type I = ()

-- check out typelits implementation
-- http://hackage.haskell.org/package/base-4.12.0.0/docs/src/GHC.TypeNats.html#%2A

infixl 6 + -- ((a + b) + c) associates to the left
infixl 7 *
infixr 8 ^


-- derived definitions
type Succ a = I + a
type Two = Succ I
type Three = Succ Two
type Four = Succ Three

One way to see that this makes sense is by counting the cardinality of types built out of these combinators. Unit is the type with 1 inhabitant. Void has 0 inhabitants. If a has n and b has m possible values, then Either a b has n + m inhabitants, (a,b) has n*m and there are n^m possible tabulations of a->b. We’re gonna stick to just polynomials for the rest of this, ignoring a->b.

Another way of looking at this is if two finitely inhabited types have the same number of inhabitants, then the types can be put into an isomorphism with each other. In other words, types modulo isomorphisms can be thought as representing the natural numbers. Because of this, we can build a curious proof system for the natural numbers using ordinary type manipulation.

In addition, we also get a natural way of expressing and manipulating polynomials.Polymorphic types can be seen as being very similar to polynomial expressions with natural coefficients N[x]. The polymorphic type variables have the ability to be instantiated to any value, corresponding to evaluating the polynomial for some number.

type ExamplePoly a = I + a + a * a * Three 

The Lens ecosystem gives some interesting combinators for manipulating this algebra. The type Iso' a b contains isomorphisms. Since we’re only considering types up to isomorphism, this Iso' represents equality. We can give identity isomorphisms, compose isomorphisms and reverse isomorphisms.

type a ~~ b = Iso' a b

refl ::  a ~~ a
refl = id

compose :: (a ~~ b) -> (b ~~ c) -> (a ~~ c)
compose = (.)

rev :: a ~~ b -> b ~~ a
rev = from

We can already form a very simple proof.

-- a very simple proof. Holds by definition
oneonetwo :: (I + I) ~~ Two
oneonetwo = id

Now we’ll add some more combinators, basically the axioms that the types mod isos are a commutative semiring. Semirings have an addition and multiplication operator that distribute over each other. It is interesting to note that I believe all of these Iso' actually are guaranteed to be isomorphisms ( to . from = id and from . to = id ) because of parametricity. from and to are unique ignoring any issues with bottoms because the polymorphic type signature is so constraining. This is not usually guaranteed to be true in Haskell just from saying it is an Iso'. If I give you an Iso' Bool Bool it might actually be the iso (const True) (const True) for example, which is not an isomorphism.

-- now we start to state our axioms
-- interestingly, I believe the Iso and Lens laws to follow actually follow via parametricity.

-- associativity + a identity object make mul and plus a monoid
plus_assoc :: Iso' (a + (b + c)) ((a + b) + c)
plus_assoc =  iso assoc unassoc  where
                   assoc (Left a) = Left (Left a)
                   assoc (Right (Left b)) = Left (Right b)
                   assoc (Right (Right c)) = Right c
                   unassoc (Left (Left a)) = Left a
                   unassoc (Left (Right b)) = Right (Left b)
                   unassoc (Right c) = (Right (Right c))

mul_assoc :: Iso' (a * (b * c)) ((a * b) * c)
mul_assoc =  iso (\(a,(b,c)) -> ((a,b),c)) (\((a,b),c) -> (a,(b,c)))


-- use `absurd` from Data.Void for empty case.
id_plus :: Iso' (a + O) a
id_plus = iso (\case Left x -> x
                     Right x -> absurd x) Left

id_mul :: Iso' (a * I) a
id_mul = iso (\(x,_) -> x) (\x -> (x,()))


-- they are also commutative
-- specialized version of swapped from Control.Lens.Iso for future clarity
comm_plus :: Iso' (a + b) (b + a)
comm_plus = swapped
comm_mul :: Iso' (a * b) (b * a)
comm_mul = swapped


-- I don't see this one in Lens. Maybe it is there though?
-- distributive property
rdist :: Iso' (a * (b + c)) (a * b + a * c)
rdist = iso (\(a,bc) -> (a,) +++ (a,) $ bc) (\case Left (a,b) -> (a, Left b)
                                                  Right (a,c) -> (a, Right c))   

mul_zero :: Iso' (a,O) O
mul_zero = iso (\(_,y) -> absurd y) absurd

There are also combinators for lifting isomorphisms into bifunctors: firsting, seconding, and bimapping. These are important for indexing into subexpressions of our types in a point-free style.

-- a specialized version of firsting and seconding for clarity
-- firsting and seconding feel to me more like words for tuples
lefting :: (a ~~ b) -> (a + c) ~~ (b + c)
lefting = firsting

righting :: (a ~~ b) -> (c + a) ~~ (c + b)
righting = seconding

Here is a slightly more complicated proof now available to us.

-- a more complicated proof
twotwofour :: Iso' (Two + Two) Four
twotwofour = rev plus_assoc

We can attempt a more interesting and difficult proof. I developed this iteratively using . _ hole expressions so that GHC would tell me what I had manipulated my type to at that point in my proof.

ldist ::   ((b + c) * a) ~~ (b * a + c * a)
ldist = comm_mul . rdist . (lefting comm_mul) . (righting comm_mul)


-- very painful. Using holes _ and error messages absolutely essential
factorexample ::  ((a + I) * (a + I)) ~~ (a * a + Two * a + I)
factorexample = rdist .  -- distribute out the right side 
               (lefting (comm_mul . rdist)) . -- in the left sum term distribute out
               (righting (comm_mul . rdist)) . -- in the right sum term distribute out
                plus_assoc . -- reassociate plus to the left
               (lefting (lefting (righting comm_mul))) . -- turn a * I term to I * a
                (lefting (rev plus_assoc)) . -- associate the two a*I terms together into an (a * I + a * I) term 
                 (lefting (righting (rev ldist))) . -- factor together that subterm into Two * a
                  (righting id_mul) -- convert last I * I term into I

Artwork Courtesy of David. Sorry for any motion sickness.

The proof here is actually pretty trivial and can be completely automated away. We’ll get to that later.

If Iso' is equality, what about the other members of the Lens family? Justin Le says that Lens s a are witness to the isomorphism of a type s to the tuple of something and a. Prism witness a similar thing for sums. Once we are only considering types mod isos, if you think about it, these are expressions of two familiar relations on the natural numbers: the inequality relation and the divisibility relation

type a >= b = Prism' a b
type a .| b = Lens' a b

Mathematically, these relations can be composed with equalities, just like in the lens hierarchy Lens and Prism can be composed with `Iso`. Both form a category, since they both have id and (.).

{-

the core combinators from the lens library for manipulating these are

_1 :: (a * b) .| a
_2 :: (a * b) .| b

_Left :: (a + b) >= a
_Right :: (a + b) >= b

For example:
-}
twodiv4 :: (Two * Two) .| Two
twodiv4 = _1

onelesstwo :: Two >= I
onelesstwo = _Left

threedivthree :: Three .| Three
threedivthree = id

Here are a couple identities that we can’t derive from these basic combinators. There are probably others. Woah-ho, my bad. These are totally derivable using id_mul, id_plus, mul_zero, _1, _2, _Left, _Right.

-- once again, these are true via parametricity, more or less
one_div :: a .| I 
-- one_div = lens (const ()) const
one_div = (rev id_mul) . _2

zero_lte :: a >= O
-- zero_lte = prism' absurd (const Nothing)
zero_lte = (rev id_plus) . _Right

zero_div :: O .| a
-- zero_div = lens absurd const
zero_div = (rev mul_zero) . _1

Pretty neat! Random thoughts and questions before we get into the slog of automation:

  • Traversal is the “is polynomial in” relation, which seems a rather weak statement on it’s own.
  • Implementing automatic polynomial division is totally possible and interesting
  • What is the deal with infinite types like [a]? Fix. I suppose this is a theory of formal power series / combinatorial species. Fun combinatorics, generatingfunctionology. Brent Yorgey did his dissertation on this stuff. Wow. I’ve never really read this. It is way more relevant than I realized.
  • Multivariate polynomial algorithms would also be interesting to explore (Grobner basis, multivariate division)
  • Derivatives of types and zippers – Conor McBride
  • Negative Numbers and Fractions?
  • Lifting to rank-1 types. Define Negative and Fractions via Galois connection?
-- partially applied +
type P n = (Either n) 
-- partially applied *
type M n = (,) n

Edit: /u/carette (wonder who that is šŸ˜‰ ) writes:

“You should dig into
J Carette, A Sabry Computing with semirings and weak rig groupoids, in Proceedings of ESOP 2016, p. 123-148. Agda code in https://github.com/JacquesCarette/pi-dual/tree/master/Univalence. A lot of the algebra you develop is there too.

If you hunt around in my repos, you’ll also find things about lenses, exploring some of the same things you mention here.”

Similar ideas taken further and with more sophistication. Very interesting. Check it out.

Automation

Our factor example above was quite painful, yet the theorem was exceedingly obvious by expansion of the left and right sides. Can we automate that? Yes we can!

Here’s the battle plan:

  • Distribute out all expressions like a*(b+c) so that all multiplication nodes appear at the bottom of the tree.
  • Reduce the expression by absorbing all stupid a*1, a*0, a+0 terms.
  • Reassociate everything to the right, giving a list like format
  • Sort the multiplicative terms by power of the variable

Once we have these operations, we’ll combine them into a canonical operation. From there, most equality proofs can be performed using the rewrite operation, which merely puts both sides into canonical form

-- | The main combinator to expand out any polynomial into canonical form with terms sorted

canonical :: (Dist a b, Absorb b c, RightAssoc c d, SortTerm d e) => a ~~ e
canonical = dist . absorb . rightAssoc . sortTerm

rewrite :: forall a' a b c d e e' d' c' b'. (Dist a b, Absorb b c, RightAssoc c d,
                 SortTerm d e, e ~ e', Dist a' b',
                Absorb b' c', RightAssoc c' d', SortTerm d' e') => a ~~ a'
rewrite = canonical . (rev canonical)

Once we have those, the painful theorem above and harder ones becomes trivial.

ex9 :: (a ~ V a') => ((a + I) * (I + a)) ~~ (a * a + Two * a + I)
ex9 = rewrite 

ex10 :: (a ~ V a') => ((a + I) * (I + a) * (Two + a)) ~~  (a * a * a + a * a * Four + Two + Four * a + a)
ex10 = rewrite 

Now we’ll build the Typeclasses necessary to achieve each of these aims in order. The Typeclass system is perfect for what we want to do, as it builds terms by inspecting types. It isn’t perfect in the sense that typeclass pattern matching needs to be tricked into doing what we need. I have traded in cleverness and elegance with verbosity.

In order to make our lives easier, we’ll need to tag every variable name with a newtype wrapper. Otherwise we won’t know when we’ve hit a leaf node that is a variable. I’ve used this trick before here in an early version of my faking Compiling to Categories series. These wrappers are easily automatically stripped.

-- For working with Variables

-- a newtype to mark variable position
newtype V a = V a

-- Suggested usage, bind the V in a unification predicate to keep expression looking clean
-- (V a' ~ a) => (a + I) * a 

-- multinomials. to be implemented some other day
-- a phantom l labelled newtype for variable ordering. 
newtype VL l a = VL a

A common pattern I exploit is to use a type family to drive complicated recursion. Closed type families allow more overlap and default patterns which is very useful for programming. However, type families do not carry values, so we need to flip flop between the typeclass and type family system to achieve our ends.

Here is the implementation of the distributor Dist. We make RDist and LDist typeclasses that make a sweep of the entire tree, using ldist and rdist as makes sense. It was convenient to separate these into two classes for my mental sanity. I am not convinced even now that I have every case. Then the master control class Dist runs these classes until any node that has a (*) in it has no nodes with (+) underneath, as checked by the HasPlus type family.


class RDist a b | a -> b where
    rdist' :: Iso' a b

instance RDist a a' => RDist (a * I) (a' * I) where
    rdist' = firsting rdist'
instance RDist a a' => RDist (a * O) (a' * O) where
    rdist' = firsting rdist'
instance RDist a a' => RDist (a * (V b)) (a' * (V b)) where
    rdist' = firsting rdist'

instance (RDist (a * b) ab, RDist (a * c) ac) => RDist (a * (b + c)) (ab + ac) where
    rdist' = rdist . (bimapping rdist' rdist')
instance (RDist a a', RDist (b * c) bc) => RDist (a * (b * c)) (a' * bc) where
    rdist' = (bimapping rdist' rdist')
instance (RDist a a', RDist b b') => RDist (a + b) (a' + b') where
    rdist' = bimapping rdist' rdist'

instance RDist O O where
    rdist' = id
instance RDist I I where
    rdist' = id
instance RDist (V a) (V a) where
    rdist' = id

-- can derive ldist from swapped . rdist' . swapped?

class LDist a b | a -> b where
    ldist' :: Iso' a b
instance (LDist (b * a) ab, LDist (c * a) ac) => LDist ((b + c) * a) (ab + ac) where
    ldist' = ldist . (bimapping ldist' ldist')
instance (LDist (b * c) bc, LDist a a') => LDist ((b * c) * a) (bc * a') where
    ldist' = bimapping ldist' ldist'
instance LDist a a' => LDist (I * a) (I * a') where
    ldist' = seconding ldist'
instance LDist a a' => LDist (O * a) (O * a') where
    ldist' = seconding ldist'
instance LDist a a' => LDist ((V b) * a) ((V b) * a') where
    ldist' = seconding ldist'

instance (LDist a a', LDist b b') => LDist (a + b) (a' + b') where
    ldist' = bimapping ldist' ldist'

instance LDist O O where
    ldist' = id
instance LDist I I where
    ldist' = id
instance LDist (V a) (V a) where
    ldist' = id

type family HasPlus a where
    HasPlus (a + b) = 'True
    HasPlus (a * b) = (HasPlus a) || (HasPlus b)
    HasPlus I = 'False
    HasPlus O = 'False
    HasPlus (V _) = 'False

class Dist' f a b | f a -> b where
    dist' :: a ~~ b
instance (f ~ HasPlus ab', LDist (a * b) ab, 
          RDist ab ab', Dist' f ab' ab'') => Dist' 'True (a * b) ab'' where
    dist' = ldist' . rdist' . (dist' @f)
instance Dist' 'False (a * b) (a * b) where
    dist' = id
instance (HasPlus a ~ fa, HasPlus b ~ fb, 
          Dist' fa a a', Dist' fb b b') => Dist' x (a + b) (a' + b') where
    dist' = bimapping (dist' @fa) (dist' @fb)
    -- is that enough though? only dist if 
instance Dist' x I I where
    dist' = id
instance Dist' x O O where
    dist' = id
instance Dist' x (V a) (V a) where
    dist' = id

class Dist a b | a -> b where
    dist :: a ~~ b
-- dist' should distributed out all multiplications
instance (f ~ HasPlus a, Dist' f a b) => Dist a b where
    dist = dist' @f

Next is the Absorb type class. It is arranged somewhat similarly to the above. Greedily absorb, and keep doing it until no absorptions are left. I think that works.


-- RAbsorb matches only on the right hand side of binary operators
-- matching on both sides is ungainly to write
class RAbsorb a b | a -> b where
    rabsorb :: a ~~ b
-- obvious absorptions
instance RAbsorb x x' => RAbsorb (x + O) x' where
    rabsorb = id_plus . rabsorb
instance RAbsorb x x' => RAbsorb (x + I) (x' + I) where
    rabsorb = firsting rabsorb
instance RAbsorb x x' => RAbsorb (x * I) x' where
    rabsorb = id_mul . rabsorb
instance RAbsorb (x * O) O where
    rabsorb = mul_zero
instance RAbsorb x x' => RAbsorb (x * (V a)) (x' * (V a)) where
    rabsorb = firsting rabsorb
instance RAbsorb x x' => RAbsorb (x + (V a)) (x' + (V a)) where
    rabsorb = lefting rabsorb
-- recursion steps
instance (RAbsorb (y * z) yz, RAbsorb x x') => RAbsorb (x * (y * z)) (x' * yz) where
    rabsorb = bimapping rabsorb rabsorb
instance (RAbsorb (y + z) yz, RAbsorb x x') => RAbsorb (x * (y + z)) (x' * yz) where
    rabsorb = bimapping rabsorb rabsorb
instance (RAbsorb (y + z) yz, RAbsorb x x') => RAbsorb (x + (y + z)) (x' + yz) where
    rabsorb = bimapping rabsorb rabsorb
instance (RAbsorb (y * z) yz, RAbsorb x x') => RAbsorb (x + (y * z)) (x' + yz) where
    rabsorb = bimapping rabsorb rabsorb
-- base cases
instance RAbsorb O O where
    rabsorb = id
instance RAbsorb I I where
    rabsorb = id
instance RAbsorb (V a) (V a) where
    rabsorb = id


-- mirror of RAbsorb    
class LAbsorb a b | a -> b where
    labsorb :: a ~~ b
instance LAbsorb x x' => LAbsorb (O + x) x' where
    labsorb = comm_plus . id_plus . labsorb
instance LAbsorb x x' => LAbsorb (I + x) (I + x') where
    labsorb = righting labsorb
instance LAbsorb x x' => LAbsorb (I * x) x' where
    labsorb = comm_mul . id_mul . labsorb
instance LAbsorb (O * x) O where
    labsorb = comm_mul . mul_zero
instance LAbsorb x x' => LAbsorb ((V a) + x) ((V a) + x') where
    labsorb = righting labsorb
instance LAbsorb x x' => LAbsorb ((V a) * x) ((V a) * x') where
    labsorb = seconding labsorb

instance (LAbsorb (y * z) yz, LAbsorb x x') => LAbsorb ((y * z) * x) (yz * x') where
    labsorb = bimapping labsorb labsorb
instance (LAbsorb (y + z) yz, LAbsorb x x') => LAbsorb ((y + z) * x) (yz * x') where
    labsorb = bimapping labsorb labsorb
instance (LAbsorb (y + z) yz, LAbsorb x x') => LAbsorb ((y + z) + x) (yz + x') where
    labsorb = bimapping labsorb labsorb
instance (LAbsorb (y * z) yz, LAbsorb x x') => LAbsorb ((y * z) + x) (yz + x') where
    labsorb = bimapping labsorb labsorb

instance LAbsorb O O where
    labsorb = id
instance LAbsorb I I where
    labsorb = id
instance LAbsorb (V a) (V a) where
    labsorb = id
    
-- labsorb :: (Swapped p, RAbsorb (p b a) (p b' a')) => (p a b) ~~ (p a' b')
-- labsorb = swapped . rabsorb . swapped   

-- a test function to see if an expression is properly absorbed
type family Absorbed a where
    Absorbed (O + a) = 'False
    Absorbed (a + O) = 'False
    Absorbed (a * I) = 'False
    Absorbed (I * a) = 'False
    Absorbed (a * O) = 'False
    Absorbed (O * a) = 'False
    Absorbed (a + b) = (Absorbed a) && (Absorbed b)
    Absorbed (a * b) = (Absorbed a) && (Absorbed b)
    Absorbed I = 'True
    Absorbed O = 'True
    Absorbed (V a) = 'True

-- iteratively rabsorbs and leftabsorbs until tree is properly absorbed.
class Absorb' f a b | f a -> b where
    absorb' :: a ~~ b
instance Absorb' 'True a a where
    absorb' = id
instance (LAbsorb a a', RAbsorb a' a'',
          f ~ Absorbed a'', Absorb' f a'' a''') => Absorb' 'False a a''' where
    absorb' = labsorb . rabsorb . (absorb' @f)

-- wrapper class to avoid the flag.
class Absorb a b | a -> b where
    absorb :: a ~~ b
instance (f ~ Absorbed a, Absorb' f a b) => Absorb a b where
    absorb = absorb' @f

The Associators are a little simpler. You basically just look for the wrong association pattern and call plus_assoc or mul_assoc until they don’t occur anymore, then recurse. We can be assured we’re always making progress if we either switch some association structure or recurse into subparts.


class LeftAssoc a b | a -> b where
    leftAssoc :: Iso' a b

instance LeftAssoc a a' => LeftAssoc (a + I) (a' + I) where
    leftAssoc = firsting leftAssoc 
instance LeftAssoc a a' => LeftAssoc (a + O) (a' + O) where
    leftAssoc = firsting leftAssoc 
instance LeftAssoc a a' => LeftAssoc (a * I) (a' * I) where
    leftAssoc = firsting leftAssoc 
instance LeftAssoc a a' => LeftAssoc (a * O) (a' * O) where
    leftAssoc = firsting leftAssoc 
instance LeftAssoc a a' => LeftAssoc (a * (V b)) (a' * (V b)) where
    leftAssoc = firsting leftAssoc 
instance LeftAssoc a a' => LeftAssoc (a + (V b)) (a' + (V b)) where
    leftAssoc = firsting leftAssoc 

instance (LeftAssoc ((a + b) + c) abc') => LeftAssoc (a + (b + c)) abc' where
    leftAssoc = plus_assoc . leftAssoc 
instance (LeftAssoc ((a * b) * c) abc') => LeftAssoc (a * (b * c)) abc' where
    leftAssoc = mul_assoc . leftAssoc 


instance (LeftAssoc (b * c) bc, LeftAssoc a a') => LeftAssoc (a + (b * c)) (a' + bc) where
    leftAssoc = bimapping leftAssoc leftAssoc
-- a * (b + c) ->  a * b + a * c 
-- This case won't happen if we've already distribute out.
instance (LeftAssoc (b + c) bc, LeftAssoc a a') => LeftAssoc (a * (b + c)) (a' * bc) where
    leftAssoc = bimapping leftAssoc leftAssoc

instance LeftAssoc O O where
    leftAssoc = id
instance LeftAssoc I I where
    leftAssoc = id
instance LeftAssoc (V a) (V a) where
    leftAssoc = id




-- right assoc will completely associate strings of + or -. Mixed terms are not associated.
-- cases  on left hand side of binary expression
-- always makes progress by either reassociating or recursing
class RightAssoc a b | a -> b where
    rightAssoc :: Iso' a b
instance (RightAssoc (a + (b + c)) abc') => RightAssoc ((a + b) + c) abc' where
    rightAssoc = (rev plus_assoc) . rightAssoc 
instance (RightAssoc (a * (b * c)) abc') => RightAssoc ((a * b) * c) abc' where
    rightAssoc = (rev mul_assoc) . rightAssoc 
instance RightAssoc a a' => RightAssoc (I + a) (I + a') where
    rightAssoc = seconding rightAssoc 
instance RightAssoc a a' => RightAssoc (O + a) (O + a') where
    rightAssoc = seconding rightAssoc 
instance RightAssoc a a' => RightAssoc (I * a) (I * a') where
    rightAssoc = seconding rightAssoc 
instance RightAssoc a a' => RightAssoc (O * a) (O * a') where
    rightAssoc = seconding rightAssoc 
instance RightAssoc a a' => RightAssoc ((V b) + a) ((V b) + a') where
    rightAssoc = seconding rightAssoc 
instance RightAssoc a a' => RightAssoc ((V b) * a) ((V b) * a') where
    rightAssoc = seconding rightAssoc 


instance (RightAssoc (b * c) bc, RightAssoc a a') => RightAssoc ((b * c) + a) (bc + a') where
    rightAssoc = bimapping rightAssoc rightAssoc
instance (RightAssoc (b + c) bc, RightAssoc a a') => RightAssoc ((b + c) * a) (bc * a') where
    rightAssoc = bimapping rightAssoc rightAssoc

instance RightAssoc O O where
    rightAssoc = id
instance RightAssoc I I where
    rightAssoc = id
instance RightAssoc (V a) (V a) where
    rightAssoc = id

Finally, the SortTerm routine. SortTerm is a bubble sort. The typeclass Bubble does a single sweep of swapping down the type level list-like structure we’ve built. The SortTerm uses the Sorted type family to check if it is finished. If it isn’t, it call Bubble again.


type family (a :: k) == (b :: k) :: Bool where
    f a == g b = f == g && a == b
    a == a = 'True
    _ == _ = 'False

type family SortedTerm a :: Bool where
    SortedTerm (a + (b + c)) = (((CmpTerm a b) == 'EQ) || ((CmpTerm a b) == 'GT)) && (SortedTerm (b + c))
    SortedTerm (a + b) = ((CmpTerm a b) == 'EQ) || ((CmpTerm a b) == 'GT)
    --SortedTerm a = 'True 
    SortedTerm I = 'True 
    SortedTerm O = 'True 
    SortedTerm (V a) = 'True 

-- higher powers of V are bigger.
-- CmpTerm compares TimesLists.    
type family CmpTerm a b where
    CmpTerm ((V a) * b) ((V a) * c) = CmpTerm b c
    CmpTerm ((V a) * b) (V a) = 'GT
    CmpTerm (V a) ((V a) * b)  = 'LT
    CmpTerm I ((V a) * b) = 'LT
    CmpTerm ((V a) * b) I = 'GT
    CmpTerm (V a) (V a) = 'EQ
    CmpTerm I (V a) = 'LT
    CmpTerm (V a) I = 'GT
    CmpTerm I I = 'EQ


-- Maybe this is all uneccessary since we'll expand out and abosrb to a*a + a*a + a + a + a + a


-- type a == b = TEq.(==) a b


-- Head and Tail of PlusLists
type family PlusHead a where
    PlusHead (x + y) = x
    PlusHead x = x
type family PlusTail a where
    PlusTail (x + y) = y



-- bubble assume a plusList of multiplicative terms. I.E. fully distributed, fully rightassociated , fully absorbed  
-- does one pass of a bubble sort
class Bubble f a b | f a -> b where
    bubble :: a ~~ b
-- more to go
instance (f ~ CmpTerm b (PlusHead c), Bubble f (b + c) bc) => Bubble 'EQ (a + (b + c)) (a + bc) where
    bubble = righting (bubble @f)
instance (f ~ CmpTerm b (PlusHead c), Bubble f (b + c) bc) => Bubble 'GT (a + (b + c)) (a + bc) where
    bubble = righting (bubble @f)
instance (f ~ CmpTerm a (PlusHead c), Bubble f (a + c) ac) => Bubble 'LT (a + (b + c)) (b + ac) where
    bubble = plus_assoc . (lefting comm_plus) . (rev plus_assoc) . righting (bubble @f)


-- The times, or constants shows that we're at the end of our + list.
instance Bubble 'EQ (a + (b * c)) (a + (b * c)) where
    bubble = id
instance Bubble 'GT (a + (b * c)) (a + (b * c)) where
    bubble = id
instance Bubble 'LT (a + (b * c)) ((b * c) + a) where
    bubble = comm_plus

instance Bubble 'EQ (a + I) (a + I) where
    bubble = id
instance Bubble 'GT (a + I) (a + I) where
    bubble = id
instance Bubble 'LT (a + I) (I + a) where
    bubble = comm_plus

instance Bubble 'EQ (a + O) (a + O) where
    bubble = id
instance Bubble 'GT (a + O) (a + O) where
    bubble = id
instance Bubble 'LT (a + O) (O + a) where -- shouldn't happen
    bubble = comm_plus

instance Bubble 'EQ (a + (V b)) (a + (V b)) where
    bubble = id
instance Bubble 'GT (a + (V b)) (a + (V b)) where
    bubble = id
instance Bubble 'LT (a + (V b)) ((V b) + a) where
    bubble = comm_plus
-- goofy base cases in case bubble gets called on a single element
instance Bubble x O O where
    bubble = id
instance Bubble x I I where
    bubble = id
instance Bubble x (V a) (V a) where
    bubble = id


-- sort term assumes rightassociated, fully distributed, fully I O absorbed expressions
class SortTerm' f a b | f a -> b where -- f is flag whether PlusList is sorted.
    sortTerm' :: a ~~ b
instance SortTerm' 'True a a where
    sortTerm' = id
-- a single term with no plus shouldn't get here. That is why PlusTail is ok.
instance (f ~ CmpTerm (PlusHead a) (PlusHead (PlusTail a)), Bubble f a a',
          f' ~ SortedTerm a', SortTerm' f' a' b) => SortTerm' 'False a b where
    sortTerm' = (bubble @f) . (sortTerm' @f')
class SortTerm a b | a -> b where -- f is flag whether PlusList is sorted.
    sortTerm :: a ~~ b
instance (f ~ SortedTerm a, SortTerm' f a a') => SortTerm a a' where
    sortTerm = sortTerm' @f

Hope you thought this was neat!

Proving Addition is Commutative in Haskell using Singletons

Yesterday morning, I was struck with awe at how amazingly cool the dependently typed approach (Agda, Idris, Coq) to proving programs is. It is great that I can still feel that after tinkering around with it for a couple years. It could feel old hat.

In celebration of that emotion, I decided to write a little introductory blog post about how to do some proving like that in Haskell. Haskell can replicate this ability to prove with varying amounts of pain. For the record, there isn’t anything novel in what follows.

One methodology for replicating the feel of dependent types is to use the singletons pattern. The singleton pattern is a way of building a data type so that there is an exact copy of the value of a variable in its type.

For future reference, I have on the following extensions, some of which I’ll mention when they come up.

{-# LANGUAGE GADTs, DataKinds, TypeFamilies, RankNTypes, UndecidableInstances, PolyKinds #-}

Here’s how the pattern goes.

Step 1: define your ordinary data type. Bool is already defined in the Prelude, but here is what it looks like

data Bool = True | False

Step 2: turn on the DataKinds extension. This automatically promotes any data type constructor like True or False or Just into types that have apostrophes in front of them 'True, 'False, 'Just. This is mostly a convenience. You could have manually defined something similar like so

data True
data False
data Just a

Step 3: Define your singleton datatype. The singleton datatype is a GADT (generalized abstract data type). GADT declarations take on a new syntax. It helps to realize that ordinary type constructors like Just are just functions. You can use them anywhere you can use functions. Just has the type a -> Maybe a. It might help to show that you can define a lower case synonym.

just :: a -> Maybe a
just = Just

just is clearly just a regular function. What makes constructors a special thing (not quite “just functions”) is that you can also pattern match on them. Data constructors are functions that “do nothing”. They hold the data until eventually you get it back again via pattern matching.

So why don’t you write the type signature of the constructors when you define them? Why does using a data statement look so different than a function definition? The GADT syntax brings the two concepts visually closer.

Letting you define the type signature for the constructor let’s you define a constrained type signature, rather than the inferred most general one. It’s similar to the following situation. If you define an identity function id, it has the polymorphic type a -> a. However, you can explicitly constrain the function with an identical implementation. If you try to use boolid on an Int it is a type error.

id x = x

boolid :: Bool -> Bool
boolid x = x

The GADT syntax let’s you constrain what the type signature of the constructor is, but also very interestingly, let’s the type checker infer information when you pattern match into the GADT.

With all that spiel, here is the singleton type for Bool as a GADT.

data SBool s where
   STrue :: SBool 'True
   SFalse :: SBool 'False

You have made an exact copy of the value at the type level. When you pattern match into a variable x of type SBool s in the STrue branch, it knows that s ~ 'True and in the SFalse branch it knows that s ~ 'False.

Here’s the analogous construction for a Peano natural number

data Nat = Zero | Succ Nat

data SNat s where
    SZero :: SNat 'Zero
    SSucc :: SNat n -> SNat ('Succ n)

Now let’s talk about programming.

Addition is straightforward to define for our Nat.

nplus :: Nat -> Nat -> Nat
nplus Zero x = x
nplus (Succ x) y = Succ (nplus x y)

Let’s replicate this definition at the type level. The extension we’ll want is TypeFamilies. Type families enables a syntax and feature for defining functions on types very similarly to how you define regular functions.

type family NPlus x y where
    NPlus 'Zero x = x
    NPlus ('Succ x) y = 'Succ (NPlus x y)

Now finally, we can exactly mirror this definition on singletons

snplus :: SNat n -> SNat n' -> SNat (NPlus n n')
snplus SZero x = x
snplus (SSucc x) y = SSucc (snplus x y) 

In the type signature SNat is kind of like a weirdo forall. It is a binding form that generates a new type variable you need to express the typelevel connections you want. The type variable n is a typelevel thing that represents the value.

Now let’s talk about proving. Basically, if you’re intent is proving things, I think it is simplest if you forget that the original data type ever existed. It is just a vestigial appendix that makes the DataKinds you need. Only work with singletons. You will then need to make a safe layer translating into and out of the singletons if you want to interface with non-singleton code.

We’re going to want to prove something about equality. The standard definition of equality is

data Eq1 a b where
    Refl :: Eq1 a a

I put the 1 there just so I wouldn’t clash with the Eq typeclass. It’s ugly, sorry. You can find an identical definition in base http://hackage.haskell.org/package/base-4.12.0.0/docs/Data-Type-Equality.html that uses the extension TypeOperators for a much cleaner syntax.

Why is this a reasonable equality? You can construct using Refl only when you are just showing that a equals itself. When you pattern match on Refl, the type checker is learning that a ~ b. It’s confusing. Let’s just try using it.

We can prove a couple facts about equality. First off that it is a symmetric relation. If a = b then b = a.

symm :: Eq1 a b -> Eq1 b a
symm Refl = _

When we pattern match and expose the incoming Refl, the type checker learns that a ~ b in this branch of the pattern match. Now we need to return an Eq1 b a. But we know that a ~ b so this is the same as an Eq1 a a. Well, we can easily do that with a Refl.

symm :: Eq1 a b -> Eq1 b a
symm Refl = Refl

Similarly we can prove the transitivity of equality.

trans :: Eq1 a b -> Eq1 b c -> Eq1 a c 
trans Refl Refl = Refl

Pattern matching on the first equality tells the type checker that a ~ b, the second that b ~ c. Now we need to return a Eq1 a c but this is the same as Eq1 a a because of the ~ we have, so Refl suffices.

It’s all damn clever. I wouldn’t have come up with it.

Now let’s start proving things about our addition operator. Can we prove that

proof1 :: Eq1 'Zero 'Zero
proof1 = Refl

This one is straightforward since obviously 'Zero is 'Zero. How about something slightly more complicated 1 + 0 = 1.

proof1' :: Eq1 (NPlus ('Succ 'Zero) 'Zero) ('Succ 'Zero)
proof1' = Refl

The typechecker can evaluate addition on concrete numbers to confirm this all works out.

Here’s a much more interesting property \forall x. 0 + x = x

proof2 :: SNat x -> Eq1 (NPlus 'Zero x) x
proof2 x = Refl

This one is also straightforward to prove. Looking at the definition of NPlus knowing that the first argument is 'Zero is enough to evaluate forward.

Here’s our first toughy. \forall x. x + 0 = x This may seem silly, but our definition of NPlus did not treat the two arguments symmetrically. it only pattern match on the first. Until we know more about x, we can’t continue. So how do we learn more? By pattern matching and looking at the possible cases of x.

proof3 :: SNat x -> (Eq1 (NPlus x 'Zero) x)
proof3 SZero = Refl
proof3 (SSucc x) | Refl <- proof3 x = Refl

The first case is very concrete and easy to prove. The second case is more subtle. We learn that x ~ 'Succ x1 for some x1 when we exposed the SSucc constructor. Hence we now need to prove Eq1 (NPlus ('Succ x1) 'Zero) ('Succ x1). The system now has enough information to evaluate NPlus a bit, so actually we need Eq1 ('Succ (NPlus x1 'Zero)) ('Succ x1). The term (NPlus x1 'Zero) looks very similar to what we were trying to prove in the first case. We can use a recursive call to get an equality proof that we pattern match to a Refl to learn that(NPlus x1 'Zero) ~ x1 which will then make the required result Eq1 ('Succ x1) ('Succ x1) which is obviously true via Refl. I learned a neat-o syntax for this second pattern match, called pattern guards. Another way of writing the same thing is

proof3 (SSucc x) = case (proof3 x) of Refl -> Refl

Follow all that? Haskell is not as friendly a place to do this as Idris or Agda is.

Now finally, the piece de resistance, the commutativity of addition, which works in a similar but more complicated way.

natcomm :: SNat x -> SNat y -> Eq1 (NPlus x y) (NPlus y x)
natcomm SZero y | Refl <- proof3 y = Refl 
natcomm x@(SSucc x') SZero | Refl <- proof3 x = Refl
natcomm (SSucc x) (SSucc y) | Refl <- natcomm x (SSucc y), Refl <- natcomm (SSucc x) y, Refl <- natcomm x y = Refl


A question: to what degree does this prove that nplus or snplus are commutative? The linkage is not perfect. snplus is type constrained to return the same result as NPlus which feels ok. nplus is syntactically identical to the implementation of the other two, but that is the only link, there is nothing compiler enforced.

The existence of non-termination in Haskell also makes everything done here much less fool-proof. It wouldn't be all that hard to accidentally make a recursive call in one of our proofs that is non terminating and the compiler would accept it and say nothing.

I recommend you check out the links below for more.

Source available here https://github.com/philzook58/singleberg

Resources:

https://github.com/RyanGlScott/ghc-software-foundations

http://hackage.haskell.org/package/singletons

https://blog.jle.im/entry/introduction-to-singletons-1.html

http://hackage.haskell.org/package/singleton-nats

Thoughts on Faking Some of GADTs in Rust

I’m a guy who is somewhat familiar with Haskell who is trying to learn Rust. So I thought I’d try to replicate some cool Haskell functionality in Rust. I would love to hear comments, because I’m trying to learn. I have no sense of Rust aesthetics yet and in particular I have no idea how this interacts with the borrow system. What follows is a pretty rough brain dump

GADTs (Generalized algebraic data types) are an extension in Haskell that allows you to write constrained type signatures for your data constructors. They also change how the type checking of pattern matching is processed.

GADTs are sometimes described/faked by being built by making data types that hold equality/unification constraints. Equality constraints in Haskell like a ~ Int are fairly magical and the Rust compiler does not support them in an obvious way. Maybe this is the next project. Figure out how to fake ’em if one can. I don’t think this is promising though, because faking them will be a little wonky, and then GADTs are a little wonky on top of that. See https://docs.rs/refl/0.1.2/refl/ So we’ll go another (related) road.

This is roughly what GADTs look like in Haskell.

data MyGadt a where
  TBool :: Bool -> MyGadt Bool
  TInt :: Int -> MyGadt Int

And here is one style of encoding using smart constructors and a typeclass for elimination (pattern matching is replicated as a function that takes callbacks for the data held in the different cases). Regular functions can have a restricted type signature than the most general one their implementation implies. The reason to use a typeclass is so that we can write the eliminator as returning the same type that the GADT supplies. There isn’t an explicit equality constraint. A kind of Leibnitz equality

(forall f. f a -> f b)
is hiding in the eliminator. The Leibnitz equality can be used in place of (~) constraints at some manual cost. http://code.slipthrough.net/2016/08/10/approximating-gadts-in-purescript/

Click to access leibniz-equality.pdf


data MyGadt2 a = TBool2 Bool | TInt2 Int
-- smart constructors. Hide the original constructors
tInt :: Int -> MyGadt2 Int
tInt = TInt

tBool :: Bool -> MyGadt2 Bool
tBool = TBool

-- gadt eliminator
class MyGadtElim a where
   elimMyGadt :: forall f. MyGadt2 a -> (Bool -> f Bool) -> (Int -> f Int) -> f a

instance MyGadtElim Int where
   elimMyGadt (TInt2 x) fb fi = fi x
   elimMyGadt _ _ _ = error "How did TBool2 get type MyGadt2 Int?"
instance MyGadtElim Bool where
   elimMyGadt (TBool2 x) fb fi = fb x
 

The

forall f :: * -> *
is a problem for Rust. Rust does not have higher kinded types, although they can be faked to some degree. https://gist.github.com/CMCDragonkai/a5638f50c87d49f815b8 There are murmurs of Associated Type Constructors / GATs , whatever those are , that help ease the pain, but I’m pretty sure they are not implemented anywhere yet.

I’m going to do something related, a defunctionalization of the higher kinded types. We make an application trait, that will apply the given type function tag to the argument. What I’m doing is very similar to what happens in the singletons library, so we may be getting some things for free.

https://typesandkinds.wordpress.com/2013/04/01/defunctionalization-for-the-win/

trait App1 { type T;}

Then in order to define a new typelevel function rip out a quick tag type and an App impl.

struct F1 {}
impl App1 for F1{
    type T = Vec>;
}

It might be possible to sugar this up with a macro. It may also be possible to write typelevel functions in a point free style without defining new function tag names. The combinators Id, Comp, Par, Fst, Snd, Dup, Const are all reasonably definable and fairly clear for small functions. Also the combinator S if you want to talk SKI combinatory calculus, which is unfit for humans. https://en.wikipedia.org/wiki/SKI_combinator_calculus For currying, I used a number for how many arguments are left to be applied (I’m not sure I’ve been entirely consistent with these numbers). You need to do currying quite manually. It may be better to work with tuplized arguments

struct Vec1 {} // number is number of aspplications left.
impl App1 for Vec1{
    type T = Vec;
}

struct Id();

impl App1 for Id{
    type T = A; 
}

// partially applied const.
// struct Const()
// call this next one const1
struct Const1();
struct Const(PhantomData); // should be Const1
impl App1 for Const1 { // partial application
    type T = Const;
}

impl App1 for Const{
    type T = B; 
}



struct Fst {}
impl  App1<(A,B)> for Fst {
    type T = A;
} 

struct Snd {}
impl  App1<(A,B)> for Snd {
    type T = B;
} 


struct Dup{}
impl  App1 for Dup {
    type T = (A,A);
} 

struct Par2 {}
struct Par1 (PhantomData);
struct Par (PhantomData , PhantomData );
impl App1 for Par2 {
    type T = Par1;
}
impl App1 for Par1 {
    type T = Par;
}

impl App1<(X,Y)> for Par where F : App1, G : App1  {
    type T = (F::T, G::T);
}

// In order to curry, i think I'd have to define a name for every curried form.


// combinator calculus Const is K, Id is I, and here is S combinator. Yikes.
type I = Id;
type K = Const;

struct S3{}
struct S2(PhantomData);
struct S1(PhantomData, PhantomData);
// struct S(PhantomData, PhantomData, PhantomData);
impl  App1 for S1 where A : App1, B : App1, >::T : App1< >::T > {
    type T =  < >::T   as App1< >::T > >::T;
}  


struct Comp2();
struct Comp1 (PhantomData);
struct Comp (PhantomData, PhantomData);
impl App1 for Comp where G : App1, F : App1<>::T> {
    type T =  >::T>>::T;
}  

Anyway, the following is a translation of the above Haskell (well, I didn’t wrap an actual i64 or bool in there but I could have I think). You need to hide the actual constructors labeled INTERNAL in a user inaccessible module.

enum Gadt {
    TIntINTERNAL(PhantomData),
    TBoolINTERNAL(PhantomData)
    
}
// then build the specialzied constructors that 
fn TBool() -> Gadt{
    Gadt::TBoolINTERNAL(PhantomData)
}

fn TInt() -> Gadt{
    Gadt::TIntINTERNAL(PhantomData)
}

The smart constructors put the right type in the parameter spot

Then pattern matching is a custom trait per gadtified type. Is it possible to unify the different elimination traits that will come up into a single Elim trait? I’m 50-50 about whether this is possible. What we’re doing is a kind of fancy map_or_else if that helps you.

https://doc.rust-lang.org/std/option/enum.Option.html#method.map_or_else

trait GadtElim {
    type Inner;
fn gadtElim(&self, case1 : >::T   , case2 : >::T ) -> >::T  where 
         F : App1,
         F : App1,
         F : App1;
         

}


impl GadtElim for Gadt {
    type Inner = i64;
    fn gadtElim(&self, case1 : >::T   , case2 : >::T ) -> >::T  where 
         F : App1,
         F : App1,
         F : App1{
        match self{
            Gadt::TIntINTERNAL(PhantomData) => case2,
            Gadt::TBoolINTERNAL(PhantomData) => panic!("Somehow TBool has type Gadt")// Will never be reached though. god willing
        }
    }
}
impl GadtElim for Gadt {
    type Inner = bool;
    fn gadtElim(&self, case1 : >::T , case2 : >::T ) -> >::T  where 
         F : App1,
         F : App1,
         F : App1{
        match self{
            Gadt::TIntINTERNAL(PhantomData) => panic!("Somehow TInt has type Gadt"),
            Gadt::TBoolINTERNAL(PhantomData) => case1 // Will never be reached though. god willing
        }
    }
}

Usage. You have to explicitly pass the return type function to the eliminator. No inference is done for you. It’s like Coq’s match but worse. BTW the dbg! macro is the greatest thing on earth. Well done, Rust.

    let z = TInt().gadtElim::(true , 34);
    let z2 = TBool().gadtElim::(true , 34);

    dbg!(z);
    dbg!(z2);
    struct F7 {} // You need to do this. Kind of sucks. macroify? App!(F = Vec)
    impl App1 for F7 { type T = Vec; }
    dbg!(TInt().gadtElim::(vec!(true,false) , vec!(34,45,4,3,46)));

You can make helpers that don’t require explicit types to be given

fn gadtRec(x : impl GadtElim, case1 : A, case2 : A) -> A {
    x.gadtElim::>(case1 , case2)
}

One could also make an Eq a b type with Refl similarly. Then we need typelevel function tags that take two type parameter. Which, with currying or tupling, we may already have.

Questions:

Is this even good? Or is it a road of nightmares? Is this even emulating GADTs or am I just playing phantom type games?

We aren’t at full gadt. We don’t have existential types. Rust has some kind of existential story evolving (already there?), but the state of it is confusing to me. Something to play with. Higher rank functions would help?

Are overlapping typeclasses a problem in Rust?

Again, I have given nearly zero thought to borrowing and how it interacts with this. I’m a Rust n00b. I should think about it. Different eliminators based on whether you own or are borrowing?.

How much of singleton style dependent types do we get from this? It feels like we have already paid the cost of defunctionalizing. http://hackage.haskell.org/package/singletons

My current playground for this is at https://github.com/philzook58/typo

Edit: Rust is an ML with affine types and a syntax facelift roughly. So Ocaml is a good place to pump. Oleg Kiselyov had a fascinating approach that sort of smuggles through an Option type in an equality type using mutation. I wonder how well that would mesh with Rust. It seems obviously not thread safe unless you can make the mutation and matching atomic.

http://okmij.org/ftp/ML/GADT.txt

okmij.org/ftp/ML/first-class-modules/

https://blog.janestreet.com/more-expressive-gadt-encodings-via-first-class-modules/

A Little Bloop on Typed Template Haskell

I’ve found looking at my statistics that short, to the point, no bull crap posts are the most read and probably most useful.

I’ve been tinkering around with typed template Haskell, which a cursory internet search doesn’t make obvious even exists. The first thing to come up is a GHC implementation wiki that seems like it might be stalled in some development branch. No. Typed template Haskell is already in GHC since GHC 7.8. And the first thing that comes up on a template haskell google search is the style of Template Haskell where you get deep into the guts of the syntax tree, which is much less fun.

Here’s a basic addition interpreter example.

{-# LANGUAGE TemplateHaskell -#}
import Language.Haskell.TH
import Language.Haskell.TH.Syntax

data Expr = Val Int | Add Expr Expr

eval' :: Expr -> Int
eval' (Val n) = n
eval' (Add e1 e2) = (eval' e1) + (eval' e2)

eval :: Expr -> TExpQ Int
eval (Val n) = [|| n ||]
eval (Add e1 e2) = [|| $$(eval e1) + $$(eval e2) ||]

The typed splice $$ takes a out of TExpQ a. The typed quote [|| ||] puts it in. I find that you tend to be able to follow the types to figure out what goes where. If you’re returning a TExpQ, you probably need to start a quote. If you need to recurse, you often need to splice. Etc.

You tend to have to put the actual use of the splice in a different file. GHC will complain otherwise.

ex1 :: Int
ex1 = $$(eval (Add (Val 1) (Val 1)))

At the top of your file put this to have the template haskell splices dumped into a file

{-# OPTIONS_GHC -ddump-splices -ddump-to-file #-}

Or have your package.yaml look something like this

executables:
  staged-exe:
    main:                Main.hs
    source-dirs:         app
    ghc-options:
    - -threaded
    - -rtsopts
    - -with-rtsopts=-N
    - -ddump-splices
    - -ddump-to-file
    dependencies:
    - staged

If you’re using stack, you need to dive into .stack/dist/x86/Cabal/build and then /src or into the executable folder something-exe-tmp/app to find .dump-splices files.

app/Main.hs:11:10-35: Splicing expression
    eval (Add (Val 1) (Val 1)) ======> (1 + 1)

Nice. I don’t know whether GHC might have optimized this thing down anyhow or not, but we have helped the process along.

Some more examples: unrolling the recursion on a power function (a classic)

-- ordinary version
power 0 x = 1
power n x = x * (power (n-1) x)

-- templated up
power' :: Int -> TExpQ (Int -> Int)
power' 0 = [|| const 1 ||]
power' n = [|| \x -> x * $$(power' (n-1)) x ||] 

You can unroll a fibonacci calculation

-- unrolled fib woth sharing
fib 0 = 1
fib 1 = 1
fib n = (fib (n-1)) + (fib (n-2))

-- we always need [||  ||] wheneve there is a Code
fib' :: Int -> TExpQ Int
fib' 0 = [|| 1 ||]
fib' 1 = [|| 1 ||] 
fib' n = [|| $$(fib' (n-1)) + $$(fib' (n-2)) ||]

This is blowing up in calculation though (no memoization, you silly head). We can implement sharing in the code using let expression (adapted from https://wiki.haskell.org/The_Fibonacci_sequence). Something to think about.

fib4 :: Int -> TExpQ Int
fib4 n = go n [|| ( 0, 1 ) ||]
                where
                  go :: Int -> TExpQ (Int, Int) -> TExpQ Int
                  go n z | n==0      = [|| let (a,b) = $$(z) in a ||]
                         | otherwise = go (n-1) [|| let (a,b) = $$(z) in (b, a + b) ||]

Tinkering around, you’ll occasionally find GHC can’t splice and quote certain things. This is related to cross stage persistence and lifting, which are confusing to me. You should look in the links below for more info. I hope I’ll eventually get a feel for it.

If you want to see more of my fiddling in context to figure out how to get stuff to compile here’s my github that I’m playing around in https://github.com/philzook58/staged-fun

Useful Link Dump:

Simon Peyton Jones has a very useful talk slides. Extremely useful https://www.cl.cam.ac.uk/events/metaprog2016/Template-Haskell-Aug16.pptx

Matthew Pickering’s post keyed me into that Typed Template Haskell is even there.

 http://mpickering.github.io/posts/2019-02-14-stage-3.html

MuniHac 2018: Keynote: Beautiful Template Haskell

https://markkarpov.com/tutorial/th.html Mark Karpov has a useful Template Haskell tutorial

Oleg Kiselyov: I’m still trying to unpack the many things that are going on here. Oleg’s site and papers are a very rich resource.

I don’t think that staged metaprogramming requires finally tagless style, as I first thought upon reading some of his articles. It is just very useful.

http://okmij.org/ftp/meta-programming/

http://okmij.org/ftp/meta-programming/tutorial/

Typed Template Haskell is still unsound (at least with the full power of the Q templating monad) http://okmij.org/ftp/meta-programming/#TExp-problem

You can write things in a finally tagless style such that you can write it once and have it interpreted as staged or in other ways. There is a way to also have a subclass of effects (Applicatives basically?) less powerful than Q that is sound.

A Touch of Topological Quantum Computation 3: Categorical Interlude

Welcome back, friend.

In the last two posts, I described the basics of how to build and manipulate the Fibonacci anyon vector space in Haskell.

As a personal anecdote, trying to understand the category theory behind the theory of anyons is one of the reasons I started learning Haskell. These spaces are typically described using the terminology of category theory. I found it very frustrating that anyons were described in an abstract and confusing terminology. I really wondered if people were just making things harder than they have to be. I think Haskell is a perfect playground to clarify these constructions. While the category theory stuff isn’t strictly necessary, it is interesting and useful once you get past the frustration.

Unfortunately, I can’t claim that this article is going to be enough to take you from zero to categorical godhood

but I hope everyone can get something out of it. Give it a shot if you’re interested, and don’t sweat the details.

The Aroma of Categories

I think Steve Awodey gives an excellent nutshell of category theory in the introductory section to his book:

“What is category theory? As a first approximation, one could say that category theory is the mathematical study of (abstract) algebras of functions. Just as group theory is the abstraction of the idea of a system of permutations of a set or symmetries of a geometric object, so category theory arises from the idea of a system of functions among some objects.”

For my intuition, a category is any “things” that plug together. The “in” of a thing has to match the “out” of another thing in order to hook them together. In other words, the requirement for something to be a category is having a notion of composition. The things you plug together are called the morphisms of the category and the matching ports are the objects of the category. The additional requirement of always having an identity morphism (a do-nothing connection wire) is usually there once you have composition, although it is good to take especial note of it.

Category theory is an elegant framework for how to think about these composing things in a mathematical way. In my experience, thinking in these terms leads to good abstractions, and useful analogies between disparate things.

It is helpful for any abstract concept to list some examples to expose the threads that connect them. Category theory in particular has a ton of examples connecting to many other fields because it is a science of analogy. These are the examples of categories I usually reach for. Which one feels the most comfortable to you will depend on your background.

  • Hask. Objects are types. Morphisms are functions between those types
  • Vect. Objects are vector spaces, morphisms are linear maps (roughly matrices).
  • Preorders. Objects are values. Morphisms are the inequalities between those values.
  • Sets. Objects are Sets. Morphisms are functions between sets.
  • Cat. Objects are categories, Morphisms are functors. This is a pretty cool one, although complete categorical narcissism.
  • Systems and Processes.
  • The Free Category of a directed graphs. Objects are vertices. Morphisms are paths between vertices

Generic Programming and Typeclasses

The goal of generic programming is to run programs that you write once in many way.

There are many ways to approach this generic programming goal, but one way this is achieved in Haskell is by using Typeclasses. Typeclasses allow you to overload names, so that they mean different things based upon the types involved. Adding a vector is different than adding a float or int, but there are programs that can be written that reasonably apply in both situations.

Writing your program in a way that it applies to disparate objects requires abstract ways of talking about things. Mathematics is an excellent place to mine for good abstractions. In particular, the category theory abstraction has demonstrated itself to be a very useful unified vocabulary for mathematical topics. I, and others, find it also to be a beautiful aesthetic by which to structure programs.

In the Haskell base library there is a Category typeclass defined in base. In order to use this, you need to import the Prelude in an unusual way.

{-# LANGUAGE NoImplicitPrelude #-} 
import Prelude hiding ((.), id) 

The Category typeclass is defined on the type that corresponds to the morphisms of the category. This type has a slot for the input type and a slot for the output type. In order for something to be a category, it has to have an identity morphisms and a notion of composition.

class Category cat where
     id :: cat a a
     (.) :: cat b c -> cat a b -> cat a c

The most obvious example of this Category typeclass is the instance for the ordinary Haskell function (->). The identity corresponds to the standard Haskell identity function, and composition to ordinary Haskell function composition.

instance Category (->) where     
   id = \x -> x
   f . g = \x -> f (g x)

Another example of a category that we’ve already encountered is that of linear operators which we’ll call LinOp. LinOp is an example of a Kliesli arrow, a category built using monadic composition rather than regular function composition. In this case, the monad Q from my first post takes care of the linear pipework that happens between every application of a LinOp. The fish <=< operator is monadic composition from Control.Monad.

newtype LinOp a b = LinOp {runLin :: a -> Q b} 
instance Category LinOp where
   id = LinOp pure 
   (LinOp f) . (LinOp g) = LinOp (f <=< g) 

A related category is the FibOp category. This is the category of operations on Fibonacci anyons, which are also linear operations. It is LinOp specialized to the Fibonacci anyon space. All the operations we've previously discussed (F-moves, braiding) are in this category.

newtype FibOp a b = FibOp {runFib :: (forall c. FibTree c a -> Q (FibTree c b))} 
instance Category (FibOp) where
   id = FibOp pure
   (FibOp f) . (FibOp g) = FibOp (f <=< g)

The "feel" of category theory takes focus away from the objects and tries to place focus on the morphisms. There is a style of functional programming called "point-free" where you avoid ever giving variables explicit names and instead use pipe-work combinators like (.), fst, snd, or (***). This also has a feel of de-emphasizing objects. Many of the combinators that get used in this style have categorical analogs. In order to generically use categorical typeclasses, you have to write your program in this point free style.

It is possible for a program written in the categorical style to be a reinterpreted as a program, a linear algebra operation, a circuit, or a diagram, all without changing the actual text of the program. For more on this, I highly recommend Conal Elliot's  compiling to categories, which also puts forth a methodology to avoid the somewhat unpleasant point-free style using a compiler plug-in. This might be an interesting place to mine for a good quantum programming language. YMMV.

Monoidal Categories.

Putting two processes in parallel can be considered a kind of product. A category is monoidal if it has this product of this flavor, and has isomorphisms for reassociating objects and producing or consuming a unit object. This will make more sense when you see the examples.

We can sketch out this monoidal category concept as a typeclass, where we use () as the unit object.

class Category k => Monoidal k where
     parC :: k a c -> k b d -> k (a,b) (c,d)
     assoc :: k ((a,b),c) (a,(b,c))
     assoc' :: k (a,(b,c)) ((a,b),c)
     leftUnitor :: k ((),a) a
     leftUnitor' :: k a ((),a)
     rightUnitor :: k (a,()) a
     rightUnitor' :: k a (a,()) 

Instances

In Haskell, the standard monoidal product for regular Haskell functions is (***) from Control.Arrow. It takes two functions and turns it into a function that does the same stuff, but on a tuple of the original inputs. The associators and unitors are fairly straightforward. We can freely dump unit () and get it back because there is only one possible value for it.

(***) :: (a -> c) -> (b -> d) -> ((a,b) -> (c,d))
f *** g = \(x,y) -> (f x, g y) ļ»æ
instance Monoidal (->) where
   parC f g = f *** g 
   assoc ((x,y),z) = (x,(y,z))
   assoc' (x,(y,z)) = ((x,y),z)
   leftUnitor (_, x) = x
   leftUnitor' x = ((),x)
   rightUnitor (x, _) = x
   rightUnitor' x = (x,()) 

The monoidal product we'll choose for LinOp is the tensor/outer/Kronecker product.

kron :: Num b => W b a -> W b c -> W b (a,c)
kron (W x) (W y) = W [((a,c), r1 * r2) | (a,r1) <- x , (c,r2) <- y ]

Otherwise, LinOp is basically a monadically lifted version of (->). The one dimensional vector space Q () is completely isomorphic to just a number. Taking the Kronecker product with it is basically the same thing as scalar multiplying (up to some shuffling).

instance Monoidal LinOp where
    parC (LinOp f) (LinOp g) = LinOp $ \(a,b) -> kron (f a) (g b) 
    assoc = LinOp  (pure . assoc)
    assoc' = LinOp (pure . unassoc)
    leftUnitor = LinOp (pure . leftUnitor)
    leftUnitor' = LinOp (pure .leftUnitor')
    rightUnitor = LinOp (pure . rightUnitor)
    rightUnitor' = LinOp (pure . rightUnitor')

Now for a confession. I made a misstep in my first post. In order to make our Fibonacci anyons jive nicely with our current definitions, I should have defined our identity particle using type Id = () rather than data Id. We'll do that now. In addition, we need some new primitive operations for absorbing and emitting identity particles that did not feel relevant at that time.

rightUnit :: FibTree e (a,Id) -> Q (FibTree e a)
rightUnit (TTI t _) = pure t
rightUnit (III t _) = pure t

rightUnit' :: FibTree e a -> Q (FibTree e (a,Id))
rightUnit' t@(TTT _ _) = pure (TTIĀ  t ILeaf)
rightUnit' t@(TTI _ _) = pure (TTIĀ  t ILeaf)
rightUnit' t@(TIT _ _) = pure (TTIĀ  t ILeaf)
rightUnit' t@(III _ _) = pure (IIIĀ  t ILeaf)
rightUnit' t@(ITT _ _) = pure (IIIĀ  t ILeaf)
rightUnit' t@(ILeaf) = pure (IIIĀ t ILeaf)
rightUnit' t@(TLeaf) = pure (TTIĀ t ILeaf)

leftUnit :: FibTree e (Id,a) -> Q (FibTree e a)
leftUnit = rightUnit <=< braid

-- braid vs braid' doesn't matter, but it has a nice symettry.
leftUnit' :: FibTree e a -> Q (FibTree e (Id,a))
leftUnit' = braid' <=< rightUnit' 

With these in place, we can define a monoidal instance for FibOp. The extremely important and intriguing F-move operations are the assoc operators for the category. While other categories have assoc that feel nearly trivial, these F-moves don't feel so trivial.

instance Monoidal (FibOp) where
    parC (FibOp f) (FibOp g) = (FibOp (lmap f)) . (FibOp (rmap g))
    assoc = FibOp  fmove'
    assoc' = FibOp fmove
    leftUnitor = FibOp leftUnit
    leftUnitor' = FibOp leftUnit'
    rightUnitor = FibOp rightUnit
    rightUnitor' = FibOp rightUnit'

This is actually useful

The parC operation is extremely useful to explicitly note in a program. It is an opportunity for optimization. It is possible to inefficiently implement parC in terms of other primitives, but it is very worthwhile to implement it in new primitives (although I haven't here). In the case of (->), parC is an explicit location where actual computational parallelism is available. Once you perform parC, it is not longer obviously apparent that the left and right side of the tuple share no data during the computation. In the case of LinOp and FibOp, parC is a location where you can perform factored linear computations. The matrix vector product (A \otimes B)(v \otimes w) can be performed individually (Av)\otimes (Bw). In the first case, where we densify A \otimes B and then perform the multiplication, it costs O((N_A N_B)^2) time, whereas performing them individually on the factors costs O( N_A^2 + N_B^2) time, a significant savings. Applied category theory indeed.

Laws

Judge Dredd courtesy of David

Like many typeclasses, these monoidal morphisms are assumed to follow certain laws. Here is a sketch (for a more thorough discussion check out the wikipedia page):

  • Functions with a tick at the end like assoc' should be the inverses of the functions without the tick like assoc, e.g. assoc . assoc' = id
  • The parC operation is (bi)functorial, meaning it obeys the commutation law parC (f . f') (g . g') = (parC f g) . (parC f' g') i.e. it doesn't matter if we perform composition before or after the parC.
  • The pentagon law for assoc: Applying leftbottom is the same as applying topright
leftbottom :: (((a,b),c),d) -> (a,(b,(c,d)))
leftbottom =Ā assoc . assoc

topright :: (((a,b),c),d) -> (a,(b,(c,d)))
topright =Ā (id *** assoc) . assoc . (assoc *** id)
  • The triangle law for the unitors:
topright' :: ((a,()),b) -> (a,b)
topright' = (id *** leftUnitor) . assoc
leftside :: ((a,()),b) -> (a,b)
leftside = rightUnitor *** id

String Diagrams

String diagrams are a diagrammatic notation for monoidal categories. Morphisms are represented by boxes with lines.

Composition g . f is made by connecting lines.

The identity id is a raw arrow.

The monoidal product of morphisms f \otimes g is represented by placing lines next to each other.

The diagrammatic notion is so powerful because the laws of monoidal categories are built so deeply into it they can go unnoticed. Identities can be put in or taken away. Association doesn't even appear in the diagram. The boxes in the notation can naturally be pushed around and commuted past each other.

This corresponds to the property

(id \otimes g) \circ (f \otimes id) = (f \otimes id) \circ (id \otimes g)

What expression does the following diagram represent?

Is it (f \circ f') \otimes (g \circ g') (in Haskell notation parC (f . f') (g . g') )?

Or is it (f \otimes g) \circ (f' \otimes g') (in Haskell notation (parC f g) . (parC f' g')?

Answer: It doesn't matter because the functorial requirement of parC means the two expressions are identical.

There are a number of notations you might meet in the world that can be interpreted as String diagrams. Three that seem particular pertinent are:

  • Quantum circuits
Image result for quantum circuits
  • Anyon Diagrams!

Braided and Symmetric Monoidal Categories: Categories That Braid and Swap

Some monoidal categories have a notion of being able to braid morphisms. If so, it is called a braided monoidal category (go figure).

class Monoidal k => Braided k where
    over :: k (a,b) (b,a)
    under :: k (a,b) (b,a)

The over and under morphisms are inverse of each other over . under = id. The over morphism pulls the left morphism over the right, whereas the under pulls the left under the right. The diagram definitely helps to understand this definition.

These over and under morphisms need to play nice with the associator of the monoidal category. These are laws that valid instance of the typeclass should follow. We actually already met them in the very first post.

If the over and under of the braiding are the same the category is a symmetric monoidal category. This typeclass needs no extra functions, but it is now intended that the law over . over = id is obeyed.

class Braided k => Symmetric k where

When we draw a braid in a symmetric monoidal category, we don't have to be careful with which one is over and under, because they are the same thing.

The examples that come soonest to mind have this symmetric property, for example (->) is a symmetric monoidal category..

swap :: (a, b) -> (b, a)
swap (x,y) = (y,x)
instance Braided (->) where
    over = swap
    under = swap
instance Symmetric (->)

Similarly LinOp has an notion of swapping that is just a lifting of swap

instance Braided (LinOp) where
    over = LinOp (pure . swap)
    under = LinOp (pure . swap)
instance Symmetric LinOp 

However, FibOp is not symmetric! This is perhaps at the core of what makes FibOp so interesting.

instance Braided FibOp where
    over = FibOp braid
    under = FibOp braid'

Automating Association

Last time, we spent a lot of time doing weird typelevel programming to automate the pain of manual association moves. We can do something quite similar to make the categorical reassociation less painful, and more like the carefree ideal of the string diagram if we replace composition (.) with a slightly different operator

(...) :: ReAssoc b b' => FibOp b' c -> FibOp a b -> FibOp a c
(FibOp f) ... (FibOp g) = FibOp $ f <=< reassoc <=< g

Before defining reassoc, let's define a helper LeftCollect typeclass. Given a typelevel integer n, it will reassociate the tree using a binary search procedure to make sure the left branch l at the root has Count l = n.

leftcollect :: forall n gte l r o e. (gte ~ CmpNat n (Count l), LeftCollect n gte (l,r) o) => FibTree e (l,r) -> Q (FibTree e o)
leftcollect x = leftcollect' @n @gte x

class LeftCollect n gte a b | n gte a -> b where
    leftcollect' :: FibTree e a -> Q (FibTree e b)

-- The process is like a binary search.
-- LeftCollect pulls n leaves into the left branch of the tuple

-- If n is greater than the size of l, we recurse into the right branch with a new number of leaves to collect
-- then we do a final reshuffle to put those all into the left tree.
instance (
   k ~ Count l,
   r ~ (l',r'),
   n' ~ (n - k),
   gte ~ CmpNat n' (Count l'), 
   LeftCollect n' gte r (l'',r'')) => LeftCollect n 'GT (l,r) ((l,l''),r'') where
        leftcollect' x = do       
               x' <- rmap (leftcollect @n') x -- (l,(l'',r'')) -- l'' is size n - k
               fmove x'  -- ((l,l''),r'') -- size of (l,l'') = k + (n-k) = n
instance (
    l ~ (l',r'),
    gte ~ CmpNat n (Count l'), 
    LeftCollect n gte l (l'',r'')) => LeftCollect n 'LT (l,r) (l'',(r'',r)) where
        leftcollect' x = do       
                x' <- lmap (leftcollect @n) x -- ((l'',r''),r) -- l'' is of size n
                fmove' x'  -- (l'',(r'',r)

instance LeftCollect n 'EQ (l,r) (l,r) where
    leftcollect' = pure

Once we have LeftCollect, the typeclass ReAssoc is relatively simple to define. Given a pattern tree, we can count the elements in it's left branch and LeftCollect the source tree to match that number. Then we recursively apply reassoc in the left and right branch of the tree. This means that every node has the same number of children in the tree, hence the trees will end up in an identical shape (modulo me mucking something up).

class ReAssoc a b where
    reassoc :: FibTree e a -> Q (FibTree e b)
instance (n ~ Count l',
    gte ~ CmpNat n (Count l), 
    LeftCollect n gte (l,r) (l'',r''),    
    ReAssoc l'' l',
    ReAssoc r'' r') => ReAssoc (l,r) (l',r') where
        reassoc x = do 
                    x' <- leftcollect @n x
                    x'' <- rmap reassoc x'
                    lmap reassoc x''
                    
--instance {-# OVERLAPS #-} ReAssoc a a where
 --   reassoc = pure
    
instance ReAssoc Tau Tau where
    reassoc = pure
instance ReAssoc Id Id where
    reassoc = pure

It seems likely that one could write equivalent instances that would work for an arbitrary monoidal category with a bit more work. We are aided somewhat by the fact that FibOp has a finite universe of possible leaf types to work with.

Closing Thoughts

While our categorical typeclasses are helpful and nice, I should point out that they are not going to cover all the things that can be described as categories, even in Haskell. Just like the Functor typeclass does not describe all the conceptual functors you might meet. One beautiful monoidal category is that of Haskell Functors under the monoidal product of Functor Composition. More on this to come, I think. https://parametricity.com/posts/2015-07-18-braids.html

We never even touched the dot product in this post. This corresponds to another doodle in a string diagram, and another power to add to your category. It is somewhat trickier to work with cleanly in familiar Haskell terms, I think because (->) is at least not super obviously a dagger category?

You can find a hopefully compiling version of all my snippets and more in my chaotic mutating Github repo https://github.com/philzook58/fib-anyon

See you next time.

References

The Rosetta Stone paper by Baez and Stay is probably the conceptual daddy of this entire post (and more).

Bartosz Milewski's Category Theory for Programmer's blog (online book really) and youtube series are where I learned most of what I know about category theory. I highly recommend them (huge Bartosz fanboy).

Catsters - https://byorgey.wordpress.com/catsters-guide-2/

https://en.wikibooks.org/wiki/Haskell/Category_theory

https://www.math3ma.com/blog/what-is-category-theory-anyway

There are fancier embeddings of category theory and monoidal categories than I've shown here. Often you want constrained categories and the ability to choose unit objects. I took a rather simplistic approach here.

http://hackage.haskell.org/package/constrained-categories

http://hackage.haskell.org/package/data-category


https://parametricity.com/posts/2015-07-18-braids.html

Applicative Bidirectional Programming and Automatic Differentiation

I got referred to an interesting paper by a comment of /u/syrak.

http://www2.sf.ecei.tohoku.ac.jp/~kztk/papers/kztk_jfp_am_2018.pdf

Applicative bidirectional programming (PDF), by Kazutaka Matsuda and Meng Wang

In it, they use a couple interesting tricks to make Lens programming more palatable. Lens often need to be be programmed in a point free style, which is rough, but by using some combinators, they are able to program lenses in a pointful style (with some pain points still left over). It is a really interesting, well-written paper. Lots ‘o’ Yoneda and laws. I’m not doing it justice. Check it out!

A while back I noted that reverse mode auto-differentiation has a type very similar to a lens and in fact you can build a working reverse mode automatic differentiation DSL out of lenses and lens-combinators. Many things about lenses, but not all, transfer over to automatic differentiation. The techniques of Matsuda and Wang do appear to transfer fairly well.

This is interesting to me for another reason. Their lift2 and unlift2 functions remind me very much of my recent approach to compiling to categories. The lift2 function is fanning a lens pair. This is basically what my FanOutput typeclass automated. unlift2 is building the input for a function function by supplying a tuple of projection lenses. This is what my BuildInput typeclass did. I think their style may extend many monoidal cartesian categories, not just lenses.

lift2 :: Lens (a,b) c -> (forall s. Num s => (Lens s a, Lens s b) -> Lens s c)
lift2 l (x,y) = lift l (fan x y)

unlift2 :: (Num a, Num b) => (forall s. Num s => (Lens s a, Lens s b) -> Lens s c) -> Lens (a,b) c
unlift2 f = f (fst', snd')

One can use the function b -> a in many of the situations one can use a in. You can do elegant things by making a Num typeclass of b -> a for example. This little fact seems to extend to other categories as well. By making a Num typeclass for Lens s a when a is a Num, we can use reasonable looking notation for arithmetic.

t1 :: Num a => Lens (a,a) a
t1 = unlift2 $ \(x,y) -> x + y*y + y * 7

They spend some time discussing the necessity of a Poset typeclass. For actual lawful lenses, the dup implementation needs a way to recombine multiple adjustments to the same object. In the AD-lens case, dup takes care of this by adding together the differentials. This means that everywhere they needed an Eq typeclass, we can use a Num typeclass. There may be usefulness to building a wrapped type data NZ a = Zero | NonZero a like their Tag type to accelerate the many 0 values that may be propagating through the system.

Unfortunately, as is, the performance of this is abysmal. Maybe there is a way to fix it? Unlifting and lifting destroys a lot of sharing and often necessitates adding many redundant zeros. Why are you doing reverse mode differentiation unless you care about performance? Forward mode is simpler to implement. In the intended use case of Matsuda and Wang, they are working with actual lawful lenses, which have far less computational content than AD-lenses. Good lawful lenses should just be shuffling stuff around a little. Maybe one can hope GHC is insanely intelligent and can optimize these zeros away. One point in favor of that is that our differentiation is completely pure (no mutation). Nevertheless, I suspect it will not without help. Being careful and unlifting and lifting manually may also help. In principle, I think the Lensy approach could be pretty fast (since all it is is just packing together exactly what you need to differentiate into a data type), but how to make it fast while still being easily programmable? It is also nice that it is pretty simple to implement. It is the simplest method that I know of if you needed to port operable reverse mode differentiation to a new library (Massiv?) or another functional language (Futhark?). And a smart compiler really does have a shot at finding optimizations/fusions.

While I was at it, unrelated to the paper above, I think I made a working generic auto differentiable fold lens combinator. Pretty cool. mapAccumL is a hot tip.

For practical Haskell purposes, all of this is a little silly with the good Haskell AD packages around, the most prominent being

http://hackage.haskell.org/package/ad

It is somewhat interesting to note the similarity of type forall s. Lens s appearing in the Matsuda and Wang approach to those those of the forall s. BVar s monad appearing in the backprop package. In this case I believe that the s type variable plays the same role it does in the ST monad, protecting a mutating Wengert tape state held in the monad, but I haven’t dug much into it. I don’t know enough about backprop to know what to make of this similarity.

http://hackage.haskell.org/package/backprop

The github repo with my playing around and stream of consciousness commentary is here

{-# LANGUAGE NoImplicitPrelude, TypeSynonymInstances, RankNTypes #-}
module Numeric.ADLens.AppBi where
-- import Numeric.ADLens.Lens
import Control.Category
import Prelude hiding (id, (.))
import Control.Arrow ((***))
import Data.Functor.Const
import Data.Traversable
newtype Lens x y = Lens (x -> (y, y -> x)) 
type L s a = Num s => Lens s a

instance Category Lens where
  id = Lens (\x -> (x, id))
  (Lens f) . (Lens g) = Lens $ \x -> let (y, g') = g x in
                                           let (z, f') = f y in
                                           (z, g' . f') 



grad'' (Lens f) x = let (y,j) = (f x) in j 1

lift :: Lens a b -> (forall s. Lens s a -> Lens s b)
lift l l' = l . l'

unlift :: Num a => (forall s. Num s => Lens s a -> Lens s b) -> Lens a b
unlift f = f id


dup :: Num a => Lens a (a,a)
dup = Lens $ \x -> ((x,x), \(dx,dy) -> dx + dy)

par :: Lens a b -> Lens c d -> Lens (a,c) (b,d)
par (Lens f) (Lens g) = Lens l'' where
    l'' (a,c) = ((b,d), f' *** g') where
        (b,f') = f a
        (d,g') = g c

fan :: Num s => Lens s a -> Lens s b -> Lens s (a,b)
fan x y = (par x y) . dup 

-- impredicative polymorphism errror when we use L in type sig. I'm just going to avoid that.
lift2 :: Lens (a,b) c -> (forall s. Num s => (Lens s a, Lens s b) -> Lens s c)
lift2 l (x,y) = lift l (fan x y)

unlift2 :: (Num a, Num b) => (forall s. Num s => (Lens s a, Lens s b) -> Lens s c) -> Lens (a,b) c
unlift2 f = f (fst', snd')

instance (Num a, Num b) => Num (a,b) where
	(x,y) + (a,b) = (x + a, y + b)
	(x,y) * (a,b) = (x * a, y * b)
	abs (x,y) = abs (x,y)
	fromInteger x = (fromInteger x, fromInteger x)
	-- and so on

fst' :: Num b => Lens (a,b) a
fst' = Lens (\(a,b) -> (a, \ds -> (ds, 0)))

snd' :: Num a => Lens (a,b) b
snd' = Lens (\(a,b) -> (b, \ds -> (0, ds)))

unit :: Num s => Lens s () -- ? This isn't right.
unit = Lens (\s -> ((), const 0))

add :: Num a => Lens (a,a) a 
add = Lens $ \(x,y) -> (x + y, \ds -> (ds, ds))

sub :: Num a => Lens (a,a) a 
sub = Lens $ \(x,y) -> (x - y, \ds -> (ds, -ds))

mul :: Num a => Lens (a,a) a 
mul = Lens $ \(x,y) -> (x * y, \dz -> (dz * y, x * dz))

recip' :: Fractional a => Lens a a 
recip' = Lens $ \x-> (recip x, \ds -> - ds / (x * x))

div :: Fractional a => Lens (a,a) a 
div = Lens $ (\(x,y) -> (x / y, \d -> (d/y,-x*d/(y * y))))

-- They called this "new" Section 3.2
constLens :: Num s => a -> Lens s a
constLens x = Lens (const (x, const 0))

-- or rather we might define add = unlift2 (+)
instance (Num s, Num a) => Num (Lens s a) where
	x + y = (lift2 add) (x,y)
	x - y = (lift2 sub) (x,y)
	x * y = (lift2 mul) (x,y)
	abs = error "TODO"
	fromInteger x = constLens (fromInteger x)

t1 :: Num a => Lens (a,a) a
t1 = unlift2 $ \(x,y) -> x + y*y + y * 7

-- See section on lifting list functions form biapplicative paper
-- These are could be Iso.
lcons :: Lens (a,[a]) [a]
lcons =  Lens $ \(a,as) -> (a : as, \xs -> (head xs, tail xs))
lnil :: Lens () [b]
lnil = Lens $ const ([], const ())

lsequence :: Num s => [Lens s a] -> Lens s [a]
lsequence [] = lift lnil unit
lsequence (x : xs) = lift2 lcons (x, lsequence xs)

llift :: Num s => Lens [a] b -> [Lens s a] -> Lens s b
llift l xs = lift l (lsequence xs)


instance (Num a) => Num [a] where
	(+) = zipWith (+)
	(*) = zipWith (*)
	(-) = zipWith (-)
	abs = map abs
	fromInteger x = repeat (fromInteger x)

-- We need to hand f a list of the accessor lenses
-- [Lens [a] a]
-- This feels quite wrong. Indexing into a list is naughty.
-- But that is what they do. Shrug.
lunlift :: Num a => (forall s. Num s => [Lens s a] -> Lens s b) -> Lens [a] b
lunlift f = Lens $ \xs -> 
					let n = length xs in
					let inds = [0 .. n-1] in
					let ls = map (lproj n) inds in
					let (Lens f') = f ls in
					f' xs

t2 :: Num a => Lens [a] a					
t2 = lunlift sum
t3 :: Num a => Lens [a] a					
t3 = lunlift product

lproj :: Num a => Int -> Int -> Lens [a] a
lproj n' ind = Lens $ \xs -> ((xs !! ind), \x' -> replace ind x' zeros) where
	replace 0 x (y:ys) = x : ys
	replace n x (y:ys) = y : (replace (n-1) x ys)
	zeros = replicate n' 0

	
lensmap :: Applicative f => Lens a b -> Lens (f a) (f b)
lensmap (Lens f) = Lens $ \fa ->
								let fbb = fmap f fa in
								let fb = fmap fst fbb in
								let fb2s = fmap snd fbb in
								(fb, \fb' -> fb2s <*> fb')

-- Types work, but does this actually make sense?
lsequenceA :: (Applicative f, Applicative t, Traversable f, Traversable t) => Lens (t (f a)) (f (t a))
lsequenceA = Lens $ \tfa -> (sequenceA tfa, sequenceA)

ltraverse :: (Applicative f, Applicative t, Traversable f, Traversable t) =>
             Lens a (f b) -> Lens (t a) (f (t b))
ltraverse f = lsequenceA . (lensmap f)

lensfoldl :: Traversable t => Lens (a, b) a -> Lens (a, t b) a
lensfoldl (Lens f) = Lens $ \(s, t) -> let (y, tape) = mapAccumL (curry f) s t  in
						  (y,  \db ->  mapAccumR (\db' f -> (f db')) db tape)
lensfoldr :: Traversable t => Lens (a, b) a -> Lens (a, t b) a
lensfoldr (Lens f) = Lens $ \(s, t) -> let (y, tape) = mapAccumR (curry f) s t  in
						(y,  \db ->  mapAccumL (\db' f -> (f db')) db tape)						  

t5 = grad'' (lensfoldl mul) (1, [1,1,2,3])


liftC :: Num a => (Lens a b -> Lens c d) -> (forall s. Num s => Lens s a -> Lens s b) -> (forall t. Num t => Lens t c -> Lens t d)
liftC c f = lift (c (unlift f))

ungrad :: Lens (a,b) c -> (a -> Lens b c)
ungrad (Lens f) a = Lens (\b -> let (c,j) = f (a,b) in (c, snd . j))

A Touch of Topological Quantum Computation in Haskell Pt. II: Automating Drudgery

Last time we built the basic pieces we need to describe anyons in Haskell. Anyon models describe interesting physical systems where a set of particles (Tau and Id in our case) have certain splitting rules and peculiar quantum properties. The existence of anyons in a system are the core physics necessary to support topological quantum computation. In topological quantum computing, quantum gates are applied by braiding the anyons and measurements performed by fusing anyons together and seeing what particle comes out. Applying gates in this way has inherent error correcting properties.

The tree of particle production with particle labelled leaves picks a basis (think the collection \{\hat{x}, \hat{y}, \hat{z}\} ) for the anyon quantum vector space. An individual basis vector (think \hat{x} ) from this basis is specified by labelling the internal edges of the tree. We built a Haskell data type for a basic free vector space and functions for the basic R-moves for braiding two anyons and reassociating the tree into a new basis with F-moves. In addition, you can move around your focus within the tree by using the function lmap and rmap. The github repo with that and what follows below is here.

Pain Points

We’ve built the atomic operations we need, but they work very locally and are quite manual. You can apply many lmap and rmap to zoom in to the leaves you actually wish to braid, and you can manually perform all the F-moves necessary to bring nodes under the same parent, but it will be rather painful.

The standard paper-and-pencil graphical notation for anyons is really awesome. You get to draw little knotty squiggles to calculate. It does not feel as laborious. The human eye and hand are great at applying a sequence of reasonably optimal moves to untangle the diagram efficiently. Our eye can take the whole thing in and our hand can zip around anywhere.

To try and bridge this gap, we need to build functions that work in some reasonable way on the global anyon tree and that automate simple tasks.

A Couple Useful Functions

Our first useful operation is pullLeftLeaf. This operation will rearrange the tree using F-moves to get the leftmost leaf associated all the way to the root. The leftmost leaf will then have the root as a parent.

Because the tree structure is in the FibTree a b data type, we need the tuple tree type of the pulled tree. This is a slightly non-trivial type computation.

In order to do this, we’ll use a bit of typelevel programming. If this is strange and alarming stuff for you, don’t sweat it too much. I am not the most elegant user of these techniques, but I hope that alongside my prose description you can get the gist of what we’re going for.

(Sandy Maguire has a new book on typelevel programming in Haskell out. Good stuff. Support your fellow Haskeller and toss him some buckos.)

class PullLeftLeaf a b | a -> b where 
  pullLeftLeaf :: FibTree c a -> Q (FibTree c b)
instance PullLeftLeaf (Tau,c) (Tau,c) where
  pullLeftLeaf = pure
instance PullLeftLeaf (Id,c) (Id,c) where
  pullLeftLeaf = pure
instance PullLeftLeaf Tau Tau where
  pullLeftLeaf = pure
instance PullLeftLeaf Id Id where
  pullLeftLeaf = pure
instance (PullLeftLeaf (a,b) (a',b'), 
          r ~ (a',(b',c))) => PullLeftLeaf ((a, b),c) r where
  pullLeftLeaf t = do 
           t' <- lmap pullLeftLeaf t
           fmove' t'

The resulting tree type b is an easily computable function of the starting tree type a. That is what the “functional dependency” notation | a -> b in the typeclass definition tells the compiler.

The first 4 instances are base cases. If you’re all the way at the leaf, you basically want to do nothing. pure is the function that injects the classical tree description into a quantum state vector with coefficient 1.

The meat is in the last instance. In the case that the tree type matches ((a,b),c), we recursively call PullLeftLeaf on (a,b) which returns a new result (a',b'). Because of the recursion, this a' is the leftmost leaf. We can then construct the return type by doing a single reassociation step. The notation ~ forces two types to unify. We can use this conceptually as an assignment statement at the type level. This is very useful for building intermediate names for large expressions, as assert statements to ensure the types are as expected, and also occasionally to force unification of previously unknown types. It’s an interesting operator for sure.

The recursion at the type level is completely reflected in the actual function definition. We focus on the piece (a,b) inside t by using lmap. We do a recursive call to pullLeftLeaf, and finally fmove' performs the final reassociation move. It is all rather verbose, but straightforward I hope.

You can also build a completely similar PullRightLeaf.

A Canonical Right Associated Basis

One common way of dealing with larger trees is to pick a canonical basis of fully right associated trees. The fully right associated tree is a list-like structure. Its uniformity makes it easier to work with.

By recursively applying pullLeftLeaf, we can fully right associate any tree.

class RightAssoc a b | a -> b where
  rightAssoc :: FibTree c a -> Q (FibTree c b)
instance RightAssoc Tau Tau where
  rightAssoc = pure
instance RightAssoc Id Id where
  rightAssoc = pure
instance (PullLeftLeaf (a,b) (a',b'),
          RightAssoc b' b'',
          r ~ (a', b'')) => RightAssoc (a,b) r where
  rightAssoc t = do 
           t' <- pullLeftLeaf t
           rmap rightAssoc t'

This looks quite similar to the implementation of pullLeftLeaf. It doesn’t actually have much logic to it. We apply pullLeftLeaf, then we recursively apply rightAssoc in the right branch of the tree.

B-Moves: Braiding in the Right Associated Basis

Now we have the means to convert any structure to it’s right associated canonical basis. In this basis, one can apply braiding to neighboring anyons using B-moves, which can be derived from the braiding R-moves and F-moves.

The B-move applies one F-move so that the two neighboring leaves share a parent, uses the regular braiding R-move, then applies the inverse F-move to return back to the canonical basis. Similarly, bmove'  is the same thing except applies the under braiding braid' rather that the over braiding braid.

(Image Source : Preskill’s notes)

bmove :: forall b c d a. FibTree a (b,(c,d)) -> Q (FibTree a (c,(b,d)))
bmove t = do
           t'  :: FibTree a ((b,c),d) <- fmove t
           t'' :: FibTree a ((c,b),d) <-  lmap braid t'
           fmove' t'' 
bmove' :: forall b c d a. FibTree a (b,(c,d)) -> Q (FibTree a (c,(b,d)))
bmove' = fmove' <=< (lmap braid') <=< fmove -- point-free style for funzies. equivalent to the above except for braid'

Indexing to Leaves

We also may desire just specifying the integer index of where we wish to perform a braid. This can be achieved with another typeclass for iterated rmaping. When the tree is in canonical form, this will enable us to braid two neighboring leaves by an integer index. This index has to be a typelevel number because the output type depends on it.

In fact there is quite a bit of type computation. Given a total tree type s and an index n this function will zoom into the subpart a of the tree at which we want to apply our function. The subpart a is replaced by b, and then the tree is reconstructed into t. t is s with the subpart a mapped into b.  I have intentionally made this reminiscent of the type variables of the lens type Lens s t a b .

rmapN :: forall n gte s t a b e. (RMapN n gte s t a b, gte ~ (CmpNat n 0)) => (forall r. FibTree r a -> Q (FibTree r b)) -> (FibTree e s) -> Q (FibTree e t)
rmapN f t = rmapN' @n @gte f t

class RMapN n gte s t a b | n gte s b -> a t where
    rmapN' :: (forall r. FibTree r a -> Q (FibTree r b)) -> (FibTree e s) -> Q (FibTree e t)

instance (a ~ s, b ~ t) => RMapN 0 'EQ s t a b where
    rmapN' f t = f t
instance (RMapN (n-1) gte r r' a b, 
               gte ~ (CmpNat (n-1) 0),
               t ~ (l,r')) => RMapN n 'GT (l,r) t a b where
    rmapN' f t = rmap (rmapN @(n-1) f) t

This looks much noisier that it has to because we need to work around some of the unfortunate realities of using the typeclass system to compute types. We can’t just match on the number n in order to pick which instance to use because the patterns 0 and n are overlapping. The pattern n can match the number 0 if n ~ 0. The pattern matching in the type instance is not quite the same as the regular Haskell pattern matching we use to define functions. The order of the definitions does not matter, so you can’t have default cases. The patterns you use cannot be unifiable. In order to fix this, we make the condition if n is greater than 0 an explicit type variable gte. Now the different cases cannot unify. It is a very common trick to need a variable representing some branching condition.

For later convenience, we define rmapN which let’s us not need to supply the necessary comparison type gte.

Parentifying Leaves Lazily

While it is convenient to describe anyon computations in a canonical basis, it can be quite inefficient. Converting an arbitrary  anyon tree into the standard basis will often result in a dense vector. A natural thing to do for the sake of economy is only do reassociation on demand.

The algorithm for braiding two neighboring leaves is pretty straightforward. We need to reassociate these leaves so that they have the same parent. First we need the ability to map into the least common ancestor of the two leaves. To reassociate these two leaves to have a common parent we pullrightLeaf the left subtree and then pullLeftLeaf the left subtree. Finally, there is a bit extra bit of shuffling to actually get them to be neighbors.

As a first piece, we need a type level function to count the number of leaves in a tree. In this case, I am inclined to use type families rather than multi parameter typeclasses as before, since I don’t need value level stuff coming along for the ride.

type family Count a where
  Count Tau = 1
  Count Id = 1
  Count (a,b) = (Count a) + (Count b)

type family LeftCount a where
  LeftCount (a,b) = Count a

Next, we make a typeclass for mapping into the least common ancestor position.

lcamap ::  forall n s t a b e gte .
           (gte ~ CmpNat (LeftCount s) n,
           LCAMap n gte s t a b)
           => (forall r. FibTree r a -> Q (FibTree r b)) -> (FibTree e s) -> Q (FibTree e t)
lcamap f t = lcamap' @n @gte f t

class LCAMap n gte s t a b | n gte s b -> t a where
    lcamap' :: (forall r. FibTree r a -> Q (FibTree r b)) -> (FibTree e s) -> Q (FibTree e t)
    

instance (n' ~ (n - Count l), -- We're searching in the right subtree. Subtract the leaf number in the left subtree
            lc ~ (LeftCount r), -- dip one level down to order which way we have to go next
            gte ~ (CmpNat lc n'), -- Do we go left, right or have we arrived in the next layer?
            LCAMap n' gte r r' a b,  -- recursive call
            t ~ (l,r') -- reconstruct total return type from recursive return type. Left tree is unaffected by lcamapping
            ) => LCAMap n 'LT (l,r) t a b where
        lcamap' f x = rmap (lcamap @n' f) x
    
instance (lc ~ (LeftCount l),
            gte ~ (CmpNat lc n),
            LCAMap n gte l l' a b,
            t ~ (l',r)
            ) => LCAMap n 'GT (l,r) t a b where
        lcamap' f x = lmap (lcamap @n f) x
    
instance (t ~ b, a ~ s) => LCAMap n 'EQ s t a b where -- Base case
    lcamap' f x = f x

We find the least common ancestor position by doing a binary search on the size of the left subtrees at each node. Once the size of the left subtree equals n, we’ve found the common ancestor of leaf n and leaf n+1.

Again, this LCAMap typeclass has a typelevel argument gte that directs it which direction to go down the tree.

class Twiddle s t a b | s b -> t a where
    twiddle :: (forall r. FibTree r a -> Q (FibTree r b)) -> FibTree e s -> Q (FibTree e t)
    
instance Twiddle ((l,x),(y,r)) ((l,c),r) (x,y) c where
    twiddle f x = do
            x'  <- fmove x -- (((l',x),y),r')
            x'' <- lmap fmove' x' -- ((l',(x,y)),r')
            lmap (rmap f) x''
instance Twiddle (Tau, (y,r)) (c,r) (Tau, y) c where
    twiddle f x = fmove x >>= lmap f
instance Twiddle (Id, (y,r)) (c,r)  (Id, y) c where
    twiddle f x = fmove x >>= lmap f
instance Twiddle ((l,x), Tau) (l,c) (x,Tau) c where
    twiddle f x = fmove' x >>= rmap f
instance Twiddle ((l,x), Id) (l,c) (x,Id) c where
    twiddle f x = fmove' x >>= rmap f
instance Twiddle (Tau, Tau) c (Tau,Tau) c where
    twiddle f x = f x 
instance Twiddle (Id, Id) c (Id,Id)  c where
    twiddle f x = f x 
instance Twiddle (Tau, Id) c (Tau,Id)  c where
    twiddle f x = f x 
instance Twiddle (Id, Tau) c (Id,Tau) c where
    twiddle f x = f x

The Twiddle typeclass will perform some final cleanup after we’ve done all the leaf pulling. At that point, the leaves still do not have the same parent. They are somewhere between 0 and 2 F-moves off depending on whether the left or right subtrees may be just a leaf or larger trees. twiddle is not a recursive function.

Putting this all together we get the nmap function that can apply a function after parentifying two leaves. By far the hardest part is writing out that type signature.

nmap :: forall (n :: Nat) s t a b a' b' l l' r r' e gte.
    (gte ~ CmpNat (LeftCount s) n,
    LCAMap n gte s t a' b',
    a' ~ (l,r),
    PullRightLeaf l l',
    PullLeftLeaf r r',
    Twiddle (l',r') b' a b) => 
    (forall r. FibTree r a -> Q (FibTree r b)) -> FibTree e s -> Q (FibTree e t)
nmap f z = lcamap @n @s @t @a' @b' (\x -> do
                                           x'  <- lmap pullRightLeaf x
                                           x'' <- rmap pullLeftLeaf x' 
                                           twiddle f x'') z

Usage Example

Here’s some simple usage:

t1 = nmap @2 braid (TTT (TTI TLeaf ILeaf) (TTT TLeaf TLeaf)) 
t5 = nmap @2 pure (TTT (TTI TLeaf ILeaf) (TTT TLeaf TLeaf)) >>= nmap @3 pure
t2 = nmap @1 braid (TTT (TTI TLeaf ILeaf) (TTT TLeaf TLeaf)) 
t4 = nmap @1 braid (TTT TLeaf (TTT TLeaf TLeaf)) 
t3 = nmap @2 braid (TTT (TTT (TTT TLeaf TLeaf) TLeaf) (TTT TLeaf TLeaf)) 
t6 = rightAssoc (TTT (TTT (TTT TLeaf TLeaf) TLeaf) (TTT TLeaf TLeaf)) 
t7 = t6 >>= bmove
t8 = t6 >>= rmapN @0 bmove

Note that rmapN is 0-indexed but nmap is 1-indexed. This is somewhat horrifying, but that is what was natural in the implementation.

Here is a more extended example showing how to fuse some particles.

ttt = TTT TLeaf TLeaf
example = starttree >>=
        nmap @1 braid >>=
        nmap @2 braid >>=
        nmap @1 (dot ttt) >>=
        nmap @2 braid' >>=
        nmap @2 (dot ttt) >>=
        nmap @1 (dot ttt) where
        starttree = pure (TTT (TTT TLeaf
                              (TTT TLeaf 
                                   TLeaf))
                          TLeaf
                         )

I started with the tree at the top and traversed downward implementing each braid and fusion. Implicitly all the particles shown in the diagram are Tau particles. The indices refer to particle position, not to the particles “identity” as you would trace it by eye on the page. Since these are identical quantum  particles, the particles don’t have identity as we classically think of it anyhow.

The particle pairs are indexed by the number on the left particle. First braid 1 over 2, then 2 over 3, fuse 1 and 2, braid 2 under 3, fuse 2 and 3, and then fuse 1 and 2. I got an amplitude for the process of -0.618, corresponding to a probability of 0.382. I would give myself 70% confidence that I implemented all my signs and conventions correctly. The hexagon and pentagon equations from last time being correct gives me some peace of mind.

Syntax could use a little spit polish, but it is usable. With some readjustment, one could use the Haskell do notation removing the need for explicit >>=.

Next Time

Anyons are often described in categorical terminology. Haskell has a category culture as well. Let’s explore how those mix!

Compiling to Categories 3: A Bit Cuter

Ordinary Haskell functions form a cartesian closed category. Category means you can compose functions. Cartesian basically means you can construct and deconstruct tuples and Closed means that you have first class functions you can pass around.

Conal Elliott’s Compiling to Categories is a paradigm for reinterpreting ordinary functions as the equivalent in other categories. At an abstract level, I think you could describe it as a mechanism to build certain natural law abiding Functors from Hask to other categories. It’s another way to write things once and have them run many ways, like polymorphism or generic programming. The ordinary function syntax is human friendly compared to writing raw categorical definitions, which look roughly like point-free programming (no named variables). In addition, by embedding it as a compiler pass within GHC, he gets to leverage existing GHC optimizations as optimizations for other categories. Interesting alternative categories include the category of graphs, circuits, and automatically differentiable functions. You can find his implementation here

I’ve felt hesitance at using a GHC plugin plus I kind of want to do it in a way I understand, so I’ve done different versions of this using relatively normal Haskell (no template haskell, no core passes, but a butt ton of hackery).

The first used explicit tags. Give them to the function and see where they come out. That is one way to probe a simple tuple rearrangement function.

The second version worked almost entirely at the typelevel. It worked on the observation that a completely polymorphic tuple type signature completely specifies the implementation. You don’t have to look at the term level at all. It unified the polymorphic values in the input with a typelevel number indexed Tag type. Then it searched through the input type tree to find the piece it needed. I did end up passing stuff in to the term level because I could use this mechanism to embed categorical literals. The typeclass hackery I used to achieve this all was heinous.

I realized today another way related to both that is much simpler and fairly direct. It has some pleasing aesthetic properties and some bad ones. The typeclass hackery is much reduced and the whole thing fits on a screen, so I’m pleased.

Here are the basic categorical definitions. FreeCat is useful for inspection in GHCi of what comes out of toCCC.

{-# LANGUAGE GADTs, StandaloneDeriving, NoImplicitPrelude, FlexibleInstances  #-}

module Cat where
import Control.Category
import Prelude hiding ((.), id)


class Category k => Monoidal k where
    parC :: k a c -> k b d -> k (a,b) (c,d)


class Monoidal k => Cartesian k where
    fstC :: k (a,b) a 
    sndC :: k (a,b) b 
    dupC :: k a (a,a) 

fanC f g = (parC f g) . dupC

idC :: Category k => k a a
idC = id

data FreeCat a b where
    Comp :: FreeCat b c -> FreeCat a b -> FreeCat a c
    Id :: FreeCat a a
    Fst :: FreeCat (a,b) a
    Snd :: FreeCat (a,b) b
    Dup :: FreeCat a (a,a)
    Par :: FreeCat a b -> FreeCat c d -> FreeCat (a,c) (b,d)
    Add :: FreeCat (a,a) a
    Mul :: FreeCat (a,a) a

deriving instance Show (FreeCat a b)

instance Category FreeCat where
    (.) = Comp
    id = Id

instance Monoidal FreeCat where
    parC = Par

instance Cartesian FreeCat where
    fstC = Fst
    sndC = Snd
    dupC = Dup

instance Monoidal (->) where
    parC f g = \(x,y) -> (f x, g y)  

instance Cartesian (->) where
    fstC (x,y) = x
    sndC (x,y) = y
    dupC x = (x,x)

class Cartesian k => NumCat k where
    mulC :: Num a => k (a,a) a
    negateC :: Num a => k a a
    addC :: Num a => k (a,a) a
    subC :: Num a => k (a,a) a
    absC :: Num a => k a a

instance NumCat (->) where
    mulC = uncurry (*)
    negateC = negate
    addC = uncurry (+)
    subC = uncurry (-)
    absC = abs

instance NumCat FreeCat where
    mulC = Mul
    negateC = error "TODO"
    addC = Add
    subC = error "TODO"
    absC = error "TODO"

instance (NumCat k, Num a) => Num (k z a) where
    f + g = addC . (fanC f g)
    f * g = mulC . (fanC f g)
    negate f = negateC . f
    f - g = subC . (fanC f g)
    abs f = absC . f 
    signum = error "TODO"
    fromInteger = error "TODO"

And here is the basic toCCC implementation

{-# LANGUAGE DataKinds, 
    AllowAmbiguousTypes, 
    TypeFamilies, 
    TypeOperators, 
    MultiParamTypeClasses, 
    FunctionalDependencies, 
    PolyKinds, 
    FlexibleInstances, 
    UndecidableInstances,
    TypeApplications,
    NoImplicitPrelude,
    ScopedTypeVariables #-}
module CCC (
     toCCC )where

import Control.Category
import Prelude hiding ((.), id)
import Cat


class IsTup a b | a -> b
instance {-# INCOHERENT #-} (c ~ 'True) => IsTup (a,b) c
instance {-# INCOHERENT #-} (b ~ 'False) => IsTup a b


class BuildInput tup (flag :: Bool) path where
    buildInput :: path -> tup

instance (Cartesian k,
          IsTup a fa,
          IsTup b fb,
          BuildInput a fa (k x a'), 
          BuildInput b fb (k x b'),
          ((k x (a',b')) ~ cat)) =>  BuildInput (a,b) 'True cat where
    buildInput path = (buildInput @a @fa patha, buildInput @b @fb pathb) where
                 patha = fstC . path
                 pathb = sndC . path

instance (Category k, a ~ k a' b') => BuildInput a  'False (k a' b') where
    buildInput path = path


class FanOutput out (flag :: Bool) cat | out flag -> cat where
    fanOutput :: out -> cat

instance (Cartesian k,
         IsTup a fa,
         IsTup b fb,
         FanOutput a fa (k x a'),
         FanOutput b fb (k x b'),
         k x (a', b') ~ cat
         )
          => FanOutput (a, b) 'True cat where
    fanOutput (x,y) = fanC (fanOutput @a @fa x) (fanOutput @b @fb y)

instance (Category k, kab ~ k a b) => FanOutput kab 'False (k a b) where
    fanOutput f = f 

toCCC :: forall k a b a' b' fa fb x. (IsTup a fa, IsTup b fb, Cartesian k, BuildInput a fa (k a' a'), FanOutput b fb (k a' b')) => (a -> b) -> k a' b'
toCCC f = fanOutput @b @fb output where
                                      input = buildInput @a @fa (idC @k @a')
                                      output = f input


 

Here is some example usage

example2 = toCCC @FreeCat (\(x,y)->(y,x))

-- You need to give the type signature unfortunately if you want polymorphism in k. k is too ambiguous otherwise and makes GHC sad.
example3 :: Cartesian k => k _ _
example3 = toCCC (\(z,y)->(z,z))

example4 = toCCC @FreeCat (\((x,y),z) -> x)

example5 = toCCC @FreeCat (\(x,y) -> x + y)

example6 = toCCC @FreeCat (\(x,y) -> y + (x * y))

example7 :: Cartesian k => k _ _
example7 = toCCC (\(x,(y,z)) -> (x,(y,z)))

 

What we do is generate a tuple to give to your function. The function is assumed to be polymorphic again but now not necessarily totally polymorphic (this is important because Num typeclass usage will unify variables). Once we hit a leaf of the input tuple, we place the categorical morphism that would extract that piece from the input. For example for the input type (a,(b,c)) we pass it the value (fstC ,(fstC . sndC, sndC . sndC )). Detecting when we are at a leaf again requires somehow detecting a polymorphic location, which is a weird thing to do. We use the Incoherent IsTup instance from last time to do this. It is close to being safe, because we immediately unify the polymorphic variable with a categorial type, so if something goes awry, a type error should result. We could make it more ironclad by unifying immediately to something that contains the extractor and a user inaccessible type.

We apply the function to this input. Now the output is a tuple tree of morphisms. We recursively traverse down this tree with a fanC for every tuple. This all barely requires any typelevel hackery. The typelevel stuff that is there is just so that I can traverse down tuple trees basically.

One benefit is that we can now use some ordinary typeclasses. We can make a simple implementation of Num for (k z a) like how we would make it for (z -> a). This let’s us use the regular (+) and (*)  operators for example.

What is not good is the performance. As it is, the implementation takes many global duplication of the input to create all the `fanC`s. In many categories, this is very wasteful.This may be a fixable problem, either via passing in more sophisticated objects that just the bare extraction morphisms to to input (CPS-ified? Path Lists?) or via the GHC rewrite rules mechanism. I have started to attempt that, but have not been successful in getting any of my rewrite rules to fire yet, because I have no idea what I’m doing. If anyone could give me some advice, I’d be much obliged. You can check that out here. For more on rewrite rules, check out the GHC user manual and this excellent tutorial by Mark Karpov here.

Another possibility is to convert to FreeCat, write regular Haskell function optimization passes over the FreeCat AST and then interpret it. This adds interpretation overhead, which may or may not be acceptable depending on your use case. It would probably not be appropriate for automatically differentiable functions, but may be for outputting circuits or graphs.

interp (Comp f g) = (interp f) . (interp g)
interp FstC = fstC
interp Dup = dupC
interp (Par f g) = parC (interp f) (interp g)
-- etc

 

Another problem is dealing with boolean operations. The booleans operators and comparison operators are not sufficiently polymorphic in the Prelude. We could define new operators that work as drop in replacements in the original context, but I don’t really have the ability to overload the originals. It is tough because if we do things like this, it feels like we’re really kind of building a DSL more than we are compiling to categories. We need to write our functions with the DSL in mind and can’t just import and use some function that had no idea about the compiling to categories stuff.

I should probably just be using Conal’s concat project. This is all a little silly.