Compiling to Categories 3: A Bit Cuter

Ordinary Haskell functions form a cartesian closed category. Category means you can compose functions. Cartesian basically means you can construct and deconstruct tuples and Closed means that you have first class functions you can pass around.

Conal Elliott’s Compiling to Categories is a paradigm for reinterpreting ordinary functions as the equivalent in other categories. At an abstract level, I think you could describe it as a mechanism to build certain natural law abiding Functors from Hask to other categories. It’s another way to write things once and have them run many ways, like polymorphism or generic programming. The ordinary function syntax is human friendly compared to writing raw categorical definitions, which look roughly like point-free programming (no named variables). In addition, by embedding it as a compiler pass within GHC, he gets to leverage existing GHC optimizations as optimizations for other categories. Interesting alternative categories include the category of graphs, circuits, and automatically differentiable functions. You can find his implementation here

I’ve felt hesitance at using a GHC plugin plus I kind of want to do it in a way I understand, so I’ve done different versions of this using relatively normal Haskell (no template haskell, no core passes, but a butt ton of hackery).

The first used explicit tags. Give them to the function and see where they come out. That is one way to probe a simple tuple rearrangement function.

The second version worked almost entirely at the typelevel. It worked on the observation that a completely polymorphic tuple type signature completely specifies the implementation. You don’t have to look at the term level at all. It unified the polymorphic values in the input with a typelevel number indexed Tag type. Then it searched through the input type tree to find the piece it needed. I did end up passing stuff in to the term level because I could use this mechanism to embed categorical literals. The typeclass hackery I used to achieve this all was heinous.

I realized today another way related to both that is much simpler and fairly direct. It has some pleasing aesthetic properties and some bad ones. The typeclass hackery is much reduced and the whole thing fits on a screen, so I’m pleased.

Here are the basic categorical definitions. FreeCat is useful for inspection in GHCi of what comes out of toCCC.

And here is the basic toCCC implementation

 

Here is some example usage

 

What we do is generate a tuple to give to your function. The function is assumed to be polymorphic again but now not necessarily totally polymorphic (this is important because Num typeclass usage will unify variables). Once we hit a leaf of the input tuple, we place the categorical morphism that would extract that piece from the input. For example for the input type (a,(b,c)) we pass it the value (fstC ,(fstC . sndC, sndC . sndC )). Detecting when we are at a leaf again requires somehow detecting a polymorphic location, which is a weird thing to do. We use the Incoherent IsTup instance from last time to do this. It is close to being safe, because we immediately unify the polymorphic variable with a categorial type, so if something goes awry, a type error should result. We could make it more ironclad by unifying immediately to something that contains the extractor and a user inaccessible type.

We apply the function to this input. Now the output is a tuple tree of morphisms. We recursively traverse down this tree with a fanC for every tuple. This all barely requires any typelevel hackery. The typelevel stuff that is there is just so that I can traverse down tuple trees basically.

One benefit is that we can now use some ordinary typeclasses. We can make a simple implementation of Num for (k z a) like how we would make it for (z -> a). This let’s us use the regular (+) and (*)  operators for example.

What is not good is the performance. As it is, the implementation takes many global duplication of the input to create all the fanCs. In many categories, this is very wasteful.This may be a fixable problem, either via passing in more sophisticated objects that just the bare extraction morphisms to to input (CPS-ified? Path Lists?) or via the GHC rewrite rules mechanism. I have started to attempt that, but have not been successful in getting any of my rewrite rules to fire yet, because I have no idea what I’m doing. If anyone could give me some advice, I’d be much obliged. You can check that out here. For more on rewrite rules, check out the GHC user manual and this excellent tutorial by Mark Karpov here.

Another possibility is to convert to FreeCat, write regular Haskell function optimization passes over the FreeCat AST and then interpret it. This adds interpretation overhead, which may or may not be acceptable depending on your use case. It would probably not be appropriate for automatically differentiable functions, but may be for outputting circuits or graphs.

 

Another problem is dealing with boolean operations. The booleans operators and comparison operators are not sufficiently polymorphic in the Prelude. We could define new operators that work as drop in replacements in the original context, but I don’t really have the ability to overload the originals. It is tough because if we do things like this, it feels like we’re really kind of building a DSL more than we are compiling to categories. We need to write our functions with the DSL in mind and can’t just import and use some function that had no idea about the compiling to categories stuff.

I should probably just be using Conal’s concat project. This is all a little silly.

Resources on String Diagrams, and Adjunctions, and Kan Extensions

I’ve been trying to figure out Kan Extensions

Ralf Hinze on Kan Extensions

https://www.cs.ox.ac.uk/ralf.hinze/Kan.pdf

 

But while doing that I went down a rabbit hole on String Diagrams

This post is the first one on String Diagrams that made sense to me.

https://parametricity.com/posts/2015-07-18-braids.html

I had seen this stuff before, but I hadn’t appreciated it until I saw what Haskell expressions it showed equivalence between. They are not obvious equivalences

This seems like a very useful video on this topic.

https://skillsmatter.com/skillscasts/8807-categories-and-string-diagrams

In Summary, it is an excellent notation for talking about transformations of a long sequence of composed Functors  F G H … into some other long sequence of Functors. The conversion of functors runs from up to down. The composition of functors procedes left to right.  F eta is the fmap of eta, and eta F is eta with the forall’ed type unified to be of the form F a.

Adjunctions L -| R are asymmetric between cups and caps. L is on the left in cups and on the right in caps. That’s what makes squiggles pull straightable

I think I have an interesting idea for a linear algebra library based on this stuff

 

John Baez and Mike Stay’s Rosetta Stone (A touch stone I keep returning to)

math.ucr.edu/home/baez/rosetta.pdf

Dan Piponi gave a talk which is another touch stone of mine that I come back to again and again. There is a set of corresponding blog posts.

Other resources:

NCatLab article

https://ncatlab.org/nlab/show/string+diagram

John Baez hosted seminars

http://www.math.ucr.edu/home/baez/QG.html

Catsters

https://www.youtube.com/view_play_list?p=50ABC4792BD0A086

 

Dan Marsden’s Article

https://arxiv.org/abs/1401.7220

Marsden and Hinze have been collaborating

events.inf.ed.ac.uk/wf2016/slides/hinze.pdf

https://link.springer.com/chapter/10.1007/978-3-319-30936-1_8

Stephen Diehl on Adjunctions

http://www.stephendiehl.com/posts/adjunctions.html

 

A Section From an old Oregon Programming Language Summer School (a rich set of resources)

 

Marsden and Hinze have been collaborating

events.inf.ed.ac.uk/wf2016/slides/hinze.pdf

https://link.springer.com/chapter/10.1007/978-3-319-30936-1_8

 

Mike Stay doing a very interesting series of Category Theory in Javascript. He uses contracts in place of types. Defeats one of the big points of types (static analysis), but still pretty cool

 

 

I think that about covers everything I know about.

Oh yeah, there is the whole Coecke and Abramsky categorical quantum mechanics stuff too.

Contracts and Category Theory

Contracts are both less cryptic and less impressive than a type system.

A type system somehow proves that things will never happen at compile time. You’ll never get a string when you expected an int or even more powerful facts. Contracts just guarantee that at least if certain problems occur at least the program will freak out and tell you at runtime.

I had been vaguely aware of contracts which I considered to be kind of a band aid Racket thing that is their solution to type safety (besides typed racket), but I did not go into depth. And I kind of viewed the thing more as a methodology for showing loop invariants and algorithmic correctness rather than type correctness. I do not know if this is an accurate portrayal of what is in Racket and I’m pretty sure contracts do not actually originate there (Eiffel?).

Mike Stay (who you may know as the co-author of the Rosetta Stone paper https://arxiv.org/abs/0903.0340)made a bunch of videos which I don’t know how I didn’t come across before (they’re kind of old by skateboarding millennial mountain dew front end developer standards. Did Node even exist? Barely.). Javascript (well maybe after python) was my language of greatest comfort a couple years ago and I would have loved this. I absolutely loved Bartosz Milewski’s Category Theory for Programmer’s series. There is a crap-ton of line noise  that kind of muddies the waters though in javascript. I wonder if it makes sense to me because I mostly already understand what he’s going for from a Haskell context. He has some cool patterns here like using objects as typeclasses/interfaces.

https://jscategory.wordpress.com/

The really neat thing he discusses is higher order contracts which I’d bet is a central piece of contract based programming but I had never gotten that far.

I’m still skimming over the vids, but nice job.