Functors, Vectors, and Quantum Circuits

Vectors are dang useful things, and any energy you put into them seems to pay off massive dividends.

Vectors and Linear Algebra are useful for:

  • 2D, 3D, 4D geometry stuff. Computer graphics, physics etc.
  • Least Squares Fitting
  • Solving discretized PDEs
  • Quantum Mechanics
  • Analysis of Linear Dynamical Systems
  • Probabilistic Transition Matrices

There are certain analogies between Haskell Functors and Vectors that corresponds to a style of computational vector mathematics that I think is pretty cool and don’t see talked about much.

Due to the expressivity of its type system, Haskell has a first class notion of container that many other languages don’t. In particular, I’m referring to the fact that Haskell has higher kinded types * -> * (types parametrized on other types) that you can refer to directly without filling them first. Examples in the standard library include Maybe, [], Identity, Const b, and Either b. Much more vector-y feeling examples can be found in Kmett’s linear package V0, V1, V2, V3, V4. For example, the 4 dimensional vector type V4

This really isn’t such a strange, esoteric thing as it may appear. You wouldn’t blink an eye at the type

in some other language. What makes Haskell special is how compositional and generic it is. We can build thousand element structs with ease via composition. What we have here is an alternative to the paradigm of computational vectors ~ arrays. Instead we have computational vectors ~ structs. In principle, I see no reason why this couldn’t be as fast as arrays, although with current compiler expectations it probably isn’t.

Monoidal categories are a mathematical structure that models this analogy well. It has been designed by mathematicians for aesthetic elegance, and it seems plausible that following its example leads us to interesting, useful, and pleasant vector combinators. And I personally find something that tickles me about category theory.

So to get started, let’s talk a bit about functors.

The Algebra of Functors

Functors in Haskell are a typeclass for containers. They allow you to map functions over all the items in the container. They are related to the categorical notion of functor, which is a mapping between categories.

You can lift the product and sum of types to the product and sum of Functors which you may find in Data.Functor.Product and Data.Functor.Sum. This is analogous to the lifting of ordinary addition and multiplication to the addition and multiplication of polynomials, which are kind of like numbers with a “hole”.

Functors also compose. A container of containers of a is still a container of a. We can form composite containers by using the Compose newtype wrapper.

When you use this Compose newtype, instead of having to address the individual elements by using fmap twice, a single application of fmap will teleport you through both layers of the container.

Product, Sum, and Compose are all binary operator on functors. The type constructor has kind

Some important other functors from the algebra of types perspective are Const Void a, Const () a, and Identity a. These are identity elements for Sum, Product, and Compose respectively.

You can define mappings between containers that don’t depend on the specifics of their contents. These mappings can only rearrange, copy and forget items of their contained type. This can be enforced at the type level by the polymorphic type signature

These mappings correspond in categorical terminology to natural transformations between the functors f and g. There is a category where objects are Functors and morphisms are natural transformations. Sum, Product, and Compose all obeys the laws necessary to be a monoidal product on this category.

How the lifting of functions works for Compose is kind of neat.

Because the natural transformations require polymorphic types, when you apply ntf to fg the polymorphic variable a in the type of ntf restricts to a ~ g a'.

Product and Sum have a straight forward notion of commutativity ( (a,b) is isomorphic to (b,a)) . Compose is more subtle. sequenceA from the Traversable typeclass can swap the ordering of composition. sequenceA . sequenceA may or may not be the identity depending on the functors in question, so it has some flavor of a braiding operation. This is an interesting post on that topic https://parametricity.com/posts/2015-07-18-braids.html

Combinators of these sorts are used arise in at least the following contexts

  • Data types a la carte – A systematic way of building extensible data types
  • GHC Generics – A system for building generic functions that operate on data types that can be described with sums, products, recursion, and holes.
  • In and around the Lens ecosystem

Also see the interesting post by Russell O’Connor and functor oriented programming http://r6.ca/blog/20171010T001746Z.html. I think the above is part of that to which he is referring.

Vector Spaces as Shape

Vector spaces are made of two parts, the shape (dimension) of the vector space and the scalar.

Just as a type of kind * -> * can be thought of as a container modulo it’s held type, it can also be a vector modulo its held scalar type. The higher kinded type for vector gives an explicit slot to place the scalar type.

The standard Haskell typeclass hierarchy gives you some of the natural operations on vectors if you so choose to abuse it in that way.

  • Functor ~> Scalar Multiplication: smul s = fmap (* s)
  • Applicative ~> Vector Addition: vadd x y = (+) <$> x <*> y
  • Traversable ~> Tranposition. sequenceA has the type of transposition and works correctly for the linear style containers like V4.

The linear library does use Functor for scalar multiplication, but defines a special typeclass for addition, Additive. I think this is largely for the purposes for bringing Map like vectors into the fold, but I’m not sure.

Once we’ve got the basics down of addition and scalar multiplication, the next thing I want is operations for combining vector spaces. Two important ones are the Kronecker product and direct sum. In terms of indices, the Kronecker product is a space that is indexed by the cartesian product (,) of its input space indices and the direct sum is a space indexed by the Either of its input space indices. Both are very useful constructs. I use the Kronecker product all the time when I want to work on 2D or 3D grids for example. If you’ll excuse my python, here is a toy 2-D finite difference Laplace equation example. We can lift the 1D second derivative matrix K = \partial_x^2 using the kronecker product K2 = K \otimes I + I \otimes K. The direct sum is useful as a notion of stacking matrices.

The following is perhaps the most important point of the entire post.

Compose of vector functors gives the Kronecker product, and Product gives the direct sum (this can be confusing but its right. Remember, the sum in direct sum refers to the indices).

We can form the Kronecker product of vectors given a Functor constraint.

Notice we have two distinct but related things called kron: Kron and kron. One operates on vectors spaces and the other operates on vector values.

Building vector spaces out of small combinators like V2, V4, DSum, Kron is interesting for a number of reasons.

  • It is well typed. Similar to Nat indexed vectors, the types specify the size of the vector space. We can easily describe vector spaced as powers of 2 as V16 = Kron V2 (Kron V2 (Kron V2 (Kron V2 V1))), similarly in terms of its prime factors, or we can do a binary expansion (least significant bit first) V5 = DSum V1 (Kron V2 (DSum V0 (Kron V2 V1))) or other things. We do it without going into quasi-dependently typed land or GADTs.
  • It often has better semantic meaning. It is nice to say Measurements, or XPosition or something rather than just denote the size of a vector space in terms of a nat. It is better to say a vector space is the Kron of two meaningful vector spaces than to just say it is a space of size m*n. I find it pleasant to think of the naturals as a free Semiring rather than as the Peano Naturals and I like the size of my vector space defined similarly.
  • Interesting opportunities for parallelism. See Conal Elliott’s paper on scans and FFT: http://conal.net/papers/generic-parallel-functional/

What do linear operators look like?

In the Vectors as shape methodology, Vectors look very much like Functors.

I have been tempted to lift the natural transformation type above to the following for linear operators.

In a sense this works, we could implement kron because many of the container type (V1, V2, V3, etc) in the linear package implement Num. However, choosing Num is a problem. Why not Fractional? Why not Floating? Sometimes we want those. Why not just specifically Double?

We don’t really want to lock away the scalar in a higher rank polymorphic type. We want to ensure that everyone is working in the same scalar type before allowing things to proceed.

Note also that this type does not constrain us to linearity. Can we form the Kronecker product of linear operators? Yes, but I’m not in love with it. This is not nearly so beautiful as the little natural transformation dance.

This was a nice little head scratcher for me. Follow the types, my friend! I find this particularly true for uses of sequenceA. I find that if I want the containers swapped in ordering. In that situation sequenceA is usually the right call. It could be called transpose.

Giving the vector direct access to the scalar feels a bit off to me. I feel like it doesn’t leave enough “room” for compositionally. However, there is another possibility for a definition of morphisms could be that I think is rather elegant.

Does this form actually enforce linearity? You may still rearrange objects. Great. You can also now add and scalar multiply them with the Additive k constraint. We also expose the scalar, so it can be enforced to be consistent.

One other interesting thing to note is that these forms allow nonlinear operations. fmap, liftU2 and liftI2 are powerful operations, but I think if we restricted Additive to just a correctly implemented scalar multiply and vector addition operation, and zero, we’d be good.

We can recover the previous form by instantiation k to V1. V1, the 1-d vector space, is almost a scalar and can play the scalars role in many situations. V1 is the unit object with respect to the monoidal product Kron.

There seems to be a missing instance to Additive that is useful. There is probably a good reason it isn’t there, but I need it.

Monoidal Categories

The above analogy can be put into mathematical terms by noting that both vectors and functor are monoidal categories. I talked a quite a bit about monoidal categories in a previous post http://www.philipzucker.com/a-touch-of-topological-computation-3-categorical-interlude/ .

Categories are the combo of a collection of objects and arrows between the objects. The arrows can compose as long as the head of one is on the same object as the tail of the other. On every object, there is always an identity arrow, which when composed will do nothing.

We need a little extra spice to turn categories into monoidal categories. One way of thinking about it is that monoidal categories have ordinary category composition and some kind of horizontal composition, putting things side to side. Ordinary composition is often doing something kind of sequentially, applying a sequence of functions, or a sequence of matrices. The horizontal composition is often something parallel feeling, somehow applying the two arrows separately to separate pieces of the system.

Why are they called Monoidal?

There is funny game category people play where they want to lift ideas from other fields and replace the bits and pieces in such a way that the entire thing is defined in terms of categorical terminology. This is one such example.

A monoid is a binary operations that is associative and has an identity.

Sometimes people are more familiar with the concept of a group. If not, ignore the next sentence. Monoids are like groups without requiring an inverse.

Numbers are seperately monoids under both addition, multiplication and minimization (and more), all of which are associative operations with identity (0, 1, and infinity respectively).

Exponentiation is a binary operation that is not a monoid, as it isn’t associative.

The Monoid typeclass in Haskell demonstrates this http://hackage.haskell.org/package/base-4.12.0.0/docs/Data-Monoid.html

A common example of a monoid is list, where mempty is the empty list and mappend appends the lists.

There are different set-like intuitions for categories. One is that the objects in the category are big opaque sets. This is the case for Hask, Rel and Vect.

A different intuitiion is that the category itself is like a set, and the objects are the elements of that set. There just so happens to be some extra structure knocking around in there: the morphisms. This is the often more the feel for the examples of preorders or graphs. The word “monoidal” means that they we a binary operation on the objects. But in the category theory aesthetic, you also need that binary operation to “play nice” with the morphisms that are hanging around too.

Functors are the first thing that has something like this. It has other properties that come along for the ride. A Functor is a map that takes objects to objects and arrows to arrows in a nice way. A binary functor takes two objects to and object, and two arrows to one arrow in a way that plays nice (commutes) with arrow composition.

String diagrams

String diagrams are a graphical notation for monoidal categories. Agin I discussed this more here.

Morphisms are denoted by boxes. Regular composition is shown by plugging arrows together vertically. Monoidal product is denoted by putting the arrows side to side.

When I was even trying to describe what a monoidal category was, I was already using language evocative of string diagrams.

You can see string diagrams in the documentation for the Arrow library. Many diagrams that people use in various fields can be formalized as the string diagrams for some monoidal category. This is big chunk of Applied Category Theory.

This is the connection to quantum circuits, which are after all a graphical notation for very Kroneckery linear operations.

example circuit

There is an annoying amount of stupid repetitive book keeping with the associative structure of Kron. This can largely be avoided hopefully with coerce, but I’m not sure. I was having trouble with roles when doing it generically.

Bit and Bobbles

  • Woof. This post was more draining to write than I expected. I think there is still a lot left to say. Sorry about the editing everyone! Bits and pieces of this post are scattered in this repo
  • How would you go about this in other languages? C, Rust, OCaml, C++, Agda
  • The discussion of Vect = * -> * is useful for discussion of 2-Vect, coming up next. What if we make vectors of Vect? Wacky shit.
  • Metrics and Duals vectors. type Dual f a = f a -> a. type Dual1 f a = forall k. Additive k => Kron f k a -> k a
  • Adjunction diagrams have cups and caps. Since we have been using representable functors, they actually have a right adjunction that is tupling with the vector space index type. This gives us something that almost feels like a metric but a weirdly constrained metric.
  • LinOp1 form is yoneda? CPS? Universally quantified k is evocative of forall c. (a -> c) -> (b -> c)

References

Appendix

Representable/Naperian Functors

Containers that are basically big product types are also known as representable, Naperian, or logarithmic. Representable places emphasis on the isomorphism between such a container type and the type (->) i which by the algebra of types correspond is isomorphic to a^i (i copies of a). They are called Naperian/Logarithmic because there is a relationship similar to exponentiation between the index type a and the container type f. If you take the Product f g, this container is indexed by (a + b) = Either a b. Compose f g is indexed by the product (a,b). (f r) ~ r^a The arrow type is written as an exponential b^a because if you have finite enumerable types a and b, that is the number of possible tabulations available for f. The Sum of two representable functors is no longer representable. Regular logarithms of sums Log(f + g) do not have good identities associated with them.

See Gibbons article. There is a good argument to be made that representable functors are a good match for vectors/well typed tensor programming.

But note that there is a reasonable interpretation for container types with sum types in them. These can be thought of as subspaces, different bases, or as choices of sparsity patterns. When you define addition, you’ll need to say how these subspaces reconcile with each other.
— two bases at 45 degrees to each other.

Monoidal Products on Hask

Hask is a name for the category that has objects as Haskell types and morphisms as Haskell functions.

Note that it’s a curious mixing of type/value layers of Haskell. The objects are types whereas the function morphisms are Haskell values. Composition is given by (.) and the identity morphisms are given by id.

For Haskell, you can compose functions, but you can also smash functions together side by side. These combinators are held in Control.Arrow.

You can smash together types with tuple (,) or with Either. Both of these are binary operators on types. The corresponding mapping on morphisms are given by

These are binary operators on morphisms that play nice with the composition structure of Haskell.

Monoidal Combinators of Functors

A monoidal category also has unit objects. This is given by the Identity functor

There is also a sense of associativity. It is just newtype rearrangement, so it can also be achieved with a coerce (although not polymorphically?).

Similarly, we can define a monoidal category structure using Product or Sum instead of Compose.

These are all actually just newtype rearrangement, so they should all just be instances of coerce, but I couldn’t get the roles to go through generically?

Concolic Weakest Precondition is Kind of Like a Lens

That’s a mouthful.

Lens are described as functional getters and setters. The simple lens type is

. The setter is

and the getter is

This type does not constrain lenses to obey the usual laws of getters and setters. So we can use/abuse lens structures for nontrivial computations that have forward and backwards passes that share information. Jules Hedges is particular seems to be a proponent for this idea.

I’ve described before how to encode reverse mode automatic differentiation in this style. I have suspicions that you can make iterative LQR and guass-seidel iteration have this flavor too, but I’m not super sure. My attempts ended somewhat unsatisfactorily a whiles back but I think it’s not hopeless. The trouble was that you usually want the whole vector back, not just its ends.

I’ve got another example in imperative program analysis that kind of makes sense and might be useful though. Toy repo here: https://github.com/philzook58/wp-lens

In program analysis it sometimes helps to run a program both concretely and symbolically. Concolic = CONCrete / symbOLIC. Symbolic stuff can slowly find hard things and concrete execution just sprays super fast and can find the dumb things really quick.  

We can use a lens structure to organize a DSL for describing a simple imperative language

The forward pass is for the concrete execution. The backward pass is for transforming the post condition to a pre condition in a weakest precondition analysis. Weakest precondition semantics is a way of specifying what is occurring in an imperative language. It tells how each statement transforms post conditions (predicates about the state after the execution) into pre conditions (predicates about before the execution).  The concrete execution helps unroll loops and avoid branching if-then-else behavior that would make the symbolic stuff harder to process. I’ve been flipping through Djikstra’s book on this. Interesting stuff, interesting man.

I often think of a state machine as a function taking s -> s. However, this is kind of restrictive. It is possible to have heterogenous transformations s -> s’. Why not? I think I am often thinking about finite state machines, which we really don’t intend to have a changing state size. Perhaps we allocated new memory or something or brought something into or out of scope. We could model this by assuming the memory was always there, but it seems wasteful and perhaps confusing. We need to a priori know everything we will need, which seems like it might break compositionally.

We could model our language making some data type like
data Imp = Skip | Print String | Assign String Expr | Seq Imp Imp | ...
and then build an interpreter

Imp.

But we can also cut out the middle man and directly define our language using combinators.

To me this has some flavor of a finally tagless style.


Likewise for expressions. Expressions evaluate to something in the context of the state (they can lookup variables), so let’s just use

And, confusingly (sorry), I think it makes sense to use Lens in their original getter/setter intent for variables. So Lens structure is playing double duty.

type Var s a = Lens' s a

With that said, here we go.

Weakest precondition can be done similarly, instead we start from the end and work backwards

Predicates are roughly sets. A simple type for sets is

Now, this doesn’t have much deductive power, but I think it demonstrates the principles simply. We could replace Pred with perhaps an SMT solver expression, or some data type for predicates, for which we’ll need to implement things like substitution. Let’s not today.

A function

is equivalent to

. This is some kind of CPS / Yoneda transformation thing. A state transformer

to predicate transformer

is somewhat evocative of that. I’m not being very precise here at all.

Without further ado, here’s how I think a weakest precondition looks roughly.

Finally here is a combination of the two above that uses the branching structure of the concrete execution to aid construction of the precondition. Although I haven’t expanded it out, we are using the full s t a b parametrization of lens in the sense that states go forward and predicates come back.

Neat. Useful? Me dunno.

Linear Algebra of Types

It gives my brain a pleasant thrum to learn new mathematics which mimics the algebra I learned in middle school. Basically this means that the new system has operations with properties that match those of regular numbers as much as possible. Two pretty important operations are addition and multiplication with the properties of distributivity and associativity. Roughly this corresponds to the mathematical notion of a Semiring.

Some examples of semirings include 

  • Regular multiplication and addition
  • And-Or
  • Min-plus 
  • Matrices.
  • Types

I have written before about how types also form a semiring, using Either for plus and (,) for times. These constructions don’t obey distributivity or associativity “on the nose”, but instead are isomorphic to the rearranged type, which when you squint is pretty similar to equality.

Matrices are grids of numbers which multiply by “row times column”. You can form matrices out of other semirings besides just numbers. One somewhat trivial but interesting example is block matrices, where the elements of the matrix itself are also matrices. Another interesting example is that of relations, which can be thought of as matrices of boolean values. Matrix multiplication using the And-Or semiring on the elements corresponds to relational composition.

What if we put our type peanut butter in our matrix chocolate and consider matrices of types, using the Either(,) semiring?

The simplest implementation to show how this could go can be made using the naive list based implementation of vectors and matrices. We can directly lift this representation to the typelevel and the appropriate value-level functions to type families.

This was just for demonstration purposes. It is not my favorite representation of vectors. You can lift a large fraction of possible ways to encode vector spaces at the value level up to the type level, such as the linear package, or using dual vectors type V2 a = a -> a -> a. Perhaps more on that another day.

What is the point?

Ok. That’s kind of neat, but why do it? Well, one way to seek an answer to that question is to ask “what are matrices useful for anyway?”

One thing they can do is describe transition systems. You can write down a matrix whose entire a_{ij} describes something about the transition from state i to state j. For example the entry could be:

  • The cost of getting from i to j (min-plus gives shortest path),
  • The count of ways to get from i to j (combinatorics of paths)
  • The connectivity of the system from i to j using boolean values and the and-or semiring
  • The probability of transition from i to j
  • The quantum amplitude of going from i to j if we’re feeling saucy.

If we form a matrix describing a single time step, then multiplying this matrix by itself gives 2 time steps and so on.

Lifting this notion to types, we can build a type exactly representing all the possible paths from state i to j.

Concretely, consider the following humorously bleak transition system: You are going between home and work. Every 1 hour period you can make a choice to do a home activity, commute, or work. There are different options of activities at each.


This is described by the following transition diagram

Stapler.

The transitions are described by the following matrix.type:

What is the data type that describe all possible 4-hour day? You’ll find the appropriate data types in the following matrix.

Now, time to come clean. I don’t think this is necessarily the best way to go about this problem. There are alternative ways of representing it.

Here are two data types that describe an indefinite numbers of transition steps.

Another style would hold the current state as a type parameter in the type using a GADT.

We could construct types that are to the above types as Vec n is to [] by including an explicit step size parameter.

Still, food for thought.

Further Thoughts

The reason i was even thinking about this is because we can lift the above construction to perform a linear algebra of vectors spaces. And I mean the spaces, not the vectors themselves. This is a confusing point.

Vector spaces have also have two natural operations on them that act like addition and multiplication, the direct sum and kronecker product. These operations do form a semiring, although again not on the nose.

This is connected to the above algebra of types picture by considering the index types of these vector spaces. The simplest way to denote this in Haskell is using the free vector space construction as shown in this Dan Piponi post. The Kronecker product makes tuples of the indices and the direct sum has an index that is the Either of the original index types.

This is by far not the only way to go about it. We can also consider using the Compose-Product semiring on functors (Compose is Kron, Product is DSum) to get a more index-free kind of feel and work with dense vectors.

Going down this road (plus a couple layers of mathematical sophistication) leads to a set of concepts known as 2Vect. Dan Roberts and James Vicary produced a Mathematica package for 2Vect which is basically incomprehensible to me. It seems to me that typed functional programming is a more appropriate venue for something of this kind of pursuit, given how evocative/ well modeled by category theory it can be. These mathematical ideas are applicable to describing anyonic vector spaces. See my previous post below. It is not a coincidence that the Path data type above is so similar to FibTree data type. The root type variable takes the place of the work/home state, and the tuple structure take the place of a Vec-like size parameter n .

More to on this to come probably as I figure out how to explain it cleanly.

Edit: WordPress, your weird formatting is killing me.

Edit: Hoo Boy. This is why we write blog posts. Some relevant material was pointed out to me that I was not aware of. Thanks @DrEigenbastard.

https://link.springer.com/chapter/10.1007/978-3-319-19797-5_6

http://blog.sigfpe.com/2010/08/constraining-types-with-regular.html

http://blog.sigfpe.com/2010/08/divided-differences-and-tomography-of.html

Relational Algebra with Fancy Types

Last time, I tried to give a primer of relations and relational algebra using the Haskell type type Rel a b = [(a,b)]. In this post we’re going to look at these ideas from a slightly different angle. Instead of encoding relations using value level sets, we’ll encode relations in the type system. The Algebra of Programming Agda repo and the papers quoted therein are very relevant, so if you’re comfortable wading into those waters, give them a look. You can find my repo for fiddling here

At this point, depending on what you’ve seen before, you’re either thinking “Yeah, sure. That’s a thing.” or you’re thinking “How and why the hell would you do such a ridiculous thing.”

Most of this post will be about how, so let’s address why first:

  1. Examining relations in this style illuminates some constructions that appear around the Haskell ecosystem, particularly some peculiar fellows in the profunctor package.
  2. More syntactic approach to relations allows discussion of larger/infinite domains. The finite enumerations of the previous post is nice for simplicity, but it seems you can’t get far that way.
  3. Mostly because we can – It’s a fun game. Maybe a useful one? TBD.

With that out of the way, let’s go on to how.

Translating Functions to Relation GADTs

We will be using some Haskell extensions in this post, at the very least GADTs and DataKinds. For an introduction to GADTs and DataKinds, check out this blog post. DataKinds is an extension that reflects every data constructor of data types to a type constructor. Because there are values True and False there are corresponding types created'True and 'False. GADTs is an extension of the type definition mechanism of standard Haskell. They allow you to declare refined types for the constructors of your data and they infer those refined type when you pattern match out of the data as well, such that the whole process is kind of information preserving.

We will use the GADT extension to define relational datatypes with the kind

. That way it has a slot a for the “input” and b for the “output” of the relation. What will goes in these type slots will be DataKind lifted types like 'True, not ordinary Haskell types like Int. This is a divergence from from the uses of similar kinds you see in Category, Profunctor, or Arrow. We’re doing a more typelevel flavored thing than you’ll see in those libraries. What we’re doing is clearly a close brother of the singleton approach to dependently typed programming in Haskell.

Some examples are in order for what I mean. Here are two simple boolean functions, not and and defined in ordinary Haskell functions, and their equivalent GADT relation data type.

You can already start to see how mechanical the correspondence between the ordinary function definition and our new fancy relation type. A function is often defined via cases. Each case corresponds to a new constructor of the relation and each pattern that occurs in that case is the pattern that appears in the GADT. Multiple arguments to the relations are encoded by uncurrying everything by default.

Any function calls that occur on the right hand side of a function definition becomes fields in the constructor of our relation. This includes recursive calls and external function calls. Here are some examples with a Peano style natural number data type.

We can also define things that aren’t functions. Relations are a larger class of things than functions are, which is part of their utility. Here is a “less than equal” relation LTE.

You can show that elements are in a particular relation by finding a value of that relational type. Is ([4,7], 11) in the relation Plus? Yes, and I can show it with with the value PS (PS (PS (PS PZ))) :: Plus (4,7) 11 . This is very much the Curry-Howard correspondence. The type R a b corresponds to the proposition/question is (a,b) \in R .

The Fun Stuff : Relational Combinators

While you need to build some primitive relations using new data type definitions, others can be built using relational combinators. If you avoid defining too many primitive relations like the above and build them out of combinators, you expose a rich high level manipulation algebra. Otherwise you are stuck in the pattern matching dreck. We are traveling down the same road we did in the previous post, so look there for less confusing explanations of the relational underpinnings of these constructions, or better yet some of the references below.

Higher order relational operators take in a type parameters of kind

and produce new types of a similar kind. The types appearing in these combinators is the AST of our relational algebra language.

The first two combinators of interest is the composition operator and the identity relation. An element (a,c) is in R \cdot Q if there exists a b such that (a,b) \in R and (b,c) \in Q. The fairly direct translation of this into a type is

The type of the composition is the same as that of Profunctor composition found in the profunctors package.

Alongside a composition operator, it is a knee jerk to look for an identity relation and we do have one

This is also a familiar friend. The identity relation in this language is the Equality type.

We can build an algebra for handling product and sum types by defining the appropriate relational combinators. These are very similar to the combinators in the Control.Arrow package.

The converse of relations is very interesting operation and is the point where relations really differ from functions. Inverting a function is tough. Conversing a relation always works. This data type has no analog in profunctor to my knowledge and probably shouldn’t.

Relations do not have a notion of currying. The closest thing they have is

Lattice Operators

For my purposes, lattices are descriptions of sets that trade away descriptive power for efficiency. So most operations you’d perform on sets have an analog in the lattice structure, but it isn’t a perfect matching and you’re forced into approximation. It is nice to have the way you perform these approximation be principled, so that you can know at the end of your analysis whether you’ve actually really shown anything or not about the actual sets in question.

? No. No… Yes? Oh. OH! IT IS!

The top relation holds all values. This is represented by making no conditions on the type parameters. They are completely phantom.

Bottom is a relation with no inhabitants.

The meet is basically the intersection of the relations, the join is basically the union.

A Lattice has an order on it. This order is given by relational inclusion. This is the same as the :-> combinator can be found in the profunctors package.

Relational equality can be written as back and forth inclusion, a natural isomorphism between the relations. There is also an interesting indirect form.

Relational Division

If we consider the equation (r <<< p) :-> q with p and q given, in what sense is there a solution for r? By analogy, this looks rather like r*p = q, so we’re asking a kind of division question. Well, unfortunately, this equation may not necessarily have a solution (neither do linear algebraic equations for that matter), but we can ask for the best under approximation instead. This is the operation of relational division. It also appears in the profunctor package as the right Kan Extension. You’ll also find the universal property of the right division under the name curryRan and uncurryRan in that module.

One formulation of Galois connections can be found in the adjunctions file. Galois Connections are very slick, but I’m running out of steam, so let’s leave that one for another day.

Properties and Proofs

We can prove many properties about these relational operations. Here a a random smattering that we showed using quickcheck last time.

Odds and Ends

  • Recursion Schemes – Recursion schemes are a methodology to talk about recursion in a point free style and where the rubber meets the road in the algebra of programming. Here is an excellent series of articles about them. Here is a sample of how I think they go:
  • Higher Order Relations?
  • Examples of use. Check out the examples folder in the AoP Agda repo. These are probably translatable into Haskell.
  • Interfacing with Singletons. Singletonized functions are a specialized case or relations. Something like?
  • A comment to help avoid confusion. What we’ve done here feels confusingly similar to profunctor, but it is in fact distinct I think. Profunctors are described as a categorical generalization of relations , but to be honest, I kind of don’t get it. Despite many of our constructions appearing in the profunctor package, the profunctor typeclass itself appears to not play a role in our formulation. There just isn’t a good way to dimap under our relations as written, unless you construct free profunctors. Converse at the least is a wrench in the works.
  • Star and graphs. Transition relations are a powerful methodology. A transition relation is in some respects the analog of a square matrix. We can iteratively compose it with itself.

References

Doing Basic Ass Shit in Haskell

Haskell has lots of fancy weird corners, but you want to get rippin’ and runnin’

The Haskell phrase book is a new useful thingy. Nice and terse.

https://typeclasses.com/phrasebook

This one is also quite good https://lotz84.github.io/haskellbyexample/

I also like what FP complete is up to. Solid set of useful stuff, although a bit more emphasis towards their solutions than is common https://haskell.fpcomplete.com/learn

I was fiddling with making some examples for my friends a while ago, but I think the above do a similar better job.

https://github.com/philzook58/basic-ass-shit

Highlights include:

Makin a json request

Showing a plot of a sine function

Doing a least squares fit of some randomly created data

Not too complicated stuff to get you excited about Haskell:

I love Power Serious. https://www.cs.dartmouth.edu/~doug/powser.html Infinite power series using the power of laziness in something like 20 lines

https://blog.plover.com/prog/haskell/monad-search.html Using the list monad to solve SEND+MORE=MONEY puzzle.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.42.8903&rep=rep1&type=pdf Jerzy Karczmarczuk doing automatic differentiation in Haskell before it was cool. Check out Conal Elliott’s stuff after.

Very simple symbolic differentiation example. When I saw this in SICP for the first time, I crapped my pants.

https://www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf Why functional Programming Matters by John Hughes

https://www.cs.cmu.edu/~crary/819-f09/Backus78.pdf John Backus emphasizing escaping the imperative mindset in his 1978 Turing Award speech. A call to arms of functional programming

https://www.cs.tufts.edu/~nr/cs257/archive/richard-bird/sudoku.pdf Richard Bird defining sudoku solutions and then using equation reasoning to build a more efficient solver

https://wiki.haskell.org/Research_papers/Functional_pearls – Functional Pearls

Here’s how I find useful Haskell packages:

I google. I go to hackage (if I’m in a subpage, click on “contents” in the upper right hand corner). Click on a category that seems reasonable (like “web” or something) and then sort by Downloads (DL). This at least tells me what is popular-ish. I look for tutorials if I can find them. Sometimes there is a very useful getting started snippet in the main subfile itself. Some packages are overwhelming, others aren’t.

The Real World Haskell book is kind of intimidating although a lovely resource.

The wiki has a pretty rockin set of tutorials. Has some kind soul been improving it?

https://wiki.haskell.org/Category:Tutorials

I forgot learn you a Haskell has a chapter on basic io

http://learnyouahaskell.com/input-and-output

Learn more

When you’re ready to sit down with Haskell more, the best intro is currently the Haskell Book

You may also be interested in https://www.edx.org/course/introduction-functional-programming-delftx-fp101x-0 this MOOC

https://github.com/data61/fp-course or this Data61 course

Then there is a fun infinitude of things to learn after that.

______

More ideas for simple examples?

This post is intentionally terse.

IO is total infective poison.

standard output io

mutation & loops. You probably don’t want these. They are not idiomatic Haskell, and you may be losing out on some of the best lessons Haskell has to offer.

file IO

web requests

http://www.serpentine.com/wreq/tutorial.html

web serving – scotty

image processing

basic data structures

command line arguments

plotting

Parallelism and Concurrency

https://nokomprendo.frama.io/tuto_fonctionnel/posts/tuto_fonctionnel_25/2018-08-25-en-README.html

A Short Skinny on Relations & the Algebra of Programming

I’ve been reading about the Algebra of Programming lately and lovin’ it. See J.N. Oliveira’s draft text in particular and the links in the references. I’ve started exploring the stuff from this post and more over here: https://github.com/philzook58/rel

Why and What?

Relations can expand the power of functional programming for the purpose of specification.

The point of a specification is to be able to write down in a very compact and clear way your intent for a program, more clearly and compactly than a full implementation could be written. It therefore makes sense to add to your specification language constructs that are not necessarily executable or efficient for the sake of compactness and clarity. When one needs executability or efficiency, one writes an implementation whose behavior you can connect to the spec via a formal or informal process.

Functional programming, with it’s focus on the abstraction of the mathematical function, is a good trade-off between executability, efficiency, and expressibility. It lies in a reasonable location between the ideas amenable to reasoning by a human mind and the command-driven requirements of the machines.

Functions are a specialization of relations. Relations extend the mathematical notion of functions with constructs like nondeterministic choice, failure and converse. These constructs are not always obviously executable or efficient. However, they greatly extend the abilities of reasoning and the clarity of expression of a specification.

The point-free style of reasoning about functions extends to a point-free style reasoning about relations, which is known as relation algebra. There are rich analogies with databases, category theory, linear algebra, and other topics.

Plus, I think it is very neato for some reason. If anyone ever thinks something is really neato, isn’t it worth giving it a listen?

A Simple Representation of Relations in Haskell

The simplest description of relations is as a set of tuples. So first let’s talk a bit about the options for sets in Haskell.

Sets in Haskell

There are a couple different reasonable ways to represent sets in Haskell.

  • [a] or Vector a
  • a -> Bool
  • Set a — a tree based Set from the containers package.

These have different performance characteristics and different power. The list [a] is very simple and has specialized pleasant syntax available. The indicator function a -> Bool gives you no ability to produce values of type a, but can easily denote very sophisticated spaces. Set a is a good general purpose data structure with fast lookup. You might also choose to mix and match combinations of these. Interconversion is often possible, but expensive. This is not a complete list of possibilities for sets, for example you may want a representation with a stronger possibility for search.

Relations in Haskell

We can directly use the definition of relations as a set of tuples with the above

But we also have the option to “curry” our relation representations, sort of mixing and matching properties of these representations.

You might also choose to package up multiples of these representations, choosing the most appropriate as the situation requires, see for example the relation package, whose type holds both Map a (Set b) and Map b (Set a).

Despite fiendishly poor performance, for simplicity and list comprehension syntax we are going to be using type Rel a b = [(a,b)] for the remainder of the post.

I’m also taking the restriction that we’re working in bounded enumerable spaces for ease. I assume such a requirement can be lifted for many purposes, but finite spaces like these are especially well tamed. The following typeclass and definition is very useful in this case.

Functions and Relations

Functions can be thought of as relations with the special property that for each left part of the tuple, there is exactly one right side and every possible left side appears. The relation corresponding to a function f looks like F = \{(x,y) | x \in X, y \in Y, y = f (x)\}.

There is a natural and slightly clever lifting of function composition to relations. We now check whether there exists a value that is in the right side of one and the left side of the other.

Because of these two operations (and their properties of associativity and absorption), FinRel is a category. We do however need the Eq b restriction to make Rel an instance of the category typeclass, so it does not quite fit the definition of category in base. It is a constrained category.

We can lift the common arrow/categorical combinators up to relations for example.

With these combinators, you have access to many functions on basic non-recursive algebraic data types. By combining them in a point free style, you can build some other useful combinators.

An Aside: Relations, Linear Algebra, Databases

The composition operation described above is not so unfamiliar as it may first appear.

Relation algebra has a great similarity to linear algebra. This connection can be made more clear by considering sparsity patterns of matrices and tensors. Sparsity patterns are a useful abstraction of linear algebraic operations. Instead of considering matrices of numbers, instead the entries are “zero” and “possibly nonzero” or, if you prefer, a matrix of boolean values corresponding to those same questions.

The ordinary row times column matrix multiplication corresponds to relation composition. Replace * with AND and + with OR. If any of the numbers is zero, then multiplying them will result in zero. In summing two numbers, if either is possibly nonzero, then the result is possibly nonzero.

Another interesting way of looking at it is that we are replacing the summation binding form \sum_i with the logical quantifier \exists_i. Both introduce a scoped “dummy variable” i and have a lot of syntactic similarity. Other related forms include \lambda i, \forall i, \int di, \max_i .

There is also an analog of the point free relation algebra in linear algebra. Linear algebra has the most widely used point free notation in the world, matrix notation. Consider the expressions Ax=b and X = ABC as compared to \sum_j A_{ij} x_j = b_i and X_{il} = \sum_{jk} A_{ij} B_{jk} C_{kl} . Matrix notation is SO much better for certain calculations. Other pieces of the matrix notation include transpose, inverse, Kronecker product, the Khatri-Rao product, and Hadamard product. Their properties are more clear in the index free form in my opinion. I believe even massive tensor expressions can be written index free using these operators. There are also analogies to be drawn between the graphical notations in these different fields.

Databases can be thought of very similarly to sparse matrices. In principle, you could enumerate all the possible values for a column of a database. So you could think of a database as a giant matrix with a 1 if the item is in the database and 0 if not. Databases are very very sparse from this perspective, and you would never store them this way. The join operation is a relative of relational composition, however join usually operates via looking at the column names, whereas our join is position based.

Query optimization in databases has interesting analogs in sparse linear algebra. For example, the Taco compiler http://tensor-compiler.org/ is doing something very akin to a query optimizer.

Inverting Relations

Unlike functions, Relations are always “invertible”. We call this the converse of a relation. When a function is invertible, it corresponds to the converse. In terms of the tuples underlying our representation, it just swaps them. Relations also possess operations trans and untrans that may be thought of as a kind of currying or as a partial inverse on a single parameter.

Orderings can also be lifted to relations (\leq) = \{(a,b) | a \leq b\}. The composition of relations also respects the usual composition of ordering.

Nondeterministic choice is sometimes represented in Haskell using Set returning functions a -> [b]. You may recall this from the context of the List monad. In fact in this case, we have an isomorphism as evidenced by tabulateSearch and searchRel.

Similarly partial functions can be reflected into relations

A useful trick is to lift sets/subsets to relations as a diagonal relation. \{(a,a) | a \in S \}. Projection onto the set can be achieve by composing with this relation. The identity results if you are talking about the entire set S.

Comparing Relations

We can compare sets by asking if one is a subset of the other A\subseteq B . Relations can also be compared by this operation, which we call relational inclusion.

A subservient notion to this is relational equality.

Relational algebra is chockful of inequality style reasoning, which is richer and slightly more complicated than equality style reasoning. This is one of the benefits of moving from functional descriptions to a relational description.

Relations also form a lattice with respect to these comparisons. What the hell are lattices? In the context of finite relations, lattices may be over powered mathematical machinery, but it really is useful down the line. They give you binary operators that play nice with some kind of ordering, in our case relational inclusion. These two operations are the meet and the join, which find the greatest lower bound and least upper bound of the operands respectively. For our relations, these correspond to the more familiar notion of set intersection and union. The intersection of two sets is the biggest set that is in both of them. The union is the smallest set for which both sets are a subset of it.

Using meet/join vs intersection/union becomes more interesting when the domain is fancier than relations over finite domains. Issues of infinity can make this interesting, or when using a representation that can’t explicitly represent arbitrary unions or intersections, but that instead has to approximate them. My favorite example is polyhedra. Polyhedra are not closed under unions. So in this case the join and union do not coincide. You need to take a convex hull of the union instead, which is the best approximation. Concretely, polyhedra can be represented as a list of their vertices, which generate the polyhedron. There is no way to express a union in this representation. Concatenating the lists represents taking the convex hull of the union.

An additional property that a lattice may possess is a largest and small element, called top (\top ) and bottom (\bot). Our finite domain relations do have these.

Relational Division

And now finally we get to one of the most interesting, powerful, and confusing operations: relational division. Relational division is a kind of pseudo inverse to relational composition. In linear algebra, the pseudo inverse is a matrix that does the best job it can to invert another matrix in a least squares sense. If the matrix is actually invertible, it equals the inverse. Relational division does the best job it can to invert a relational composition. Instead of taking least squares as a criteria, it ensures that the result doesn’t over approximate. If you had the inequality X \cdot Y \subseteq Z and you want to solve for X, relational division is the thing that does that. The right division Q = Z/Y is the largest relation such that Q \cdot Y \subseteq Z.

A helpful example is the similar operation of division in database tables.

And here is an implementation that I think is correct. I’ve goofed it up a couple times, it is a rather confusing construct.

There also exists a very similar operation of ldiv.

Relational division encapsulates many notions of searching or optimizing. I invite you to read more about it in J.N. Oliveira’s text or the Bird & de Moor text.

Properties and QuickCheck

Oh. Mah. Glob. You guys. So many properties. (Artwork courtesy of David)

Relation algebra is so chock full of properties. This is a perfect opportunity for some QuickCheck , a randomized property testing framework. There are so many more to test. I need to dig through to collect up all the identities.

Bits and Bobbles

  • Relations over continuous spaces. Vector subspaces (Linear Relations), Polyhedra (Linear inequality relations).
  • Non Bool-valued Relations. Replace \exists_x with \max_x. The weighted edgelist of a graph is a natural relation. By using composition we can ask about paths. We still have a comparison operator \subseteq which now respects the ordering of weights
  • Galois connections are cool.
  • Relations combined with recursion schemes. Recursion schemes are the point free way of describing recursion.
  • Moving into infinite spaces. How do we cope?
  • Faster search. Some relations are best specified by functions, Maps, others, mixes and matching.
  • If you “singletonize” relations a la the Agda project https://github.com/scmu/aopa, you get very interesting interconnections with profunctors, which people say are a categorical generalization of relations.
  • Point-free DSLs are interesting and pleasant. Many worries about alpha renaming are gone, at the expense of point-free pain. A DSL like this may be necessary to choose good relational query plans

Edit: A follow up post on that type level angle here http://www.philipzucker.com/relational-algebra-with-fancy-types/

References

Edit : A math exchange question about a -> [b] relational type. https://math.stackexchange.com/questions/3360026/can-division-be-expressed-intensionally-in-relation-algebra/3361351#3361351

Edit: An interesting comment and related library from /u/stevana

Why I (as of June 22 2019) think Haskell is the best general purpose language (as of June 22 2019)

Me and my close friends have been interested in starting a project together and I suggested we use Haskell. I do not think the suggestion was received well or perhaps in seriousness. I have a tendency to joke about almost everything and have put forward that we use many interesting but not practical languages in the same tone that I suggest Haskell. This was a tactical mistake. I find myself in despair at the idea I can’t convince my personal friends, who are curious and intellectual people, to use Haskell on a fresh start web project we have complete control over. What hope do I have in the world at large? This brain dump post is meant for them. My suggestion to use Haskell is not just me being an asshole, although that does make it more fun for me. I will now try to explain in all seriousness and in all the honesty that I can muster what my opinions on languages are and why I have them.

Pragmatically can you start a new project using a language you don’t know? This is a problem. A project always has some intrinsic difficulty. Not all projects will survive an extra layer of unnecessary speedbump. But when and how are you supposed to learn new languages? Never is one answer. I disagree with this answer. In this case we have the leg up that I do know Haskell. Perhaps this is a downside in that it will be extra frustrating? It is also easy for me to ask, as using Haskell is not a burden for me, I have already sunk the cost, but a massive learning burden for others.

Is Haskell actually practical for a web application? Short answer: yes. Expect pain though. If your web application is so simple you could rip it out in 100 lines of python, this is such a simple project that it is a good opportunity to learn something new. If it will become large and complex, then I believe Haskell does shine, keeping complexity under control. I base this upon the professional experience making a web application using a Haskell-Purescript stack. For honesty, it wasn’t all good. I recall ripping my hair out threading shit through monad stacks to where it need to go. Yet on the whole, it kept the project sane. I also believe this based on the word of mouth that I believe but could be just cultish ramblings.

I believe that truly dominating properties of Haskell appear in large complex projects. This is difficult to prove in any other way except empirically and the experiments will be so wildly uncontrolled in terms of the project and people involved that no conclusions can truly be drawn. And yet I have faith, and think that personal experience validates this opinion to myself at least. We have to live this life even though truth does not exist. Choices and opinions must be made.

For programs that are going to be a single manageable file and written in one night, it doesn’t matter much what you use in terms of being choked on your own code. At this scale, I still think Haskell is enjoyable and interesting though. Haskell was my doorway into the world of computer science as I now understand it. I hope there are more doorways.

Things about me that may be different from you. Decide for yourself if these aspects of me make our opinions fundamentally incompatible.

  • I do have a talent and a like for practically oriented mathematical topics (computational methods, linear algebra, formal methods, calculus, geometric algebra, projective geometry, optimization, etc.). I actually have very little taste at all for mathematical topics that I see no purpose to.
  • I do have some desire and taste for esoterica for its own purpose. I cannot exactly characterize what makes some topics acceptable to me and others not.
  • A hard learning curve is not necessarily a downside for me. I enjoy the challenge if it is overcomable and worth it on the other side.
  • I like weird and different. That is a positive for me, but a negative for many. I might just be a millennial hipster idiot.
  • I would LOVE to find a language I think is better than Haskell. I would LOVE to abandon Haskell. Perhaps this already makes me odd. Perhaps I think many people don’t consider the differences between languages to be worth making the switch and the wasted knowledge. The people with this opinion may or may not have tried enough languages.
  • I have “drank the koolaid”. I do read what comes out of the Haskell and functional programming community and have a tendency to believe things without strong empirical backing.
  • I have been more deeply entwined with Haskell than any other language. Perhaps if I had reached a level of familiarity in another language I’d be as fervent about that one? I don’t believe this is the case.
  • While I desire performance, I consciously curb this desire. I am strongly in favor of cleanly written, clear, principled code. This is of course a weighted judgement. I will probably use a 100x performance gain for a 2x decrease in clarity or reusability. This is a result of the problem domains and scale that have interested me in the past and that I anticipate in the future. I STRONGLY believe the world at large over values performance at the expense of other good qualities of code. Or at least over optimizes early.

A Biased List of Pros and Cons

What do I find bad about C++.

  • The feature set is huge and difficult to understand
  • Extreme amounts of legacy features that will be even recommended if you read legacy documentation. Kitchen sink.
  • The language appears to be almost entirely built out of footguns.
  • The syntax is too verbose.
  • I have a distaste for mutation.
  • I have a distaste for object oriented programming

What do I find good about C++

  • It is commonly used. Large community.
  • It is possible for it to be very fast
  • Kitchen Sink. You can find almost any feature you want here.
  • The high level goals of the mind-leaders sound good.
  • Very interesting projects are written in C++. HPC things. Scientific computation.
  • Template metaprogramming seems very powerful, but arcane.

What I find bad about Java

  • The syntax puts me off as being incredibly verbose
  • Extreme object oriented focus
  • Corporate/Enterprise feel. I am an iconoclast and see myself as a reasonable but slightly rebellious character. Java in my mind brings images of cubicles and white walls. Perhaps this is not fair.

What do I find good about Java

  • ?

I’m joking. Sorry Java. But I also kind of mean it. Yes, there are positive aspects to Java.

What I like about python

  • Very commonly used and understood. Perhaps the lingua franca
  • Incredible library ecosystem
  • Numpy and scipy in particular are marvels of the modern age.
  • Syntax is basically imperative pseudo-code
  • I am personally very familiar with it
  • python is the easiest to use language I know.

What I dislike about python

  • Python has no natural tendency for correctness due to the free wheeling dynamically typed character. This is patched up with testing, opt-in type systems.
  • I don’t know how to grow as a pythonista. The skill curve flattens out. For some this may be a positive.
  • The main way of building new data types is the class system. I think this is ungainly, overly verbose, and not always a good conceptual fit.
  • Despite being among the most succinct of common imperative languages, I still think it ends up being too verbose.
  • It is slow. This is a negative, although not high on my priorities.

What is bad about Haskell

  • Very difficult learning curve. Let’s get real. Haskell is a very confusing programming language to get started in for common programming tasks.
  • functional programming is weirder than imperative programming.
  • The monad paradigm, even once learned is ungainly. Tracking multiple effects is a pain that does not exist in most languages.
  • The pain is up front. It is easy to get a sketch of what you want ripped out in python faster than in Haskell (for me). If you want a web server, command line tool, optimization problem, curve fitter, I can rip all of these out faster in python than I can in Haskell. As a psychological thing, this feels awful. For a small scale project, unless toying with Haskell itself or one of its domain expertises like implementing DSLs, python is the easier and correct choice. Python is a scripting language. I’d make the switch at two screens worth of code.
  • I think laziness is confusing and easy to shoot yourself with.
  • Haskell is not the fastest language, although faster than python.
  • Concern for there not being jobs and interestingly on the converse side, no people to hire. There is a catch-22 there. There is a set of people that would KILL for a Haskell job.
  • There are vocal smug assholes who use Haskell and push it. I am smug, I hope only mildly an asshole.

What I find good about Haskell

  • The number one reason is that there is something ephemeral that I just like. I am not naturally inclined to analyze such things. When I like a movie or don’t like it, it just happens, and if forced to explain why, I’ll bullshit.
  • Errors and bugs are to a shocking degree caught at compile time. More than any other language I have experience with does Haskell code run correctly without hidden bugs if it compiles. I am not claiming this is absolute but it is all the more incredible the degree to which it is true since it didn’t literally have to be this way. I have done major rewiring of a data structure in a project. The compiler guided my hand to exactly the positions that needed to be adjusted. Could C++ do this? Yes, to some degree. I believe that the tendencies of C++ programming make it less satisfactory on this point.
  • Types are an incredible design tool. I find designing the types of my program to be an extremely enjoyable and helpful activity. Much more so than box and wire class and interface diagrams. A good function name and a type signature basically entirely constrains the behavior of a function. Types can be quickly and completely be given to the compiler and machine enforced.
  • The pain that Monads cause are pains you should be feeling. They are the pain of explicitness which I 70% choose over the pain of not knowing what the fuck a function might do, and not enabling the compiler to enforce that. If something is capable of mutating state, it should say so in goddamn huge purple letters.
  • Haskell is more than fast enough. It isn’t that even people don’t care. The Haskell community at large cares a lot more for performance than I do, and I reap the dividends. The people in charge of the compiler and the main libraries are goddamn wizards who’s work I get to benefit from. This is true of all languages perhaps.
  • Laziness is very cool. At the beginning I thought it was INCREDIBLY awesome and inconceivable that I could manipulate an infinite list.
  • The way of specifying new data types is so succinct and so incredible. Pattern matching is SO GODDAMN good for some tasks.
  • Haskell has a paradigm of small combinators. It is common to build a sequence of very small natural functions for the domain and then build larger and larger things out of them. This is good for reusability and conceptual clearness.
  • Extreme preference for Immutability. As part of keeping what you must keep in your head or remember small while programming, immutability is an insane win. You think you know what you want now. You know you could just tweak this variable here, make a special variable over here. You can reason about how to make this all work and be correct now. You will not in a month. Your coworkers will mess it all up too.
  • Haskell code is generic by default. This allows the same code to be reused in many situations
  • The standard typeclass hierarchy is extremely well thought out and powerful. To some degree it is unnecessary in other languages. The difference between Functor, Applicative, Monad, and Traversable makes little sense in languages with unconstrained mutations and effects.
  • Haskell paradigms are inspired by mathematics, and I have great faith in mathematics. The concepts behind Haskell feel closer to discoveries rather than inventions. Imperative programming speaks in a language formed for the accidental nature of the machines we have. Functional programming is a language closer to mathematics, which I believe is closer to how the human mind works, and closer to what the problem at hand actually is.
  • Complexity scales. It is my belief, perhaps unverified, that as a project grows larger, the insane miserable churn and mental overhead grows slower in a Haskell project than in other languages. I do not have strong empirical evidence to this assertion. Word of mouth (of Haskellers).
  • The ceiling on Haskell is extremely high. You can continue to learn and get better, gaining more and more power. I do not currently see the end of this.
  • When I do reach for some new library, I am very often impressed by how thoughtfully built it is. Haskell itself is INCREDIBLY thoughtfully built.
  • The haskell community is very excited and they have many things to teach. There are many Haskellers out there who are very welcoming, kind, and intelligent.
  • Haskell does have some cache. I am not immune to wanting to seem smart. If the world thought that only idiots use Haskell, that would offput me some. That the world things that only impractical ivory tower smug weenies use Haskell does offput me, although I perhaps embrace it belligerently.
  • The Haskell library ecosystem is strong. Less strong than python, but much better than some of the other languages that intrigue my eye. There is functionality somewhere for most common tasks.
  • Haskell is used industrially. I live in an echo chamber, but by and large the industrial users of Haskell sing its praises.

For context, this is my programming journey:

I first learned BASIC in middle school. I wrote a computer game in Visual Basic. I toyed with the TI-83. I went to a summer camp in high school where I learned the extreme rudiments of C++. Did a small web business with some friends. Did really bad work with big dreams. I took a fairly poor Java course in high school. I learned MATLAB in college. I doinked around with Arduino level C projects. I learned python in grad school I think. I think I got far more proficient at python than any other language before. At this point, I was on board for object oriented programming, albeit not at a deep level (design patterns or ilk), just as light organizational principle. Did some Android projects, really didn’t like Java there. Did a small business with my friend and got deep in Javascript. As our web application got bigger, this really start to hurt. Errors all over the place. Unknown junk in objects. Who has access to what? We tried our darndest to read and follow best practices as we could find them, but it felt like we sinking in quicksand. More and more effort for more and more bugs and less and less progress. It was around this era that I first even heard of Haskell. I was intrigued immediately for some reason, maybe for just how weird it was. It took probably 2 years of forgetting about it and going back to tutorials every 6 months. I didn’t necessarily KNOW that this was a thing that I wanted, I just found it interesting. Currently I am fairly proficient at some things in Haskell (unfortunately I am more proficient at esoterica than more practical things). I have had a professional job writing Haskell, and it seems like my like of Haskell is working out very well for me professionally. Passion in all forms is powerful.

My history may not be as convincing as someone who spent 20 years as a professional C++ dev and then switched, but I have at least experienced different paradigms and languages and found Haskell the most to my liking.

Random Thoughts

I am now trying to push myself into comfort in Julia, Rust, Agda, Coq, OCaml, all languages I feel show promise for different reasons. To my knowledge Haskell is a better choice than these as a general purpose tool for pragmatic reasons. Haskell’s library ecosystem is strong and performance is good. These are points against agda and coq. Julia has a focus on scientific programming.

Rust might be a good compromise. I consider it a trojan horse for useful programming language features that I associate with functional languages. It claims to performance and being an acceptable systems level language, which appeals to some. The syntax does not scare anyone off. The library ecosystem seems good and the community strong. I did find myself missing Haskell features in Rust though, am personally much less familiar with it. I think the design of Rust weights more to performance than is warranted in most applications and has less descriptive and abstraction power than Haskell, qualities that I prioritize. This opinion is not strongly held.

What makes Haskells types any better or worse than C? At the beginning many of the features of Haskell seem like magic. But as time has worn on, I can see how the features can be emulated with just some slight syntactic and conceptual overhead in other languages. This slight overhead is enough though. A language is more than just it’s syntax. It is also its idioms. It is also they way it makes people think. Language is almost a prerequisite for thought. You cannot even conceive of the ways there are to express yourself before learning.

What exactly makes a language “good”? This is a question poorly phrased enough to have no answer. Excel can be an excellent language for many tasks. Easy to learn. Very powerful. Yet, it is not considered a good general purpose programming language. Library ecosystem is extremely important. Specialized languages are often the best choice for special problem domains, at the expense of learning them, or eventually finding incompatibility of what you want from what they designed for.

What makes abstractions “good”. Why do I have queasiness about object oriented-programmming. Java, I think basically. I, overeagerly have gone down the road of trying to design deep subclass hierarchies, which is not OO at it’s best. Zebra is a Quadruped is an Animal is Alive kind of stuff. I believe object oriented in an interesting principle. I hear about SmallTalk and Common Lisp doing object oriented right and I am duly intrigued. There has been some recent work in Haskell about how to do objects in a way aesthetically compatible with Haskell. I think object oriented has been over used and abused. I think it is a poor conceptual fit for many situations. I think it tends to make non reusable code. I think the form that it takes in C++ and Java development is arcane horseshit.

I deserve almost no opinion about Java or C++, having not done sufficient that much in them. Yet, I must state my opinions, take them as you will, for I do in fact hold them strongly. I have worked on a network simulator and a robotics framework in C++, but begrudgingly. I have done a very small amount of Java development for a personal project and some Processing sketches. My coworker was a 10 year professional Java dev before switching to Scala then Haskell. He despises Java now. Highly anecdotal, and he is a similar iconoclastic character like me. Nevertheless, this also informs my opinion. I have been reading Bjarne Stroustrup’s book (his stated goals and what he claims C++ achieves are admirable and one can’t argue he hasn’t changed the world) and actually find C++ rather interesting, especially in the sense that many projects that I find interesting are written in C++, I just don’t want to myself work in the language.

Haskell love:

Haskell Criticism (perhaps warranted)

https://news.ycombinator.com/item?id=17114308 Ah Hacker News. Always a sunny worldview.

https://medium.com/@snoyjerk/least-favorite-thing-about-haskal-ef8f80f30733

Hacker news discussion of this post:

https://news.ycombinator.com/item?id=20258218#20271511

Lens as a Divisibility Relation: Goofin’ Off With the Algebra of Types

Types have an algebra very analogous to the algebra of ordinary numbers (video). This is the basic table of correspondences. Code with all the extensions available here.

One way to see that this makes sense is by counting the cardinality of types built out of these combinators. Unit is the type with 1 inhabitant. Void has 0 inhabitants. If a has n and b has m possible values, then Either a b has n + m inhabitants, (a,b) has n*m and there are n^m possible tabulations of a->b. We’re gonna stick to just polynomials for the rest of this, ignoring a->b.

Another way of looking at this is if two finitely inhabited types have the same number of inhabitants, then the types can be put into an isomorphism with each other. In other words, types modulo isomorphisms can be thought as representing the natural numbers. Because of this, we can build a curious proof system for the natural numbers using ordinary type manipulation.

In addition, we also get a natural way of expressing and manipulating polynomials.Polymorphic types can be seen as being very similar to polynomial expressions with natural coefficients N[x]. The polymorphic type variables have the ability to be instantiated to any value, corresponding to evaluating the polynomial for some number.

The Lens ecosystem gives some interesting combinators for manipulating this algebra. The type Iso' a b contains isomorphisms. Since we’re only considering types up to isomorphism, this Iso' represents equality. We can give identity isomorphisms, compose isomorphisms and reverse isomorphisms.

We can already form a very simple proof.

Now we’ll add some more combinators, basically the axioms that the types mod isos are a commutative semiring. Semirings have an addition and multiplication operator that distribute over each other. It is interesting to note that I believe all of these Iso' actually are guaranteed to be isomorphisms ( to . from = id and from . to = id ) because of parametricity. from and to are unique ignoring any issues with bottoms because the polymorphic type signature is so constraining. This is not usually guaranteed to be true in Haskell just from saying it is an Iso'. If I give you an Iso' Bool Bool it might actually be the iso (const True) (const True) for example, which is not an isomorphism.

There are also combinators for lifting isomorphisms into bifunctors: firsting, seconding, and bimapping. These are important for indexing into subexpressions of our types in a point-free style.

Here is a slightly more complicated proof now available to us.

We can attempt a more interesting and difficult proof. I developed this iteratively using . _ hole expressions so that GHC would tell me what I had manipulated my type to at that point in my proof.

Artwork Courtesy of David. Sorry for any motion sickness.

The proof here is actually pretty trivial and can be completely automated away. We’ll get to that later.

If Iso' is equality, what about the other members of the Lens family? Justin Le says that Lens s a are witness to the isomorphism of a type s to the tuple of something and a. Prism witness a similar thing for sums. Once we are only considering types mod isos, if you think about it, these are expressions of two familiar relations on the natural numbers: the inequality relation and the divisibility relation

Mathematically, these relations can be composed with equalities, just like in the lens hierarchy Lens and Prism can be composed with Iso. Both form a category, since they both have id and (.).

Here are a couple identities that we can’t derive from these basic combinators. There are probably others. Woah-ho, my bad. These are totally derivable using id_mul, id_plus, mul_zero, _1, _2, _Left, _Right.

Pretty neat! Random thoughts and questions before we get into the slog of automation:

  • Traversal is the “is polynomial in” relation, which seems a rather weak statement on it’s own.
  • Implementing automatic polynomial division is totally possible and interesting
  • What is the deal with infinite types like [a]? Fix. I suppose this is a theory of formal power series / combinatorial species. Fun combinatorics, generatingfunctionology. Brent Yorgey did his dissertation on this stuff. Wow. I’ve never really read this. It is way more relevant than I realized.
  • Multivariate polynomial algorithms would also be interesting to explore (Grobner basis, multivariate division)
  • Derivatives of types and zippers – Conor McBride
  • Negative Numbers and Fractions?
  • Lifting to rank-1 types. Define Negative and Fractions via Galois connection?

Edit: /u/carette (wonder who that is 😉 ) writes:

“You should dig into
J Carette, A Sabry Computing with semirings and weak rig groupoids, in Proceedings of ESOP 2016, p. 123-148. Agda code in https://github.com/JacquesCarette/pi-dual/tree/master/Univalence. A lot of the algebra you develop is there too.

If you hunt around in my repos, you’ll also find things about lenses, exploring some of the same things you mention here.”

Similar ideas taken further and with more sophistication. Very interesting. Check it out.

Automation

Our factor example above was quite painful, yet the theorem was exceedingly obvious by expansion of the left and right sides. Can we automate that? Yes we can!

Here’s the battle plan:

  • Distribute out all expressions like a*(b+c) so that all multiplication nodes appear at the bottom of the tree.
  • Reduce the expression by absorbing all stupid a*1, a*0, a+0 terms.
  • Reassociate everything to the right, giving a list like format
  • Sort the multiplicative terms by power of the variable

Once we have these operations, we’ll combine them into a canonical operation. From there, most equality proofs can be performed using the rewrite operation, which merely puts both sides into canonical form

Once we have those, the painful theorem above and harder ones becomes trivial.

Now we’ll build the Typeclasses necessary to achieve each of these aims in order. The Typeclass system is perfect for what we want to do, as it builds terms by inspecting types. It isn’t perfect in the sense that typeclass pattern matching needs to be tricked into doing what we need. I have traded in cleverness and elegance with verbosity.

In order to make our lives easier, we’ll need to tag every variable name with a newtype wrapper. Otherwise we won’t know when we’ve hit a leaf node that is a variable. I’ve used this trick before here in an early version of my faking Compiling to Categories series. These wrappers are easily automatically stripped.

A common pattern I exploit is to use a type family to drive complicated recursion. Closed type families allow more overlap and default patterns which is very useful for programming. However, type families do not carry values, so we need to flip flop between the typeclass and type family system to achieve our ends.

Here is the implementation of the distributor Dist. We make RDist and LDist typeclasses that make a sweep of the entire tree, using ldist and rdist as makes sense. It was convenient to separate these into two classes for my mental sanity. I am not convinced even now that I have every case. Then the master control class Dist runs these classes until any node that has a (*) in it has no nodes with (+) underneath, as checked by the HasPlus type family.

Next is the Absorb type class. It is arranged somewhat similarly to the above. Greedily absorb, and keep doing it until no absorptions are left. I think that works.

The Associators are a little simpler. You basically just look for the wrong association pattern and call plus_assoc or mul_assoc until they don’t occur anymore, then recurse. We can be assured we’re always making progress if we either switch some association structure or recurse into subparts.

Finally, the SortTerm routine. SortTerm is a bubble sort. The typeclass Bubble does a single sweep of swapping down the type level list-like structure we’ve built. The SortTerm uses the Sorted type family to check if it is finished. If it isn’t, it call Bubble again.