Linear Algebra of Types

It gives my brain a pleasant thrum to learn new mathematics which mimics the algebra I learned in middle school. Basically this means that the new system has operations with properties that match those of regular numbers as much as possible. Two pretty important operations are addition and multiplication with the properties of distributivity and associativity. Roughly this corresponds to the mathematical notion of a Semiring.

Some examples of semirings include 

  • Regular multiplication and addition
  • And-Or
  • Min-plus 
  • Matrices.
  • Types

I have written before about how types also form a semiring, using Either for plus and (,) for times. These constructions don’t obey distributivity or associativity “on the nose”, but instead are isomorphic to the rearranged type, which when you squint is pretty similar to equality.

Matrices are grids of numbers which multiply by “row times column”. You can form matrices out of other semirings besides just numbers. One somewhat trivial but interesting example is block matrices, where the elements of the matrix itself are also matrices. Another interesting example is that of relations, which can be thought of as matrices of boolean values. Matrix multiplication using the And-Or semiring on the elements corresponds to relational composition.

What if we put our type peanut butter in our matrix chocolate and consider matrices of types, using the Either(,) semiring?

The simplest implementation to show how this could go can be made using the naive list based implementation of vectors and matrices. We can directly lift this representation to the typelevel and the appropriate value-level functions to type families.

type a :*: b = (a,b)
type a :+: b = Either a b

type family Dot v v' where
    Dot '[x] '[y] = x :*: y 
    Dot (x : xs) (y : ys) = (x :*: y) :+: (Dot xs ys)

type family MVMult m v where
    MVMult '[r] v = '[Dot r v]
    MVMult (r : rs) v = (Dot r v) : (MVMult rs v)

type family VMMult m v where
    VMMult v ' = '[Dot v c]
    VMMult v (c : cs) = (Dot v c) : (VMMult v cs)

type family MMMult' m m' where
    MMMult' '[r] cs = '[VMMult r cs]
    MMMult' (r : rs) cs = (VMMult r cs) : (MMMult' rs cs)

type family MMMult m m' where
    MMMult m m' = MMMult' m (Transpose m')

type family Transpose m where
    Transpose ((r1 : rs') : rs) = (r1 : (Heads rs)) : (Conss rs' (Transpose (Tails rs)))
    Transpose '[] = '[]

-- some mapped helper functions
-- verrrrrry ugly. Eh. Get 'er dun
type family Heads v where
    Heads ((v : vs) : xs) = v : (Heads xs)
    Heads '[] = '[]
type family Tails v where
    Tails ((v : vs) : xs) = vs : (Tails xs)
    Tails '[] = '[]
type family Conss v vs where
    Conss (x : xs) (y : ys) = (x : y) : (Conss xs ys)
    Conss '[] '[] = '[]

type family Index v i where
    Index (x : xs) 0 = x
    Index (x : xs) n = Index xs (n-1)

This was just for demonstration purposes. It is not my favorite representation of vectors. You can lift a large fraction of possible ways to encode vector spaces at the value level up to the type level, such as the linear package, or using dual vectors type V2 a = a -> a -> a. Perhaps more on that another day.

What is the point?

Ok. That’s kind of neat, but why do it? Well, one way to seek an answer to that question is to ask “what are matrices useful for anyway?”

One thing they can do is describe transition systems. You can write down a matrix whose entire a_{ij} describes something about the transition from state i to state j. For example the entry could be:

  • The cost of getting from i to j (min-plus gives shortest path),
  • The count of ways to get from i to j (combinatorics of paths)
  • The connectivity of the system from i to j using boolean values and the and-or semiring
  • The probability of transition from i to j
  • The quantum amplitude of going from i to j if we’re feeling saucy.

If we form a matrix describing a single time step, then multiplying this matrix by itself gives 2 time steps and so on.

Lifting this notion to types, we can build a type exactly representing all the possible paths from state i to j.

Concretely, consider the following humorously bleak transition system: You are going between home and work. Every 1 hour period you can make a choice to do a home activity, commute, or work. There are different options of activities at each.

data Commute = Drive
data Home = Sleep | Eat
data Work = TPSReport | Bitch | Moan

This is described by the following transition diagram


The transitions are described by the following matrix.type:

type T = '[ '[Home    ,  Commute ],  
            '[Commute ,  Work    ]]

What is the data type that describe all possible 4-hour day? You’ll find the appropriate data types in the following matrix.

type FourHour = MMMult T (MMMult T (MMMult T T))

Now, time to come clean. I don’t think this is necessarily the best way to go about this problem. There are alternative ways of representing it.

Here are two data types that describe an indefinite numbers of transition steps.

data HomeChoice = StayHome Home HomeChoice | GoWork Commute WorkChoice
data WorkChoice = StayWork Work WorkChoice | GoHome Commute HomeChoice

Another style would hold the current state as a type parameter in the type using a GADT.

data Path state where   
   StayWork :: Work -> Path Work -> Path Work
   CommuteHome :: Commute -> Path Home ->  Path Work
   StayHome :: Home -> Path Home -> Path Home
   CommuteWork :: Commute -> Path Work ->  Path Home

We could construct types that are to the above types as Vec n is to [] by including an explicit step size parameter.

Still, food for thought.

Further Thoughts

The reason i was even thinking about this is because we can lift the above construction to perform a linear algebra of vectors spaces. And I mean the spaces, not the vectors themselves. This is a confusing point.

Vector spaces have also have two natural operations on them that act like addition and multiplication, the direct sum and kronecker product. These operations do form a semiring, although again not on the nose.

This is connected to the above algebra of types picture by considering the index types of these vector spaces. The simplest way to denote this in Haskell is using the free vector space construction as shown in this Dan Piponi post. The Kronecker product makes tuples of the indices and the direct sum has an index that is the Either of the original index types.

type Vec b r = [(b, a)]
-- Example 2D vector space type
type V2D = Vec Bool Double

This is by far not the only way to go about it. We can also consider using the Compose-Product semiring on functors (Compose is Kron, Product is DSum) to get a more index-free kind of feel and work with dense vectors.

Going down this road (plus a couple layers of mathematical sophistication) leads to a set of concepts known as 2Vect. Dan Roberts and James Vicary produced a Mathematica package for 2Vect which is basically incomprehensible to me. It seems to me that typed functional programming is a more appropriate venue for something of this kind of pursuit, given how evocative/ well modeled by category theory it can be. These mathematical ideas are applicable to describing anyonic vector spaces. See my previous post below. It is not a coincidence that the Path data type above is so similar to FibTree data type. The root type variable takes the place of the work/home state, and the tuple structure take the place of a Vec-like size parameter n .

More to on this to come probably as I figure out how to explain it cleanly.

Edit: WordPress, your weird formatting is killing me.

Edit: Hoo Boy. This is why we write blog posts. Some relevant material was pointed out to me that I was not aware of. Thanks @DrEigenbastard.

Relational Algebra with Fancy Types

Last time, I tried to give a primer of relations and relational algebra using the Haskell type type Rel a b = [(a,b)]. In this post we’re going to look at these ideas from a slightly different angle. Instead of encoding relations using value level sets, we’ll encode relations in the type system. The Algebra of Programming Agda repo and the papers quoted therein are very relevant, so if you’re comfortable wading into those waters, give them a look. You can find my repo for fiddling here

At this point, depending on what you’ve seen before, you’re either thinking “Yeah, sure. That’s a thing.” or you’re thinking “How and why the hell would you do such a ridiculous thing.”

Most of this post will be about how, so let’s address why first:

  1. Examining relations in this style illuminates some constructions that appear around the Haskell ecosystem, particularly some peculiar fellows in the profunctor package.
  2. More syntactic approach to relations allows discussion of larger/infinite domains. The finite enumerations of the previous post is nice for simplicity, but it seems you can’t get far that way.
  3. Mostly because we can – It’s a fun game. Maybe a useful one? TBD.

With that out of the way, let’s go on to how.

Translating Functions to Relation GADTs

We will be using some Haskell extensions in this post, at the very least GADTs and DataKinds. For an introduction to GADTs and DataKinds, check out this blog post. DataKinds is an extension that reflects every data constructor of data types to a type constructor. Because there are values True and False there are corresponding types created'True and 'False. GADTs is an extension of the type definition mechanism of standard Haskell. They allow you to declare refined types for the constructors of your data and they infer those refined type when you pattern match out of the data as well, such that the whole process is kind of information preserving.

We will use the GADT extension to define relational datatypes with the kind

a -> b -> *
. That way it has a slot a for the “input” and b for the “output” of the relation. What will goes in these type slots will be DataKind lifted types like 'True, not ordinary Haskell types like Int. This is a divergence from from the uses of similar kinds you see in Category, Profunctor, or Arrow. We’re doing a more typelevel flavored thing than you’ll see in those libraries. What we’re doing is clearly a close brother of the singleton approach to dependently typed programming in Haskell.

Some examples are in order for what I mean. Here are two simple boolean functions, not and and defined in ordinary Haskell functions, and their equivalent GADT relation data type.

not True = False
not False = True

data Not a b where
    NotTF :: Not 'True 'False
    NotFT :: Not 'False 'True

and True True = True
and False _ = False
and _ False = False

data And a b where
    AndTT :: And '( 'True, 'True) 'True
    AndFU :: And '( 'False, a) 'False
    AndUF :: And '( a, 'False) 'False

You can already start to see how mechanical the correspondence between the ordinary function definition and our new fancy relation type. A function is often defined via cases. Each case corresponds to a new constructor of the relation and each pattern that occurs in that case is the pattern that appears in the GADT. Multiple arguments to the relations are encoded by uncurrying everything by default.

Any function calls that occur on the right hand side of a function definition becomes fields in the constructor of our relation. This includes recursive calls and external function calls. Here are some examples with a Peano style natural number data type.

data Nat = S Nat | Z

plus Z x = x
plus (S x) y = S (plus x y)

data Plus a b where
    PZ :: Plus '( 'Z, a) a
    PS :: Plus '( a,b) c -> Plus '( 'S a, b) c 

We can also define things that aren’t functions. Relations are a larger class of things than functions are, which is part of their utility. Here is a “less than equal” relation LTE.

data LTE a b where
    LTERefl :: LTE n n
    LTESucc :: LTE n m -> LTE n ('S m)

You can show that elements are in a particular relation by finding a value of that relational type. Is ([4,7], 11) in the relation Plus? Yes, and I can show it with with the value PS (PS (PS (PS PZ))) :: Plus (4,7) 11 . This is very much the Curry-Howard correspondence. The type R a b corresponds to the proposition/question is (a,b) \in R .

The Fun Stuff : Relational Combinators

While you need to build some primitive relations using new data type definitions, others can be built using relational combinators. If you avoid defining too many primitive relations like the above and build them out of combinators, you expose a rich high level manipulation algebra. Otherwise you are stuck in the pattern matching dreck. We are traveling down the same road we did in the previous post, so look there for less confusing explanations of the relational underpinnings of these constructions, or better yet some of the references below.

Higher order relational operators take in a type parameters of kind

a -> b -> *
and produce new types of a similar kind. The types appearing in these combinators is the AST of our relational algebra language.

The first two combinators of interest is the composition operator and the identity relation. An element (a,c) is in R \cdot Q if there exists a b such that (a,b) \in R and (b,c) \in Q. The fairly direct translation of this into a type is

{- rcompose :: Rel b c -> Rel a b -> Rel a c  -}

data RCompose k1 k2 a c where
   RCompose :: k1 b c -> k2 a b -> RCompose k1 k2 a c

type k <<< k' = RCompose k k' 
type k >>> k' = RCompose k' k

The type of the composition is the same as that of Profunctor composition found in the profunctors package.

type RCompose = Procompose

Alongside a composition operator, it is a knee jerk to look for an identity relation and we do have one

data Id a b where
   Refl :: Id a a

-- monomorphic identity. Leave this out?
data IdBool a b where
  ReflTrue :: IdBool 'True 'True
  ReflFalse :: IdBool 'False 'False

This is also a familiar friend. The identity relation in this language is the Equality type.

-- identity function is the same as Equality
type Id a b = Id (a :~: b)

We can build an algebra for handling product and sum types by defining the appropriate relational combinators. These are very similar to the combinators in the Control.Arrow package.

-- Product types

data Fan k k' a b where
    Fan :: k a b -> k' a c -> Fan k k' a '(b,c)

type k &&& k' = Fan k k'

data Fst a b where
    Prj1 :: Fst '(a, b) a

data Snd a b where
    Prj2 :: Snd '(a, b) b

-- Sum type

data Split k k' a b where
    CaseLeft :: k a c -> Split k k' ('Left a) c
    CaseRight :: k' b c -> Split k k' ('Right b) c

type k ||| k' = Split k k'

data Inj1 a b where
    Inj1 :: Inj1 a ('Left a)
data Inj2 a b where
    Inj2 :: Inj2 a ('Right a)

-- some derived combinators
type Par f g = Fan (f <<< Fst) (g <<< Snd)
type Dup  = Fan Id Id
type Swap = Fan Snd Fst

The converse of relations is very interesting operation and is the point where relations really differ from functions. Inverting a function is tough. Conversing a relation always works. This data type has no analog in profunctor to my knowledge and probably shouldn't.

data RConverse k a b where
    RConverse :: k a b -> RConverse k b a
-- Shorter synonym
type RCon = RConverse

Relations do not have a notion of currying. The closest thing they have is

data Trans k a b where
    Trans :: k '(a,b) c -> Trans k a '(b,c)

Lattice Operators

For my purposes, lattices are descriptions of sets that trade away descriptive power for efficiency. So most operations you'd perform on sets have an analog in the lattice structure, but it isn't a perfect matching and you're forced into approximation. It is nice to have the way you perform these approximation be principled, so that you can know at the end of your analysis whether you've actually really shown anything or not about the actual sets in question.

? No. No... Yes? Oh. OH! IT IS!

The top relation holds all values. This is represented by making no conditions on the type parameters. They are completely phantom.

newtype Top a b = Top ()

Bottom is a relation with no inhabitants.

newtype Bottom a b = Bottom Void

The meet is basically the intersection of the relations, the join is basically the union.

newtype RMeet k k' a b = RMeet (k a b, k' a b)
type k /\ k' = RMeet k k'  

newtype RJoin k k' a b = RJoin (Either (k a b) (k' a b))
type k \/ k' = RJoin k k' 

A Lattice has an order on it. This order is given by relational inclusion. This is the same as the :-> combinator can be found in the profunctors package.

type (:->) p q = forall a b. p a b -> q a b
type RSub p q = p :-> q

Relational equality can be written as back and forth inclusion, a natural isomorphism between the relations. There is also an interesting indirect form.

data REq k k' = REq {to' :: k :-> k', from' :: k' :-> k }

Relational Division

If we consider the equation (r <<< p) :-> q with p and q given, in what sense is there a solution for r? By analogy, this looks rather like r*p = q, so we're asking a kind of division question. Well, unfortunately, this equation may not necessarily have a solution (neither do linear algebraic equations for that matter), but we can ask for the best under approximation instead. This is the operation of relational division. It also appears in the profunctor package as the right Kan Extension. You'll also find the universal property of the right division under the name curryRan and uncurryRan in that module.

newtype Ran p q a b = Ran { runRan :: forall x. p x a -> q x b }
type RDiv = Ran

One formulation of Galois connections can be found in the adjunctions file. Galois Connections are very slick, but I'm running out of steam, so let's leave that one for another day.

Properties and Proofs

We can prove many properties about these relational operations. Here a a random smattering that we showed using quickcheck last time.

prop_ridleft ::  (k <<< Id) :-> k
prop_ridleft (RCompose k IdRefl) = k

prop_ridright ::  (Id <<< k) :-> k
prop_ridright (RCompose IdRefl k) = k

prop_meet :: p /\ q :-> p
prop_meet (RMeet (p, q)) = p

prop_join :: p :-> p \/ q
prop_join p = RJoin (Left p)

meet_assoc :: RMeet k (RMeet k' k'') a b -> RMeet (RMeet k k') k'' a b
meet_assoc (RMeet (k, (RMeet (k',k'')))) = RMeet (RMeet (k,k'), k'')

prop_top :: k :-> Top
prop_top _ = top

prop_bottom :: Bottom :-> k
prop_bottom (Bottom x) = absurd x

bottom_compose :: REq (k <<< Bottom) Bottom
bottom_compose = REq (\(RCompose k (Bottom b)) -> absurd b) prop_bottom

data Iso a b = Iso {to :: a -> b, from :: b -> a}
type a <-> b = Iso a b

meet_universal :: (p ::-> RMeet k k') <-> (p ::-> k, p ::-> k')
meet_universal = Iso to from where
    to (RSub f) = (RSub $ \p -> case f p of RMeet (k,k') -> k  , RSub $ \p -> case f p of RMeet (k,k') -> k')
    from (RSub f,RSub g) = RSub $ \p -> RMeet (f p, g p) 

prop_con :: RCon (RCon k) :-> k
prop_con (RConverse (RConverse k)) = k

Odds and Ends

  • Recursion Schemes - Recursion schemes are a methodology to talk about recursion in a point free style and where the rubber meets the road in the algebra of programming. Here is an excellent series of articles about them. Here is a sample of how I think they go:
data MapMaybe k a b where
    MapJust :: k a b -> MapMaybe k ('Just a) ('Just b)
    MapNothing :: MapMaybe k 'Nothing 'Nothing

data Cata map k a b where
    Cata :: k fa a -> map (Cata map k) x fa  -> Cata map k ('Fix x) 
  • Higher Order Relations?
  • Examples of use. Check out the examples folder in the AoP Agda repo. These are probably translatable into Haskell.
  • Interfacing with Singletons. Singletonized functions are a specialized case or relations. Something like?
  •  newtype SFun a b = SFun (Sing a -> Sing b) 
  • A comment to help avoid confusion. What we've done here feels confusingly similar to profunctor, but it is in fact distinct I think. Profunctors are described as a categorical generalization of relations , but to be honest, I kind of don't get it. Despite many of our constructions appearing in the profunctor package, the profunctor typeclass itself appears to not play a role in our formulation. There just isn't a good way to dimap under our relations as written, unless you construct free profunctors. Converse at the least is a wrench in the works.
  • Star and graphs. Transition relations are a powerful methodology. A transition relation is in some respects the analog of a square matrix. We can iteratively compose it with itself.
-- Check out "term rewriting and all that"
-- This is also the reflection without remorse data type
-- TSequence
-- this is also a free instance of Category
data Star k a b where
    Done :: Star k a a
    Roll :: k b c -> Star k a b -> Star k a c

data KPlus k a b where
    PDone :: k a b -> KPlus k a b
    PRoll :: k b c -> KPlus k a b -> KPlus k a c

type SymClos k a b = RJoin k (RCon k) a b
type RefClos k a b = RJoin k Id a b
{- n-fold composition -}
-- similar to Fin.
-- This is also the Vec n is to list and this is to reflection without remorse. Kind of interesting
data NFold n k a b where
    One :: k a b -> NFold ('S n) k a b
    More :: k b c -> NFold n k a b -> NFold ('S n) k a b