For my birthday, we made a new movie and built some robots and went to see Star Wars. Quite a day!

I am pleased to present: The Wassenroid

Skip to content
## Our Latest Film Ouvre: The Wassenroid

## Solving the Laplace Equations with Linear Relations

### Bits and Bobbles

## Sum of Squares optimization for Minimax Optimal Differential Eq Residuals

## Categorical LQR Control with Linear Relations

### Bits and Bobbles

### References

## Linear Relation Algebra of Circuits with HMatrix

### Circuits

### Bits and Bobbles

#### Refs

## Failing to Bound Kissing Numbers

## Learn Coq in Y

## Neural Networks with Weighty Lenses (DiOptics?)

### Bits and Bobbles

## Gröbner Bases and Optics

## Functors, Vectors, and Quantum Circuits

## The Algebra of Functors

### Vector Spaces as Shape

#### What do linear operators look like?

## Monoidal Categories

**Why are they called Monoidal?**

**String diagrams**

### Bit and Bobbles

### References

## Appendix

### Representable/Naperian Functors

**Monoidal Products on Hask**

### Monoidal Combinators of Functors

For my birthday, we made a new movie and built some robots and went to see Star Wars. Quite a day!

I am pleased to present: The Wassenroid

The Laplace equation is ubiquitous in physics and engineering.

$latex \nabla^2 \phi = 0 = \partial_x^2 \phi + \partial_y^2 \phi = 0

It and slight variants of it describes electrostatics, magnetostatics, steady state heat flow, elastic flex, pressure, velocity potentials.

There are a couple reasons for that.

- It results from minimizing the squared gradient of a field which can make sense from an energy minimization perspective.
- Similarly it results from the combination of a flow conservation law and a linear constitutive relation connecting flow and field (such as Ohm’s law, Fick’s law, or Hooke’s law).
- It also gets used even if not particularly appropriate because we know how to mathematically deal with it, for example in image processing.

There are a couple of questions we may want to ask about a Laplace equation system

- Given the field on the boundary, determine the field in the interior (Dirchlet problem)
- Given the normal derivative of the field on the boundary determine the field in the interior (Neumann problem)
- Given sources in the interior and 0 boundary condition, determine the field. The Laplace equation is called the Poisson equation when you allow a source term on the right hand side. .
- Given the field at the boundary, determine the derivative at the boundary. Dirichlet-to-Neumann map or Poincare-Steklov operator.

Given the Dirichlet to Neumann map, you do not have to consider the interior of a region to use it. The Dirichlet to Neumann map is sort of the same thing as an effective resistance or scattering matrix. It gives you a black box representation of a region based solely on the variables at its boundary.

This linear relation algebra is useful for many things that I’d have considered a use case for the Schur complement. The Schur complement arises when you do Gaussian elimination on a blocked matrix. It is good that this pattern has a name, because once you know about it, you’ll see it in many places. Domain decomposition, marginalized gaussian distributions, low-rank update, Scattering matrices.

By composing the linear relations corresponding to the Dirchlet-Neumann relations of regions, we can build the Dirichlet-Neumann relations of larger regions.

To make this more concrete, let us take the example of electrical circuits like before. A grid of resistors is a finite difference approximation to the continuous problem

Electric field is gradient of potential

continuum ohm’s law

current conservation

In this post, I mentioned how you can make reasonable 2 dimensional diagrams out of a monoidal category, by sort of arbitrarily flipping one wire up and one wire down as in the diagram below. This defines a horizontal and vertical composition which have to do the required book-keeping (associations) to keep an arrow in canonical form. I had considered this for the management method of weights in neural networks, but it is way more natural as the actual geometrical layout of a finite difference grid of a laplace equation.

So we can reuse our categorical circuit combinators to build a finite difference Laplace equation.

This can be implemented in Haskell doing the following. Neato.

- Not the tightest post, but I needed to get it out there. I have a short attention span.
- Homology and defining simplices as categories. One way of describing Homology is about linear operators that are analogues of finite difference operators (or better yet, discrete differential geometry operators / exterior derivatives). To some degree, it is analyzing the required boundary conditions to fully define differential equations on weirdo topological surfaces, which correspond to geometrical loops/holes. You can figure this out by looking at subspaces and quotients of the difference operators. Here we have here a very category theory way of looking at partial differential equations. How does it all connect?
- Continuous circuit models – https://en.wikipedia.org/wiki/Distributed-element_model Telegrapher’s equation is classic example.
- Cody mentioned that I could actually build circuits and measure categorical identities in a sense. That’s kind of cool. Or I could draw conductive ink on carbon paper and actually make my string diagrams into circuits? That is also brain tickling
- Network circuits
- I really want to get coefficients that aren’t just doubles. allowing rational functions of a frequency would allow analysis of capacitor/inductor circuits, but also tight binding model systems for fun things like topological insulators and the Haldane model http://www.philipzucker.com/topologically-non-trivial-circuit-making-haldane-model-gyrator/ . I may need to leave Haskell. I’m not seeing quite the functionality I need. Use Sympy? https://arxiv.org/abs/1605.02532 HLinear. Flint bindings for haskell? Looks unmaintained. Could also use a grobner basis package as dynamite for a mouse.
- This is relevant for the boundary element method. Some really cool other stuff relevant here. http://people.maths.ox.ac.uk/martinsson/2014_CBMS/

Blah Blah blah: The subtlest aspect of differential equations is that of boundary conditions. It is more correct usually to consider the system of the interior differential equation and the boundary conditions as equally essential parts of the statement of the problem you are considering.

Huh. This doesn’t embed very well. Maybe you’re better off just clicking into the thing. It’s nice not to let things rot too long though. *shrug*

Other ideas: Can I not come up with some scheme to use Sum of Squares for rigorous upper and lower bound regions like in https://github.com/JuliaIntervals/TaylorModels.jl ? Maybe a next post.

Last time, I described and built an interesting expansion of describing systems via linear functions, that of using linear relations. We remove the requirement that mappings be functional (exactly one output vector for any given input vector), which extends our descriptive capability. This is useful for describing “open” pieces of a linear system like electric circuits. This blog post is about another application: linear dynamical systems and control.

Linear relations are an excellent tool for reachability/controllability. In a control system, it is important to know where it is possible to get the system to. With linear dynamics , an easy controllability question is “can my system reach into some subspace”. This isn’t quite the most natural question physically, but it is natural mathematically. My first impulse would be to ask “given the state starts in this little region over here, can it get to this little region over here”, but the subspace question is a bit easier to answer. It’s a little funky but it isn’t useless. It can detect if the control parameter only touches 1 of 2 independent dynamical systems for example. We can write the equations of motion as a linear relation and iterate composition on them. See Erbele for more.

There is a set of different things (and kind of more interesting things) that are under the purview of linear control theory, LQR Controllers and Kalman filters. The LQR controller and Kalman filter are roughly (exactly?) the same thing mathematically. At an abstract mathematical level, they both rely on the fact that optimization of a quadratic objective with a linear constraints is solvable in closed form via linear algebra. The cost of the LQR controller could be the squared distance to some goal position for example. When you optimize a function, you set the derivative to 0. This is a linear equation for quadratic objectives. It is a useful framework because it has such powerful computational teeth.

The Kalman filter does a similar thing, except for the problem of state estimation. There are measurements linear related to the state that you want to match with the prior knowledge of the linear dynamics of the system and expected errors of measurement and environment perturbations to the dynamics.

There are a couple different related species of these filters. We could consider discrete time or continuous time. We can also consider infinite horizon or finite horizon. I feel that discrete time and finite horizon are the most simple and fundamental so that is what we’ll stick with. The infinite horizon and continuous time are limiting cases.

We also can consider the dynamics to be fixed for all time steps, or varying with time. Varying with time is useful for the approximation of nonlinear systems, where there are different good nonlinear approximation (taylor series of the dynamics) depending on the current state.

There are a couple ways to solve a constrained quadratic optimization problem. In some sense, the most conceptually straightforward is just to solve for a span of the basis (the “V-Rep” of my previous post) and plug that in to the quadratic objective and optimize over your now unconstrained problem. However, it is commonplace and elegant in many respects to use the Lagrange multiplier method. You introduce a new parameter for every linear equality and add a term to the objective . Hypothetically this term is 0 for the constrained solution so we haven’t changed the objective in a sense. It’s a funny slight of hand. If you optimize over x and , the equality constraint comes out as your gradient conditions on . Via this method, we convert a linearly constrained quadratic optimization problem into an unconstrained quadratic optimization problem with more variables.

Despite feeling very abstract, the value these Lagrangian variables take on has an interpretation. They are the “cost” or “price” of the equality they correspond to. If you moved the constraint by an amount , you would change the optimal cost by an amount . (Pretty sure I have that right. Units check out. See Boyd https://www.youtube.com/watch?v=FJVmflArCXc)

The Lagrange multipliers enforcing the linear dynamics are called the co-state variables. They represent the “price” that the dynamics cost, and are derivatives of the optimal value function V(s) (The best value that can be achieved from state s) that may be familiar to you from dynamic programming or reinforcement learning. See references below for more.

Let’s get down to some brass tacks. I’ve suppressed some terms for simplicity. You can also add offsets to the dynamics and costs.

A quadratic cost function with lagrange multipliers. Q is a cost matrix associated with the x state variables, R is a cost matrix for the u control variables.

Equations of motion results from optimizing with respect to by design.

.

The initial conditions are enforced by the zeroth multiplier.

Differentiation with respect to the state x has the interpretation of backwards iteration of value derivative, somewhat related to what one finds in the Bellman equation.

The final condition on value derivative is the one that makes it the most clear that the Lagrange multiplier has the interpretation of the derivative of the value function, as it sets it to that.

Finally, differentiation with respect to the control picks the best action given knowledge of the value function at that time step.

Ok. Let’s code it up using linear relations. Each of these conditions is a separate little conceptual box. We can build the optimal control step relation by combining these updates together

The following is a bit sketchy. I am confident it makes sense, but I’m not confident that I have all of the signs and other details correct. I also still need to add the linear terms to the objective and offset terms to the dynamics. These details are all kind of actually important. Still, I think it’s nice to set down in preliminary blog form.

- The code in context https://github.com/philzook58/ConvexCat/blob/master/src/LinRel.hs
- Some of the juicier stuff is nonlinear control. Gotta walk before we can run though. I have some suspicions that a streaming library may be useful here, or a lens-related approach. Also ADMM.
- Reminiscent of Keldysh contour. Probably a meaningless coincidence.
- In some sense H-Rep is the analog of (a -> b -> Bool) and V-rep [(a,b)]
- Note that the equations of motion are relational rather than functional for a control systems. The control parameters describe undetermined variables under our control.
- Loopback (trace) the value derivative for infinite horizon control.
- Can solve the Laplace equation with a Poincare Steklov operator. That should be my next post.
- There is something a touch unsatisfying even with the reduced goals of this post. Why did I have to write down the quadratic optimization problem and then hand translate it to linear relations? It’s kind of unacceptable. The quadratic optimization problem is really the most natural statement, but I haven’t figured out to make it compositional. The last chapter of Rockafellar?

https://arxiv.org/abs/1405.6881 Baez and Erbele – Categories in Control

http://www.scholarpedia.org/article/Optimal_control

http://www.argmin.net/2016/05/18/mates-of-costate/ – interesting analogy to backpropagation. Lens connection?

https://courses.cs.washington.edu/courses/cse528/09sp/optimality_chapter.pdf

Oooh this is a fun one.

I’ve talked before about relation algebra and I think it is pretty neat. http://www.philipzucker.com/a-short-skinny-on-relations-towards-the-algebra-of-programming/. In that blog post, I used finite relations. In principle, they are simple to work with. We can perform relation algebra operations like composition, meet, and join by brute force enumeration.

Unfortunately, brute force may not always be an option. First off, the finite relations grow so enormous as to be make this infeasible. Secondly, it is not insane to talk about relations or regions with an infinite number of elements, such as some continuous blob in 2D space. In that case, we can’t even in principle enumerate all the points in the region. What are we to do? We need to develop some kind of finite parametrization of regions to manipulate. This parametrization basically can’t possibly be complete in some sense, and we may choose more or less powerful systems of description for computational reasons.

In this post, we are going to be talking about linear or affine subspaces of a continuous space. These subspaces are hyperplanes. Linear subspaces have to go through the origin, while affine spaces can have an offset from the origin.

In the previous post, I mentioned that the finite relations formed a lattice, with operations meet and join. These operations were the same as set intersection and union so the introduction of the extra terminology meet and join felt a bit unwarranted. Now the meet and join aren’t union and intersection anymore. We have chosen to not have the capability to represent the union of two vectors, instead we can only represent the smallest subspace that contains them both, which is the union closed under vector addition. For example, the join of a line and point will be the plane that goes through both.

Linear/Affine stuff is great because it is so computational. Most questions you cant to ask are answerable by readily available numerical linear algebra packages. In this case, we’ll use the Haskell package HMatrix, which is something like a numpy/scipy equivalent for Haskell. We’re going to use type-level indices to denote the sizes and partitioning of these spaces so we’ll need some helper functions.

In case I miss any extensions, make typos, etc, you can find a complete compiling version here https://github.com/philzook58/ConvexCat/blob/master/src/LinRel.hs

1 2 3 4 5 |
type BEnum a = (Enum a, Bounded a) -- cardinality. `size` was already taken by HMatrix :( card :: forall a. (BEnum a) => Int card = (fromEnum (maxBound @a)) - (fromEnum (minBound @a)) + 1 |

In analogy with sets of tuples for defining finite relations, we partition the components of the linear spaces to be “input” and “output” indices/variables . This partition is somewhat arbitrary and easily moved around, but the weakening of strict notions of input and output as compared to functions is the source of the greater descriptive power of relations.

Relations are extensions of functions, so linear relations are an extension of linear maps. A linear map has the form . A linear relation has the form . An affine map has the form and an affine relation has the form .

There are at least two useful concrete representation for subspaces.

- We can write a matrix and vector down that corresponds to affine constraints. . The subspace described is the nullspace of plus a solution of the equation. The rows of A are orthogonal to the space.
- We can hold onto generators of subspace. where l parametrizes the subspace. In other words, the subspace is generated by / is the span of the columns of . It is the range of .

We’ll call these two representations the H-Rep and V-Rep, borrowing terminology from similar representations in polytopes (describing a polytope by the inequalities that define it’s faces or as the convex combination of it’s vertices). https://inf.ethz.ch/personal/fukudak/lect/pclect/notes2015/PolyComp2015.pdf These two representations are dual in many respects.

1 2 3 4 5 |
-- HLinRel holds A x = b constraint data HLinRel a b = HLinRel (Matrix Double) (Vector Double) deriving Show -- x = A l + b. Generator constraint. data VLinRel a b = VLinRel (Matrix Double) (Vector Double) deriving Show |

It is useful to have both reps and interconversion routines, because different operations are easy in the two representations. Any operations defined on one can be defined on the other by sandwiching between these conversion functions. Hence, we basically only need to define operations for one of the reps (if we don’t care too much about efficiency loss which, fair warning, is out the window for today). The bulk of computation will actually be performed by these interconversion routines. The HMatrix function `nullspace`

performs an SVD under the hood and gathers up the space with 0 singular values.

1 2 3 4 5 6 7 8 9 10 11 12 |
-- if A x = b then x is in the nullspace + a vector b' solves the equation h2v :: HLinRel a b -> VLinRel a b h2v (HLinRel a b) = VLinRel a' b' where b' = a <\> b -- least squares solution a' = nullspace a -- if x = A l + b, then A' . x = A' A l + A' b = A' b because A' A = 0 v2h :: VLinRel a b -> HLinRel a b v2h (VLinRel a' b') = HLinRel a b where b = a #> b' -- matrix multiply a = tr $ nullspace (tr a') -- orthogonal space to range of a. -- tr is transpose and not trace? A little bit odd, HMatrix. |

These linear relations form a category. I’m not using the Category typeclass because I need BEnum constraints hanging around. The identity relations is aka .

1 2 3 4 |
hid :: forall a. BEnum a => HLinRel a a hid = HLinRel (i ||| (- i)) (vzero s) where s = card @a i = ident s |

Composing relations is done by combining the constraints of the two relations and then projecting out the interior variables. Taking the conjunction of constraints is easiest in the H-Rep, where we just need to vertically stack the individual constraints. Projection easily done in the V-rep, where you just need to drop the appropriate section of the generator vectors. So we implement this operation by flipping between the two.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
hcompose :: forall a b c. (BEnum a, BEnum b, BEnum c) => HLinRel b c -> HLinRel a b -> HLinRel a c hcompose (HLinRel m b) (HLinRel m' b') = let a'' = fromBlocks [[ ma', mb' , 0 ], [ 0 , mb, mc ]] in let b'' = vjoin [b', b] in let (VLinRel q p) = h2v (HLinRel a'' b'') in -- kind of a misuse let q' = (takeRows ca q) -- drop rows belonging to @b === (dropRows (ca + cb) q) in let [x,y,z] = takesV [ca,cb,cc] p in let p'= vjoin [x,z] in -- rebuild without rows for @b v2h (VLinRel q' p') -- reconstruct HLinRel where ca = card @a cb = card @b cc = card @c sb = size b -- number of constraints in first relation sb' = size b' -- number of constraints in second relation ma' = takeColumns ca m' mb' = dropColumns ca m' mb = takeColumns cb m mc = dropColumns cb m (<<<) :: forall a b c. (BEnum a, BEnum b, BEnum c) => HLinRel b c -> HLinRel a b -> HLinRel a c (<<<) = hcompose |

We can implement the general cadre of relation operators, meet, join, converse. I feel the converse is the most relational thing of all. It makes inverting a function nearly a no-op.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
hjoin :: HLinRel a b -> HLinRel a b -> HLinRel a b hjoin v w = v2h $ vjoin' (h2v v) (h2v w) -- hmatrix took vjoin from me :( -- joining means combining generators and adding a new generator -- Closed under affine combination l * x1 + (1 - l) * x2 vjoin' :: VLinRel a b -> VLinRel a b -> VLinRel a b vjoin' (VLinRel a b) (VLinRel a' b') = VLinRel (a ||| a' ||| (asColumn (b - b'))) b -- no constraints, everything -- trivially true htop :: forall a b. (BEnum a, BEnum b) => HLinRel a b htop = HLinRel (vzero (1,ca + cb)) (konst 0 1) where ca = card @a cb = card @b -- hbottom? hconverse :: forall a b. (BEnum a, BEnum b) => HLinRel a b -> HLinRel b a hconverse (HLinRel a b) = HLinRel ( (dropColumns ca a) ||| (takeColumns ca a)) b where ca = card @a cb = card @b |

Relational inclusion is the question of subspace inclusion. It is fairly easy to check if a VRep is in an HRep (just see plug the generators into the constraints and see if they obey them) and by using the conversion functions we can define it for arbitrary combos of H and V.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
-- forall l. A' ( A l + b) == b' -- is this numerically ok? I'm open to suggestions. vhsub :: VLinRel a b -> HLinRel a b -> Bool vhsub (VLinRel a b) (HLinRel a' b') = (naa' <= 1e-10 * (norm_2 a') * (norm_2 a) ) && ((norm_2 ((a' #> b) - b')) <= 1e-10 * (norm_2 b') ) where naa' = norm_2 (a' <> a) hsub :: HLinRel a b -> HLinRel a b -> Bool hsub h1 h2 = vhsub (h2v h1) h2 heq :: HLinRel a b -> HLinRel a b -> Bool heq a b = (hsub a b) && (hsub b a) instance Ord (HLinRel a b) where (<=) = hsub (>=) = flip hsub instance Eq (HLinRel a b) where (==) = heq |

It is useful the use the direct sum of the spaces as a monoidal product.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
hpar :: HLinRel a b -> HLinRel c d -> HLinRel (Either a c) (Either b d) hpar (HLinRel mab v) (HLinRel mcd v') = HLinRel (fromBlocks [ [mab, 0], [0 , mcd]]) (vjoin [v, v']) where hleft :: forall a b. (BEnum a, BEnum b) => HLinRel a (Either a b) hleft = HLinRel ( i ||| (- i) ||| (konst 0 (ca,cb))) (konst 0 ca) where ca = card @a cb = card @b i = ident ca hright :: forall a b. (BEnum a, BEnum b) => HLinRel b (Either a b) hright = HLinRel ( i ||| (konst 0 (cb,ca)) ||| (- i) ) (konst 0 cb) where ca = card @a cb = card @b i = ident cb htrans :: HLinRel a (Either b c) -> HLinRel (Either a b) c htrans (HLinRel m v) = HLinRel m v hswap :: forall a b. (BEnum a, BEnum b) => HLinRel (Either a b) (Either b a) hswap = HLinRel (fromBlocks [[ia ,0,0 ,-ia], [0, ib,-ib,0]]) (konst 0 (ca + cb)) where ca = card @a cb = card @b ia = ident ca ib = ident cb hsum :: forall a. BEnum a => HLinRel (Either a a) a hsum = HLinRel ( i ||| i ||| - i ) (konst 0 ca) where ca = card @a i= ident ca hdup :: forall a. BEnum a => HLinRel a (Either a a) hdup = HLinRel (fromBlocks [[i, -i,0 ], [i, 0, -i]]) (konst 0 (ca + ca)) where ca = card @a i= ident ca hdump :: HLinRel a Void hdump = HLinRel 0 0 hlabsorb ::forall a. BEnum a => HLinRel (Either Void a) a hlabsorb = HLinRel m v where (HLinRel m v) = hid @a |

A side note: `Void`

causes some consternation. `Void`

is the type with no elements and is the index type of a 0 dimensional space. It is the unit object of the monoidal product. Unfortunately by an accident of the standard Haskell definitions, actual `Void`

is not a `BEnum`

. So, I did a disgusting hack. Let us not discuss it more.

Baez and Fong have an interesting paper where they describe building circuits using a categorical graphical calculus. We have the pieces to go about something similar. What we have here is a precise way in which circuit diagrams can be though of as string diagrams in a monoidal category of linear relations.

An idealized wire has two quantities associated with it, the current flowing through it and the voltage it is at.

1 2 |
-- a 2d space at every wire or current and voltage. data IV = I | V deriving (Show, Enum, Bounded, Eq, Ord) |

When we connect wires, the currents must be conserved and the voltages must be equal. `hid`

and `hcompose`

from above still achieve that. Composing two independent circuits in parallel is achieve by `hpar`

.

We will want some basic tinker toys to work with.

A resistor in series has the same current at both ends and a voltage drop proportional to the current

1 2 3 |
resistor :: Double -> HLinRel IV IV resistor r = HLinRel ( (2><4) [ 1,0,-1, 0, r, 1, 0, -1]) (konst 0 2) |

Composing two resistors in parallel adds the resistance. `(resistor r1) <<< (resistor r2) == resistor (r1 + r2))`

A bridging resistor allows current to flow between the two branches

1 2 3 4 5 6 |
bridge :: Double -> HLinRel (Either IV IV) (Either IV IV) bridge r = HLinRel ( (4><8) [ 1,0, 1, 0, -1, 0, -1, 0, -- current conservation 0, 1, 0, 0, 0, -1 , 0, 0, --voltage maintained left 0, 0, 0, 1, 0, 0, 0, -1, -- voltage maintained right r, 1, 0,-1, -r, 0, 0, 0 ]) (konst 0 4) |

Composing two bridge circuits is putting the bridge resistors in parallel. The conductance of resistors in parallel adds. `hcompose (bridge r1) (bridge r2) == bridge 1 / (1/r1 + 1/r2)`

.

An open circuit allows no current to flow and ends a wire. `open ~ resistor infinity`

1 2 |
open :: HLinRel IV Void open = HLinRel (fromList [[1,0]]) (konst 0 1) |

At branching points, the voltage is maintained, but the current splits.

1 2 3 4 |
cmerge :: HLinRel (Either IV IV) IV cmerge = HLinRel (fromList [[1, 0, 1, 0, -1, 0], [0,1,0,0,0 ,-1 ], [0,0,0,1, 0, -1]]) (konst 0 3) |

This `cmerge`

combinator could also be built using a `short == bridge 0`

, composing a branch with `open`

, and then absorbing the `Void`

away.

We can bend wires up or down by using a composition of `cmerge`

and `open`

.

1 2 3 4 5 6 7 8 |
cap :: HLinRel (Either IV IV) Void cap = hcompose open cmerge cup :: HLinRel Void (Either IV IV) cup = hconverse cap ground :: HLinRel IV Void ground = HLinRel ( (1><2) [ 0 , 1 ]) (vzero 1) |

Voltage and current sources enforce current and voltage to be certain values

1 2 3 4 5 6 7 |
vsource :: Double -> HLinRel IV IV vsource v = HLinRel ( (2><4) [ 1,0,-1, 0, 0, 1, 0, -1]) (fromList [0,v]) isource :: Double -> HLinRel IV IV isource i = HLinRel (fromList [ [1,0, -1, 0], -- current conservation [1, 0, 0, 0]]) (fromList [0,i]) |

Measurements of circuits proceed by probes.

1 2 3 4 |
type VProbe = () vprobe :: HLinRel IV VProbe vprobe = HLinRel ( (2><3) [1,0,0, 0,1,-1]) (konst 0 2) |

Inductors and capacitors could be included easily, but would require the entries of the HMatrix values to be polynomials in the frequency , which it does not support (but it could!). We’ll leave those off for another day.

We actually can determine that the rules suggested above are being followed by computation.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
r20 :: HLinRel IV IV r20 = resistor 20 main :: IO () main = do print (r20 == (hid <<< r20)) print (r20 == r20 <<< hid) print (r20 == (hmeet r20 r20)) print $ resistor 50 == r20 <<< (resistor 30) print $ (bridge 10) <<< (bridge 10) == (bridge 5) print $ v2h (h2v r20) == r20 print $ r20 <= htop print $ hconverse (hconverse r20) == r20 print $ (open <<< r20) == open |

- Homogenous systems are usually a bit more elegant to deal with, although a bit more unfamiliar and abstract.
- Could make a pandas like interface for linear relations that uses numpy/scipy.sparse for the computation. All the swapping and associating is kind of fun to design, not so much to use. Labelled n-way relations are nice for users.
- Implicit/Lazy evaluation. We should let the good solvers do the work when possible. We implemented our operations eagerly. We don’t have to. By allowing hidden variables inside our relations, we can avoid the expensive linear operations until it is useful to actually compute on them.
- Relational division = quotient spaces?
- DSL. One of the beauties of the pointfree/categorical approach is that you avoid the need for binding forms. This makes for a very easily manipulated DSL. The transformations feel like those of ordinary algebra and you don’t have to worry about the subtleties of index renaming or substitution under binders.
- Sparse is probably really good. We have lots of identity matrices and simple rearrangements. It is very wasteful to use dense operations on these.
- Schur complement https://en.wikipedia.org/wiki/Schur_complement are the name in the game for projecting out pieces of linear problems. We have some overlap.
- Linear relations -> Polyhedral relations -> Convex Relations. Linear is super computable, polyhedral can blow up. Rearrange a DSL to abuse Linear programming as much as possible for queries.
- Network circuits. There is an interesting subclass of circuits that is designed to be pretty composable.

https://en.wikipedia.org/wiki/Two-port_network Two port networks are a very useful subclass of electrical circuits. They model transmission lines fairly well, and easily composable for filter construction.

It is standard to describe these networks by giving a linear function between two variables and the other two variables. Depending on your choice of which variables depend on which, these are called the z-parameters, y-parameters, h-parameters, scattering parameters, abcd parameters. There are tables of formula for converting from one form to the others. The different parameters hold different use cases for composition and combining in parallel or series. From the perspective of linear relations this all seems rather silly. The necessity for so many descriptions and the confusing relationship between them comes from the unnecessary and overly rigid requirement of have a linear function-like relationship rather than just a general relation, which depending of the circuit may not even be available (there are degenerate configurations where two of the variables do not imply the values of the other two). A function relationship is always a lie (although a sometimes useful one), as there is always back-reaction of new connections.

1 2 3 |
-- voltage divider divider :: Double -> Double -> HLinRel (Either IV IV) (Either IV IV) divider r1 r2 = hcompose (bridge r2) (hpar (resistor r1) hid) |

The relation model also makes clearer how to build lumped models out of continuous ones. https://en.wikipedia.org/wiki/Lumped-element_model

https://en.wikipedia.org/wiki/Transmission_line https://en.wikipedia.org/wiki/Distributed-element_model

- null
- Because the type indices have no connection to the actual data types (they are phantom) it is a wise idea to use smart constructors that check that the sizes of the matrices makes sense.

1 2 3 4 5 6 7 8 |
-- smart constructors hLinRel :: forall a b. (BEnum a, BEnum b) => Matrix Double -> Vector Double -> Maybe (HLinRel a b) hLinRel m v | cols m == (ca + cb) && (size v == rows m) = Just (HLinRel m v) | otherwise = Nothing where ca = card @a cb = card @b |

- Nonlinear circuits. Grobner Bases and polynomial relations?
- Quadratic optimization under linear constraints. Can’t get it to come out right yet. Clutch for Kalman filters. Nice for many formulations like least power, least action, minimum energy principles. Edit: I did more in this direction here http://www.philipzucker.com/categorical-lqr-control-with-linear-relations/
- Quadratic Operators -> Convex operators. See last chapter of Rockafellar.
- Duality of controllers and filters. It is well known (I think) that for ever controller algorithm there is a filter algorithm that is basically the same thing.
- LQR – Kalman
- Viterbi filter – Value function table
- particle filter – Monte Carlo control
- Extended Kalman – iLQR-ish? Use local approximation of dynamics
- unscented kalman – ?

- https://golem.ph.utexas.edu/category/2015/04/categories_in_control.html
- https://johncarlosbaez.wordpress.com/2015/04/28/a-compositional-framework-for-passive-linear-networks/
- https://graphicallinearalgebra.net/2015/12/26/27-linear-relations/
- https://arxiv.org/abs/1611.07591
- https://arxiv.org/abs/1504.05625
- Edit: Very relevant and interesting https://www.ioc.ee/~pawel/papers/affinePaper.pdf

https://en.wikipedia.org/wiki/Kissing_number

Cody brought up the other day the kissing number problem.Kissing numbers are the number of equal sized spheres you can pack around another one in d dimensions. It’s fairly self evident that the number is 2 for 1-d and 6 for 2d but 3d isn’t so obvious and in fact puzzled great mathematicians for a while. He was musing that it was interesting that he kissing numbers for some dimensions are not currently known, despite the fact that the first order theory of the real numbers is decidable https://en.wikipedia.org/wiki/Decidability_of_first-order_theories_of_the_real_numbers

I suggested on knee jerk that Sum of Squares might be useful here. I see inequalities and polynomials and then it is the only game in town that I know anything about.

Apparently that knee jerk was not completely wrong

https://arxiv.org/pdf/math/0608426.pdf

Somehow SOS/SDP was used for bounds here. I had an impulse that the problem feels SOS-y but I do not understand their derivation.

One way the problem can be formulated is by finding or proving there is no solution to the following set of equations constraining the centers of the spheres. Set the central sphere at (0,0,0,…) . Make the radii 1. Then and

I tried a couple different things and have basically failed. I hope maybe I’ll someday have a follow up post where I do better.

So I had 1 idea on how to approach this via a convex relaxation

Make a vector Take the outer product of this vector Then we can write the above equations as linear equalities and inequalities on X. If we forget that we need X to be the outer product of x (the relaxation step), this becomes a semidefinite program. Fingers crossed, maybe the solution comes back as a rank 1 matrix. Other fingers crossed, maybe the solution comes back and says it’s infeasible. In either case, we have solved our original problem.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
import numpy as np import cvxpy as cvx d = 2 n = 6 N = d * n x = cvx.Variable((N+1,N+1), symmetric=True) c = [] c += [x >> 0] c += [x[0,0] == 1] # x^2 + y^2 + z^2 + ... == 2^2 constraint x1 = x[1:,1:] for i in range(n): q = 0 for j in range(d): q += x1[d*i + j, d*i + j] c += [q == 4] #[ x1[2*i + 1, 2*i + 1] + x[2*i + 2, 2*i + 2] == 4] #c += [x1[0,0] == 2, x1[1,1] >= 0] #c += [x1[2,2] >= 2] # (x - x) + (y - y) >= 4 for i in range(n): for k in range(i): q = 0 for j in range(d): q += x1[d*i + j, d*i + j] + x1[d*k + j, d*k + j] - 2 * x1[d*i + j, d*k + j] # xk ^ 2 - 2 * xk * xi c += [q >= 4] print(c) obj = cvx.Maximize(cvx.trace(np.random.rand(N+1,N+1) @ x )) prob = cvx.Problem(obj, c) print(prob.solve(verbose=True)) u, s, vh = np.linalg.svd(x.value) print(s) import matplotlib.pyplot as plt xy = vh[0,1:].reshape(-1,2) print(xy) plt.scatter(xy[0], xy[1] ) plt.show() |

Didn’t work though. Sigh. It’s conceivable we might do better if we start packing higher powers into x?

Ok Round 2. Let’s just ask z3 and see what it does. I’d trust z3 with my baby’s soft spot.

It solves for 5 and below. Z3 grinds to a halt on N=6 and above. It ran for days doin nothing on my desktop.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
from z3 import * import numpy as np d = 2 # dimensions n = 6 # number oif spheres x = np.array([ [ Real("x_%d_%d" % (i,j)) for j in range(d) ] for i in range(n)]) print(x) c = [] ds = np.sum(x**2, axis=1) c += [ d2 == 4 for d2 in ds] # centers at distance 2 from origin ds = np.sum( (x.reshape((-1,1,d)) - x.reshape((1,-1,d)))**2, axis = 2) c += [ ds[i,j] >= 4 for i in range(n) for j in range(i)] # spheres greater than dist 2 apart c += [x[0,0] == 2] print(c) print(solve(c)) |

Ok. A different tact. Try to use a positivstellensatz proof. If you have a bunch of polynomial inequalities and equalities if you sum polynomial multiples of these constraints, with the inequalities having sum of square multiples, in such a way to = -1, it shows that there is no real solution to them. We have the distance from origin as equality constraint and distance from each other as an inequality constraint. I intuitively think of the positivstellensatz as deriving an impossibility from false assumptions. You can’t add a bunch of 0 and positive numbers are get a negative number, hence there is no real solution.

I have a small set of helper functions for combining sympy and cvxpy for sum of squares optimization. I keep it here along with some other cute little constructs https://github.com/philzook58/cvxpy-helpers

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
import cvxpy as cvx from sympy import * import random ''' The idea is to use raw cvxpy and sympy as much as possible for maximum flexibility. Construct a sum of squares polynomial using sospoly. This returns a variable dictionary mapping sympy variables to cvxpy variables. You are free to the do polynomial operations (differentiation, integration, algerba) in pure sympy When you want to express an equality constraint, use poly_eq(), which takes the vardict and returns a list of cvxpy constraints. Once the problem is solved, use poly_value to get back the solution polynomials. That some polynomial is sum of squares can be expressed as the equality with a fresh polynomial that is explicility sum of sqaures. With the approach, we get the full unbridled power of sympy (including grobner bases!) I prefer manually controlling the vardict to having it auto controlled by a class, just as a I prefer manually controlling my constraint sets Classes suck. ''' def cvxify(expr, cvxdict): # replaces sympy variables with cvx variables in sympy expr return lambdify(tuple(cvxdict.keys()), expr)(*cvxdict.values()) def sospoly(terms, name=None): ''' returns sum of squares polynomial using terms, and vardict mapping to cvxpy variables ''' if name == None: name = str(random.getrandbits(32)) N = len(terms) xn = Matrix(terms) Q = MatrixSymbol(name, N,N) p = (xn.T * Matrix(Q) * xn)[0] Qcvx = cvx.Variable((N,N), PSD=True) vardict = {Q : Qcvx} return p, vardict def polyvar(terms,name=None): ''' builds sumpy expression and vardict for an unknown linear combination of the terms ''' if name == None: name = str(random.getrandbits(32)) N = len(terms) xn = Matrix(terms) Q = MatrixSymbol(name, N, 1) p = (xn.T * Matrix(Q))[0] Qcvx = cvx.Variable((N,1)) vardict = {Q : Qcvx} return p, vardict def scalarvar(name=None): return polyvar([1], name) def worker(x ): (expr,vardict) = x return cvxify(expr, vardict) == 0 def poly_eq(p1, p2 , vardict): ''' returns a list of cvxpy constraints ''' dp = p1 - p2 polyvars = list(dp.free_symbols - set(vardict.keys())) print("hey") p, opt = poly_from_expr(dp, gens = polyvars, domain = polys.domains.EX) #This is brutalizing me print(opt) print("buddo") return [ cvxify(expr, vardict) == 0 for expr in p.coeffs()] ''' import multiprocessing import itertools pool = multiprocessing.Pool() return pool.imap_unordered(worker, zip(p.coeffs(), itertools.repeat(vardict))) ''' def vardict_value(vardict): ''' evaluate numerical values of vardict ''' return {k : v.value for (k, v) in vardict.items()} def poly_value(p1, vardict): ''' evaluate polynomial expressions with vardict''' return cvxify(p1, vardict_value(vardict)) if __name__ == "__main__": x = symbols('x') terms = [1, x, x**2] #p, cdict = polyvar(terms) p, cdict = sospoly(terms) c = poly_eq(p, (1 + x)**2 , cdict) print(c) prob = cvx.Problem(cvx.Minimize(1), c) prob.solve() print(factor(poly_value(p, cdict))) # global poly minimization vdict = {} t, d = polyvar([1], name='t') vdict.update(d) p, d = sospoly([1,x,x**2], name='p') vdict.update(d) constraints = poly_eq(7 + x**2 - t, p, vdict) obj = cvx.Maximize( cvxify(t,vdict) ) prob = cvx.Problem(obj, constraints) prob.solve() print(poly_value(t,vdict)) |

and here is the attempted positivstellensatz.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
import sos import cvxpy as cvx from sympy import * import numpy as np d = 2 N = 7 # a grid of a vector field. indices = (xposition, yposition, vector component) '''xs = [ [symbols("x_%d_%d" % (i,j)) for j in range(d)] for i in range(N) ] gens = [x for l in xs for x in l ] xs = np.array([[poly(x,gens=gens, domain=polys.domains.EX) for x in l] for l in xs]) ''' xs = np.array([ [symbols("x_%d_%d" % (i,j)) for j in range(d)] for i in range(N) ]) c1 = np.sum( xs * xs, axis=1) - 1 c2 = np.sum((xs.reshape(-1,1,d) - xs.reshape(1,-1,d))**2 , axis=2) - 1 print(c1) print(c2) terms0 = [1] terms1 = terms0 + list(xs.flatten()) terms2 = [ terms1[i]*terms1[j] for j in range(N+1) for i in range(j+1)] #print(terms1) #print(terms2) vdict = {} psatz = 0 for c in c1: lam, d = sos.polyvar(terms2) vdict.update(d) psatz += lam*c for i in range(N): for j in range(i): c = c2[i,j] lam, d = sos.sospoly(terms2) vdict.update(d) psatz += lam*c #print(type(psatz)) print("build constraints") constraints = sos.poly_eq(psatz, -1, vdict) #print("Constraints: ", len(constraints)) obj = cvx.Minimize(1) #sum([cvx.sum(v) for v in vdict.values()])) print("build prob") prob = cvx.Problem(obj, constraints) print("solve") prob.solve(verbose=True, solver= cvx.SCS) |

It worked in 1-d, but did not work in 2d. At order 3 polynomials N=7, I maxed out my ram.

I also tried doing it in Julia, since sympy was killing me. Julia already has a SOS package

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
using JuMP using SumOfSquares using DynamicPolynomials using SCS N = 10 d = 2 @polyvar x[1:N,1:d] X = monomials(reshape(x,d*N), 0:2) X1 = monomials(reshape(x,d*N), 0:4) model = SOSModel(with_optimizer(SCS.Optimizer)) acc = nothing for t in sum(x .* x, dims=2) #print(t) p = @variable(model, [1:1], Poly(X1)) #print(p) if acc != nothing acc += p * (t - 1) else acc = p * (t - 1) end end for i in range(1,stop=N) for j in range(1,stop=i-1) d = x[i,:] - x[j,:] p = @variable(model, [1:1], SOSPoly(X)) acc += p * (sum(d .* d) - 1) end end #print(acc) print(typeof(acc)) @constraint(model, acc[1] == -1 ) optimize!(model) |

It was faster to encode, but it’s using the same solver (SCS), so basically the same thing.

I should probably be reducing the system with respect to equality constraints since they’re already in a Groebner basis. I know that can be really important for reducing the size of your problem

I dunno.

Blah blah blah blah A bunch of unedited trash

https://github.com/peterwittek/ncpol2sdpa Peter Wittek has probably died in an avalanche? That is very sad.

These notes

https://web.stanford.edu/class/ee364b/lectures/sos_slides.pdf

Positivstullensatz.

kissing number

Review of sum of squares

minimimum sample as LP. ridiculous problem

min t

st. f(x_i) – t >= 0

dual -> one dual variable per sample point

The only dual that will be non zero is that actually selecting the minimum.

Hm. Yeah, that’s a decent analogy.

How does the dual even have a chance of knowing about poly airhtmetic?

It must be during the SOS conversion prcoess. In building the SOS constraints,

we build a finite, limittted version of polynomial multiplication

x as a matrix. x is a shift matrix.

In prpducing the characterstic polynomial, x is a shift matrix, with the last line using the polynomial

known to be zero to

eigenvectors of this matrix are zeros of the poly.

SOS does not really on polynomials persay. It relies on closure of the suqaring operaiton

maybe set one sphere just at x=0 y = 2. That breaks some symmettry

set next sphere in plane something. random plane through origin?

order y components – breaks some of permutation symmettry.

no, why not order in a random direction. That seems better for symmettry breaking

Edit: It’s up! https://learnxinyminutes.com/docs/coq/

I’ve been preparing a Learn X in Y tutorial for Coq. https://learnxinyminutes.com/

I’ve been telling people this and been surprised by how few people have heard of the site. It’s super quick intros to syntax and weirdness for a bunch of languages with inline code tutorials.

I think that for me, a short description of that mundane syntactic and programming constructs of coq is helpful.

Some guidance of the standard library, what is available by default. And dealing with Notation scopes, which is a pretty weird feature that most languages don’t have.

The manual actually has all this now. It’s really good. Like check this section out https://coq.inria.fr/refman/language/coq-library.html . But the manual is an intimidating documents. It starts with a BNF description of syntax and things like that. The really useful pedagogical stuff is scattered throughout it.

Anyway here is my draft (also here https://github.com/philzook58/learnxinyminutes-docs/blob/master/coq.html.markdown where the syntax highlighting isn’t so janked up). Suggestions welcome. Or if this gets accepted, you can just make pull requests

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 |
--- language: Coq filename: learncoq.v contributors: - ["Philip Zucker", "http://www.philipzucker.com/"] --- The Coq system is a proof assistant. It is designed to build and verify mathematical proofs. The Coq system contains the functional programming language Gallina and is capable of proving properties about programs written in this language. Coq is a dependently typed language. This means that the types of the language may depend on the values of variables. In this respect, it is similar to other related languages such as Agda, Idris, F*, Lean, and others. Via the Curry-Howard correspondence, programs, properties and proofs are formalized in the same language. Coq is developed in OCaml and shares some syntactic and conceptual similiarity with it. Coq is a language containing many fascinating but difficult topics. This tutorial will focus on the programming aspects of Coq, rather than the proving. It may be helpful, but not necessary to learn some OCaml first, especially if you are unfamiliar with functional programming. This tutorial is based upon its OCaml equivalent The standard usage model of Coq is to write it with interactive tool assistance, which operates like a high powered REPL. Two common such editors are the CoqIDE and Proof General Emacs mode. Inside Proof General `Ctrl+C <Enter>` will evaluate up to your cursor. ```coq (*** Comments ***) (* Comments are enclosed in (* and *). It's fine to nest comments. *) (* There are no single-line comments. *) (*** Variables and functions ***) (* The Coq proof assistant can be controlled and queried by a command language called the vernacular. Vernacular keywords are capitalized and the commands end with a period. Variable and function declarations are formed with the Definition vernacular. *) Definition x := 10. (* Coq can sometimes infer the types of arguments, but it is common practice to annotate with types. *) Definition inc_nat (x : nat) : nat := x + 1. (* There exists a large number of vernacular commands for querying information. These can be very useful. *) Compute (1 + 1). (* 2 : nat *) (* Compute a result. *) Check tt. (* tt : unit *) (* Check the type of an expressions *) About plus. (* Prints information about an object *) (* Print information including the definition *) Print true. (* Inductive bool : Set := true : Bool | false : Bool *) Search nat. (* Returns a large list of nat related values *) Search "_ + _". (* You can also search on patterns *) Search (?a -> ?a -> bool). (* Patterns can have named parameters *) Search (?a * ?a). (* Locate tells you where notation is coming from. Very helpful when you encounter new notation. *) Locate "+". (* Calling a function with insufficient number of arguments does not cause an error, it produces a new function. *) Definition make_inc x y := x + y. (* make_inc is int -> int -> int *) Definition inc_2 := make_inc 2. (* inc_2 is int -> int *) Compute inc_2 3. (* Evaluates to 5 *) (* Definitions can be chained with "let ... in" construct. This is roughly the same to assigning values to multiple variables before using them in expressions in imperative languages. *) Definition add_xy : nat := let x := 10 in let y := 20 in x + y. (* Pattern matching is somewhat similar to switch statement in imperative languages, but offers a lot more expressive power. *) Definition is_zero (x : nat) := match x with | 0 => true | _ => false (* The "_" pattern means "anything else". *) end. (* You can define recursive function definition using the Fixpoint vernacular.*) Fixpoint factorial n := match n with | 0 => 1 | (S n') => n * factorial n' end. (* Function application usually doesn't need parentheses around arguments *) Compute factorial 5. (* 120 : nat *) (* ...unless the argument is an expression. *) Compute factorial (5-1). (* 24 : nat *) (* You can define mutually recursive functions using "with" *) Fixpoint is_even (n : nat) : bool := match n with | 0 => true | (S n) => is_odd n end with is_odd n := match n with | 0 => false | (S n) => is_even n end. (* As Coq is a total programming language, it will only accept programs when it can understand they terminate. It can be most easily seen when the recursive call is on a pattern matched out subpiece of the input, as then the input is always decreasing in size. Getting Coq to understand that functions terminate is not always easy. See the references at the end of the artice for more on this topic. *) (* Anonymous functions use the following syntax: *) Definition my_square : nat -> nat := fun x => x * x. Definition my_id (A : Type) (x : A) : A := x. Definition my_id2 : forall A : Type, A -> A := fun A x => x. Compute my_id nat 3. (* 3 : nat *) (* You can ask Coq to infer terms with an underscore *) Compute my_id _ 3. (* An implicit argument of a function is an argument which can be inferred from contextual knowledge. Parameters enclosed in {} are implicit by default *) Definition my_id3 {A : Type} (x : A) : A := x. Compute my_id3 3. (* 3 : nat *) (* Sometimes it may be necessary to turn this off. You can make all arguments explicit again with @ *) Compute @my_id3 nat 3. (* Or give arguments by name *) Compute my_id3 (A:=nat) 3. (*** Notation ***) (* Coq has a very powerful Notation system that can be used to write expressions in more natural forms. *) Compute Nat.add 3 4. (* 7 : nat *) Compute 3 + 4. (* 7 : nat *) (* Notation is a syntactic transformation applied to the text of the program before being evaluated. Notation is organized into notation scopes. Using different notation scopes allows for a weak notion of overloading. *) (* Imports the Zarith module containing definitions related to the integers Z *) Require Import ZArith. (* Notation scopes can be opened *) Open Scope Z_scope. (* Now numerals and addition are defined on the integers. *) Compute 1 + 7. (* 8 : Z *) (* Integer equality checking *) Compute 1 =? 2. (* false : bool *) (* Locate is useful for finding the origin and definition of notations *) Locate "_ =? _". (* Z.eqb x y : Z_scope *) Close Scope Z_scope. (* We're back to nat being the default interpetation of "+" *) Compute 1 + 7. (* 8 : nat *) (* Scopes can also be opened inline with the shorthand % *) Compute (3 * -7)%Z. (* -21%Z : Z *) (* Coq declares by default the following interpretation scopes: core_scope, type_scope, function_scope, nat_scope, bool_scope, list_scope, int_scope, uint_scope. You may also want the numerical scopes Z_scope (integers) and Q_scope (fractions) held in the ZArith and QArith module respectively. *) (* You can print the contents of scopes *) Print Scope nat_scope. (* Scope nat_scope Delimiting key is nat Bound to classes nat Nat.t "x 'mod' y" := Nat.modulo x y "x ^ y" := Nat.pow x y "x ?= y" := Nat.compare x y "x >= y" := ge x y "x > y" := gt x y "x =? y" := Nat.eqb x y "x <? y" := Nat.ltb x y "x <=? y" := Nat.leb x y "x <= y <= z" := and (le x y) (le y z) "x <= y < z" := and (le x y) (lt y z) "n <= m" := le n m "x < y <= z" := and (lt x y) (le y z) "x < y < z" := and (lt x y) (lt y z) "x < y" := lt x y "x / y" := Nat.div x y "x - y" := Init.Nat.sub x y "x + y" := Init.Nat.add x y "x * y" := Init.Nat.mul x y *) (* Coq has exact fractions available as the type Q in the QArith module. Floating point numbers and real numbers are also available but are a more advanced topic, as proving properties about them is rather tricky. *) Require Import QArith. Open Scope Q_scope. Compute 1. (* 1 : Q *) Compute 2. (* 2 : nat *) (* only 1 and 0 are interpeted as fractions by Q_scope *) Compute (2 # 3). (* The fraction 2/3 *) Compute (1 # 3) ?= (2 # 6). (* Eq : comparison *) Close Scope Q_scope. Compute ( (2 # 3) / (1 # 5) )%Q. (* 10 # 3 : Q *) (*** Common data structures ***) (* Many common data types are included in the standard library *) (* The unit type has exactly one value, tt *) Check tt. (* tt : unit *) (* The option type is useful for expressing computations that might fail *) Compute None. (* None : option ?A *) Check Some 3. (* Some 3 : option nat *) (* The type sum A B allows for values of either type A or type B *) Print sum. Check inl 3. (* inl 3 : nat + ?B *) Check inr true. (* inr true : ?A + bool *) Check sum bool nat. (* (bool + nat)%type : Set *) Check (bool + nat)%type. (* Notation for sum *) (* Tuples are (optionally) enclosed in parentheses, items are separated by commas. *) Check (1, true). (* (1, true) : nat * bool *) Compute prod nat bool. (* (nat * bool)%type : Set *) Definition my_fst {A B : Type} (x : A * B) : A := match x with | (a,b) => a end. (* A destructuring let is available if a pattern match is irrefutable *) Definition my_fst2 {A B : Type} (x : A * B) : A := let (a,b) := x in a. (*** Lists ***) (* Lists are built by using cons and nil or by using notation available in list_scope. *) Compute cons 1 (cons 2 (cons 3 nil)). (* (1 :: 2 :: 3 :: nil)%list : list nat *) Compute (1 :: 2 :: 3 :: nil)%list. (* There is also list notation available in the ListNotations modules *) Require Import List. Import ListNotations. Compute [1 ; 2 ; 3]. (* [1; 2; 3] : list nat *) (* There are a large number of list manipulation functions available, lncluding: • length • head : first element (with default) • tail : all but first element • app : appending • rev : reverse • nth : accessing n-th element (with default) • map : applying a function • flat_map : applying a function returning lists • fold_left : iterator (from head to tail) • fold_right : iterator (from tail to head) *) Definition my_list : list nat := [47; 18; 34]. Compute List.length my_list. (* 3 : nat *) (* All functions in coq must be total, so indexing requires a default value *) Compute List.nth 1 my_list 0. (* 18 : nat *) Compute List.map (fun x => x * 2) my_list. (* [94; 36; 68] : list nat *) Compute List.filter (fun x => Nat.eqb (Nat.modulo x 2) 0) my_list. (* [18; 34] : list nat *) Compute (my_list ++ my_list)%list. (* [47; 18; 34; 47; 18; 34] : list nat *) (*** Strings ***) Require Import Strings.String. Open Scope string_scope. (* Use double quotes for string literals. *) Compute "hi"%string. (* Strings can be concatenated with the "++" operator. *) Compute String.append "Hello " "World". (* "Hello World" : string *) Compute "Hello " ++ "World". (* "Hello World" : string *) (* Strings can be compared for equality *) Compute String.eqb "Coq is fun!"%string "Coq is fun!"%string. (* true : bool *) Compute ("no" =? "way")%string. (* false : bool *) Close Scope string_scope. (*** Other Modules ***) (* Other Modules in the standard library that may be of interest: • Logic : Classical logic and dependent equality • Arith : Basic Peano arithmetic • PArith : Basic positive integer arithmetic • NArith : Basic binary natural number arithmetic • ZArith : Basic relative integer arithmetic • Numbers : Various approaches to natural, integer and cyclic numbers (currently axiomatically and on top of 2^31 binary words) • Bool : Booleans (basic functions and results) • Lists : Monomorphic and polymorphic lists (basic functions and results), Streams (infinite sequences defined with co-inductive types) • Sets : Sets (classical, constructive, finite, infinite, power set, etc.) • FSets : Specification and implementations of finite sets and finite maps (by lists and by AVL trees) • Reals : Axiomatization of real numbers (classical, basic functions, integer part, fractional part, limit, derivative, Cauchy series, power series and results,...) • Relations : Relations (definitions and basic results) • Sorting : Sorted list (basic definitions and heapsort correctness) • Strings : 8-bits characters and strings • Wellfounded : Well-founded relations (basic results) *) (*** User-defined data types ***) (* Because Coq is dependently typed, defining type aliases is no different than defining an alias for a value. *) Definition my_three : nat := 3. Definition my_nat : Type := nat. (* More interesting types can be defined using the Inductive vernacular. Simple enumeration can be defined like so *) Inductive ml := OCaml | StandardML | Coq. Definition lang := Coq. (* Has type "ml". *) (* For more complicated types, you will need to specify the types of the constructors. *) (* Type constructors don't need to be empty. *) Inductive my_number := plus_infinity | nat_value : nat -> my_number. Compute nat_value 3. (* nat_value 3 : my_number *) (* Record syntax is sugar for tuple-like types. It defines named accessor functions for the components *) Record Point2d (A : Set) := mkPoint2d { x2 : A ; y2 : A }. Definition mypoint : Point2d nat := {| x2 := 2 ; y2 := 3 |}. Compute x2 nat mypoint. (* 2 : nat *) Compute mypoint.(x2 nat). (* 2 : nat *) (* Types can be parameterized, like in this type for "list of lists of anything". 'a can be substituted with any type. *) Definition list_of_lists a := list (list a). Definition list_list_nat := list_of_lists nat. (* Types can also be recursive. Like in this type analogous to built-in list of naturals. *) Inductive my_nat_list := EmptyList | NatList : nat -> my_nat_list -> my_nat_list. Compute NatList 1 EmptyList. (* NatList 1 EmptyList : my_nat_list *) (** Matching type constructors **) Inductive animal := Dog : string -> animal | Cat : string -> animal. Definition say x := match x with | Dog x => (x ++ " says woof")%string | Cat x => (x ++ " says meow")%string end. Compute say (Cat "Fluffy"). (* "Fluffy says meow". *) (** Traversing data structures with pattern matching **) (* Recursive types can be traversed with pattern matching easily. Let's see how we can traverse a data structure of the built-in list type. Even though the built-in cons ("::") looks like an infix operator, it's actually a type constructor and can be matched like any other. *) Fixpoint sum_list l := match l with | [] => 0 | head :: tail => head + (sum_list tail) end. Compute sum_list [1; 2; 3]. (* Evaluates to 6 *) (*** A Taste of Proving ***) (* Explaining the proof language is out of scope for this tutorial, but here is a taste to whet your appetite. Check the resources below for more. *) (* A fascinating feature of dependently type based theorem provers is that the same primitive constructs underly the proof language as the programming features. For example, we can write and prove the proposition A and B implies A in raw Gallina *) Definition my_theorem : forall A B, A /\ B -> A := fun A B ab => match ab with | (conj a b) => a end. (* Or we can prove it using tactics. Tactics are a macro language to help build proof terms in a more natural style and automate away some drudgery. *) Theorem my_theorem2 : forall A B, A /\ B -> A. Proof. intros A B ab. destruct ab as [ a b ]. apply a. Qed. (* We can prove easily prove simple polynomial equalities using the automated tactic ring. *) Require Import Ring. Require Import Arith. Theorem simple_poly : forall (x : nat), (x + 1) * (x + 2) = x * x + 3 * x + 2. Proof. intros. ring. Qed. (* Here we prove the closed form for the sum of all numbers 1 to n using induction *) Fixpoint sumn (n : nat) : nat := match n with | 0 => 0 | (S n') => n + (sumn n') end. Theorem sum_formula : forall n, 2 * (sumn n) = (n + 1) * n. Proof. intros. induction n. - reflexivity. (* 0 = 0 base case *) - simpl. ring [IHn]. (* induction step *) Qed. ``` With this we have only scratched the surface of Coq. It is a massive ecosystem with many interesting and peculiar topics leading all the way up to modern research. ## Further reading * [The Coq reference manual](https://coq.inria.fr/refman/) * [Software Foundations](https://softwarefoundations.cis.upenn.edu/) * [Certfied Programming with Dependent Types](http://adam.chlipala.net/cpdt/) * [Mathematical Components](https://math-comp.github.io/mcb/) * [Coq'Art: The Calculus of Inductive Constructions](http://www.cse.chalmers.se/research/group/logic/TypesSS05/resources/coq/CoqArt/) * [FRAP](http://adam.chlipala.net/frap/) |

Bonus. An uneditted list of tactics. You’d probably prefer https://pjreddie.com/coq-tactics/

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
(*** Tactics ***) (* Although we won't explain their use in detail, here is a list of common tactics. *) (* * exact * simpl * intros * apply * assert * destruct * induction * reflexivity * rewrite * inversion * injection * discriminate * fold * unfold Tacticals * try * ; * repeat * Automatic * auto * eauto * tauto * ring * ring_simplify * psatz * lia * ria LTac is a logic programming scripting language for tactics From Tatics chapter of LF intros: move hypotheses/variables from goal to context reflexivity: finish the proof (when the goal looks like e = e) apply: prove goal using a hypothesis, lemma, or constructor apply... in H: apply a hypothesis, lemma, or constructor to a hypothesis in the context (forward reasoning) apply... with...: explicitly specify values for variables that cannot be determined by pattern matching simpl: simplify computations in the goal simpl in H: ... or a hypothesis rewrite: use an equality hypothesis (or lemma) to rewrite the goal rewrite ... in H: ... or a hypothesis symmetry: changes a goal of the form t=u into u=t symmetry in H: changes a hypothesis of the form t=u into u=t unfold: replace a defined constant by its right-hand side in the goal unfold... in H: ... or a hypothesis destruct... as...: case analysis on values of inductively defined types destruct... eqn:...: specify the name of an equation to be added to the context, recording the result of the case analysis induction... as...: induction on values of inductively defined types injection: reason by injectivity on equalities between values of inductively defined types discriminate: reason by disjointness of constructors on equalities between values of inductively defined types assert (H: e) (or assert (e) as H): introduce a "local lemma" e and call it H generalize dependent x: move the variable x (and anything else that depends on it) from the context back to an explicit hypothesis in the goal formula clear H: Delete hypothesis H from the context. subst x: For a variable x, find an assumption x = e or e = x in the context, replace x with e throughout the context and current goal, and clear the assumption. subst: Substitute away all assumptions of the form x = e or e = x (where x is a variable). rename... into...: Change the name of a hypothesis in the proof context. For example, if the context includes a variable named x, then rename x into y will change all occurrences of x to y. assumption: Try to find a hypothesis H in the context that exactly matches the goal; if one is found, behave like apply H. contradiction: Try to find a hypothesis H in the current context that is logically equivalent to False. If one is found, solve the goal. constructor: Try to find a constructor c (from some Inductive definition in the current environment) that can be applied to solve the current goal. If one is found, behave like apply c. (* Dependent types. Using dependent types for programming tasks tends to be rather unergonomic in Coq. We briefly mention here as an advanced topic that there exists a more sophistictaed match statement that is needed for dependently typed. See for example the "Convoy" pattern. *) (*** Other topics ***) (* Dependently Typed Programming - Most of the above syntax has its equivalents in OCaml. Coq also has the capability for full dependently typed programming. There is an extended pattern matching syntax available for this purpose. Extraction - Coq can be extracted to OCaml and Haskell code for their more performant runtimes and ecosystems Modules / TypeClasses - Modules and Typeclasses are methods for organizing code. They allow a different form of overloading than Notation Setoids - Termination - Gallina is a total functional programming language. It will not allow you to write functions that do not obviously terminate. For functions that do terminate but non-obviously, it requires some work to get Coq to understand this. Coinductive - Coinductive types such as streams are possibly infinite values that stay productive. *) |

I wrote a while back how you can make a pretty nice DSL for reverse mode differentiation based on the same type as Lens. I’d heard some interesting rumblings on the internet around these ideas and so was revisiting them.

1 2 |
type Lens s t a b = s -> (a, b -> t) type AD x dx y dy = x -> (y, dy -> dx) |

Composition is defined identically for reverse mode just as it is for lens.

After chewing on it a while, I realized this really isn’t that exotic. How it works is that you store the reverse mode computation graph, and all necessary saved data from the forward pass in the closure of the `(dy -> dx)`

. I also have a suspicion that if you defunctionalized this construction, you’d get the Wengert tape formulation of reverse mode ad.

Second, Lens is just a nice structure for bidirectional computation, with one forward pass and one backward pass which may or may not be getting/setting. There are other examples for using it like this.

It is also pretty similar to the standard “dual number” form `type FAD x dx y dy = (x,dx)->(y,dy)`

for forward mode AD. We can bring the two closer by a CPS/Yoneda transformation and then some rearrangement.

1 2 3 |
x -> (y, dy -> dx) ==> x -> (y, forall s. (dx -> s) -> (dy -> s)) ==> forall s. (x, dx -> s) -> (y, dx -> s) |

and meet it in the middle with

1 2 |
(x,dx) -> (y,dy) ==> forall s. (x, s -> dx) -> (y, s -> dy) |

I ended the previous post somewhat unsatisfied by how ungainly writing that neural network example was, and I called for Conal Elliot’s compiling to categories plugin as a possible solution. The trouble is piping the weights all over the place. This piping is very frustrating in point-free form, especially when you know it’d be so trivial pointful. While the inputs and outputs of layers of the network compose nicely (you no longer need to know about the internal computations), the weights do not. As we get more and more layers, we get more and more weights. The weights are in some sense not as compositional as the inputs and outputs of the layers, or compose in a different way that you need to maintain access to.

I thought of a very slight conceptual twist that may help.

The idea is we keep the weights out to the side in their own little type parameter slots. Then we define composition such that it composes input/outputs while tupling the weights. Basically we throw the repetitive complexity appearing in piping the weights around into the definition of composition itself.

These operations are easily seen as 2 dimensional diagrams.

Here’s the core reverse lens ad combinators

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
import Control.Arrow ((***)) type Lens'' a b = a -> (b, b -> a) comp :: (b -> (c, (c -> b))) -> (a -> (b, (b -> a))) -> (a -> (c, (c -> a))) comp f g x = let (b, dg) = g x in let (c, df) = f b in (c, dg . df) id' :: Lens'' a a id' x = (x, id) relu' :: (Ord a, Num a) => Lens'' a a relu' = \x -> (frelu x, brelu x) where frelu x | x > 0 = x | otherwise = 0 brelu x dy | x > 0 = dy | otherwise = 0 add' :: Num a => Lens'' (a,a) a add' = \(x,y) -> (x + y, \ds -> (ds, ds)) dup' :: Num a => Lens'' a (a,a) dup' = \x -> ((x,x), \(dx,dy) -> dx + dy) sub' :: Num a => Lens'' (a,a) a sub' = \(x,y) -> (x - y, \ds -> (ds, -ds)) mul' :: Num a => Lens'' (a,a) a mul' = \(x,y) -> (x * y, \dz -> (dz * y, x * dz)) recip' :: Fractional a => Lens'' a a recip' = \x-> (recip x, \ds -> - ds / (x * x)) div' :: Fractional a => Lens'' (a,a) a div' = (\(x,y) -> (x / y, \d -> (d/y,-x*d/(y * y)))) sin' :: Floating a => Lens'' a a sin' = \x -> (sin x, \dx -> dx * (cos x)) cos' :: Floating a => Lens'' a a cos' = \x -> (cos x, \dx -> -dx * (sin x)) pow' :: Num a => Integer -> Lens'' a a pow' n = \x -> (x ^ n, \dx -> (fromInteger n) * dx * x ^ (n-1)) --cmul :: Num a => a -> Lens' a a --cmul c = lens (* c) (\x -> \dx -> c * dx) exp' :: Floating a => Lens'' a a exp' = \x -> let ex = exp x in (ex, \dx -> dx * ex) fst' :: Num b => Lens'' (a,b) a fst' = (\(a,b) -> (a, \ds -> (ds, 0))) snd' :: Num a => Lens'' (a,b) b snd' = (\(a,b) -> (b, \ds -> (0, ds))) -- some monoidal combinators swap' :: Lens'' (a,b) (b,a) swap' = (\(a,b) -> ((b,a), \(db,da) -> (da, db))) assoc' :: Lens'' ((a,b),c) (a,(b,c)) assoc' = \((a,b),c) -> ((a,(b,c)), \(da,(db,dc)) -> ((da,db),dc)) assoc'' :: Lens'' (a,(b,c)) ((a,b),c) assoc'' = \(a,(b,c)) -> (((a,b),c), \((da,db),dc)-> (da,(db,dc))) par' :: Lens'' a b -> Lens'' c d -> Lens'' (a,c) (b,d) par' l1 l2 = l3 where l3 (a,c) = let (b , j1) = l1 a in let (d, j2) = l2 c in ((b,d) , j1 *** j2) first' :: Lens'' a b -> Lens'' (a, c) (b, c) first' l = par' l id' second' :: Lens'' a b -> Lens'' (c, a) (c, b) second' l = par' id' l labsorb :: Lens'' ((),a) a labsorb (_,a) = (a, \a' -> ((),a')) labsorb' :: Lens'' a ((),a) labsorb' a = (((),a), \(_,a') -> a') rabsorb :: Lens'' (a,()) a rabsorb = comp labsorb swap' |

And here are the two dimensional combinators. I tried to write them point-free in terms of the combinators above to demonstrate that there is no monkey business going on. We

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
type WAD' w w' a b = Lens'' (w,a) (w',b) type WAD'' w a b = WAD' w () a b -- terminate the weights for a closed network {- For any monoidal category we can construct this composition? -} -- horizontal composition hcompose :: forall w w' w'' w''' a b c. WAD' w' w'' b c -> WAD' w w''' a b -> WAD' (w',w) (w'',w''') a c hcompose f g = comp f' g' where f' :: Lens'' ((w',r),b) ((w'',r),c) f' = (first' swap') `comp` assoc'' `comp` (par' id' f) `comp` assoc' `comp` (first' swap') g' :: Lens'' ((r,w),a) ((r,w'''),b) g' = assoc'' `comp` (par' id' g) `comp` assoc' rotate :: WAD' w w' a b -> WAD' a b w w' rotate f = swap' `comp` f `comp` swap' -- vertical composition of weights vcompose :: WAD' w' w'' c d -> WAD' w w' a b -> WAD' w w'' (c, a) (d, b) vcompose f g = rotate (hcompose (rotate f) (rotate g) ) -- a double par. diagpar :: forall w w' a b w'' w''' c d. WAD' w w' a b -> WAD' w'' w''' c d -> WAD' (w,w'') (w',w''') (a, c) (b, d) diagpar f g = t' `comp` (par' f g) `comp` t where t :: Lens'' ((w,w''),(a,c)) ((w,a), (w'',c)) -- yikes. just rearrangements. t = assoc'' `comp` (second' ((second' swap') `comp` assoc' `comp` swap')) `comp` assoc' t' :: Lens'' ((w',b), (w''',d)) ((w',w'''),(b,d)) -- the tranpose of t t' = assoc'' `comp` (second' ( swap' `comp` assoc'' `comp` (second' swap'))) `comp` assoc' id''' :: WAD' () () a a id''' = id' -- rotate:: WAD' w a a w -- rotate = swap' liftIO :: Lens'' a b -> WAD' w w a b liftIO = second' liftW :: Lens'' w w' -> WAD' w w' a a liftW = first' wassoc' = liftW assoc' wassoc'' = liftW assoc'' labsorb'' :: WAD' ((),w) w a a labsorb'' = first' labsorb labsorb''' :: WAD' w ((),w) a a labsorb''' = first' labsorb' wswap' :: WAD' (w,w') (w',w) a a wswap' = first' swap' -- and so on we can lift all combinators |

I wonder if this is actually nice?

I asked around and it seems like this idea may be what davidad is talking about when he refers to dioptics

http://events.cs.bham.ac.uk/syco/strings3-syco5/slides/dalrymple.pdf

Perhaps this will initiate a convo.

Edit: He confirms that what I’m doing appears to be a dioptic. Also he gave a better link http://events.cs.bham.ac.uk/syco/strings3-syco5/papers/dalrymple.pdf

He is up to some interesting diagrams

- Does this actually work or help make things any better?
- Recurrent neural nets flip my intended role of weights and inputs.
- Do conv-nets naturally require higher dimensional diagrams?
- This weighty style seems like a good fit for my gauss seidel and iterative LQR solvers. A big problem I hit there was getting all the information to the outside, which is a similar issue to getting the weights around in a neural net.

Geometrical optics is a pretty interesting topic. It really is almost pure geometry/math rather than physics.

Huygens principle says that we can compute the propagation of a wave by considering the wavelets produced by each point on the wavefront separately.

In physical optics, this corresponds to the linear superposition of the waves produced at each point by a propagator function . In geometrical optics, which was Huygens original intent I think (these old school guys were VERY geometrical), this is the curious operation of taking the geometrical envelope of the little waves produced by each point.

https://en.wikipedia.org/wiki/Envelope_(mathematics) The envelope is an operation on a family of curves. You can approximate it by a finite difference procedure. Take two subsequent curves close together in the family, find their intersection. Keep doing that and the join up all the intersections. This is roughly the approach I took in this post http://www.philipzucker.com/elm-eikonal-sol-lewitt/

You can describe a geometrical wavefront implicitly with an equations . Maybe the wavefront is a circle, or a line, or some wacky shape.

The wavelet produced by the point x,y after a time t is described implicitly by .

This described a family of curves, the circles produced by the different points of the original wavefront. If you take the envelope of this family you get the new wavefront at time t.

How do we do this? Grobner bases is one way if we make a polynomial equation. For today’s purposes, Grobner bases are a method for solving multivariate polynomial equations. Kind of surprising that such a thing even exists. It’s actually a guaranteed terminating algorithm with horrific asymptotic complexity. Sympy has an implementation. For more on Grobner bases, the links here are useful http://www.philipzucker.com/dump-of-nonlinear-algebra-algebraic-geometry-notes-good-links-though/. Especially check out the Cox Little O’Shea books

The algorithm churns on a set of multivariate polynomials and spits out a new set that is equivalent in the sense that the new set is equal to zero if and only if the original set was. However, now (if you ask for the appropriate term ordering) the polynomials are organized in such a way that they have an increasing number of variables in them. So you solve the 1-variable equation (easy), and substitute into the 2 variable equation. Then that is a 1-variable equation, which you solve (easy) and then you substitute into the three variable equation, and so on. It’s analogous to gaussian elimination.

So check this out

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
from sympy import * x1, y1, x2, y2, dx, dy = symbols('x1, y1, x2, y2, dx, dy') def dist(x,y,d): return x**2 + y**2 - d**2 e1 = dist(x1,y1,2) # the original circle of radius 2 e2 = dist(x1-x2 ,y1 - y2 , 1) # the parametrized wavefront after time 1 # The two envelope conditions. e3 = diff(e1,x1)*dx + diff(e1,y1)*1 e4 = diff(e2,x1)*dx + diff(e2,y1)*1 envelope = groebner([e1,e2,e3,e4], y1, x1, dx, dy, x2, y2, order='lex')[-1] plot_implicit(e1, show=False) plot_implicit(envelope, show = True) |

The envelope conditions can be found by introducing two new differential variables dx, and dy. They are constrained to lie tangent to the point on the original circle by the differential equation e3, and then we require that the two subsequent members of the curve family intersect by the equation e4. Hence we get the envelope. Ask for the Grobner basis with that variable ordering gives us an implicit equations for x2, y2 with no mention of the rest if we just look at the last equation of the Grobner basis.

I set arbitrarily dy = 1 because the overall scale of them does not matter, only the direction. If you don’t do this, the final equation is scaled homogenously in dy.

This does indeed show the two new wavefronts at radius 1 and radius 3.

Here’s a different one of a parabola using e1 = y1 – x1 + x1**2

The weird lumpiness here is plot_implicit’s inability to cope, not the actually curve shape Those funky prongs are from a singularity that forms as the wavefront folds over itself.

I tried using a cubic curve x**3 and the grobner basis algorithm seems to crash my computer. 🙁 Perhaps this is unsurprising. https://en.wikipedia.org/wiki/Elliptic_curve ?

I don’t know how to get the wavefront to go in only 1 direction? As is, it is propagating into the past and the future. Would this require inequalities? Sum of squares optimization perhaps?

Edit:

It’s been suggested on reddit that I’d have better luck using other packages, like Macaulay2, MAGMA, or Singular. Good point

Also it was suggested I use the Dixon resultant, for which there is an implementation in sympy, albeit hidden. Something to investigate

https://github.com/sympy/sympy/blob/master/sympy/polys/multivariate_resultants.py

https://nikoleta-v3.github.io/blog/2018/06/05/resultant-theory.html

Another interesting angle might be to try to go numerical with a homotopy continuation method with phcpy

http://homepages.math.uic.edu/~jan/phcpy_doc_html/welcome.html

https://www.semion.io/doc/solving-polynomial-systems-with-phcpy

or pybertini https://ofloveandhate-pybertini.readthedocs.io/en/feature-readthedocs_integration/intro.html

Vectors are dang useful things, and any energy you put into them seems to pay off massive dividends.

Vectors and Linear Algebra are useful for:

- 2D, 3D, 4D geometry stuff. Computer graphics, physics etc.
- Least Squares Fitting
- Solving discretized PDEs
- Quantum Mechanics
- Analysis of Linear Dynamical Systems
- Probabilistic Transition Matrices

There are certain analogies between Haskell Functors and Vectors that corresponds to a style of computational vector mathematics that I think is pretty cool and don’t see talked about much.

Due to the expressivity of its type system, Haskell has a first class notion of container that many other languages don’t. In particular, I’m referring to the fact that Haskell has higher kinded types `* -> *`

(types parametrized on other types) that you can refer to directly without filling them first. Examples in the standard library include `Maybe`

, `[]`

, `Identity`

, `Const b`

, and `Either b`

. Much more vector-y feeling examples can be found in Kmett’s linear package `V0`

, `V1`

, `V2`

, `V3`

, `V4`

. For example, the 4 dimensional vector type `V4`

1 |
data V4 a = V4 a a a a |

This really isn’t such a strange, esoteric thing as it may appear. You wouldn’t blink an eye at the type

1 2 3 |
struct V4 { double x, y, z, w; } |

in some other language. What makes Haskell special is how compositional and generic it is. We can build thousand element structs with ease via composition. What we have here is an alternative to the paradigm of computational vectors ~ arrays. Instead we have computational vectors ~ structs. In principle, I see no reason why this couldn’t be as fast as arrays, although with current compiler expectations it probably isn’t.

Monoidal categories are a mathematical structure that models this analogy well. It has been designed by mathematicians for aesthetic elegance, and it seems plausible that following its example leads us to interesting, useful, and pleasant vector combinators. And I personally find something that tickles me about category theory.

So to get started, let’s talk a bit about functors.

Functors in Haskell are a typeclass for containers. They allow you to map functions over all the items in the container. They are related to the categorical notion of functor, which is a mapping between categories.

1 2 |
type Container = * -> * -- Note: This is actually a kind signature. -- Kinds and types are the same thing in Haskell. |

You can lift the product and sum of types to the product and sum of Functors which you may find in Data.Functor.Product and Data.Functor.Sum. This is analogous to the lifting of ordinary addition and multiplication to the addition and multiplication of polynomials, which are kind of like numbers with a “hole”.

1 2 |
newtype Product f g a = Pair (f a , g a) newtype Sum f g a = Sum (Either (f a) (g a)) |

Functors also compose. A container of containers of `a`

is still a container of `a`

. We can form composite containers by using the Compose newtype wrapper.

1 |
newtype Compose f g a = Compose (f (g a)) |

When you use this Compose newtype, instead of having to address the individual elements by using `fmap`

twice, a single application of `fmap`

will teleport you through both layers of the container.

`Product`

, `Sum`

, and `Compose`

are all binary operator on functors. The type constructor has kind

1 2 3 4 |
{- Enter into ghci -} :kind Compose ==> (* -> *) -> (* -> *) -> (* -> *) :kind Product ==> (* -> *) -> (* -> *) -> (* -> *) :kind Sum ==> (* -> *) -> (* -> *) -> (* -> *) |

Some important other functors from the algebra of types perspective are `Const Void a`

, `Const () a`

, and `Identity a`

. These are identity elements for `Sum`

, `Product`

, and `Compose`

respectively.

You can define mappings between containers that don’t depend on the specifics of their contents. These mappings can only rearrange, copy and forget items of their contained type. This can be enforced at the type level by the polymorphic type signature

1 |
type f ~> g = forall a. f a -> g a |

These mappings correspond in categorical terminology to natural transformations between the functors `f`

and `g`

. There is a category where objects are Functors and morphisms are natural transformations. `Sum`

, `Product`

, and `Compose`

all obeys the laws necessary to be a monoidal product on this category.

How the lifting of functions works for `Compose`

is kind of neat.

1 2 3 |
mon_prod :: Functor f' => (f ~> f') -> (g ~> g') -> (Compose f g ~> Compose f' g') mon_prod ntf ntg (Compose fg) = Compose (fmap ntg (ntf fg)) -- or equvalently Compose (ntf (fmap ntg fg)) with a (Functor f) typeclass requirement. |

Because the natural transformations require polymorphic types, when you apply `ntf`

to `fg`

the polymorphic variable `a`

in the type of `ntf`

restricts to `a ~ g a'`

.

`Product`

and `Sum`

have a straight forward notion of commutativity ( `(a,b)`

is isomorphic to `(b,a)`

) . `Compose`

is more subtle. `sequenceA`

from the Traversable typeclass can swap the ordering of composition. `sequenceA . sequenceA`

may or may not be the identity depending on the functors in question, so it has some flavor of a braiding operation. This is an interesting post on that topic https://parametricity.com/posts/2015-07-18-braids.html

Combinators of these sorts are used arise in at least the following contexts

- Data types a la carte – A systematic way of building extensible data types
- GHC Generics – A system for building generic functions that operate on data types that can be described with sums, products, recursion, and holes.
- In and around the Lens ecosystem

Also see the interesting post by Russell O’Connor and functor oriented programming http://r6.ca/blog/20171010T001746Z.html. I think the above is part of that to which he is referring.

Vector spaces are made of two parts, the shape (dimension) of the vector space and the scalar.

Just as a type of kind `* -> *`

can be thought of as a container modulo it’s held type, it can also be a vector modulo its held scalar type. The higher kinded type for vector gives an explicit slot to place the scalar type.

1 |
type Vect = * -> * |

The standard Haskell typeclass hierarchy gives you some of the natural operations on vectors if you so choose to abuse it in that way.

- Functor ~> Scalar Multiplication:
`smul s = fmap (* s)`

- Applicative ~> Vector Addition: vadd x y = (+) <$> x <*> y
- Traversable ~> Tranposition.
`sequenceA`

has the type of transposition and works correctly for the linear style containers like V4.

The linear library does use `Functor`

for scalar multiplication, but defines a special typeclass for addition, `Additive`

. I think this is largely for the purposes for bringing `Map`

like vectors into the fold, but I’m not sure.

Once we’ve got the basics down of addition and scalar multiplication, the next thing I want is operations for combining vector spaces. Two important ones are the Kronecker product and direct sum. In terms of indices, the Kronecker product is a space that is indexed by the cartesian product `(,)`

of its input space indices and the direct sum is a space indexed by the `Either`

of its input space indices. Both are very useful constructs. I use the Kronecker product all the time when I want to work on 2D or 3D grids for example. If you’ll excuse my python, here is a toy 2-D finite difference Laplace equation example. We can lift the 1D second derivative matrix using the kronecker product . The direct sum is useful as a notion of stacking matrices.

1 2 3 4 5 6 7 8 |
import numpy as np N = 10 # We're making a 10x10 grid row = np.zeros(N) row[0] = -2 row[1] = 1 K = np.toeplitz(row,row) #toeplitz makes constant diagonal matrices I = np.eye(N) #identity matrix K2 = np.kron(K,I) + np.kron(I,K) |

The following is perhaps the most important point of the entire post.

1 2 |
type Kron = Compose type DSum = Product |

`Compose`

of vector functors gives the Kronecker product, and `Product`

gives the direct sum (this can be confusing but its right. Remember, the sum in direct sum refers to the *indices*).

We can form the Kronecker product of vectors given a Functor constraint.

1 2 3 4 5 |
kron :: (Num a, Functor f, Functor g) => f a -> g a -> Kron f g a kron f g = Compose $ fmap (\s1 -> fmap (\s2 -> s1 * s2) g) f dsum :: f a -> g a -> DSum f g a dsum f g = Pair f g |

Notice we have two distinct but related things called kron: `Kron`

and `kron`

. One operates on vectors *spaces* and the other operates on vector *values*.

Building vector spaces out of small combinators like V2, V4, DSum, Kron is interesting for a number of reasons.

- It is well typed. Similar to Nat indexed vectors, the types specify the size of the vector space. We can easily describe vector spaced as powers of 2 as
`V16 = Kron V2 (Kron V2 (Kron V2 (Kron V2 V1)))`

, similarly in terms of its prime factors, or we can do a binary expansion (least significant bit first)`V5 = DSum V1 (Kron V2 (DSum V0 (Kron V2 V1)))`

or other things. We do it without going into quasi-dependently typed land or GADTs. - It often has better semantic meaning. It is nice to say Measurements, or XPosition or something rather than just denote the size of a vector space in terms of a nat. It is better to say a vector space is the Kron of two meaningful vector spaces than to just say it is a space of size m*n. I find it pleasant to think of the naturals as a free Semiring rather than as the Peano Naturals and I like the size of my vector space defined similarly.
- Interesting opportunities for parallelism. See Conal Elliott’s paper on scans and FFT: http://conal.net/papers/generic-parallel-functional/

In the Vectors as shape methodology, Vectors look very much like Functors.

I have been tempted to lift the natural transformation type above to the following for linear operators.

1 |
type LinOp f g = forall a. (Num a) => f a -> g a |

In a sense this works, we could implement `kron`

because many of the container type (`V1`

, `V2`

, `V3`

, etc) in the linear package implement Num. However, choosing Num is a problem. Why not Fractional? Why not Floating? Sometimes we want those. Why not just specifically Double?

We don’t really want to lock away the scalar in a higher rank polymorphic type. We want to ensure that everyone is working in the same scalar type before allowing things to proceed.

1 |
type LinOp f g a = f a -> g a |

Note also that this type does not constrain us to linearity. Can we form the Kronecker product of linear operators? Yes, but I’m not in love with it. This is not nearly so beautiful as the little natural transformation dance.

1 2 3 |
kron''' :: (Applicative f', Applicative g', Traversable f, Traversable g') => (f a -> f' a) -> (g a -> g' a) -> (Kron f g a -> Kron f' g' a) kron'' lf lg (Compose fga) = Compose $ sequenceA $ (fmap lf) $ sequenceA $ (fmap lg fga) |

This was a nice little head scratcher for me. Follow the types, my friend! I find this particularly true for uses of `sequenceA`

. I find that if I want the containers swapped in ordering. In that situation sequenceA is usually the right call. It could be called `transpose`

.

Giving the vector direct access to the scalar feels a bit off to me. I feel like it doesn’t leave enough “room” for compositionally. However, there is another possibility for a definition of morphisms could be that I think is rather elegant.

1 |
type LinOp1 f g a = forall k. Additive k => Kron f k a -> Kron g k a |

Does this form actually enforce linearity? You may still rearrange objects. Great. You can also now add and scalar multiply them with the `Additive k`

constraint. We also expose the scalar, so it can be enforced to be consistent.

One other interesting thing to note is that these forms allow nonlinear operations. `fmap`

, `liftU2`

and `liftI2`

are powerful operations, but I think if we restricted `Additive`

to just a correctly implemented scalar multiply and vector addition operation, and zero, we’d be good.

1 2 3 4 |
class Additive' f where smul :: Num a => a -> f a -> f a vadd :: Num a => f a -> f a -> f a zero :: Num a => f a |

We can recover the previous form by instantiation `k`

to `V1`

. `V1`

, the 1-d vector space, is almost a scalar and can play the scalars role in many situations. `V1`

is the unit object with respect to the monoidal product `Kron`

.

There seems to be a missing instance to Additive that is useful. There is probably a good reason it isn’t there, but I need it.

1 2 3 4 5 |
instance (Additive g, Additive f) => Additive (Compose f g) where (Compose v) ^+^ (Compose w) = Compose (liftU2 (^+^) v w) zero = zero liftU2 f (Compose x) (Compose y) = Compose $ liftU2 (liftU2 f) x y liftI2 f (Compose x) (Compose y) = Compose $ liftI2 (liftI2 f) x y |

The above analogy can be put into mathematical terms by noting that both vectors and functor are monoidal categories. I talked a quite a bit about monoidal categories in a previous post http://www.philipzucker.com/a-touch-of-topological-computation-3-categorical-interlude/ .

Categories are the combo of a collection of objects and arrows between the objects. The arrows can compose as long as the head of one is on the same object as the tail of the other. On every object, there is always an identity arrow, which when composed will do nothing.

We need a little extra spice to turn categories into monoidal categories. One way of thinking about it is that monoidal categories have ordinary category composition and some kind of horizontal composition, putting things side to side. Ordinary composition is often doing something kind of sequentially, applying a sequence of functions, or a sequence of matrices. The horizontal composition is often something parallel feeling, somehow applying the two arrows separately to separate pieces of the system.

There is funny game category people play where they want to lift ideas from other fields and replace the bits and pieces in such a way that the entire thing is defined in terms of categorical terminology. This is one such example.

A monoid is a binary operations that is associative and has an identity.

Sometimes people are more familiar with the concept of a group. If not, ignore the next sentence. Monoids are like groups without requiring an inverse.

Numbers are seperately monoids under both addition, multiplication and minimization (and more), all of which are associative operations with identity (0, 1, and infinity respectively).

Exponentiation is a binary operation that is not a monoid, as it isn’t associative.

The Monoid typeclass in Haskell demonstrates this http://hackage.haskell.org/package/base-4.12.0.0/docs/Data-Monoid.html

1 2 3 |
class Semigroup a => Monoid a where mempty :: a mappend :: a -> a -> a |

A common example of a monoid is list, where `mempty`

is the empty list and `mappend`

appends the lists.

There are different set-like intuitions for categories. One is that the objects in the category are big opaque sets. This is the case for Hask, Rel and Vect.

A different intuitiion is that the category itself is like a set, and the objects are the elements of that set. There just so happens to be some extra structure knocking around in there: the morphisms. This is the often more the feel for the examples of preorders or graphs. The word “monoidal” means that they we a binary operation on the objects. But in the category theory aesthetic, you also need that binary operation to “play nice” with the morphisms that are hanging around too.

Functors are the first thing that has something like this. It has other properties that come along for the ride. A Functor is a map that takes objects to objects and arrows to arrows in a nice way. A binary functor takes two objects to and object, and two arrows to one arrow in a way that plays nice (commutes) with arrow composition.

String diagrams are a graphical notation for monoidal categories. Agin I discussed this more here.

Morphisms are denoted by boxes. Regular composition is shown by plugging arrows together vertically. Monoidal product is denoted by putting the arrows side to side.

When I was even trying to describe what a monoidal category was, I was already using language evocative of string diagrams.

You can see string diagrams in the documentation for the Arrow library. Many diagrams that people use in various fields can be formalized as the string diagrams for some monoidal category. This is big chunk of Applied Category Theory.

This is the connection to quantum circuits, which are after all a graphical notation for very Kroneckery linear operations.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
type Qubit = V2 type C = Complex Double assoc :: Functor f => (Kron (Kron f g) h) ~> (Kron f (Kron g h)) assoc = Compose . (fmap Compose) . getCompose . getCompose assoc' :: Functor f => (Kron f (Kron g h)) ~> (Kron (Kron f g) h) assoc' (Compose x) = Compose $ Compose $ (fmap getCompose x) kron'' :: (Additive f, Additive g, Additive k, Additive f', Additive g') => LinOp1 f f' a -> LinOp1 g g' a -> Kron (Kron f g) k a -> Kron (Kron f' g') k a kron'' lf lg fgk = let v = (assoc fgk) in assoc' (Compose $ fmap lg $ getCompose (lf v)) sigx' :: LinOp1 Qubit Qubit C sigx' (Compose (V2 up down)) = Compose $ V2 down up sigz' :: LinOp1 Qubit Qubit C sigz' (Compose (V2 up down)) = Compose $ V2 up ((-1) *^ down) sigy' :: LinOp1 Qubit Qubit C sigy' (Compose (V2 up down)) = Compose $ V2 ((-i) *^ down) (i *^ up) where i = 0 :+ 1 swap' :: (Traversable f, Applicative g) => LinOp1 (Kron f g) (Kron g f) a swap' (Compose (Compose fgk)) = Compose $ Compose $ sequenceA fgk cnot :: LinOp1 (Kron Qubit Qubit) (Kron Qubit Qubit) a cnot (Compose (Compose (V2 (V2 up1 down1) v))) = Compose $ Compose $ V2 (V2 down1 up1) v phase :: Double -> LinOp1 Qubit Qubit C phase phi (Compose (V2 up down)) = Compose $ V2 up ((cis phi) *^ down) lefting :: (Additive f, Additive k, Additive g) => LinOp1 f g a -> LinOp1 (Kron f k) (Kron g k) a lefting l = kron'' l id -- Qubit.assoc' . l . Qubit.assoc righting :: (Additive k, Additive f, Additive g) => LinOp1 f g a -> LinOp1 (Kron k f) (Kron k g) a righting l = kron'' id l -- (Compose (Compose fkk)) = Compose $ Compose $ (fmap (getCompose . l . Compose) fkk) example :: LinOp1 (Kron Qubit Qubit) (Kron Qubit Qubit) C example = (lefting sigx') . (lefting sigy') . (righting sigz') . swap' |

There is an annoying amount of stupid repetitive book keeping with the associative structure of Kron. This can largely be avoided hopefully with `coerce`

, but I’m not sure. I was having trouble with roles when doing it generically.

- Woof. This post was more draining to write than I expected. I think there is still a lot left to say. Sorry about the editing everyone! Bits and pieces of this post are scattered in this repo
- How would you go about this in other languages? C, Rust, OCaml, C++, Agda
- The discussion of Vect = * -> * is useful for discussion of 2-Vect, coming up next. What if we make vectors of Vect? Wacky shit.
- Metrics and Duals vectors.
`type Dual f a = f a -> a`

.`type Dual1 f a = forall k. Additive k => Kron f k a -> k a`

- Adjunction diagrams have cups and caps. Since we have been using representable functors, they actually have a right adjunction that is tupling with the vector space index type. This gives us something that almost feels like a metric but a weirdly constrained metric.
- LinOp1 form is yoneda? CPS? Universally quantified k is evocative of
`forall c. (a -> c) -> (b -> c)`

- I got the LinOp1 construction from Kitaev https://arxiv.org/abs/cond-mat/0506438 Presumably there are better references as this is not the thrust of this massive article.
- Jeremy Gibbons Naperian Functor – Very interesting and related to this post. https://www.cs.ox.ac.uk/people/jeremy.gibbons/publications/aplicative.pdf
- Bartosz Milewski – String Diagrams https://www.youtube.com/watch?v=eOdBTqY3-Og
- Huenen and Vicary – http://www.cs.ox.ac.uk/people/jamie.vicary/IntroductionToCategoricalQuantumMechanics.pdf Intro to categorical quantum mechanics.
- http://hackage.haskell.org/package/linear Kmett’s linear package
- https://graphicallinearalgebra.net/ – Graphical Linear Algebra
- http://www.philipzucker.com/resources-string-diagrams-adjunctions-kan-extensions/
- http://www.philipzucker.com/functor-vector-part-2-function-vectors/
- http://www.philipzucker.com/functor-vector-part-1-functors-basis/
- Baez and Stay. Rosetta Stone. A classic http://math.ucr.edu/home/baez/rosetta.pdf

Containers that are basically big product types are also known as representable, Naperian, or logarithmic. Representable places emphasis on the isomorphism between such a container type and the type `(->) i`

which by the algebra of types correspond is isomorphic to (i copies of a). They are called Naperian/Logarithmic because there is a relationship similar to exponentiation between the index type `a`

and the container type `f`

. If you take the `Product f g`

, this container is indexed by (a + b) = `Either a b`

. `Compose f g`

is indexed by the product `(a,b)`

. `(f r) ~ r^a`

The arrow type is written as an exponential `b^a`

because if you have finite enumerable types `a`

and `b`

, that is the number of possible tabulations available for `f`

. The Sum of two representable functors is no longer representable. Regular logarithms of sums Log(f + g) do not have good identities associated with them.

See Gibbons article. There is a good argument to be made that representable functors are a good match for vectors/well typed tensor programming.

But note that there is a reasonable interpretation for container types with sum types in them. These can be thought of as subspaces, different bases, or as choices of sparsity patterns. When you define addition, you’ll need to say how these subspaces reconcile with each other.

— two bases at 45 degrees to each other.

1 2 3 |
data V2_45 a = XY a a | XY' a a data Maybe a = Just a | Notinhg -- a 1-d vectro space with a special marker for the zero vector. data Maybe2 a = Just2 a a | Nothing2 -- a 2d vector space with special zero marker |

Hask is a name for the category that has objects as Haskell types and morphisms as Haskell functions.

Note that it’s a curious mixing of type/value layers of Haskell. The objects are types whereas the function morphisms are Haskell values. Composition is given by `(.)`

and the identity morphisms are given by `id`

.

For Haskell, you can compose functions, but you can also smash functions together side by side. These combinators are held in Control.Arrow.

You can smash together types with tuple `(,)`

or with `Either`

. Both of these are binary operators on types. The corresponding mapping on morphisms are given by

1 2 |
(***) :: a b c -> a b' c' -> a (b, b') (c, c') (+++) :: a b c -> a b' c' -> a (Either b b') (Either c c') |

These are binary operators on morphisms that play nice with the composition structure of Haskell.

A monoidal category also has unit objects. This is given by the Identity functor

1 2 3 4 5 6 7 8 9 10 11 |
rightUnitor :: Functor f => Compose f Identity a -> f a rightUnitor (Compose f) = fmap runIdentity f rightUnitor' :: f ~> Compose f Identity rightUnitor' = Compose . fmap Identity leftUnitor' :: f ~> Compose Identity f leftUnitor' = Compose . Identity leftUnitor :: Identity *** f ~> f leftUnitor = runIdentity . getCompose |

There is also a sense of associativity. It is just newtype rearrangement, so it can also be achieved with a `coerce`

(although not polymorphically?).

1 2 3 4 5 |
assoc :: Functor f => ((f *** g) *** h) ~> (f *** (g *** h)) assoc = Compose . (fmap Compose) . getCompose . getCompose assoc' :: Functor f => (f *** (g *** h)) ~> ((f *** g) *** h) assoc' (Compose x) = Compose $ Compose $ (fmap getCompose x) |

Similarly, we can define a monoidal category structure using Product or Sum instead of Compose.

These are all actually just newtype rearrangement, so they should all just be instances of `coerce`

, but I couldn’t get the roles to go through generically?