The post Approximating Compiling to Categories using Type-level Haskell: Take 2 appeared first on Hey There Buddo!.

]]>Summary: I’m trying to use typelevel programming in Haskell to achieve some of the aims of Conal Elliott’s compiling to categories GHC plugin. The types of highly polymorphic tuple functions are enough to specify the implementation. We aren’t going to be able to piggy-back off of GHC optimizations (a huge downside), but we can reify lambdas into other categories and avoid the scariness of plugins.

The current implementation github source is here

JESUS CHRIST.

http://okmij.org/ftp/Haskell/de-typechecker.lhs

Of course, Oleg already did it. This is a file where he builds the implementation of a polymorphic function from the type signature. Instead of tuples, he is focusing on higher order functions with deep nesting of (->).

The trick that I was missing is in the IsFunction typeclass at the very end, which is only achievable as a Incoherent instances.

I would never have had the courage to use an Incoherent instance if I hadn’t seen a higher authority do it first. It has turned out in my typelevel journey that many instances that I’ve been tempted to make overlapping or incoherent don’t actually have to be, especially with the availability of closed type families. I think you really do need Incoherent in this application because type variables stay polymorphic forever.

To the best of my knowledge, if you need to differentiate between a tuple type (a,b) and an uninstantiated polymorphic value a’ like we do when deconstructing the input type of our lambda, you need to use an Incoherent instance. Since a’ could hypothetically eventually be unified to some (a,b) we should not be able to do different things for the two cases without stepping outside the usual laws of typeclass resolution.

New features of the implementation:

- The new implementation does not require the V annotation of the input like previous version by using the previously mentioned. This is super cool because now I can throw the stock Prelude.fst into toCcc.
- I changed how the categorical implementation is built, such that it short circuits with an ‘id’ if a large structure is needed from the input, rather than deconstructing all the way to every piece of the input. Lots of other optimizations would be nice (vital?), but it is a start.
- I also implemented a FreeCat GADT so that we can see the implementation in ghci.
- I switched from using Data.Proxy to the type annotations extensions, which is a huge readability win.
- I added a binary application operator binApp, which is useful for encapsulating categorical literals as infix operators into your lambda expressions.
- General cleanup, renaming, and refactoring into library files.

A couple typelevel tricks and comments:

You often have to make helper typeclasses, so don’t be afraid to. If you want something like an if-then-else in your recursion, it is very likely you need a form of the typeclass that has slots for ‘True or ‘False to help you pick the instance.

If possible, you often want the form

(a~Something) => MyClass 'True a

rather than

Myclass 'True Something

The type inference tends to be better.

Here are some usage examples of toCcc.

example6 = toCcc (\x -> x) 'a' example7 = toCcc @FreeCat (\(x, y) -> x) example8 = toCcc @FreeCat (\(x, y) -> y) example8andahalf = toCcc' (Proxy :: Proxy FreeCat) (\(x, y) -> y) example9 = toCcc @FreeCat (\(x, y) -> (y,x)) example10 = toCcc @FreeCat (\(( x, z), y) -> (y,x)) swappo = toCcc @FreeCat $ \((x, z), y) -> (x,(z,y)) example11 = toCcc @(->) $ \(x,y) -> binApp addC x y example12 = toCcc @(->) $ \(x,y) -> App negateC x -- infix synonyms (+++) = binApp addC (***) = binApp mulC example13 = toCcc @(->) $ \(x,(y,z)) -> x +++ (y *** z)

My partially implemented version of some of Conal’s category typeclasses. Should I switch over to using the constrained-categories package?

{-# LANGUAGE GADTs, StandaloneDeriving, NoImplicitPrelude #-} module Cat where import Control.Category import Prelude hiding ((.)) class Category k => Monoidal k where par :: k a c -> k b d -> k (a,b) (c,d) class Monoidal k => Cartesian k where fst :: k (a,b) a snd :: k (a,b) b dup :: k a (a,a) fan f g = (par f g) . dup data FreeCat a b where Comp :: FreeCat b c -> FreeCat a b -> FreeCat a c Id :: FreeCat a a Fst :: FreeCat (a,b) a Snd :: FreeCat (a,b) b Dup :: FreeCat a (a,a) Par :: FreeCat a b -> FreeCat c d -> FreeCat (a,c) (b,d) deriving instance Show (FreeCat a b) instance Category FreeCat where (.) = Comp id = Id instance Monoidal FreeCat where par = Par instance Cartesian FreeCat where fst = Fst snd = Snd dup = Dup instance Monoidal (->) where par f g = \(x,y) -> (f x, g y) instance Cartesian (->) where fst (x,y) = x snd (x,y) = y dup x = (x,x) class NumCat k where mulC :: Num a => k (a,a) a negateC :: Num a => k a a addC :: Num a => k (a,a) a instance NumCat (->) where mulC = uncurry (*) negateC = negate addC = uncurry (+)

The actual implementation of toCcc

{-# LANGUAGE DataKinds, AllowAmbiguousTypes, TypeFamilies, TypeOperators, MultiParamTypeClasses, FunctionalDependencies, PolyKinds, FlexibleInstances, UndecidableInstances, TypeApplications, NoImplicitPrelude, ScopedTypeVariables #-} module CCC ( CCC , App (App) , binApp , toCcc' , toCcc ) where import Cat import Data.Proxy import Data.Type.Equality (type (==)) import Prelude (Bool(..)) import Control.Category -- The unfortunate Incoherent instances I need to force polymorphic values class IsTup a b | a -> b instance {-# INCOHERENT #-} (c ~ 'True) => IsTup (a,b) c instance {-# INCOHERENT #-} (b ~ 'False) => IsTup a b data Leaf n = Leaf data Z data S a data App f a = App f a f $$ x = App f x binApp f a b = App f (a,b) class Cartesian k => CCC k a a' b' | a a' -> b' where toCcc :: a -> k a' b' instance (Tag a, Build k a b a' b', Cartesian k) => CCC k (a->b) a' b' where toCcc f = build @k @a @b @a' @b' res where -- build (Proxy :: Proxy labels) (Proxy :: Proxy b) res where res = f val toCcc' :: CCC k f a' b' => Proxy k -> f -> k a' b' toCcc' _ f = toCcc f class Tag a where val :: a instance (IsTup a flag, Tag' a Z flag n) => Tag a where val = val' @a @Z @flag class Tag' a n (flag :: Bool) n'| a n flag -> n' where val' :: a instance (IsTup a flaga, IsTup b flagb, Tag' a n flaga n'', Tag' b n'' flagb n') => Tag' (a,b) n 'True n' where val' = (val' @a @n @flaga, val' @b @n'' @flagb) instance (a ~ Leaf n) => Tag' a n 'False (S n) where val' = Leaf type family Or a b where Or 'True b = 'True Or a 'True = 'True Or 'False 'False = 'False class In a b flag | a b -> flag where instance ( ((a,b) == c) ~ isc, flag' ~ Or flaga flagb, flag ~ Or flag' isc, In a c flaga, In b c flagb) => In (a,b) c flag instance ((Leaf n == c) ~ flag) => In (Leaf n) c flag class Build k input key a' b' | input key a' -> b' where build :: Cartesian k => key -> k a' b' instance ( iseq ~ ((a,b) == key), In a key isinleft, In b key isinright, Cond k iseq isinleft isinright (a,b) key a' b' ) => Build k (a,b) key a' b' where build key = cond @k @iseq @isinleft @isinright @(a,b) @key @a' key instance (Leaf n ~ b, a' ~ b') => Build k (Leaf n) b a' b' where build _ = id class Cond k iseq isinleft isinright input key a b | iseq isinleft isinright input key a -> b where cond :: Cartesian k => key -> k a b -- Find the key is in the input instance (a ~ b) => Cond k 'True x x input key a b where cond _ = id instance (Build k a key a' c', (a',b') ~ ab) => Cond k 'False 'True x (a,b) key ab c' where -- get those input types inferred baby! cond key = (build @k @a @key @a' key) . fst instance (Build k b key b' c', (a',b') ~ ab) => Cond k 'False 'False 'True (a,b) key ab c' where cond key = (build @k @b @key @b' key) . snd -- Otherwise destruct on the key instance (Build k input key1 a' c', Build k input key2 a' d') => Cond k 'False 'False 'False input (key1,key2) a' (c',d') where cond (key1,key2) = fan (build @k @input @key1 @a' key1) (build @k @input @key2 @a' key2) instance (Build k input key a' b', f ~ k b' c') => Cond k 'False 'False 'False input (App f key) a' c' where cond (App f key) = f . (build @k @input @key @a' key) -- Could I replace almost everything with App? A very powerful construct -- This is a of some relation to defunctionalization like in HList -- Maybe I should build a typelevel FreeCat and then do compilation passes on it {- type family (StripLeaf a) where StripLeaf (a,b) = (StripLeaf a, StripLeaf b) StripLeaf (Leaf n a) = a -}

The post Approximating Compiling to Categories using Type-level Haskell: Take 2 appeared first on Hey There Buddo!.

]]>The post Variational Method of the Quantum Simple Harmonic Oscillator using PyTorch appeared first on Hey There Buddo!.

]]>I’ve tried two versions, using a stock neural network with relus and making it a bit easier by giving a gaussian with variable width and shift.

We can mimic the probability constraint by dividing by to total normalization . A Lagrange multiplier or penalty method may allows us to access higher wavefunctions.

SGD seems to do a better job getting a rounder gaussian, while Adam is less finicky but makes a harsh triangular wavefunction.

The ground state solution of is , with an energy of 1/2 (unless I botched up my head math). We may not get it, because we’re not sampling a very good total domain. Whatever, for further investigation.

Very intriguing is that pytorch has a determinant in it, I believe. That opens up the possibility of doing a Hartree-Fock style variational solution.

Here is my garbage

import torch import matplotlib.pyplot as plt import numpy as np import torch.optim from scipy import linalg import time import torch.nn as nn import torch.nn.functional as F class Psi(nn.Module): def __init__(self): super(Psi, self).__init__() # an affine operation: y = Wx + b self.lin1 = nn.Linear(1, 10) #Takes x to the 10 different hats self.lin2 = nn.Linear(10, 1) #add up the hats. #self.lin1 = nn.Linear(1, 1) #Takes x to the 10 different hats #self.lin2 = nn.Linear(2, 1) #add up the hats. def forward(self, x): # Max pooling over a (2, 2) window shifts = self.lin1(x) hats = F.relu(shifts) #hat(shifts)hat(shifts) y = self.lin2(hats) #y = torch.exp(- shifts ** 2 / 4) return y #a = torch.ones(10, requires_grad=True) #z = torch.linspace(0, 1, steps=10) # batch variable for monte carlo integration x = torch.randn(10000,1, requires_grad=True) # a hundred random points between 0 and 1 psi = Psi() y = psi(x) import torch.optim as optim # create your optimizer optimizer = optim.SGD(psi.parameters(), lr=0.0001, momentum=0.9, nesterov=True) #optim.Adam(psi.parameters()) #optim.Adam(psi.parameters()) #y2 = torch.sin(np.pi*x) #print(y) #x2 = x.clone() plt.scatter(x.detach().numpy(),y.detach().numpy(), label="original") #plt.scatter(x.detach().numpy(),x.grad.detach().numpy()) scalar = torch.ones(1,1) for i in range(4000): #print(x.requires_grad) #y.backward(torch.ones(100,1), create_graph=True) x = torch.randn(1000,1, requires_grad=True) # a hundred random points between 0 and 1 y = psi(x) psi.zero_grad() y.backward(torch.ones(1000,1), create_graph=True) #torch.autograd.backward(torch.sum(y), grad_tensors=x) #print(x.grad) #print(dir(x)) #x.grad ** 2 + E = torch.sum(x.grad ** 2 + x**2 * y**2)#+ 10*(psi(scalar*0)**2 + psi(scalar)**2) N = torch.sum(y ** 2) L = E/N print(L) psi.zero_grad() L.backward() optimizer.step() for param in psi.parameters(): print(param) print(param.grad) plt.scatter(x.detach().numpy(),y.detach().numpy(), label="new") plt.legend() #plt.scatter(x.detach().numpy(),x.grad.detach().numpy()) plt.show() # may want to use the current wavefunction for gibbs style sampling # we need to differentiate with respect to x for the kinetic energy

Edit: Hmm I didn’t compensate for the fact I was using randn sampling. That was a goof. I started using unifrom sampling, which doesn’t need compensation

The post Variational Method of the Quantum Simple Harmonic Oscillator using PyTorch appeared first on Hey There Buddo!.

]]>The post Shit Compiling to Categories using Type level Programming in Haskell appeared first on Hey There Buddo!.

]]>His approach in using a GHC plugin is better for a couple reasons. One really important thing is that he gets to piggy back on GHC optimizations for the lambdas. I have only implemented a very bad evaluation strategy. Perhaps we could recognize shared subexpressions and stuff, but it is more work. I seem somewhat restricted in what I can do and sometimes type inference needs some help. Not great. However, GHC plugins definitely bring their own difficulties.

What I’ve got I think still has roots in my thinking from this previous post

There are a couple insights that power this implementation

- A fully polymorphic tuple to tuple converter function is uniquely defined by it’s type. For example, swap :: (a,b) -> (b,a), id:: a -> a, fst :: (a,b) ->a, snd::(a,b)->b are all unique functions due to the restrictions of polymorphism. Thus typelevel programming can build the implementation from the type.
- Getting a typeclass definition to notice polymorphism is hard though. I haven’t figured out how to do it, if it is even possible. We get around it by explicitly newtyping every pattern match on a polymorphic variable like so \(V x, V y) -> (y,x). Two extra characters per variable. Not too shabby.
- You can abuse the type unification system as a substitution mechanism. Similar to HOAS, you can make a lambda interpreter at the type level that uses polymorphism as way of generating labels for variables. This is probably related to Oleg Kiselyov’s type level lambda calculus, but his kind of confuses me http://okmij.org/ftp/Computation/lambda-calc.html#haskell-type-level
- You can inject a categorical literal morphism using a wrapping type to be extracted later using an apply operator and type App f a. An infix ($$) makes it feel better.

class Eval e r | e -> r data App f b newtype V a = V a type Lit a = V a type Lam a b = a -> b -- Let's borrow arrow for Lambda instance (Eval b d, a ~ c) => Eval (App (a -> b) c) d instance (Eval b c) => Eval (a -> b) (a -> c) -- Evaluate inside the body or stop? instance Eval (Lit a) (Lit a)

So here is the rough structure of what the typelevel programming is doing

You can do a depth first traversal of the input tuple structure, when you hit V, unify the interior type with a Nat labelled Leaf. At the same time, you can build up the actual value of the structure so that you can apply the lambda to it and get the output which will hold a tree that has morphism literals you want.

Then inspect the output type of the lambda which now has Nat labels, and traverse the labelled tree again to find the appropriate sequence of fst and snd to extract what you need. If you run into an application App, you can pull the literal out now and continue traversing down.

At the moment I’ve only tested with the (->) Category, which is a bit like demonstrating a teleporter by deconstructing a guy and putting back in the same place, but I think it will work with others. We’ll see. I see no reason why not.

At the moment, I’m having trouble getting GHC to not freakout unless you explicitly state what category you’re working in, or explicitly state that you’re polymorphic in the Category k.

Some future work thoughts: My typelevel code is complete awful spaghetti written like I’m a spastic trapped animal. It needs some pruning. I think all those uses of Proxy can be cleaned up by using TypeApplications. I need to implement some more categories. Should I conform to the Constrained-Categories package? Could I use some kind of hash consing to find shared structures? Could Generic or Generic1 help us autoplace V or locate polymorphism? Maybe a little Template Haskell spice to inject V?

Here’s the most relevant bits, with my WIP git repository here

{-# LANGUAGE FunctionalDependencies, FlexibleInstances, GADTs, DataKinds, TypeOperators, KindSignatures, PolyKinds, FlexibleContexts, UndecidableInstances, ScopedTypeVariables, NoImplicitPrelude #-} module Main where import Lib import GHC.TypeLits import Data.Proxy import Prelude hiding (id, fst, snd, (.)) main :: IO () main = someFunc -- We will use these wrappers to know when we've hit polymorphism newtype V a = V a data Z data S a data Leaf n a = Leaf data Node n a b = Node a b ccc' :: Top a b c k => Proxy k -> a -> k b c ccc' _ f = ccc f class Tag a b c d mono | a b mono -> d c where val :: Proxy a -> Proxy b -> Proxy mono -> a instance (Tag a n n'' r1 a', Tag b n'' n' r2 b', (a', b') ~ q) => Tag (a, b) n n' (Node n'' r1 r2) q where val _ _ _ = (val (Proxy :: Proxy a) (Proxy :: Proxy n) (Proxy :: Proxy a'), val (Proxy :: Proxy b) (Proxy :: Proxy n'') (Proxy :: Proxy b')) instance (a ~ Leaf n a') => Tag (V a) n (S n) (Leaf n a') a' where val _ _ _ = V Leaf class CartesianCategory k => Top a b c k | a b -> c where ccc :: a -> k b c instance (Tag a Z n labels c, Build labels b c d k, CartesianCategory k) => Top (a->b) c d k where ccc f = build (Proxy :: Proxy labels) (Proxy :: Proxy b) res where res = f (val (Proxy :: Proxy a) (Proxy :: Proxy Z) (Proxy :: Proxy c)) fan f g = (par f g) . dup -- Once you've labelled, traverse the output type and extract those pieces -- and put them together class CartesianCategory k => Build labels b c d k | labels b -> c d where build :: Proxy labels -> Proxy b -> b -> k c d instance (Build labels b i o1 k, Build labels c i o2 k) => Build labels (b,c) i (o1,o2) k where build pl pbc (x,y) = fan (build pl (Proxy :: Proxy b) x) (build pl (Proxy :: Proxy c) y) instance (Extract labels n a b, CartesianCategory k) => Build labels (Leaf n c) a b k where build pl pb _ = extract pl (Proxy :: Proxy n) instance (Build labels c a b k, CartesianCategory k) => Build labels (App (k b d) c) a d k where build pl pb (App f x) = f . (build pl (Proxy :: Proxy c) x) class StripN a b | a -> b instance (StripN a a', StripN b b') => StripN (Node n a b) (a',b') instance StripN (Leaf n a) a -- Builds the extractor function class Extract a n d r | a n -> d r where extract :: CartesianCategory k => Proxy a -> Proxy n -> k d r instance (LT n n' gt, -- which one is greater StripN (Node n' a b) ab, FstSnd gt ab r1, -- get value level rep of this ITE gt a b c, -- Select to go down branch Extract c n r1 r) -- recurse => Extract (Node n' a b) n ab r where extract _ p = (extract (Proxy :: Proxy c) p) . (fstsnd (Proxy :: Proxy gt)) instance Extract (Leaf n a) n a a where extract _ _ = id arrccc :: (Top a b c (->)) => a -> b -> c arrccc = ccc' (Proxy :: Proxy (->)) -- applying the category let's us imply arrow example6 = ccc (\(V x) -> x) 'a' --example7 :: CartesianCategory k => k _ _ example7 = arrccc (\(V x, V y) -> x) -- ('a','b') example8 = arrccc (\(V x, V y) -> y) -- ('a','b') example9 = arrccc (\(V x, V y) -> (y,x)) -- ('a','b') example10 = arrccc (\((V x,V z), V y) -> (y,x)) -- ((1,'b'),'c') swappo = arrccc $ \((V x,V z), V y) -> (x,(z,y)) class FstSnd a d r | a d -> r where fstsnd :: CartesianCategory k => Proxy a -> k d r instance FstSnd 'True (a,b) a where fstsnd _ = fst instance FstSnd 'False (a,b) b where fstsnd _ = snd class Fst a b | a -> b instance Fst (a,b) a class Snd a b | a -> b instance Snd (a,b) b class ITE a b c d | a b c -> d instance ITE 'True a b a instance ITE 'False a b b class GT a b c | a b -> c instance GT a b d => GT (S a) (S b) d instance GT Z (S a) 'False instance GT (S a) Z 'True instance GT Z Z 'False class LT a b c | a b -> c instance LT a b d => LT (S a) (S b) d instance LT Z (S a) 'True instance LT (S a) Z 'False instance LT Z Z 'False -- For external function application data App f a = App f a f $$ x = App f x plus :: (Int, Int) -> Int plus (x,y) = x + y plus' (x,y) = x + y inc :: Int -> Int inc = (+ 1) --example11 = ccc (\(x,y) -> App plus (x,y)) example11 = arrccc (\(V x) -> App inc x) -- $ (1 :: Int) example12 = arrccc (\(V x,V y) -> plus $$ (x,y)) example13 = arrccc (\(V x,V y) -> inc $$ (plus $$ (x,y))) example14 :: Num a => (a,a) -> a -- Without this annotation it inferred Integer? Monomorphization? example14 = ccc (\(V x,V y) -> plus' $$ (x,y)) class CartesianCategory k where (.) :: k b c -> k a b -> k a c id :: k a a fst :: k (a,b) a snd :: k (a,b) b dup :: k a (a,a) par :: k a c -> k b d -> k (a,b) (c,d) instance CartesianCategory (->) where id = \x -> x fst (x,y) = x snd (x,y) = y dup x =(x,x) f . g = \x -> f (g x) par f g = \(x,y) -> (f x, g y)

The post Shit Compiling to Categories using Type level Programming in Haskell appeared first on Hey There Buddo!.

]]>The post Deducing Row Sparse Matrices from Matrix Vector Product Samples appeared first on Hey There Buddo!.

]]>If every row of a matrix has <N non zero entries, you can back out that matrix from N matrix vector samples of it. You have many choices for the possible sampling vectors. Random works well.

The unknown in this case is the row of the matrix, represented in green. We put a known set of inputs into it and get outputs. Each row of the output, represented in red, can tell use the matrix row. We have to invert the matrix that consists of all the elements that the nonzero elements of that row touches represented in blue. That black T is a transpose. To turn those row vectors into column vectors, we have to transpose everything.

For a banded matrix, we have a shifting sample matrix that we have to invert, one for each row.

import numpy as np from scipy.linalg import toeplitz as toep N =10 bandn = 3 row = np.zeros(N) row[0] = -2 row[1] = 1 banded = toep(row) #np.eye(N) print(banded) samples = np.random.randn(N,bandn) print(samples) y = banded @ samples print(y) band = np.zeros((N,bandn)) circsamples = np.random.randn(N+bandn,bandn) circsamples[bandn//2:-bandn//2, :] = samples for j in range(N): band[j,:] = np.linalg.solve(circsamples[j:j+bandn, :].T, y[j,:]) print(band) ''' for j in range(N-bandn): band[j+bandn//2,:] = np.linalg.solve(samples[j:j+bandn, :].T, y[j+bandn//2,:]) ''' #corners ''' for j in range(bandn//2): band[j,j+bandn//2:] = np.linalg.solve(samples[0:j+bandn//2+1, 0:j+bandn//2+1].T, y[j,:j+bandn//2+1]) ''' #print(band)

A cute trick we can use to simplify the edge cases where we run off the ends is to extend the sample matrix with some random trash. That will actually put the entries in the appropriate place and will keep the don’t cares close to zero also.

In my previous post I used a stack of identity matrices. These are nice because a shifted identity matrix is a circular permutation of the entries, which is very easy to undo. That was what the loop that used numpy.roll was doing. Even better, it is easy to at least somewhat vectorize the operation and you can produce those sampling vectors using some clever use of broadcasting of an identity matrix.

An alternative formulation that is a little tighter. I want the previous version because the samples isn’t actually always random. Often they won’t really be under our control.

import numpy as np from scipy.linalg import toeplitz as toep N =10 bandn = 3 row = np.zeros(N) row[0] = -2 row[1] = 1 banded = toep(row) #np.eye(N) print(banded) samples = np.random.randn(N,bandn) print(samples) y = banded @ samples print(y) band = np.zeros((N,bandn)) circsamples = np.random.randn(N+bandn,bandn) circsamples[bandn//2:-bandn//2, :] = samples for j in range(N): band[j,:] = np.linalg.solve(circsamples[j:j+bandn, :].T, y[j,:]) print(band)

This all still doesn’t address the hermitian problem. The constraint that A is hermitian hypothetically allows you to take about half the samples. But the constraints make it tougher to solve. I haven’t come up with anything all that much better than vectorizing the matrix and building a matrix out of the samples in the appropriate spots.

I think such a matrix will be banded if A is banded, so that’s something at least.

The post Deducing Row Sparse Matrices from Matrix Vector Product Samples appeared first on Hey There Buddo!.

]]>The post Pytorch Trajectory Optimization Part 4: Cleaner code, 50Hz appeared first on Hey There Buddo!.

]]>Added backtracking. It will backtrack on the dx until the function is actually decreasing.

Prototyped the online part with shifts. Seems to work well with a fixed penalty parameter rho~100. Runs at ~50Hz with pretty good performance at 4 optimization steps per time step. Faster or slower depending on the number of newton steps per time step we allow ourselves. Still to see if the thing will control an actual cartpole.

The majority of time is spent just backwards calculating the hessian still (~50%).

I’ve tried a couple different schemes (direct projection of the delLy terms or using y = torch.eye). None particularly seem to help.

The line search is also fairly significant (~20% of the time) but it really helps with both stability and actually decreasing the number of hessian steps, so it is an overall win. Surprisingly during the line search, projecting out the batch to 0 doesn’t matter much. How could this possibly make sense?

What I should do is pack this into a class that accepts new state observations and initializes with the warm start. Not clear if I should force the 4 newton steps on you or let you call them yourself. I think if you use too few it is pretty unstable (1 doesn’t seem to work well. 2 might be ok and gets you up to 80Hz maybe.)

The various metaparameters should be put into the __init__. The stopping cutoff 1e-7, Starting rho (~0.1), rho increase (x10) , backtrack alpha decrease factor (0.5 right now), the online rho (100). Hopefully none of these matter two much. I have noticed going too small with cutoff leading to endless loops.

Could swapping the ordering of time step vs variable number maybe help?

For inequality constraints like the track length and forces, exponential barriers seems like a more stable option compared to log barriers. Log barriers at least require me to check if they are going NaN.

I attempted the pure Lagrangian version where lambda is just another variable. It wasn’t working that great.

import torch import matplotlib.pyplot as plt import numpy as np import torch.optim from scipy import linalg import time N = 100 T = 10.0 dt = T/N NVars = 4 NControls = 1 # Enum values X = 0 V = 1 THETA = 2 THETADOT = 3 #The bandwidth number for solve_banded bandn = (NVars+NControls)*3//2 # We will use this many batches so we can get the entire hessian in one pass batch = bandn * 2 + 1 def getNewState(): #we 're going to also pack f into here #The forces have to come first for a good variable ordering the the hessian x = torch.zeros(batch,N,NVars+NControls, requires_grad=True) l = torch.zeros(1, N-1,NVars, requires_grad=False) return x, l #Compute the residual with respect to the dynamics def dynamical_res(x): f = x[:,1:,:NControls] x = x[:,:,NControls:] delx = (x[:,1:,:] - x[:, :-1,:]) / dt xbar = (x[:,1:,:] + x[:, :-1,:]) / 2 #dxdt = torch.zeros(x.shape[0], N-1,NVars) dxdt = torch.zeros_like(xbar) dxdt[:,:,X] = xbar[:,:,V] dxdt[:,:,V] = f[:,:,0] dxdt[:,:,THETA] = xbar[:,:,THETADOT] dxdt[:,:,THETADOT] = -torch.sin(xbar[:,:,THETA]) + f[:,:,0]*torch.cos(xbar[:,:,THETA]) xres = delx - dxdt return xres def calc_loss(x, l, rho): xres = dynamical_res(x) # Some regularization. This encodes sort of that all variables -100 < x< 100 cost = 0.1*torch.sum(x**2) # The forces have to come first for a good variable ordering the the hessian f = x[:,1:,:NControls] x = x[:,:,NControls:] lagrange_mult = torch.sum(l * xres) penalty = rho*torch.sum(xres**2) #Absolute Value craps it's pants unfortunately. #I tried to weight it so it doesn't feel bad about needing to swing up cost += 1.0*torch.sum((x[:,:,THETA]-np.pi)**2 * torch.arange(N) / N ) cost += 0.5*torch.sum(f**2) xlim = 0.4 #Some options to try for inequality constraints. YMMV. #cost += rho*torch.sum(-torch.log(xbar[:,:,X] + xlim) - torch.log(xlim - xbar[:,:,X])) #The softer inequality constraint seems to work better. # the log loses it's mind pretty easily # tried adding ln rho in there to make it harsher as time goes on? #cost += torch.sum(torch.exp((-xbar[:,:,X] - xlim)*(5+np.log(rho+0.1))) + torch.exp((xbar[:,:,X]- xlim)*(5+np.log(rho+0.1)))) #Next one doesn't work? #cost += torch.sum(torch.exp((-xbar[:,:,X] - xlim)) + torch.exp((xbar[:,:,X]- xlim)))**(np.log(rho/10+3)) total_cost = cost + lagrange_mult + penalty return total_cost def getGradHessBand(loss, B, x): # get gradient. create_graph allows higher order derivatives delL0, = torch.autograd.grad(loss, x, create_graph=True) delL = delL0[:,1:,:].view(B,-1,B) #remove x0 #y is used to sample the appropriate rows #y = torch.zeros(B,N-1,NVars+NControls, requires_grad=False).view(B,-1) # There is probably a way to do it this way. # Would this be a speed up? y = torch.eye(B).view(B,1,B) #print(y.shape) #print(delL.shape) #delL = delL.view(B,-1) #y = torch.zeros(B,N-1,NVars+NControls, requires_grad=False).view(B,-1) #for i in range(B): # y[i,i::B]=1 #delL = delL.view(B,-1) #temp = 0 #for i in range(B): # temp += torch.sum(delL[i,:,i]) #Direct projection is not faster delLy = torch.sum(delL * y) delL = delL.view(B,-1) delLy.backward() #temp.backward() nphess = x.grad[:,1:,:].view(B,-1).detach().numpy() #reshuffle columns to actuall be correct for i in range(B): nphess[:,i::B] = np.roll(nphess[:,i::B], -i+B//2, axis=0) #returns gradient and hessian flattened return delL.detach().numpy()[0,:].reshape(-1), nphess def line_search(x, dx, total_cost, newton_dec): with torch.no_grad(): #x1 = torch.unsqueeze(x[0],0) xnew = torch.tensor(x) #Make a copy alpha = 1 prev_cost = torch.tensor(total_cost) #Current total cost done = False # do a backtracking line search while not done: try: xnew[:,1:,:] = x[:,1:,:] - alpha * dx #print(xnew.shape) total_cost = calc_loss(xnew, l, rho) if alpha < 1e-8: print("Alpha small: Uh oh") done = True if total_cost < prev_cost: # - alpha * 0.5 * batch * newton_dec: done = True else: print("Backtrack") alpha = alpha * 0.5 except ValueError: #Sometimes you get NaNs if you have logs in cost func print("Out of bounds") alpha = alpha * 0.1 x[:,1:,:] -= alpha * dx #Commit the change return x def opt_iteration(x, l, rho): total_cost = calc_loss(x, l, rho) gradL, hess = getGradHessBand(total_cost, (NVars+NControls)*3, x) #Try to solve the linear system. Sometimes, it fails # in which case just defualt to gradient descent # you're probably fucked though try: dx = linalg.solve_banded((bandn,bandn), hess, gradL, overwrite_ab=True) except ValueError: print("ValueError: Hess Solve Failed.") dx = gradL except LinAlgError: print("LinAlgError: Hess Solve Failed.") dx = gradL x.grad.data.zero_() # Forgetting this causes awful bugs. I think this has to be here newton_dec = np.dot(dx,gradL) # quadratic estimate of cost improvement dx = torch.tensor(dx.reshape(1,N-1,NVars+NControls)) # return to original shape x = line_search(x, dx, total_cost, newton_dec) # If newton decrement is a small percentage of cost, quit done = newton_dec < 1e-7*total_cost.detach().numpy() return x, done #Initial Solve x, l = getNewState() rho = 0.0 count = 0 for j in range(6): while True: count += 1 print("Count: ", count) x, done = opt_iteration(x,l,rho) if done: break with torch.no_grad(): xres = dynamical_res(x[0].unsqueeze(0)) print(xres.shape) print(l.shape) l += 2 * rho * xres print("upping rho") rho = rho * 10 + 0.1 #Online Solve start = time.time() NT = 10 for t in range(NT): # time steps print("Time step") with torch.no_grad(): x[:,0:-1,:] = x[:,1:,:] # shift forward one step l[:,0:-1,:] = l[:,1:,:] #x[:,0,:] = x[:,1,:] + torch.randn(1,NVars+NControls)*0.05 #Just move first position rho = 100 for i in range(1): # how many penalty pumping moves for m in range(4): # newton steps print("Iter Step") x, done = opt_iteration(x,l,rho) with torch.no_grad(): xres = dynamical_res(x[0].unsqueeze(0)) l += 2 * rho * xres rho = rho * 10 end = time.time() print(NT/(end-start), "Hz" ) plt.plot(xres[0,:,0].detach().numpy(), label='Xres') plt.plot(xres[0,:,1].detach().numpy(), label='Vres') plt.plot(xres[0,:,2].detach().numpy(), label='THeres') plt.plot(xres[0,:,3].detach().numpy(), label='Thetadotres') plt.legend(loc='upper right') plt.figure() #plt.subplot(132) plt.plot(x[0,:,1].detach().numpy(), label='X') plt.plot(x[0,:,2].detach().numpy(), label='V') plt.plot(x[0,:,3].detach().numpy(), label='Theta') plt.plot(x[0,:,4].detach().numpy(), label='Thetadot') plt.plot(x[0,:,0].detach().numpy(), label='F') #plt.plot(cost[0,:].detach().numpy(), label='F') plt.legend(loc='upper right') #plt.figure() #plt.subplot(133) #plt.plot(costs) print("hess count: ", count) plt.show()

The post Pytorch Trajectory Optimization Part 4: Cleaner code, 50Hz appeared first on Hey There Buddo!.

]]>The post Pytorch Trajectory Optimization 3: Plugging in the Hessian appeared first on Hey There Buddo!.

]]>When I profiled it using the oh so convenient https://github.com/rkern/line_profiler I found almost all of my time was spent in the delLy.backwards step. For each hessian I needed to run this B (the band width) times and each time cost ~0.6ms. For the entire run to converge took about 70 iterations and 1000 runs of this backwards step, which came out to 0.6 seconds. It is insane, but actually even calculating the band of the hessian costs considerably more time than inverting it.

So to speed this up, I did a bizarre thing. I replicated the entire system B times. Then I can get the entire hessian band in a single call to backwards. remarkably, although B ~ 15 this only slowed backwards down by 3x. This is huge savings actually, while obviously inefficient. The entire program has gone down from 1.1s to 0.38s, roughly a 3x improvement. All in all, this puts us at 70/0.38 ~ 185 Hz for a newton step. Is that good enough? I could trim some more fat. The Fast MPC paper http://web.stanford.edu/~boyd/papers/fast_mpc.html says we need about ~5 iterations to tune up a solution, this would mean running at 40Hz. I think that might be ok.

Since the hessian is hermitian it is possible to use roughly half the calls (~B/2), but then actually extracting the hessian is not so simple. I haven’t figured out a way to comfortably do such a thing yet. I think I could figure out the first column and then subtract (roughly some kind of gaussian elimination procedure).

It has helped stability to regularize everything with a surprising amount of weight in the cost. I guess since I anticipate all values being in the range of -10,10, maybe this makes sense.

Now I need to try not using this augmented Lagrangian method and just switching to a straight newton step.

Edit: Ooh. Adding a simple backtracking line search really improves stability.

import torch import matplotlib.pyplot as plt import numpy as np import torch.optim from scipy import linalg N = 100 T = 10.0 dt = T/N NVars = 4 NControls = 1 batch = (NVars+NControls)*3 def getNewState(): #we 're going to also pack f into here x = torch.zeros(batch,N,NVars+NControls, requires_grad=True) #f = torch.zeros(batch, N-1, requires_grad=True) l = torch.zeros(batch, N-1,NVars, requires_grad=False) return x, l def calc_loss(x, l , rho, prox=0): # l, #depack f, it has one less time point cost = 0.1*torch.sum(x**2) #cost += prox * torch.sum((x - x.detach())**2) f = x[:,1:,:NControls] #leftoverf = x[:,0,:NControls] x = x[:,:,NControls:] delx = (x[:,1:,:] - x[:, :-1,:]) / dt xbar = (x[:,1:,:] + x[:, :-1,:]) / 2 dxdt = torch.zeros(batch, N-1,NVars) THETA = 2 THETADOT = 3 X = 0 V = 1 dxdt[:,:,X] = xbar[:,:,V] #print(dxdt.shape) #print(f.shape) dxdt[:,:,V] = f[:,:,0] dxdt[:,:,THETA] = xbar[:,:,THETADOT] dxdt[:,:,THETADOT] = -torch.sin(xbar[:,:,THETA]) + f[:,:,0]*torch.cos(xbar[:,:,THETA]) xres = delx - dxdt lagrange_mult = torch.sum(l * xres) #cost = torch.sum((x+1)**2+(x+1)**2, dim=0).sum(0).sum(0) #cost += torch.sum((f+1)**2, dim=0).sum(0).sum(0) #cost += 1 penalty = rho*torch.sum( xres**2) #cost += 1.0*torch.sum((abs(x[:,:,THETA]-np.pi)), dim=1) #cost = 1.0*torch.sum((x[:,:,:]-np.pi)**2 ) cost += 1.0*torch.sum((x[:,:,THETA]-np.pi)**2 * torch.arange(N) / N ) cost += 0.5*torch.sum(f**2) #cost = 1.0*torch.sum((x[:,:,:]-np.pi)**2 ) #cost = cost1 + 1.0 #cost += 0.01*torch.sum(x**2, dim=1).sum(0).sum(0) #xlim = 3 #cost += 0.1*torch.sum(-torch.log(xbar[:,:,X] + xlim) - torch.log(xlim - xbar[:,:,X])) #cost += 0.1*torch.sum(-torch.log(xbar[:,:,V] + 1) - torch.log(1 - xbar[:,:,V]),dim=1) #cost += (leftoverf**2).sum() #total_cost = cost + lagrange_mult + penalty return cost, penalty, lagrange_mult, xres def getFullHess(): #for experimentation pass def getGradHessBand(loss, B, x): #B = bandn delL0, = torch.autograd.grad(loss, x, create_graph=True) delL = delL0[:,1:,:].view(B,-1) #remove x0 print("del ", delL[:,:10]) #hess = torch.zeros(B,N-1,NVars+NControls, requires_grad=False).view(B,B,-1) y = torch.zeros(B,N-1,NVars+NControls, requires_grad=False).view(B,-1) #y = torch.eye(B).view(B,1,B) #print(y.shape) for i in range(B): #y = torch.zeros(N-1,NVars+NControls, requires_grad=False).view(-1) y[i,i::B]=1 #print(y[:,:2*B]) print(y.shape) print(delL.shape) delLy = torch.sum(delL * y) #print(delLy) delLy.backward() #(i != B-1) #torch.autograd.grad(loss, x, create_graph=True) #print(hess.shape) #print(x.grad.shape) #hess[i,:] = x.grad[:,1:,:].view(-1) #also remove x0 #print(hess[i,:]) #print(x.grad) #print(hess) nphess = x.grad[:,1:,:].view(B,-1).detach().numpy()# .view(-1)# hess.detach().numpy() #print(nphess[:,:4]) #print(nphess) for i in range(B): nphess[:,i::B] = np.roll(nphess[:,i::B], -i+B//2, axis=0) print(nphess[:,:4]) #hessband = removeX0(nphess[:B//2+1,:]) #grad = removeX0(delL.detach().numpy()) return delL.detach().numpy()[0,:], nphess #hessband x, l = getNewState() rho = 0.1 prox = 0.0 for j in range(10): while True: try: cost, penalty, lagrange_mult, xres = calc_loss(x, l, rho, prox) #print(total_cost) print("hey now") #print(cost) total_cost = cost + lagrange_mult + penalty #total_cost = cost gradL, hess = getGradHessBand(total_cost, (NVars+NControls)*3, x) #print(hess) #print(hess.shape) gradL = gradL.reshape(-1) #print(gradL.shape) #easiest thing might be to put lagrange mutlipleirs into x. #Alternatively, use second order step in penalty method. bandn = (NVars+NControls)*3//2 print(hess.shape) print(gradL.shape) dx = linalg.solve_banded((bandn,bandn), hess, gradL) # x.grad.data.zero_() #print(hess) #print(hess[bandn:,:]) #dx = linalg.solveh_banded(hess[:bandn+1,:], gradL, overwrite_ab=True) newton_dec = np.dot(dx,gradL) #df0 = dx[:NControls].reshape(-1,NControls) dx = dx.reshape(1,N-1,NVars+NControls) with torch.no_grad(): x[:,1:,:] -= torch.tensor(dx) print(x[:,:5,:]) #print(x[:,0,NVars:].shape) #print(df0.shape) costval = cost.detach().numpy() #break if newton_dec < 1e-10*costval: break except np.linalg.LinAlgError: print("LINALGERROR") prox += 0.1 #break #print(x) with torch.no_grad(): l += 2 * rho * xres rho = rho * 2 #+ 0.1 #print(x) #plt.subplot(131) plt.plot(xres[0,:,0].detach().numpy(), label='Xres') plt.plot(xres[0,:,1].detach().numpy(), label='Vres') plt.plot(xres[0,:,2].detach().numpy(), label='THeres') plt.plot(xres[0,:,3].detach().numpy(), label='Thetadotres') plt.legend(loc='upper right') plt.figure() #plt.subplot(132) plt.plot(x[0,:,1].detach().numpy(), label='X') plt.plot(x[0,:,2].detach().numpy(), label='V') plt.plot(x[0,:,3].detach().numpy(), label='Theta') plt.plot(x[0,:,4].detach().numpy(), label='Thetadot') plt.plot(x[0,:,0].detach().numpy(), label='F') #plt.plot(cost[0,:].detach().numpy(), label='F') plt.legend(loc='upper right') #plt.figure() #plt.subplot(133) #plt.plot(costs) plt.show()

The post Pytorch Trajectory Optimization 3: Plugging in the Hessian appeared first on Hey There Buddo!.

]]>The post Cartpole Camera System – OpenCV + PS EYE + IR appeared first on Hey There Buddo!.

]]>We bought some retroreflective tape and put it on the pole. http://a.co/0A9Otmr

We removed our PS EYE IR filter. The PS EYE is really cheap (~7$) and has a high framerate mode (100+ fps). People have been using it for a while for computer vision projects.

http://wiki.lofarolabs.com/index.php/Removing_the_IR_Filter_from_the_PS_Eye_Camera

We followed the instructions, but did not add the floppy disk and sanded down the base of the lens to bring the image back into focus.

We bought an IR LED ring light which fit over the camera with the plastic cover removed and rubber banded it in place.

If you snip the photoresistor it is always on, since the photoresistor is high resistance in the dark. We used a spare 12V power supply that we soldered a connector on for.

We had also bought an IR pass filter on amazon, but it does not appear to help.

Useful utilties: qv4l2, v4l2-ctl and v4l2-utils. You can change lots of stuff.

qv4l2 -d 1 is very useful for experiementation

Useful options to v4l2-ctl : -d selects camera, -p sets framerate -l gives a list of changeable options. You have to turn off the automatic stuff before it becomes changeable. Counterintuitively auto-exposure seems to have 1 as off.

There has been a recent update to opencv to let the v4l2 buffer size be changed. We’re hoping this will really help with our latency issues

A useful blog. We use v4l2-ctl for controlling the exposure programmatically

http://www.jayrambhia.com/blog/capture-v4l2

Oooh. The contour method + rotated rectangle is working really well for matching the retroreflective tape.

https://docs.opencv.org/3.3.1/dd/d49/tutorial_py_contour_features.html

You need to reduce the video size to 320×240 if you want to go to the highest framerate of 187fps

In regards to the frame delay problem from before, it’s not clear that we’re really seeing it? We are attempting both the screen timestamp technique and also comparing to our rotary encoder. In the screen timestamp technique, it is not so clear that what we measure there is latency, and if it is, it includes the latency of the monitor itself, which is irrelevant.

The post Cartpole Camera System – OpenCV + PS EYE + IR appeared first on Hey There Buddo!.

]]>The post Extracting a Banded Hessian in PyTorch appeared first on Hey There Buddo!.

]]>One could sample out every column of the hessian for example. Performance-wise I don’t know how bad this might be.

For a banded hessian, which will occur in a trajectory optimization problem (the bandedness being a reflection of the finite difference scheme), you don’t need that many samples. This feels more palatable. You only need to sample the hessian roughly the bandwidth number of times, which may be quite small. Plus, then you can invert that banded hessian very quickly using special purpose banded matrix solvers, which are also quite fast. I’m hoping that once I plug this into the trajectory optimization, I can use a Newton method (or SQP?) which will perform better than straight gradient descent.

If you pulled just a single column using [1,0,0,0,0,0..] for example, that would be wasteful, since there are so many known zeros in the banded matrix. Instead something like [1,0,0,1,0,0,1,0,0..] will not have any zeros in the result. This gets us every 3rd row of the matrix. Then we can sample with shifted versions like [0,1,0,0,1,0,0,1,0,0..]. until we have all the rows somewhere. Then there is some index shuffling to put the thing into a sane ordering, especially so that we can use https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.solveh_banded.html which requires the banded matrix to be given in a particular form.

An alternative approach might be to use an fft with some phase twiddling. Also it feels like since the Hessian is hermitian we ought to be able to use about half the samples, since half are redundant, but I haven’t figured out a clean way to do this yet. I think that perhaps sampling with random vectors and then solving for the coefficients would work, but again how to organize the code for such a thing?

Here’s a snippet simulatng extracting the band matrix from matrix products.

import numpy as np N = 6 B = 5 h = np.diag(np.random.randn(N)) h = h + h.T # symmetrize our matrix print(h) band = y = np.zeros((B, N)) for i in range(B): y = np.zeros(N) y[i::B]=1 band[i,:] = h @ y print(band) for i in range(B): band[:,i::B] = np.roll(band[:,i::B], -i, axis=0) #B//2 print(band) print(band[:B//2+1,:])

and here is the full pytorch implementation including a linear banded solve.

import torch import matplotlib.pyplot as plt import numpy as np import torch.optim from scipy import linalg import matplotlib.pyplot as plt N = 12 x = torch.zeros(N, requires_grad=True) L = torch.sum((x[1:] - x[ :-1])**2)/2 + x[0]**2/2 + x[-1]**2/2 #torch.sum((x**2)) #L.backward() B = 3 delL, = torch.autograd.grad(L, x, create_graph=True) print(delL) print(x.grad) hess = torch.zeros(3,N, requires_grad=False) for i in range(3): y = torch.zeros(N, requires_grad=False) y[i::3]=1 delLy = delL @ y #delLy._zero_grad() delLy.backward(retain_graph=True) hess[i,:] = x.grad print(x.grad) x.grad.data.zero_() print(hess) nphess = hess.detach().numpy() print(nphess) for i in range(B): nphess[:,i::B] = np.roll(nphess[:,i::B], -i, axis=0) print(nphess) print(nphess[:B//2+1,:]) hessband = nphess[:B//2+1,:] b = np.zeros(N) b[4]=1 x = linalg.solveh_banded(hessband, b, lower=True) print(x) plt.plot(x) plt.show()

Output:

tensor([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]) None tensor([ 2., -1., -1., 2., -1., -1., 2., -1., -1., 2., -1., 0.]) tensor([-1., 2., -1., -1., 2., -1., -1., 2., -1., -1., 2., -1.]) tensor([ 0., -1., 2., -1., -1., 2., -1., -1., 2., -1., -1., 2.]) tensor([[ 2., -1., -1., 2., -1., -1., 2., -1., -1., 2., -1., 0.], [-1., 2., -1., -1., 2., -1., -1., 2., -1., -1., 2., -1.], [ 0., -1., 2., -1., -1., 2., -1., -1., 2., -1., -1., 2.]]) [[ 2. -1. -1. 2. -1. -1. 2. -1. -1. 2. -1. 0.] [-1. 2. -1. -1. 2. -1. -1. 2. -1. -1. 2. -1.] [ 0. -1. 2. -1. -1. 2. -1. -1. 2. -1. -1. 2.]] [[ 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] [-1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. 0.] [ 0. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1.]] [[ 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.] [-1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. 0.]] [0.61538462 1.23076923 1.84615385 2.46153846 3.07692308 2.69230769 2.30769231 1.92307692 1.53846154 1.15384615 0.76923077 0.38461538]

The post Extracting a Banded Hessian in PyTorch appeared first on Hey There Buddo!.

]]>The post PyTorch Trajectory Optimization Part 2: Work in Progress appeared first on Hey There Buddo!.

]]>Alternating the Lagrange multiplier steps and the state variable steps seems to have helped with convergence. Adding a cost to the dynamical residuals seems to have helped clean them up also.

I should attempt some kind of analysis rather than making shit up. Assuming quadratic costs (and dynamics), the problem is tractable. The training procedure is basically a dynamical system.

Changed the code a bit to use more variables. Actually trying the cart pole problem now. The results seem plausible. A noisy but balanced dynamical residual around zero. And the force appears to flip it’s direction as the pole crosses the horizontal.

Polyak’s step length

http://stanford.edu/class/ee364b/lectures/subgrad_method_notes.pdf

The idea is that if you know the optimal value you’re trying to achieve, that gives you a scale of gradient to work with. Not as good as a hessian maybe, but it’s somethin’. If you use a gradient step of it at least has the same units as x and not f/x. In some simple models of f, this might be exactly the step size you’d need. If you know you’re far away from optimal, you should be taking some big step sizes.

The Polyak step length has not been useful so far. Interesting idea though.

import torch import matplotlib.pyplot as plt import numpy as np batch = 1 N = 50 T = 7.0 dt = T/N NVars = 4 x = torch.zeros(batch,N,NVars, requires_grad=True) #print(x) #v = torch.zeros(batch, N, requires_grad=True) f = torch.zeros(batch, N-1, requires_grad=True) l = torch.zeros(batch, N-1,NVars, requires_grad=True) #with torch.no_grad(): # x[0,0,2] = np.pi ''' class Vars(): def __init__(self, N=10): self.data = torch.zeros(batch, N, 2) self.data1 = torch.zeros(batch, N-1, 3) self.lx = self.data1[:,:,0] self.lv = self.data1[:,:,1] self.f = self.data1[:,:,2] self.x = self.data[:,:,0] self.v = self.data[:,:,1] ''' def calc_loss(x,f, l): delx = (x[:,1:,:] - x[:, :-1,:]) / dt xbar = (x[:,1:,:] + x[:, :-1,:]) / 2 dxdt = torch.zeros(batch, N-1,NVars) THETA = 2 THETADOT = 3 X = 0 V = 1 dxdt[:,:,X] = xbar[:,:,V] dxdt[:,:,V] = f dxdt[:,:,THETA] = xbar[:,:,THETADOT] dxdt[:,:,THETADOT] = -torch.sin(xbar[:,:,THETA]) + torch.cos(f) xres = delx - dxdt #dyn_err = torch.sum(torch.abs(xres) + torch.abs(vres), dim=1) #torch.sum(xres**2 + vres**2, dim=1) # + Abs of same thing? lagrange_mult = torch.sum(l * xres, dim=1).sum(1) cost = 0.1*torch.sum((x[:,:,X]-1)**2, dim=1) + .1*torch.sum( f**2, dim=1) + torch.sum( torch.abs(xres), dim=1).sum(1)*dt + torch.sum((x[:,:,THETA]-np.pi)**2, dim=1)*0.01 #cost = torch.sum((x-1)**2, dim=1) total_cost = cost + lagrange_mult #100 * dyn_err + reward return total_cost, lagrange_mult, cost, xres #print(x.grad) #print(v.grad) #print(f.grad) import torch.optim as optim ''' # create your optimizer optimizer = optim.SGD(net.parameters(), lr=0.01) # in your training loop: optimizer.zero_grad() # zero the gradient buffers output = net(input) loss = criterion(output, target) loss.backward() optimizer.step() # Does the update ''' #Could interleave an ODE solve step - stinks that I have to write dyanmics twice #Or interleave a sepearate dynamic solving # Or could use an adaptive line search. Backtracking # Goal is to get dyn_err quite small ''' learning_rate = 0.001 for i in range(40): total_cost=calc_loss(x,v,f) #total_cost.zero_grad() total_cost.backward() while dyn_loss > 0.01: dyn_loss.backward() with torch.no_grad(): learning_rate = dyn_loss / (torch.norm(x.grad[:,1:]) + (torch.norm(v.grad[:,1:]) x[:,1:] -= learning_rate * x.grad[:,1:] # Do not change Starting conditions v[:,1:] -= learning_rate * v.grad[:,1:] reward.backward() with torch.no_grad(): f -= learning_rate * f.grad ''' learning_rate = 0.005 costs= [] for i in range(4000): total_cost, lagrange, cost, xres = calc_loss(x,f, l) costs.append(total_cost[0]) print(total_cost) #print(x) #total_cost.zero_grad() total_cost.backward() with torch.no_grad(): #print(f.grad) #print(lx.grad) #print(x.grad) #print(v.grad) f -= learning_rate * f.grad #l += learning_rate * l.grad #print(x.grad[:,1:]) x[:,1:,:] -= learning_rate * x.grad[:,1:,:] # Do not change Starting conditions #x.grad.data.zero_() #f.grad.data.zero_() l.grad.data.zero_() total_cost, lagrange, cost, xres = calc_loss(x,f, l) costs.append(total_cost[0]) print(total_cost) #print(x) #total_cost.zero_grad() total_cost.backward() with torch.no_grad(): #print(f.grad) #print(lx.grad) #print(x.grad) #print(v.grad) #f -= learning_rate * f.grad l += learning_rate * l.grad #print(x.grad[:,1:]) #x[:,1:,:] -= learning_rate * x.grad[:,1:,:] # Do not change Starting conditions x.grad.data.zero_() f.grad.data.zero_() #l.grad.data.zero_() print(x) #print(v) print(f) plt.plot(xres[0,:,0].detach().numpy(), label='Xres') plt.plot(xres[0,:,1].detach().numpy(), label='Vres') plt.plot(xres[0,:,2].detach().numpy(), label='THeres') plt.plot(xres[0,:,3].detach().numpy(), label='Thetadotres') plt.plot(x[0,:,0].detach().numpy(), label='X') plt.plot(x[0,:,1].detach().numpy(), label='V') plt.plot(x[0,:,2].detach().numpy(), label='Theta') plt.plot(x[0,:,3].detach().numpy(), label='Thetadot') plt.plot(f[0,:].detach().numpy(), label='F') #plt.plot(costs) #plt.plot(l[0,:,0].detach().numpy(), label='Lx') plt.legend(loc='upper left') plt.show()

Problems:

- The step size is ad hoc.
- Lagrange multiplier technique does not seem to work
- Takes a lot of steps
- diverges
- seems to not be getting an actual solution
- Takes a lot of iterations

On the table: Better integration scheme. Hermite collocation?

Be more careful with scaling, what are the units?

mutiplier smoothing. Temporal derivative of lagrange multiplier in cost?

alternate more complete solving steps

huber on theta position cost. Square too harsh? Punishes swing away too much?

more bullshit optimization strats as alternatives to grad descent

weight sooner more than later. Care more about earlier times since want to do model predictive control

Just solve eq of motion don’t worry about control as simpler problem

Pole up balancing

logarithm squeezing method – nope

The lambda * x model of lagrange mulitplier. Leads to oscillation

Damping term?

This learning rate is more like a discretization time step than a decay parameter. Well the product of both actually.

Heat equation model. Kind of relaxing everything into place

______________________________

Made some big adjustments

Switched to using pytorch optimizers. Adam seems to work the best. Maybe 5x as fast convergence as my gradient descent. Adagrad and Adadelta aren’t quite as good. Should still try momentum. Have to reset the initial conditions after every iteration. A better way? Maybe pass x0 in to calc_loss separately?

Switched over to using the method of multipliers http://www.cs.cmu.edu/~pradeepr/convexopt/Lecture_Slides/Augmented-lagrangian.pdf

The idea is to increase the quadratic constraint cost slowly over time, while adjusting a Lagrange mutiplier term to compensate also. Seems to be working better. The scheduling of the increase is still fairly ad hoc.

import torch import matplotlib.pyplot as plt import numpy as np import torch.optim batch = 1 N = 50 T = 10.0 dt = T/N NVars = 4 def getNewState(): x = torch.zeros(batch,N,NVars, requires_grad=True) f = torch.zeros(batch, N-1, requires_grad=True) l = torch.zeros(batch, N-1,NVars, requires_grad=False) return x,f,l def calc_loss(x,f, l, rho): delx = (x[:,1:,:] - x[:, :-1,:]) / dt xbar = (x[:,1:,:] + x[:, :-1,:]) / 2 dxdt = torch.zeros(batch, N-1,NVars) THETA = 2 THETADOT = 3 X = 0 V = 1 dxdt[:,:,X] = xbar[:,:,V] dxdt[:,:,V] = f dxdt[:,:,THETA] = xbar[:,:,THETADOT] dxdt[:,:,THETADOT] = -torch.sin(xbar[:,:,THETA]) + f*torch.cos(xbar[:,:,THETA]) xres = delx - dxdt #dyn_err = torch.sum(torch.abs(xres) + torch.abs(vres), dim=1) #torch.sum(xres**2 + vres**2, dim=1) # + Abs of same thing? lagrange_mult = torch.sum(l * xres, dim=1).sum(1) #cost = 0 cost = 1.0*torch.sum(torch.abs(x[:,:,THETA]-np.pi), dim=1) # 0.1*torch.sum((x[:,:,X]-1)**2, dim=1) + #cost += 2.0 * torch.sum((x[:,30:-1,THETA] - np.pi)**2,dim=1) #cost += 7.0*torch.sum( torch.abs(xres)+ xres**2 , dim=1).sum(1) penalty = rho*torch.sum( xres**2 , dim=1).sum(1) # + 1*torch.sum( torch.abs(xres)+ xres**2 , dim=1).sum(1) # 5.0*torch.sum( torch.abs(xres)+ xres**2 , dim=1).sum(1) + cost += 0.01*torch.sum( f**2, dim=1) #cost += torch.sum(-torch.log(f + 1) - torch.log(1 - f),dim=1) cost += 0.1*torch.sum(-torch.log(xbar[:,:,X] + 1) - torch.log(1 - xbar[:,:,X]),dim=1) cost += 0.1*torch.sum(-torch.log(xbar[:,:,V] + 1) - torch.log(1 - xbar[:,:,V]),dim=1) #cost += torch.sum(-torch.log(xres + .5) - torch.log(.5 - xres),dim=1).sum(1) # torch.sum( torch.abs(xres), dim=1).sum(1)*dt + #cost = torch.sum((x-1)**2, dim=1) total_cost = cost + lagrange_mult + penalty #100 * dyn_err + reward return total_cost, lagrange_mult, cost, xres import torch.optim as optim learning_rate = 0.001 x, f, l = getNewState() optimizers = [torch.optim.SGD([x,f], lr= learning_rate), torch.optim.Adam([x,f]), torch.optim.Adagrad([x,f])] optimizerNames = ["SGD", "Adam", "Adagrad"] optimizer = optimizers[1] #optimizer = torch.optim.SGD([x,f], lr= learning_rate) #optimizer = torch.optim.Adam([x,f]) #optimizer = torch.optim.Adagrad([x,f]) costs= [] path = [] mults = [] rho = 0.1 prev_cost = 0 for j in range(15): prev_cost = None for i in range(1,10000): total_cost, lagrange, cost, xres = calc_loss(x,f, l, rho) costs.append(total_cost[0]) if i % 5 == 0: #pass print(total_cost) optimizer.zero_grad() total_cost.backward() optimizer.step() with torch.no_grad(): x[0,0,2] = 0#np.pi+0.3 # Initial Conditions x[0,0,0] = 0 x[0,0,1] = 0 x[0,0,3] = 0 #print(x.grad.norm()/N) #print((x.grad.norm()/total_cost/N).detach().numpy() < 0.01) #if (x.grad.norm()).detach().numpy()/N < 0.1: #Put Convergence condition here if i > 2000: break if prev_cost: if ((total_cost - prev_cost).abs()/total_cost).detach().numpy() < 0.000001: pass #break prev_cost = total_cost total_cost, lagrange, cost, xres = calc_loss(x,f, l, rho) costs.append(total_cost[0]) with torch.no_grad(): l += 2 * rho * xres rho = rho + 0.5 print(rho) plt.subplot(131) plt.plot(xres[0,:,0].detach().numpy(), label='Xres') plt.plot(xres[0,:,1].detach().numpy(), label='Vres') plt.plot(xres[0,:,2].detach().numpy(), label='THeres') plt.plot(xres[0,:,3].detach().numpy(), label='Thetadotres') plt.legend(loc='upper right') #plt.figure() plt.subplot(132) plt.plot(x[0,:,0].detach().numpy(), label='X') plt.plot(x[0,:,1].detach().numpy(), label='V') plt.plot(x[0,:,2].detach().numpy(), label='Theta') plt.plot(x[0,:,3].detach().numpy(), label='Thetadot') plt.plot(f[0,:].detach().numpy(), label='F') #plt.plot(cost[0,:].detach().numpy(), label='F') plt.legend(loc='upper right') #plt.figure() plt.subplot(133) plt.plot(costs) plt.show()

The left is residuals of obeying the equations of motion, the middle is the force and trajectories themselves and the right is cost vs iteration time. Not entirely clear that a residual of 0.04 is sufficient. Integrated over time this could be an overly optimistic error of 0.2 ish I’d guess. That is on the edge of making me uncomfortable. Increase rho more? Also that force schedule seems funky and overly complex. Still, improvement from before. Feels like we’re cookin’ with gas

The post PyTorch Trajectory Optimization Part 2: Work in Progress appeared first on Hey There Buddo!.

]]>The post Garbage Can Compiling to Categories with Inspectable Lambdas appeared first on Hey There Buddo!.

]]>Linear functions can be reconstituted into a matrix if you give a basis of vectors.

Functions from enumerable types can be turned into a lookup table

Sufficiently polymorphic functions are another example though. forall a. a-> a is commonly known to only be id. The same goes for fst = forall a b. (a,b)->a and snd and swap and all the nesting of . These functions have exactly one inhabiting value (excluding internal churning and the possibility of going into an infinite loop).

So the type directly tells us the implementation

forall a. (a,a)->a is similar. It can only be fst or snd. Types that reuse a type parameter in the input can only be permutations.

I’ve been trying to find a way to take a written lambda and convert it to data automatically and have been having trouble.

An opaque type that we have hidden the contructors to is the same (T,T)->T can only be fst or snd specialized to T since we can’t possibly destruct on T.

We can figure out which one by giving a labeled example to that function and then inspecting a single output. This gives the permutation and duplication that was done.

Similarly for T -> Either T T

Once we have this, we can (Hopefully) reinterpret this lambda in terms of a monoidal category.

{-# LANGUAGE RankNTypes, GADTs, FlexibleInstances, DataKinds, TypeFamilies,MultiParamTypeClasses, FlexibleContexts, ScopedTypeVariables, FunctionalDependencies, GADTs, TypeOperators #-} --AllowAmbiguousTypes, -- OverlappingInstances, -- UndecidableInstances, import Data.Proxy import Unsafe.Coerce data Tag = Tag Int deriving Show type family (MonoMorphTag a) :: * where MonoMorphTag (a,b) = (MonoMorphTag a, MonoMorphTag b) MonoMorphTag (a->b) = (MonoMorphTag a) -> (MonoMorphTag b) MonoMorphTag Int = Int MonoMorphTag [a] = [a] MonoMorphTag (a,b,c) = (a,b,c) MonoMorphTag (a,b,c,d) = (a,b,c,d) MonoMorphTag Double = Double MonoMorphTag () = () MonoMorphTag Char = Char MonoMorphTag _ = Tag unsafeMonoTag :: a -> MonoMorphTag a unsafeMonoTag = unsafeCoerce -- unsafeTagLeaves :: forall a. MonoMorphTag a -> Tag -- unsafeTagLeaves = unsafeCoerce type T = Tag class GetVal a where val :: Int -> Proxy a -> (a, Int) instance GetVal Tag where val n _ = (Tag n, n+1) instance (GetVal a, GetVal b) => GetVal (a,b) where val n _ = ((v1, v2), n'') where (v1 , n') = val n (Proxy :: Proxy a) (v2 , n'') = val n' (Proxy :: Proxy b) data TagTree a = Node (TagTree a) (TagTree a) | Leaf a deriving Show -- | Apply (k a b) TagTree class Treeify a b where treeify :: a -> TagTree b instance Treeify Tag Tag where treeify x = Leaf x instance (Treeify a Tag, Treeify b Tag) => Treeify (a,b) Tag where treeify (a,b) = Node (treeify a) (treeify b) class MonoMorph a where type Mono a :: * instance MonoMorph (a,b) where type Mono (a,b) = (Mono a, Mono b) {- instance MonoMorph (MonoMorphTag a) where type Mono a = Tag -} {- -- Hmm I'm not sure how to monomorhpize this. fst' :: (TagTup a) => (a, b) -> a fst' = fst -} {- class AutoCurry a b | a -> b where autocurry :: a -> b instance AutoCurry (a->b->Tag) ((a,b)->Tag) where autocurry f = uncurry f instance AutoCurry c (a->c') => AutoCurry (b->c) ((b,a) -> c') where autocurry f = uncurry (\b -> autocurry (f b)) -} data Monoidal = Dup | Mon Monoidal Monoidal | Par Monoidal Monoidal | Fst | Snd | Id | Comp Monoidal Monoidal deriving Show data Monoidal' a b where Id' :: Monoidal' a a Dup' :: Monoidal' a (a,a) Fst' :: Monoidal' (a,b) a Snd' :: Monoidal' (a,b) b Comp' :: Monoidal' b c -> Monoidal' a b -> Monoidal' a c Mon' :: Monoidal' a a' -> Monoidal' b b' -> Monoidal' (a,b) (a',b') data FunData = FunData {inval :: TagTree Tag, outval :: TagTree Tag} deriving Show class TestIdea a b where works :: (a -> b) -> (a, b) instance (GetVal a) => TestIdea a b where works f = (inval, f inval) where inval = fst $ val 0 (Proxy :: Proxy a) -- fst $ val 0 (Proxy :: Proxy b) fuckmyshitup :: (GetVal a, Treeify a Tag, Treeify b Tag) => (a -> b) -> FunData fuckmyshitup f = let (a, b) = works f in FunData ((treeify a) :: TagTree Tag) ((treeify b):: TagTree Tag) ccc :: FunData -> Monoidal ccc (FunData x (Node y z)) = Mon (ccc $ FunData x y) (ccc $ FunData x z) ccc (FunData (Leaf _) (Leaf _)) = Id ccc (FunData (Node x y) z@(Leaf (Tag n))) = if inleft n x then Comp Fst (ccc (FunData x z)) else Comp Snd (ccc (FunData y z)) ineither :: Int -> TagTree Tag -> Bool ineither n (Node x y) = (ineither n x) || (ineither n y) ineither n (Leaf (Tag n')) = n == n' inleft :: Int -> TagTree Tag -> Bool inleft n (Node l r) = ineither n l inleft n (Leaf (Tag n')) = n == n' inright :: Int -> TagTree Tag -> Bool inright n (Node l r) = ineither n r inright n (Leaf (Tag n')) = n == n' -- Then we can compile to categories. Replacing the entire structure with dup and par and -- fst, snd, etc. -- Make an infix operator $' --data Apply k a b c = Apply (FreeCat k a b) c --type ($$) = Apply -- No, don't need getval. -- We'll just need it for treeify? {-instance GetVal c => GetVal (Apply k a b c) where val n _ = where x, n' = val n Proxy c -} -- Another Option data A data B data C -- This is basically a lambda calculus -- I could probably finitely enumerate through all the typeclasses for all the variables example = Proxy :: Proxy ((A,B) -> B) -- Hmm this would allow you to force duplicate input types though. {- class (Tagify a ~ a, Tagify b ~ b) => TestIdea a b where works :: (a -> b) -> (a, b) instance (GetVal a) => TestIdea a b where works f = (inval, f inval) where inval = fst $ val 0 (Proxy :: Proxy a) -- fst $ val 0 (Proxy :: Proxy b) -} --thisworks :: String --thisworks = works id -- fst . (val 0) {- instance (F a ~ flag, GetVal' flag a) => GetVal a where val = val' (Proxy :: Proxy flag) class GetVal' (flag :: Bool) a where val' :: Proxy flag -> a -> Tagify a instance (GetVal a, GetVal b) => GetVal' 'True (a,b) where val' _ (x,y) = (val x, val y) instance GetVal' 'False a where val' _ x = Tag 0 -}

What about TH? Also the new quantified constraints extensions might be helpful?

Ok. A Different approach. This works much better to what I had in mind. you can write aribatrary (\(x,y,) -> (y,x)) tuple like lambdas and it will convert them to a category. I really had to hack around to get the thing to compile. Like that Pick typeclass, what the heck? Why can I get defaults values in type families but not in typeclasses?

It is all decidedly not typesafe. You can get totally nonsensical things to compile to something. However if you stick to lambdas, you’ll be ok. Maybe.

No on further review this does not work. I got tricked that the type seemed ok at a certain point. A couple problems arise upon actual application. Since the idea is to drive the form based on the type variables upon actual application to something that has types of the same form it gets all screwed up. Also tons of instances are overlapping, although I think this is fixable.

Maybe what I need is existential types that can’t ever unify together accidentally.

A couple thought on typelevel programming principles:

- Typeclasses are hard to get default cases. You want to use type families if that is what you want
- Typeclasses need unique stuff to appear on the right hand side. Only 1 pattern should match. You might need to add extra parameters to match to which you can force on the left hand side of the instance
- ~ type equality is real useful

An alternative to using lambda is to use an explicit Proxy. The type variables are basically just as good for syntactic purposes (a touch more noisy).

{-# LANGUAGE RankNTypes, GADTs, FlexibleInstances, DataKinds, TypeFamilies,MultiParamTypeClasses, ImpredicativeTypes, FlexibleContexts, ScopedTypeVariables, FunctionalDependencies, UndecidableInstances, GADTs, TypeOperators #-} -- OverlappingInstances, NoImplicitPrelude -- --UndecidableInstances, --OverlappingInstances, import Data.Type.Bool import Data.Proxy --import Control.Category --import GHC.Base hiding (id,(.)) class IsId a where val :: a -> a -- toCat instance forall a. IsId (a -> a) where val _ = id {- class Catable f a b | f -> a,b where toCat :: forall k. CartesianCategory k => k a b instance forall a b. Catable ((a,b)->a) (a,b) a where toCat = fst -} class Fst ab a | ab -> a where -- toCat :: forall k. k ab a instance forall a b. Fst (a,b) a class Anything b where fun :: b -> b class Stringly a where stringly :: a -> String instance (Stringly a, Stringly b) => Stringly (a,b) where stringly (x,y) = "(" ++ (stringly x) ++ "," ++ (stringly y) ++ ")" {- instance (Stringly a, Stringly b) => Stringly (a -> b) where stringly f = "(" ++ (stringly x) ++ "->" ++ (stringly y) ++ ")" -} class Category k where dot' :: k b c -> k a b -> k a c id' :: k a a instance Category (->) where dot' = (.) id' = id class Category k => CartesianCat k where fst' :: k (a,b) a snd' :: k (a,b) b join' :: k a b -> k a c -> k a (b,c) instance CartesianCat (->) where fst' = fst snd' = snd join' = join'' class Catable a b where toCat :: CartesianCat k => (a -> b) -> (k a b) -- toCat (\x -> ((x,x),x)) . id -- it's not INSANE to just list out a finite list of possibilities ((a,b),c) etc. {- data HeldApply k a b = HeldApply (k a b) a ($$) :: Category k => k a b -> b -> HeldApply k a b f $$ x = HeldApply f instance Catable a (HeldApply a b) where toCat Doesn't seem to work. We don't have an a get get the heldapply out of the function Maybe we could pass in the approriate function as a a lambda \f x -> Apply f x instance ExponentialCategory k where apply :: k (k a b, a) b -} instance Catable a a where toCat _ = id' -- why is this okay? should these be covered by the other cases? instance Catable (a,b) a where toCat _ = fst' instance Catable (a,b) b where toCat _ = snd' dup x = (x,x) {- instance Catable a (a,a) where toCat _ = dup -} join'' f g x = (f x, g x) -- iterates down through the output instance (Catable a b, Catable a c) => Catable a (b,c) where toCat f = join' (toCat (fst . f)) (toCat (snd . f)) {- instance (InL c (a,b), Catable a c) => Catable (a,b) c where toCat f = (toCat (f . fst)) instance (InR c (a,b), Catable a c) => Catable (a,b) c where toCat f = (toCat (f . snd)) -} instance (Catable a c, Catable b c, Pick' c (a,b) (In a c)) => Catable (a,b) c where toCat f = pick' (Proxy :: Proxy (In a c)) {- instance (Catable a c, Catable b c, Pick c (a,b) (In a c)) => Catable (a,b) c where toCat f = (toCat (pick (Proxy :: Proxy (In a c)))) -} {- class In a c where find :: c -> a instance In a a find = id instance In a b => In a (b,c) find = find . fst instance In a c => In a (b,c) find = find . snd -} {- type family (LorR a c) :: Nat where LorR a (a,_) = 1 LorR a (_,a) = 2 LorR a ((b,c),d) = (LorR a (b,c)) + (LorR a d) LorR a (d,(b,c)) = (LorR a (b,c)) + (LorR a d) LorR a _ = 0 -} type family (In a c) :: Bool where In a a = 'True In a (a,_) = 'True In a (_,a) = 'True In a ((b,c),d) = In a (b,c) || In a d In a (d,(b,c)) = In a (b,c) || In a d In a _ = 'False {- type Snd = forall a b. (a,b) -> b type family (FstSnd a) :: * where FstSnd 'True = Snd FstSnd 'False = Snd -} class Pick a c (d :: Bool) where pick :: Proxy d -> c -> a instance (Pick a (e,f) (In a e), (e,f) ~ b) => Pick a (b,c) 'True where pick _ = (pick (Proxy :: Proxy (In a e))) . fst instance (Pick a (e,f) (In a e), (e,f) ~ c) => Pick a (b,c) 'False where pick _ = (pick (Proxy :: Proxy (In a e))) . snd instance Pick a (a,b) 'True where pick _ = fst instance Pick a (b,a) 'False where pick _ = snd instance Pick a a d where pick _ = id -- The bool is true if in the left branch class Pick' a c (d :: Bool) where pick' :: CartesianCat k => Proxy d -> k c a instance (Pick' a (e,f) (In a e), (e,f) ~ b) => Pick' a (b,c) 'True where pick' _ = dot' (pick' (Proxy :: Proxy (In a e))) fst' instance (Pick' a (e,f) (In a e), (e,f) ~ c) => Pick' a (b,c) 'False where pick' _ = dot' (pick' (Proxy :: Proxy (In a e))) snd' instance Pick' a (a,b) 'True where pick' _ = fst' instance Pick' a (b,a) 'False where pick' _ = snd' instance Pick' a a d where pick' _ = id' {- class InL a c where instance InL a a instance In a b => InL a (b,c) class InR a c instance InR a a instance In a b => InR a (c,b) -} {- instance (Catable a c, Catable b c) => Catable (a,b) c where toCat f = instance (Catable a c, Catable b c) => Catable a (b,c) where toCat f = -} {- instance (Stringly a, Stringly b, (a,b) ~ c, IsTup c ~ 'True) => Stringly c where stringly (x,y) = "(" ++ (stringly x) ++ "," ++ (stringly y) ++ ")" -} --instance (IsTup a ~ 'False, IsArr a ~ 'False) => Stringly a where -- stringly _ = "_" instance forall a. Anything a where fun = id example :: a -> a example = val id type family (IsTup a) :: Bool where IsTup (a,b) = 'True IsTup _ = 'False type family (IsArr a) :: Bool where IsArr (a->b) = 'True IsArr _ = 'False

The post Garbage Can Compiling to Categories with Inspectable Lambdas appeared first on Hey There Buddo!.

]]>