LQR with CVXPY

Holy crap this was easy.

And I was able to easily add a constraint on the available force. Hot damn. It is a ridiculously tiny problem I guess, but still pretty damn cool. 0.002 second runtime.

 

STM32F411 Discovery Board Getting started

Bought one of these discovery boards for 15$ from digikey. I like the built in stuff, like the 9axis mems device. I don’t enjoy wiring up little boards particularly and it makes every project so ephemeral.

http://www.st.com/content/st_com/en/products/evaluation-tools/product-evaluation-tools/mcu-eval-tools/stm32-mcu-eval-tools/stm32-mcu-discovery-kits/32f411ediscovery.html

I am concerned that I should have gotten the older board.

User manual has pin connectors

http://www.st.com/content/ccc/resource/technical/document/user_manual/e9/d2/00/5e/15/46/44/0e/DM00148985.pdf/files/DM00148985.pdf/jcr:content/translations/en.DM00148985.pdf

It’s in platform.io as

disco_f411ve

Huh it only supports STM32Cube? I kind of wanted to try libOpenCM3

start a project with

platformio init –board disco_f411ve

platformio run

The examples are invaluable

https://github.com/platformio/platform-ststm32/tree/develop/examples

ok. There is a blink for the Hardware abstraction layer (HAL) and Low level (LL)

Hmm. Neither blink examples work off the bat. SYS_CLOCK is undefined for Low level

and board not supported in the HAL.

 

 

Alright let’s try libopencm3

https://github.com/libopencm3/libopencm3-examples

make in top directory first

follow directions to init submodule

alirght I guess I need to download the arm toolchain

https://developer.arm.com/open-source/gnu-toolchain/gnu-rm/downloads

I put them on my path

export PATH=~/Downloads/gcc-arm-none-eabi-7-2017-q4-major/bin:$PATH

Then ran make

also need to apt install gawk

editted the Makefile of miniblink

https://github.com/libopencm3/libopencm3/blob/master/ld/devices.data

stm32f411re stm32f4 ROM=512K RAM=128K

that’s the right config even though re is wrong

Okay was getting weird error

sudo make flash V=1

the V=1 gives verbose

I may have had to apt install openocd?

Need to run under sudo. Go fig.

Alright got some blinking! WOOO

Ah I see. PD12 is PortD12. Makes sense.

 

Trying Platform.io

PlatformIO and Visual Studio Take over the World

http://platformio.org/

Somehow I was not aware of this thing. It is a build tool for microcontrollers

Seems like people basically like it. 1000+ stars on github

python -m pip install -U platformio

 

make a folder

platformio init –board icestick

Holy crap. Is this thing going to download and setup the tools? THAT. IS. AWESOME. If it works.

Better yet clone this bad boy

https://github.com/platformio/platformio-examples

https://github.com/platformio/platformio-examples/tree/develop/lattice_ice40/lattice_ice40-leds

go to the blink folder.

platformio run

platformio run –target upload

Holy. Hell. It worked. THAT IS NUTS.

The commands it ran to compile

Hmm. I’m puzzled. Where did this come from? How did it know counter.v?

 

Mecrisp has an icestick version. Intriguing (Mecrisp is a forth implementation)

 

 

https://github.com/platformio/platform-lattice_ice40/tree/develop/examples/leds

had to sudo apt install libreadline6 and gtkwave to run simulation

I had to follow these instructions to get the FTDI device to work

https://stackoverflow.com/questions/36633819/iceprog-cant-find-ice-ftdi-usb-device-linux-permission-issue

and change the platformio.ini file to say icestick instead of icezum. Actually i don’t think that is necessary.

 

 

 

Cart Pole using Lyapunov and LQR control, OpenAI gym

We’re having a lot of trouble hacking together a reinforcement learning version of this, so we are taking an alternative approacg, inspired by wtaching the MIT underactuated robotics course.

http://underactuated.csail.mit.edu/underactuated.html?chapter=acrobot

It took some pen and paper to get the equations of motion (which are maybe right?).

openai gym has

We switch over to LQR when the y position of the pole is above a certain height

https://en.wikipedia.org/wiki/Linear%E2%80%93quadratic_regulator

This scipy function solves the algebriac ricatti equation in the ocntinous time infite horizon section

https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.linalg.solve_continuous_are.html

Things that helped: Trying to balance pole first from upright position then from downright.

Tuning weights for theta and thetadot. Thetadot was too small made it unstable

Hacked in the LQR control by adjusting force_mag variable. Nasty.

 

Put it some slight compensation for a delayed observation, which reflects our actual sensor system

 

 

 

 

 

Linksprite CNC machine

I’m impressed with the package in the box. Well organanized.

Putting it together took maybe 4 hours, half watching The Wire.

The x-axis screw is not fitting. I’m hoping the end is just ground incorrectly.

Nope. This has been a huge pain. They sent me the wrong screw. There are 8mm 4mm and 2mm pitch T8 rods. They have 1 2 and 4 starts to their threading. I have bought the maximal number of incorrect rods. I now have in posession the 4mm pitch rod I need. Figuring this out has set me back a month and another 30$. I am annoyed.

There is a crack in the z-axis printed part. Epoxy should fix it.

open up arduino serial monitor

115200 baud both NL & CR

 

Hmm. LinuxCNC is more complicated than I thought. It seems installing it on my regular ubuntu 14.04 is not an option without a lot of dangerous futzing. Uses a real-time special kernel.

http://linuxcnc.org/docs/devel/html/code/building-linuxcnc.html

whenever it didn’t work i apt-get installed whatever was missing.

I also had to add a non redistributable when that error came up.

make

 

 

 

Two Control Guys. One in python one in java

https://github.com/vlachoudis/bCNC

https://github.com/winder/Universal-G-Code-Sender

 

Pycam and Blendercam do 3d models. Pycam appears to be defunct

http://pycam.sourceforge.net/

go to folder and make.

The website seems wrong. The links aren’t.

Build error

https://sourceforge.net/p/pycam/mailman/pycam-devel/

 

 

http://blendercam.blogspot.com/

 

Pcb

https://github.com/pcb2gcode/pcb2gcode

 

Inkscape gcodetools.

https://github.com/cnc-club/gcodetools#gcodetools

An Intro to G-code and How to Generate It Using Inkscape

 

Openscam (now called camotics?) is a path simulator.

 

http://jscut.org/

 

S1000 sets the spindle speed

M3 turns spindle on

M5 turns off

even 0.2 depth is a little deep

home. reset

turn on the spindle

 

3d Printed Soda Can Stirling Engine

Success!

After a very disappointing day trying to use hand twisted wire crankshafts and bottle cap piston tops, we shifted over to 3d printing to make a well dimensioned soda can Stirling engine. We found that the crankshaft sizes has to be really surprisingly small. Remember that things more twice that radial distance over a full revolution.

Things that may have helped: We got those Sterno heat things. Actually it was too aggressive. The steel wool displacer was nearly the width of the.

We kept the floss hole pretty tight, by poking it with a small object and threading the floss through with a needle and running it back and forth until the steel wool weight is enough. It is basically air tight and a little too sticky, but it works.

The files can be found here https://cad.onshape.com/documents/65e85cdc777333555f580b5a/w/e8a82a796b8d794bd914d077/e/82b3f39011e3b6223a8eb6be

Mostly everything is 3d printed, with some 3mm screws and 1/8in home depot steel rod for axles.

Currently we hot glued the crankshaft together. A better idea possibly is to print them a little tight and then head the shaft with a lighter and sort of melt it through a bit. Might make for easier assembly.

The free wheeling floss pulley is clutch. We had a lot of problems with binding on our crappy version.

The displacer and piston rod are 90 degrees out of phase. Checkout the assembly in the Onshape document to see.

We do have difficulty with melting parts. The stand in particular keeps melting. We’re thinking about it.

Haskell Gloss is awesome

Gloss is a super simple binding to drawing 2d stuff in a window for haskell

https://hackage.haskell.org/package/gloss

Given how relatively bewildering Hello World feels in Haskell, it’s surprising how easy it is to get an animated window going. I think that may be because of expectations. You expect Hello world to be really easy and have no confusion and yet there is an inexplicable IO monad-what? , whereas you expect drawing to involve a bunch of inexplicable boiler plate bullshit.

This simulates a pendulum. Drawing it as a line.

simulate takes a pile of arguments

first thing describes a window, with title text, size and screen position I think?

then background color

frames per second

initial state (0 angle 0 angular velocity)

a drawing function from state to a picture, which is a gloss type of lines and crap

and a state update function taking a time step dt and current state.

 

I feel like there is room for a completely introductory tutorial to Haskell using gloss. It’s so rewarding to see stuff splash up on the screen.

MAYBE I’LL DO IT. or maybe not.

An ignoramus thinking about Compiling to Categories

http://conal.net/papers/compiling-to-categories/

I haven’t thoroughly read this (in fact barely at all) and yet I have some thoughts.

Conal Elliott is up to some good shit. Maybe ignore me and just checkout the links.

Simply Typed Lambda Calculus (STLC)

Hask is a category of haskell types with functions as arrows between them, but there is a lot going on in Haskell. Polymorphism for example, which is often modeled as some kind of Natural Transformation of EndoFunctors from Hask to Hask (but I don’t think this covers every possible use of polymorphism. Maybe it does).

It is commonly mentioned that STLC is the more direct mapping. STLC is kind of a subset of haskell with no polymorphism or with the polymorphism stripped out (Every time you use a polymorphic function just monomorphize it. This blows up the number of functions floating around).

STLC is a Cartesian Closed Category (CCC), which means it is always possible to create pairs between any two types and functions between any two types.

data STLCType a = Prim a | Pair (STLCType a) (STLCType a) | Arr (STLCType a) (STLCType a)

data STLCTerm a = Lam Var STLCTerm | App STLCTerm | Var Var | Prim a

data Var = Var String STLCType

 

which maybe be extendible with a bunch of primitive operations and types (primitives might include addition, integers, bits, bit operations, etc.). It isn’t clear to me where it is most elegant to put type annotations. Maybe it is best to keep them separate and compare them.

Apparently it is possible to compile this in a point free form

data CatTerm a = Comp STLCTerm STLCTerm | App STLCTerm STLCTerm | Prim a

or maybe

data CatTerm a = App STLCTerm STLCTerm | Prim a| Comp

Dealing with the labeling of variables correctly is a real pain in the ass, so this is a win from the compiler standpoint. It is a pain to manually write functions using this style.

The compiling to categories project I think is using Haskell as the language and GHC to do the parsing and some other stuff, but then grabbing Core (the GHC intermediate language) and converting it into the Category form. I don’t see why you couldn’t use an STLC DSL and do the same. It would be less ergonomic to the user but also much less complicated. I wonder. I’ve written interpreters for STLC and they are very simple.

Circuits form a CCC. Basic Prim type is a wire with a Boolean value on it. Pairs of these make a bus. Composition is just attaching wires between subunits. PrimOps might include Ands and Ors and Nands and multiplexers and such. Or maybe you want to work at the level where 32-bit integers are primitives and addition and subtraction and other arithmetic functions are primops.

The Arrow type is more questionable. Can you really do higher order functions in fixed circuitry? In principle, yes. Since every prim type is finite and enumerable, arrows are also finitely enumerable. You could use a varying lookup table for example as the incoming data. This is an exponential blowup though. Maybe you ought to be working in the category where arrows are types of functions that are compactly encodable and runnable. You don’t want really stupidly massive circuits anyhow. Some kind of complexity theory restriction. For example, maybe you want all functions encodable in a tight BDD. You really need to shape of the BDD to be fixed. Maybe BDD whose width is bounded by some polynomial of the depth? If you don’t need the full width, maybe you could leave sections dead. Just spitballin’

Or in many cases I assume the higher order functions will come applied at compile time, in which case you can just substitute the given circuit in to the locations it is needed or share it somehow (probably needs a clock to time multiplex its use) at the multiple locations it is needed to save space.

Also of extreme interest:

http://conal.net/papers/generic-parallel-functional/

He’s using this compiling to categories perspective with the intent to layout parallel circuits.

He uses Generic DataTypes with are very functory, which implies type parameters which tempts one into polymorphism. But I think again he is using polymorphism as a scheme which upon compilation gets turned into a bunch of different legit types. Maybe you’d want

data STLCType f a = Prim (f a) | Pair (STLCType f a) (STLCType f a) | Arr (STLCType f a) (STLCType f a)

 

You could do well to make way more lightweight operator synonyms for this lambda calculus

Lam String LC | App LC LC | Var String

(–>) = Lam

or

(\\) = Lam

and some synonyms for common variables

x = “x”

y = “y”

etc

 

($$) = App

to dereference

(**) = Var  – bind this very close. Trying to look kind of like pointer derferencing?

maybe also add pattern matching into the language

(Pat) =

And some kind of autocurrying

Curry [Var] |

Maybe use Vec instead of list for compile time size.

I guess this is gonna be funky and having the actual lambda syntax compile is super cool and nice. But it is also nice to have a userland haskell concept of this stuff without weird plugins. A OverloadLambda extension would be cool. I don’t know how it would work. Type directed inference of which kind of lambda to use? Is that possible?

 

Also, could one piggyback on the haskell type system to make the well typing lighter weight a la the well-typed interpeter. Maybe using GADTs. http://docs.idris-lang.org/en/latest/tutorial/interp.html

It does seem unlikely that one could use raw lambdas in a good way

 

Checkin out the OpenAI Baselines

These are my notes on trying to edit the opeai baselines codebase to balance a cartpole from the down position. They are pretty scattered.

First I just run the built in examples to get a feel and try out deepq networks.

The PPO algorithm at the bottom is the reccommended one still I think. I got the pole kind of upright and would balance for a couple seconds maybe. Work in progress. The ppo example has some funky batching going on that you need to reshape your observations for.

 

 

https://github.com/openai/baselines

Some of these are ready to roll on anything you throw at them

They use python3.

pip3 install tensorflow-gpu

pip3 install cloudpickle

running a deepq cartpole example

checkout source

https://github.com/openai/baselines/blob/master/baselines/deepq/experiments/train_cartpole.py

what is all this shit.

Has quickly suppressed  exploration

has a clear upward trend. But the increase dies off at episode 300 or so. Learning rate decay?

took around 300 episodes to reach reward 200

Pretty good looking

The pong trainer is now called run_atari.py

sudo apt install python3-opencv

Hmm. Defaults to playing breakout actually. Highly exploratory at first. Still basically totally random after a couple mins

The buffer size is actually smaller than I’d expected. 10000?

simple.py is where the learn function is defined.

Hmmm explloration fraction is fraction of time for random play to be turned off over.

Is it getting better? Maybe. Still at 90% random. reward ~0.2

After about 2 hours 13,000 episodes still at 32% exploration. reward ~ 2

How much is from reduction of randomness, vs improvement of q-network? I suppose I could make a graph of reward vs exploration % using current network

 

trying ppo2 now (this is apparently openai goto method at the moment for first attempt)

https://blog.openai.com/openai-baselines-ppo/

I don’t know what to attribute this to necessarily but it is very quickly outperforming the q learning

like in 30seconds

clipfrac is probably how often the clipping bound gets hit

approxkl is approximate kullback-leibler divergence

looking inside the networks,

pi is the move probability distribution layer

vf is the value function

The MlpPolicy class will be useful for lower dimensional tasks. That is the multilayer perceptron, using  a bunch of fully connected layers

on a non simulated system, lower processor count to 1. The -np option on mpirun

Or just follow the mujoco example which is using only 1 cpu

Ah. This whole time it was saving in the /tmp folder in a time stamped folder

 

 

without threshold stopping, the thing went absolutely insane. You need to box in the pole

 

trying to make the square of height, so  that it more emphasizes a total height achieved? Maybe only give reward for new record height? + any height nearly all the way up

beefing up the force magntidue. I think it is a little too wimpy to get it over vertical

Maybe lower episode end so it spends more time trying to get higher than trying to just stay in bounds

Wait, should I not add those vector wrapper guys?

 

Hey! Sort of kind of success… It gets it up there sometimes, but then it can’t really keep it up there… That’s odd.

Wow. This is not a good integrator. Under no force the pendulum is obviously gaining energy. Should that matter much? I switched around the ordering to get a leapfrog. Much better.

custom cartpole instance to work from down position and editted to work with ppo2. To more closely match mujoco example, i switched from discrete actions to a continuous choice. I was trying to shape the reward to keep it in bounds. We’re getting there maybe. Picking the starting position happens in reset. Most everything else is a straight copy from the gym env

 

A script to view resulting network