Categorical LQR Control with Linear Relations

Last time, I described and built an interesting expansion of describing systems via linear functions, that of using linear relations. We remove the requirement that mappings be functional (exactly one output vector for any given input vector), which extends our descriptive capability. This is useful for describing “open” pieces of a linear system like electric circuits. This blog post is about another application: linear dynamical systems and control.

Linear relations are an excellent tool for reachability/controllability. In a control system, it is important to know where it is possible to get the system to. With linear dynamics x_{t+1} = Ax_{t} + B u_{t}, an easy controllability question is “can my system reach into some subspace”. This isn’t quite the most natural question physically, but it is natural mathematically. My first impulse would be to ask “given the state starts in this little region over here, can it get to this little region over here”, but the subspace question is a bit easier to answer. It’s a little funky but it isn’t useless. It can detect if the control parameter only touches 1 of 2 independent dynamical systems for example. We can write the equations of motion as a linear relation and iterate composition on them. See Erbele for more.

There is a set of different things (and kind of more interesting things) that are under the purview of linear control theory, LQR Controllers and Kalman filters. The LQR controller and Kalman filter are roughly (exactly?) the same thing mathematically. At an abstract mathematical level, they both rely on the fact that optimization of a quadratic objective x^T Q x + r^T x + c with a linear constraints Ax=b is solvable in closed form via linear algebra. The cost of the LQR controller could be the squared distance to some goal position for example. When you optimize a function, you set the derivative to 0. This is a linear equation for quadratic objectives. It is a useful framework because it has such powerful computational teeth.

The Kalman filter does a similar thing, except for the problem of state estimation. There are measurements linear related to the state that you want to match with the prior knowledge of the linear dynamics of the system and expected errors of measurement and environment perturbations to the dynamics.

There are a couple different related species of these filters. We could consider discrete time or continuous time. We can also consider infinite horizon or finite horizon. I feel that discrete time and finite horizon are the most simple and fundamental so that is what we’ll stick with. The infinite horizon and continuous time are limiting cases.

We also can consider the dynamics to be fixed for all time steps, or varying with time. Varying with time is useful for the approximation of nonlinear systems, where there are different good nonlinear approximation (taylor series of the dynamics) depending on the current state.

There are a couple ways to solve a constrained quadratic optimization problem. In some sense, the most conceptually straightforward is just to solve for a span of the basis (the “V-Rep” of my previous post) and plug that in to the quadratic objective and optimize over your now unconstrained problem. However, it is commonplace and elegant in many respects to use the Lagrange multiplier method. You introduce a new parameter for every linear equality and add a term to the objective \lambda^T (A x - b). Hypothetically this term is 0 for the constrained solution so we haven’t changed the objective in a sense. It’s a funny slight of hand. If you optimize over x and \lambda, the equality constraint comes out as your gradient conditions on \lambda. Via this method, we convert a linearly constrained quadratic optimization problem into an unconstrained quadratic optimization problem with more variables.

Despite feeling very abstract, the value these Lagrangian variables take on has an interpretation. They are the “cost” or “price” of the equality they correspond to. If you moved the constraint by an amount Ax - b + \epsilon, you would change the optimal cost by an amount \lambda \epsilon. (Pretty sure I have that right. Units check out. See Boyd

The Lagrange multipliers enforcing the linear dynamics are called the co-state variables. They represent the “price” that the dynamics cost, and are derivatives of the optimal value function V(s) (The best value that can be achieved from state s) that may be familiar to you from dynamic programming or reinforcement learning. See references below for more.

Let’s get down to some brass tacks. I’ve suppressed some terms for simplicity. You can also add offsets to the dynamics and costs.

A quadratic cost function with lagrange multipliers. Q is a cost matrix associated with the x state variables, R is a cost matrix for the u control variables.

C = \lambda_0 (x_0 - \tilde{x}_0) + \sum_t x_t^T Q x_t + u_t^T R u_t + \lambda_{t+1}^T (x_{t+1} - A x_t - B u_t  )

Equations of motion results from optimizing with respect to \lambda by design.

\nabla_{\lambda_{t+1}} C = x_{t+1} - A x_t - B u_t  = 0.

The initial conditions are enforced by the zeroth multiplier.

\nabla_{\lambda_0} C = x_i - x_{0} = 0

Differentiation with respect to the state x has the interpretation of backwards iteration of value derivative, somewhat related to what one finds in the Bellman equation.

\nabla_{x_t} C = Q x_{t} + A^T \lambda_{t+1} - \lambda_{t} = 0 \Longrightarrow \lambda_{t} =  A^T \lambda_{t+1} + Q x_{t}

The final condition on value derivative is the one that makes it the most clear that the Lagrange multiplier has the interpretation of the derivative of the value function, as it sets it to that.

\nabla_{x_N} C = Q x_N - \lambda_{N} = 0

Finally, differentiation with respect to the control picks the best action given knowledge of the value function at that time step.

\nabla_{u_t} C = R u_{t} - B^T \lambda_t  = 0

Ok. Let’s code it up using linear relations. Each of these conditions is a separate little conceptual box. We can build the optimal control step relation by combining these updates together

The string diagram for a single time step of control

The following is a bit sketchy. I am confident it makes sense, but I’m not confident that I have all of the signs and other details correct. I also still need to add the linear terms to the objective and offset terms to the dynamics. These details are all kind of actually important. Still, I think it’s nice to set down in preliminary blog form.

-- state of an oscillator
data SHOState = X | P deriving (Show, Enum, Bounded, Eq, Ord)
data Control = F deriving (Show, Enum, Bounded, Eq, Ord)
-- Costate newtype wrapper
newtype Co a = Co a deriving (Show, Enum, Bounded, Eq, Ord)
type M = Matrix Double
dynamics :: forall x u. (BEnum x, BEnum u) => Matrix Double -> Matrix Double ->
HLinRel (Either x u) x
dynamics a b = HLinRel (a ||| b ||| -i ) (vzero cx) where
cx = card @x
cu = card @u
i = ident cx
initial_cond :: forall x. BEnum x => Vector Double-> HLinRel Void x
initial_cond x0 = HLinRel i x0 where
cx = card @x
i = ident cx
valueUpdate :: forall x l. (BEnum x, BEnum l) => M -> M -> HLinRel (Either x l) l
valueUpdate a q = HLinRel ((tr a) ||| q ||| i) (vzero cl) where
cl = card @l
i = ident cl
optimal_u :: forall u l. (BEnum u, BEnum l) =>
M -> M -> HLinRel u l
optimal_u r b = HLinRel (r ||| tr b) (vzero cu) where
cu = card @u
step :: forall x u l. (BEnum x, BEnum u, BEnum l) => M -> M -> M -> M
-> HLinRel (Either x l) (Either x l)
step a b r q =
f5 <<< f4 <<< hassoc' <<< f3 <<< f2 <<< hassoc <<< f1 where
f1 :: HLinRel (Either x l) (Either (Either x x) l)
f1 = first hdup
f2 :: HLinRel (Either x (Either x l)) (Either x l)
f2 = second (valueUpdate a q)
f3 :: HLinRel (Either x l) (Either x (Either l l))
f3 = second hdup
f4 :: HLinRel (Either (Either x l) l) (Either (Either x u) l)
f4 = first (second (hconverse (optimal_u r b)))
f5 :: HLinRel (Either (Either x u) l) (Either x l)
f5 = first (dynamics a b)
-- iterate (hcompose (step a b r q)) :: [HLinRel (Either x l) (Either x l)]
view raw optimal_relation.hs hosted with ❤ by GitHub
Initial conditions and final conditions. Final conditions fix the final value derivative to Qx. Initial conditions set x to some value. Should there be a leg for lambda on the initial conditions?

Bits and Bobbles

  • The code in context
  • Some of the juicier stuff is nonlinear control. Gotta walk before we can run though. I have some suspicions that a streaming library may be useful here, or a lens-related approach. Also ADMM.
  • Reminiscent of Keldysh contour. Probably a meaningless coincidence.
  • In some sense H-Rep is the analog of (a -> b -> Bool) and V-rep [(a,b)]
  • Note that the equations of motion are relational rather than functional for a control systems. The control parameters u describe undetermined variables under our control.
  • Loopback (trace) the value derivative for infinite horizon control.
  • Can solve the Laplace equation with a Poincare Steklov operator. That should be my next post.
  • There is something a touch unsatisfying even with the reduced goals of this post. Why did I have to write down the quadratic optimization problem and then hand translate it to linear relations? It’s kind of unacceptable. The quadratic optimization problem is really the most natural statement, but I haven’t figured out to make it compositional. The last chapter of Rockafellar?

References Baez and Erbele – Categories in Control – interesting analogy to backpropagation. Lens connection?

Flappy Bird as a Mixed Integer Program

My birds.

Mixed Integer Programming is a methodology where you can specify convex (usually linear) optimization problems that include integer/boolean variables.

Flappy Bird is a game about a bird avoiding pipes.

We can use mixed integer programming to make a controller for Flappy Bird. Feel free to put this as a real-world application in your grant proposals, people.

While thinking about writing a MIP for controlling a lunar lander game, I realized how amenable to mixed integer modeling flappy bird is. Ben and I put together the demo on Saturday. You can find his sister blog post here.

The bird is mostly in free fall, on parabolic trajectories. This is a linear dynamic, so it can directly be expressed as a linear constraint. It can discretely flap to give itself an upward impulse. This is a boolean force variable at every time step. Avoiding the ground and sky is a simple linear constraint. The bird has no control over its x motion, so that can be rolled out as concrete values. Because of this, we can check what pipes are relevant at time points in the future and putting the bird in the gap is also a simple linear constraint.

There are several different objectives one might want to consider and weight. Perhaps you want to save the poor birds energy and minimize the sum of all flaps cvx.sum(flap). Or perhaps you want to really be sure it doesn’t hit any pipes by maximizing the minimum distance from any pipe. Or perhaps minimize the absolute value of the y velocity, which is a reasonable heuristic for staying in control. All are expressible as linear constraints. Perhaps you might want a weighted combo of these. All things to fiddle with.

There is a pygame flappy bird clone which made this all the much more slick. It is well written and easy to understand and modify. Actually figuring out the appropriate bounding boxes for pipe avoidance was finicky. Figuring out the right combo of bird size and pipe size is hard, combined with computer graphics and their goddamn upside down coordinate system.

We run our solver in a model predictive control configuration. Model predictive control is where you roll out a trajectory as an optimization problem and resolve it at every action step. This turns an open loop trajectory solve into a closed loop control, at the expense of needing to solve a perhaps very complicated problem in real time. This is not always feasible.

My favorite mip modeling tool is cvxpy. It gives you vectorized constraints and slicing, which I love. More tools should aspire to achieve numpy-like interfaces. I’ve got lots of other blog posts using this package which you can find in my big post list the side-bar 👀.

The github repo for the entire code is here:

And here’s the guts of the controller:

import cvxpy as cvx
import numpy as np
import matplotlib.pyplot as plt

N = 20 # time steps to look ahead
path = cvx.Variable((N, 2)) # y pos and vel
flap = cvx.Variable(N-1, boolean=True) # whether or not the bird should flap in each step
last_solution = [False, False, False]
last_path = [(0,0),(0,0)]

SKY = 0
GROUND = (512*0.79)-1

def getPipeConstraints(x, y, lowerPipes):
    constraints = []
    for pipe in lowerPipes:
        dist_from_front = pipe['x'] - x - BIRDDIAMETER
        dist_from_back = pipe['x'] - x + PIPEWIDTH
        if (dist_from_front < 0) and (dist_from_back > 0):
            #print(pipe['y'] + BIRDDIAMETER,  pipe['y'] + PIPEGAPSIZE)
            constraints += [y <= (pipe['y'] - BIRDDIAMETER)] # y above lower pipe
            constraints += [y >= (pipe['y'] - PIPEGAPSIZE)] # y below upper pipe
    #if len(constraints) > 0:
    return constraints

def solve(playery, playerVelY, lowerPipes):
    global last_path, last_solution

    pipeVelX = -4
    playerAccY    =   1   # players downward accleration
    playerFlapAcc =  -14   # players speed on flapping

    # unpack variables
    y = path[:,0]

    vy = path[:,1]

    c = [] #constraints
    c += [y <= GROUND, y >= SKY]
    c += [y[0] == playery, vy[0] == playerVelY]

    x = PLAYERX
    xs = [x]
    for t in range(N-1):
        dt = t//10 + 1
        #dt = 1
        x -= dt * pipeVelX
        xs += [x]
        c += [vy[t + 1] ==  vy[t] + playerAccY * dt + playerFlapAcc * flap[t] ]
        c += [y[t + 1] ==  y[t] + vy[t + 1]*dt ]
        c += getPipeConstraints(x, y[t+1], lowerPipes)

    #objective = cvx.Minimize(cvx.sum(flap)) # minimize total fuel use
    objective = cvx.Minimize(cvx.sum(flap) + 10* cvx.sum(cvx.abs(vy))) # minimize total fuel use

    prob = cvx.Problem(objective, c)
        prob.solve(verbose = False, solver="GUROBI")
        last_path = list(zip(xs, y.value))
        last_solution = np.round(flap.value).astype(bool)
        return last_solution[0], last_path
        last_solution = last_solution[1:]
        last_path = [((x-4), y) for (x,y) in last_path[1:]]
        return last_solution[0], last_path

I think it is largely self explanatory but here are some notes. The dt = t//10 + 1 thing is about decreasing our time resolution the further out from the current time step. This increases the time horizon without the extra computation cost. Intuitively modeling accuracy further out in time should matter less. The last_solution stuff is for in case the optimization solver fails for whatever reason, in which case it’ll follow open-loop the last trajectory it got.

Bits and Bobbles

  • We changed the dynamics slightly from the python original to make it easier to model. We found this did not change the feel of the game. The old dynamics were piecewise affine though, so are also modelable using mixed integer programming. . Generally check out the papers coming out of the Tedrake group. They are sweet. Total fanboy over here.
  • The controller as is is not perfect. It fails eventually, and it probably shouldn’t. A bug? Numerical problems? Bad modeling of the pipe collision? A run tends to get through about a hundred pipes before something gets goofy.
  • Since we had access to the source code, we could mimic the dynamics very well. How robust is flappy bird to noise and bad modeling? We could add wind, or inaccurate pipe data.
  • Unions of Convex Region. Giving the flappy bird some x position control would change the nature of the problem. We could easily cut up the allowable regions of the bird into rectangles, and represent the total space as a union of convex regions, which is also MIP representable.
  • Verification – The finite difference scheme used is crude. It is conceivable for the bird to clip a pipe. Since basically we know the closed form of the trajectories, we could verify that the parabolas do not intersect the regions. For funzies, maybe use sum of squares optimization?
  • Realtime MIP. Our solver isn’t quite realtime. Maybe half real time. One might pursue methods to make the mixed integer program faster. This might involve custom branching heuristics, or early stopping. If one can get the solver fast enough, you might run the solver in parallel and only query a new path plan every so often.
  • 3d flappy bird? Let the bird rotate? What about a platformer (Mario) or lunar lander? All are pretty interesting piecewise affine systems.
  • Is this the best way to do this? Yes and no. Other ways to do this might involve doing some machine learning, or hardcoding a controller that monitors the pipe locations and has some simple feedback. You can find some among the forks of FlapPyBird. I have no doubt that you could write these quickly, fiddle with them and get them to work better and faster than this MIP controller. However, for me there is a difference between could work and should work. You can come up with a thousand bizarre schemes that could work. RL algorithms fall in this camp. But I have every reason to believe the MIP controller should work, which makes it easier to troubleshoot.

Some Notes on Drake: A Robotic Control ToolBox

Dumping these notes out there as is. Some material is out of date. Hope they’re useful.

What is Drake?

From the Drake Website (

“Drake (“dragon” in Middle English) is a C++ toolbox started by the Robot Locomotion Group at the MIT Computer Science and Artificial Intelligence Lab (CSAIL). The development team has now grown significantly, with core development led by the Toyota Research Institute. It is a collection of tools for analyzing the dynamics of our robots and building control systems for them, with a heavy emphasis on optimization-based design/analysis.

While there are an increasing number of simulation tools available for robotics, most of them function like a black box: commands go in, sensors come out. Drake aims to simulate even very complex dynamics of robots (e.g. including friction, contact, aerodynamics, …), but always with an emphasis on exposing the structure in the governing equations (sparsity, analytical gradients, polynomial structure, uncertainty quantification, …) and making this information available for advanced planning, control, and analysis algorithms. Drake provides interfaces to high-level languages (MATLAB, Python, …) to enable rapid-prototyping of new algorithms, and also aims to provide solid open-source implementations for many state-of-the-art algorithms. Finally, we hope Drake provides many compelling examples that can help people get started and provide much needed benchmarks. We are excited to accept user contributions to improve the coverage.”

Drake is a powerful toolkit for the control of dynamical systems, and I hope I lower some of the barrier to entry with this post. Be forewarned, Drake changes quickly with time, and some of the following may be out of date (especially the rigid body trees) or ill advised. Use your judgement.

Getting Drake

Drake may be installed from binaries or source. Both may be gotten here:

Using Bazel

Drake uses Bazel as a build tool. Bazel is an open-source build and test tool similar to Make, Maven, and Gradle.

There are three commands that you need to know:

  • bazel build
  • bazel run
  • bazel query

Query is very useful for investigating available binaries within the examples folder and elsewhere.

The notation // is used to refer to the build’s main directory. This corresponds to the drake folder.
... is the notation for everything in the subdirectory


  • bazel build //... will build everything in the project.
  • To query all the binaries available for tools run bazel query //tools/...

Using Pydrake

Drake can be run natively in C++, or by using its MATLAB, python, or Julia bindings. This manual will be focusing on using Drake in python. By default pydrake will not be in the path. You can put pydrake into the path after building by running the following line or adding it to your .bashrc

export PYTHONPATH=~/drake/drake-build/install/lib/python2.7/site-packages:${PYTHONPATH}

The following line will import everything in Drake into the python namespace.

from pydrake.all import *

Drake is only compatible with python 2.7. If your default system install is python3, you may need to explicitly run the command python2.
The following message may be thrown if you inadvertently use python3.

Traceback (most recent call last):
File "", line 2, in 
from pydrake.all import


ImportError: dynamic module does not define module export function (PyInit__common_py)

Pydrake itself has generated documentation available here:

This is an extremely useful resource.

A very useful tool for exploring and confirming the Drake functionality and syntax is the python REPL. From the python REPL or within your code, it is useful to inspect an object using either help(obj) or print(dir(obj)) which will print a list of all the properties available on your object.

Drake resources


Besides the source code itself, the most accurate and up to date information is available in searchable form on the Doxygen page. This is a useful reference, but can be overwhelming.

Underactuated Robotics textbook and examples

Drake has a textbook being built alongside it by Russ Tedrake.

In particular the python source used to generate some of the examples in the book is worth examining.

The full course is available online both on Edx and more recent versions on Youtube.

Examples directory.

A useful source of use cases for drake can be found in the examples directory within the Drake project. As of October 2018 the contents include:

  • Cartpole – The cartpole is a classic control system consisting of a pendulum attached to a linearly actuated cart. In this directory you can find examples
  • Pendulum
  • Double Pendulum
  • Acrobot – The Acrobot is a double pendulum system actuated at the shoulder
  • Kinova Jaco Arm – A commercially available robotic arm
  • Kuka Iiwa Arm – A commerically available robotic arm
  • Particles
  • Bouncing Ball
  • Contact Model – Contains bowling pins, a gripper, and sliding bricks demonstrating Drakes ability to simulate contact dynamics
  • Rimless Wheel – A very simple model of a walking robot
  • Compass Gait – A slightly less simple model of a walking robot
  • Atlas – Examples concerning the Atlas humanoid robot
  • zmp
  • quadcopter
  • and others

Additional usage examples for pydrake can be found in the drake/bindings/pydrake/test folder.

Periscope Tutorials

There is a set of Jupyter notebook based tutorials for some basic Drake functionality in python.

Drake Concepts


For dynamic simulation, Drake exposes a Lego block interface for building complex systems out of smaller pieces, a paradigm similar to Simulink and other modeling software.
Objects possess input and output ports. One wires up input ports to output ports to build composite systems.

To build a simple forward simulation, construct a builder object. Then add all subsystems to the builder object. Explicitly connected input and output ports together. One possible cause of crashes may be leaving ports unconnected.

Once the entire system has been built, a Simulator object can be constructed from it. You may select an integration scheme and set initial conditions by getting a context from the Simulator object. The context holds state information.

There following excerpt from shows how to define a simple system and simulate it.

from import VectorSystem

# Define the system.
class SimpleContinuousTimeSystem(VectorSystem):
def __init__(self):
0, # Zero inputs.
1) # One output.
self._DeclareContinuousState(1) # One state variable.

# xdot(t) = -x(t) + x^3(t)
def _DoCalcVectorTimeDerivatives(self, context, u, x, xdot):
xdot[:] = -x + x**3

# y(t) = x(t)
def _DoCalcVectorOutput(self, context, u, x, y):
y[:] = x

import matplotlib.pyplot as plt
from pydrake.all import (DiagramBuilder, SignalLogger, Simulator)
from continuous_time_system import *

# Create a simple block diagram containing our system.
builder = DiagramBuilder()
system = builder.AddSystem(SimpleContinuousTimeSystem())
logger = builder.AddSystem(SignalLogger(1))
builder.Connect(system.get_output_port(0), logger.get_input_port(0))
diagram = builder.Build()

# Create the simulator.
simulator = Simulator(diagram)

# Set the initial conditions, x(0).
state = simulator.get_mutable_context().get_mutable_continuous_state_vector()

# Simulate for 10 seconds.

# Plot the results.


Rigid Body Trees

Edit : I think the Rigid Body Trees interface is deprecated. Use Multi Body Trees.

There are two complementary perspectives to take of the degrees of freedom of a robot, intrinsic coordinates and extrinsic coordinates. The intrinsic coordinates have one variable per degree of freedom of the robot. A common example is a set of joint angles. The dynamics are simply expressed in the intrinsic coordinates and can derived using Lagrangian mechanics. The extrinsic coordinates specify the spatial locations and orientation of frames attached to the robot. These are called frames. These spatial coordinates may be constrained in a relationship by a rigid mounting or gearing, so there may be more extrinsic frames available than intrinsic coordinates. Extrinsic coordinates are particularly pertinent for discussions of geometry, contact, and external forces.

Drake uses the URDF (Universal Robot Description Format) format. This is a common robot format originating in the ROS community for which you can find models of many commercial robots online. It is an XML based format with visualization, inertial, and collision tags.

This is an example URDF for a pendulum cart system.

Example: Pendulum URDF








The RigidBodyTree is a Drake class that describes both the intrinsic and extrinsic properties of a linkage. RigidBodyTrees may be built from a URDF file, some of which are packaged inside of Drake. For example, the Jaco arm URDF is available packaged with Drake.

jaco_urdf = FindResourceOrThrow(
tree = RigidBodyTree(jaco_urdf, FloatingBaseType.kFixed)

The full properties of the RigidBodyTree class are available at , but it will be useful to highlight some of the most commonly used functionality.

You can probe the RigidBodyTree for useful properties about the linkage, for example the number of intrinsic coordinates, or the number of bodies in the tree.


For many operations, one needs to perform the forward dynamics to build a kinematic cache that needs to be supplied later.

q = np.zeros(tree.get_num_positions())
v = np.zeros(tree.get_num_velocities())
cache = tree.doKinematics(q,v)

Drake describes the dynamics in terms of the intrinsic coordinates. The robot manipulator equations have the common form (See )

M(q)\ddot{q} + C(q,\dot{q})\dot{q} = \tau_g(q) + Bu

where q is the state vector, M is the inertia matrix, C captures Coriolis forces, and \tau_g is the gravity vector. The matrix B maps control inputs u into generalized forces.

External forces on the tree are described as wrenches. Wrenches are an object that combines forces and torques. In Drake, they are specified by 6 dimensional vectors. From the Drake docs:
“A column vector consisting of one wrench (spatial force) = [r X f; f], where f is a force (translational force) applied at a point P and r is the position vector from a point O (called the “moment center”) to point P.”

Drake will also compute the quantities in the manipulator equation. For example, to compute the term C(q,v,f_{ext}) in the manipulator equations with no externally applied wrenches.

bodies = [tree.get_body(j) for j in range(tree.get_num_bodies())]
no_wrench = { body : np.zeros(6) for body in bodies}
print(tree.dynamicsBiasTerm(cache, no_wrench))

Drake can be asked to compute the other terms in the manipulator equation as well. Drake can compute the center of mass of the entire tree.


Drake provides a couple of useful mappings between the intrinsic and extrinsic coordinates.

First, given a set of internal coordinates, Drake can transform these into frames, which may be expressed in yet another coordinate system.

The extrinsic frames can be expressed as a function of the intrinsic coordinates

X_i = f_i(x_j)

tree.relativeTransform(cache, 0, 9)

Returns a 4\times 4 transformation matrix between body 0 and 9 of the tree. The upper 3\times 3 block corresponds to a rotation matrix, and the last column a translation vector of the frame.

The Jacobian of this mapping,

J_{ij} = \frac{\partial f_i}{\partial x_j}

is useful for translating externally described small displacements, velocities, forces, and torques, into the equivalent terms for the intrinsic coordinates.

The time derivative or differential of a frame will possess a linear and angular velocity. These come packed in a 6 dimensional vector called the Twist.

The geometric Jacobian function returns the Jacobian in a sparse format. It returns a tuple of a m\times 6 matrix and a 1\times m vector of indices to which coordinates these correspond.
The function takes the id of three different frames. In this example, it computes the differential of the transformation from frame 0 to frame 9 expressed in frame 3.

tree.geometricJacobian(cache, 0, 9, 3)

The dense matrix can be constructed by

jsparse, inds = tree.geometricJacobian(cache, 0, 9, 0)
jdense = np.zeros((tree.get_num_positions(), 6))
jdense[inds, :] = jsparse


A Mathematical Introduction to Robotic Manipulation –


The Drake visualizer may be found in the tools folder. The Drake visualizer needs to be running before any applications that needs to communicate with it are started. To get the drake visualizer running run the following command from the drake directory

bazel run //tools:drake_visualizer

May need to run commands here to make LCM work and visualizer not crash immediately

sudo ifconfig lo multicast
sudo route add -net netmask dev lo

Kinova ROS package

The Kinova Jaco arm is easily communicated to through the Kinova ROS package

The package as of October 2018 supports Ubuntu 14.04 and ROS Indigo. The package includes bindings to the Jaco SDK for reading sensor data and giving motion commands. The package supports a variety of control modes, including torque control.

To update functionality to ubuntu 16.04 and ROS Kinetic you may wish to pull from this external branch

or try the beta branch

The Gazebo simulator seems to have a great deal of trouble.
One suggestion is that you have to add a small nonzero size parameter to your URDF files.

The topics made available by the ROS Package are


ROS Intercommunication

The communication stack used by Drake is Lightweight Communications and Marshalling (LCM). For interconnection to the rest of the ROS ecosystem, this adds a layer of friction.

One option is to use the lcm_to_ros project

This is a generator for building republishers of messages going between ROS and LCM

Another option is to build custom subscribers and publishers as in the following example.

Example: Connecting Drake Visualizer to External Jaco Arm

In another terminal turn on the Jaco driver

roslaunch kinova_bringup kinova_robot.launch kinova_robotType:=j2n6s300

Also get the drake visualizer running from the drake directory before running the script

bazel run //tools:drake_visualizer

import rospy
from pydrake.all import *
import numpy as np
from sensor_msgs.msg import JointState
jaco_urdf = FindResourceOrThrow("drake/manipulation/models/jaco_description/urdf/j2n6s300.urdf")
tree = RigidBodyTree(jaco_urdf , FloatingBaseType.kFixed)

builder = DiagramBuilder()

lc = DrakeLcm()
vis = DrakeVisualizer(tree, lc)
robot = builder.AddSystem(RigidBodyPlant(tree))
publisher = builder.AddSystem(vis)
builder.Connect(robot.get_output_port(0), publisher.get_input_port(0))

diagram = builder.Build()
simulator = Simulator(diagram)
context = simulator.get_mutable_context()
state = context.get_mutable_continuous_state_vector()

q = None
v = None
f = None
def callback(data):
        global q, v, f
        q = np.array(data.position)
        v = np.array(data.velocity)
        f = np.array(data.effort)
        # Some of the coordinates are offset between the Kinova Ros package and the Drake model
        q[1] = np.pi + q[1]
        q[2] = np.pi + q[2]

rospy.init_node('listener', anonymous=True)
rospy.Subscriber("/j2n6s300_driver/out/joint_state", JointState, callback)

rate = rospy.Rate(4)
while not rospy.is_shutdown():
        state = context.get_mutable_continuous_state_vector()
The Drake Visualizer

Example: Crumpling Jaco

A simple example of the simulation capabilities is the simulation of an unactuated Jaco arm.

import rospy
from pydrake.all import *
import numpy as np
from sensor_msgs.msg import JointState
jaco_urdf = FindResourceOrThrow("drake/manipulation/models/jaco_description/urdf/j2n6s300.urdf")
tree = RigidBodyTree(jaco_urdf , FloatingBaseType.kFixed)

builder = DiagramBuilder()

lc = DrakeLcm()
vis = DrakeVisualizer(tree, lc)
robot = builder.AddSystem(RigidBodyPlant(tree))
publisher = builder.AddSystem(vis)
builder.Connect(robot.get_output_port(0), publisher.get_input_port(0))
force = builder.AddSystem(ConstantVectorSource(np.zeros(9)))
builder.Connect(force.get_output_port(0), robot.get_input_port(0))

diagram = builder.Build()
simulator = Simulator(diagram)
context = simulator.get_mutable_context()
state = context.get_mutable_continuous_state_vector()

Optimization Solvers

Drake provides a common interface to many optimization solvers. Because of the tight integration, Drake has the capability to directly build the equations of motion for a system into a form the solver can comprehend.

The Mathematical Program class provides a high level interface to the different solvers. This class can be constructed as an object on it’s own or as returned by other helper classes such as trajectory optimization builders.

Drake provides a symbolic expression modeling language in which to describe constraints and costs.

from pydrake.all import *
import numpy as np

m = MathematicalProgram()

xs = m.NewContinuousVariables(2, 'x')
c = np.array([-1,-1])
m.AddCost( c[0]*xs[0] + c[1]*xs[1] )

m.AddLinearConstraint(xs[0] == 0)
m.AddLinearConstraint(3*xs[1] + xs[0] <= -1)


There are a large variety of solvers out there for problems of different structure.

A Mathematical Program is generally of the form

\min f(x) \   s.t.
g(x) \ge 0
h(x) = 0

One of the oldest and most studied class of these programs are known as Linear Programs which has linear cost and constraints. This mathematical program has very efficient solvers available for it.

A large class of optimization problems that are tractable are known as Convex Optimization problems. The cost functions must be bowl shaped (convex), the inequality constraints must define convex regions and the equality constraints must be affine (linear + offset). For this class, gradient descent roughly works and refinements of gradient descent like Newton's method work even better. Convexity implies that greedy local moves are also acceptable global moves and there are no local minima or tendril regions to get caught in.

The common reference for convex optimization is the textbook by Boyd and Vandenberghe freely available at There is also an accompanying video course available online.

Subclasses of Convex programming may have solvers tuned to them. Important subclasses include:

  • Linear Programming - Linear objective, linear equality, and linear inequality constraints.
  • Quadratic Programming - Linear Programming + quadratic objective
  • Second Order Cone Programming
  • Semidefinite Programming - Optimization allowing for the constraint of a SemiDefinite matrix. This means the matrix is constrained to have all nonnegative eigenvalues or equivalently the quadratic form it defines q^T Xq is non-negative for all possible vectors q.
  • Sum of Squares Programming - Optimization over polynomials with the constraint that the polynomial can be written as a sum of squares, a manifestly positive form.

Many problems cannot be put into this form. If the inherent nature of the problem at hand requires it, you may choose to use a nonlinear programming solver or Mixed Integer Programming Solver.

The solvers Ipopt and Snopt are nonlinear programming solvers. Ipopt is an open source and Snopt is proprietary. These solvers use local convex approximations to the problem to heuristically drive the solution to a local minimum.

Mixed Integer Linear Programming is Linear Programming with additional ability to require variables to take on integer values. This additional constraint type takes the problem from polynomial time solvable to NP-complete. A surprisingly large number of discrete and continuous optimization problems can be encoded into this framework. Internally, these solvers use linear programming solvers to guide the discrete search.

Part of the art and fun of the subject comes from manipulating your problem into a form amenable to powerful available solvers and theory.

Hans Mittelmann at the University of Arizona maintains benchmarks for a variety of optimization tools. A rule of thumb is that commercial solvers perform better than open source solvers.

Available Solvers in Drake

  • Mosek - Mosek is a proprietary optimization solver package offering solvers for
    • Linear.
    • Conic quadratic.
    • Semi-definite (Positive semi-definite matrix variables).
    • Quadratic and quadratically constrained.
    • General convex nonlinear.
    • Mixed integer linear, conic and quadratic.
  • Gurobi - Gurobi is a proprietary optimization solver package offering solvers for
    • linear programming solver (LP)
    • mixed-integer linear programming solver (MILP)
    • mixed-integer quadratic programming solver (MIQP)
    • quadratic programming solver (QP)
    • quadratically constrained programming solver (QCP)
    • mixed-integer quadratically constrained programming solver (MIQCP)
  • Snopt - Snopt is a nonlinear optimization solver using sequential convex optimization (SQP)
  • Ipopt - Ipopt is a open source nonlinear optimization solver using sequential convex optimization (SQP)
  • Operator Splitting Quadratic Program (OSQP) - Open source quadratic programming package
  • ik
  • LCP
  • dReal - An SMT solver for reals

Automatic Differentiation

Automatic Differentiation is the capability to have derivatives computed alongside code that computes the values. It is largely based upon application of the chain rule. There are two modes, forward and reverse mode.

Forward mode is the simplest to describe. Functions can be overloaded so that they take a dual number, a value combined in a tuple with it's derivative information. As the value of a function is computed, the Jacobian of that function is matrix multiplied on the derivative concurrently.

Drake exposes automatic differentiation capability for manual use

from pydrake.all import *
import numpy as np

# second parameter initializes forward mode derivative information
x = AutoDiffXd(4.0, np.array([1,1]))
f = x * x * x

More importantly Drake uses automatic differentiation internally to marshal symbolic problems into forms acceptable to external solvers and to calculate the various Jacobians we've already seen.

Trajectory Optimization

Trajectory optimization is a framework in which one uses Mathematical programming to solve optimal trajectory problems. The input to the system
is considered to be a decision variable in a Mathematical programming problem.

The combination of dynamical system modeling, automatic differentiation, and bindings to mathematical programming solvers makes Drake an excellent platform for trajectory optimization.

In Direct Collocation, both the path and the input variables are discretized along time. The trajectories are described by cubics and force curves are described by piecewise linear. One way of performing direct collocation is to take the path position at a discrete number of time points and make a decision variable for each. The discretized equations of motion become constraints that neighboring time points have to obey. One may then inject any other desired requirements (staying inside some safe region for example) as additional constraints.

The Drake docs state:
DirectCollocation implements the approach to trajectory optimization as described in C. R. Hargraves and S. W. Paris. Direct trajectory optimization using nonlinear programming and collocation. J Guidance, 10(4):338-342, July-August 1987. It assumes a first-order hold on the input trajectory and a cubic spline representation of the state trajectory, and adds dynamic constraints (and running costs) to the midpoints as well as the knot points in order to achieve a 3rd order integration accuracy.

Drake provides a useful interface for talking about trajectories. For both describing the cost function and the constraint functions, you want them to apply at all time steps. You can ask drake for variables representing the position or forces at all time steps. Drake will then build the appropriate mathematical program applying the cost of constraint at all time steps.

ctx = robot.CreateDefaultContext()
dircol = DirectCollocation(robot, ctx, 10, 0.01, 0.1) # timesteps, minimum time step, max timestep

state = dircol.state
u = dircol.input
t = dircol.time

dircol.AddConstraintToAllKnotPoints(u[0] >= 0)

You may want to select the initial and final state specifically to specify goals and initial conditions

init = dircol.initial_state
final = dircol.final_state


Example: Trajectory Optimization of a Pendulum

This example comes from the Underactuated Robotics textbook source

import math
import numpy as np
import matplotlib.pyplot as plt

from pydrake.examples.pendulum import (PendulumPlant, PendulumState)
from pydrake.all import (DirectCollocation, PiecewisePolynomial,
#from visualizer import PendulumVisualizer

plant = PendulumPlant()
context = plant.CreateDefaultContext()

dircol = DirectCollocation(plant, context, num_time_samples=21,
                            minimum_timestep=0.2, maximum_timestep=0.5)


torque_limit = 3.0  # N*m.
u = dircol.input()
dircol.AddConstraintToAllKnotPoints(-torque_limit <= u[0])
dircol.AddConstraintToAllKnotPoints(u[0] <= torque_limit)

initial_state = PendulumState()
# More elegant version is blocked on drake #8315:
# dircol.AddLinearConstraint(dircol.initial_state()
#                               == initial_state.get_value())

final_state = PendulumState()
# dircol.AddLinearConstraint(dircol.final_state() == final_state.get_value())

R = 10  # Cost on input "effort".

initial_x_trajectory = \
        PiecewisePolynomial.FirstOrderHold([0., 4.],
result = dircol.Solve() 
assert(result == SolutionResult.kSolutionFound) 
x_trajectory = dircol.ReconstructStateTrajectory() 
#vis = PendulumVisualizer() 
#ani = vis.animate(x_trajectory, repeat=True) 
x_knots = np.hstack([x_trajectory.value(t) for t in                                          
                      x_trajectory.end_time(), 100)]) 
plt.plot(x_knots[0, :], x_knots[1, :]) 

Example Application: Estimating end-effector force from Jaco motor torques

End effector forces become part of the equations of motion.

The geometric Jacobian transforms the extrinsic force into intrinsic coordinates. It is in general a rectangular matrix, since the number of extrinsic coordinates does not need to match the number of intrinsic coordinates.

A force f applied to the end effector appears linearly in the manipulator equations as J^Tf. This will be canceled by the torques of the motors during static or quasi-static movement. Hence, we can can determinethe external force from the motor torques if we assume it is the only external force at play.

The pseudo-inverse is the best possible solution to an overdetermined set of linear equations, in the least squares sense. We use this operation due to the non square nature of the Jacobian.

\min (Jx - X)^2 \rightarrow J^T Jx = J^T X

The following script prints both the end effector force as supplied by the Jaco SDK and the force as computed by Drake from the internal motor torques.

import rospy
from pydrake.all import *
import numpy as np
from sensor_msgs.msg import JointState
from geometry_msgs.msg import WrenchStamped
urdf_tree = FindResourceOrThrow("drake/manipulation/models/jaco_description/urdf/j2n6s300.urdf")
tree = RigidBodyTree( urdf_tree,
bodies = [tree.get_body(j) for j in range(tree.get_num_bodies())]
no_wrench = { body : np.zeros(6)  for body in bodies}

body_n = 9 #end_effector body number

def end_effector_force(q, v, u):
        cache = tree.doKinematics(q,v)
        # returns the external torque term for only gravity and no other external wrenches
        Cgrav = tree.dynamicsBiasTerm(cache, no_wrench )
        C = u - Cgrav # subtract off gravity of robot itself
        print(u - Cgrav)
        j = tree.geometricJacobian(cache, 0 , body_n, 0) # returns a tuple of 6 by n matrix in j[0] and the indices to which it applied in j[1]
        jjt =[0] ,j[0].T) # 
        Wext = np.linalg.solve(jjt,[0],  C[j[1]])) # C[j[1]] is cryptic numpy sub indexing for densifying the Jacobian
        return Wext[3:]

q = None
v = None
f = None
fext = np.zeros(3)

def callback(data):
        global q, v, f, initu, start
        q = np.array(data.position)
        v = np.array(data.velocity)
        q[1] = np.pi + q[1] #The drake conventions and the kinova conventions for angles are off
        q[2] = np.pi + q[2]
        f = np.array(data.effort)

def wrenchcallback(data):
        global fext
        fext[0] = data.wrench.force.x
        fext[1] = data.wrench.force.y
        fext[2] = data.wrench.force.z

rospy.init_node('listener', anonymous=True)
rospy.Subscriber("/j2n6s300_driver/out/joint_state", JointState, callback)
rospy.Subscriber("/j2n6s300_driver/out/tool_wrench", WrenchStamped, wrenchcallback)

rate = rospy.Rate(1)
while not rospy.is_shutdown():

2D Robot Arm Inverse Kinematics using Mixed Integer Programming in Cvxpy

Mixed Integer programming is crazy powerful. You can with ingenuity encode many problems into it. The following is a simplification of the ideas appearing in . They do 3d robot arms, I do 2d. I also stick to completely linear approximations.

The surface of a circle is not a convex shape. If you include the interior of a circle it is. You can build a good approximation to the circle as polygons. A polygon is the union of it’s sides, each of which is a line segment. Line sgements are convex set. Unions of convex sets are encodable using mixed integer programming. What I do is sample N regular positions on the surface of a circle. These are the vertices of my polygon. Then I build boolean indicator variables for which segment we are on. Only one of them is be nonzero \sum s_i == 1. If we are on a segment, we are allowed to make positions x that interpolate between the endpoints x_i of that segment x = \lambda_1 x_1 + \lambda_2 x_2, where \lambda_i >= 0 and \sum \lambda=1. These \lambda are only allowed to be nonzero if we are on the segment, so we suppress them with the indicator variables \lambda_i <= s_i + s_{i+1}. That’s the gist of it.

image link

Given a point on the circle (basically sines and cosines of an angle) we can build a 2d rotation matrix R from it. Then we can write down the equations connecting subsequent links on the arm. p_{i+1}=p_{i} +Rl. By using global rotations with respect to the world frame, these equations stay linear. That is a subtle point. p and R are variables, whereas l is a constant describing the geometry of the robot arm. If we instead used rotation matrices connecting frame i to i+1 these R matrices would compound nonlinearly.

All in all, pretty cool!

import cvxpy as cvx
import numpy as np
import matplotlib.pyplot as plt

# builds a N sided polygon approximation of a circle for MIP. It is the union of the segments making up the polygon
# might also be useful to directly encode arcs. for joint constraint limits.
def circle(N):
    x = cvx.Variable()
    y = cvx.Variable()
    l = cvx.Variable(N) #interpolation variables
    segment = cvx.Variable(N,boolean=True) #segment indicator variables, relaxing the boolean constraint gives the convex hull of the polygon
    angles = np.linspace(0, 2*np.pi, N, endpoint=False)
    xs = np.cos(angles) #we're using a VRep
    ys = np.sin(angles)

    constraints = []
    constraints += [x == l*xs, y == l*ys] # x and y are convex sum of the corner points
    constraints += [cvx.sum(l) == 1, l <= 1, 0 <= l] #interpolations variables. Between 0 and 1 and sum up to 1
    constraints += [cvx.sum(segment) == 1] # only one indicator variable can be nonzero

    constraints += [l[N-1] <= segment[N-1] + segment[0]] #special wrap around case
    for i in range(N-1):
        constraints += [l[i] <= segment[i] + segment[i+1]] # interpolation variables suppressed
    return x, y, constraints
x, y, constraints = circle(8)
objective = cvx.Maximize(x-0.8*y)
prob = cvx.Problem(objective, constraints)
res = prob.solve(solver=cvx.GLPK_MI, verbose=True)

# build a 2d rotation matrix using circle
def R(N):    
    constraints = []
    c, s, constraint = circle(N) # get cosines and sines from a circle
    constraints += constraint

    r = cvx.Variable((2,2)) # build rotation matrix
    constraints += [r[0,0] == c, r[0,1] == s] 
    constraints += [r[1,0] == -s, r[1,1] == c]
    return r, constraints
    # np.array([[c , s],                [-s, c]])

#robot linkage of differing arm length
link_lengths = [0.5,0.2,0.3,0.4]
pivots = []
Rs = []
N = 8
constraints = []
origin = np.zeros(2)

p1 = origin
for l in link_lengths:
    R1, c = R(8)    
    constraints += c

    p2 = cvx.Variable(2)
    constraints += [p2 == p1 + R1*np.array([l,0])] # R1 is global rotation with respect to world frame. This is important. It is what makes the encoding linear.

    p1 = p2


end_position = np.array([-0.5, .7])
constraints += [p2 == end_position]

objective = cvx.Maximize(1)
prob = cvx.Problem(objective, constraints)
res = prob.solve(solver=cvx.GLPK_MI, verbose=True)


print(list(map(lambda r: r.value, Rs)))

p1 = origin
for l, r in zip(link_lengths, Rs):
    p2 = p1 + r.value@np.array([l,0])
    plt.plot([p1[0],p2[0]], [p1[1],p2[1]], marker='o'),

    p1 = p2


plt.plot(x.value, label='x')
plt.plot(v.value, label= 'v')
plt.plot(collision.value, label = 'collision bool')

Giving the Mostly Printed CNC a try (MPCNC)

Declan had the good idea to make a CNC machine. There is a popular plan available here

A Doge

The really cute part of it is using electrical conduit as rails, which are shockingly inexpensive. Like a couple bucks for 4 feet! Holy shnikes!

We’ve been printing up a storm for the last couple weeks. A ton of parts!

We already had a lot of motors and stuff lying around. Declan bought a lot of stuff too just for this. Assorted bearings and bolts. The plans have a bill of materials.

Repetier host seemed to work pretty well for controlling the board

Used the RAMPS branch of the mpcnc marlin repo

Edited the header files as described in this post so that we could use both extruders as extra x and y motor drivers. It did not seem like driving two motors from the same driver board was acceptable. Our bearings are gripping the rails a little too tight. It is tough to move.

Some useful links on the thingiverse version of the mpccnc

He suggests using this program Seem like windows only? ugh.

The mpcnc plans don’t contain actual tool mounts but here are some examples

A pen holder:

A dewalt mount:

This is an interesting web based g-code maker. Ultimately a little to janky though. It works good enough to get started . Not entirely clear what pocket vs interior vs whatever is. engrave sort of seemed like what I wanted. Went into inkscape with a reasonable png and traced bitmapped it, then object to path. It’s also nice to just find an svg on the internet

The following code was needed to zero repetier and the RAMPS at the touch off point. We added it as a macro. It is doing some confusing behavior though.

G92 X0 Y0 Z0

pycam is the best I can find for 3d machining. Haven’t actually tried it yet

We should probably upgrade the thing to have limit switches. It pains me every time we slam it into the edge.

All in all, a very satisfying project. Hope we build something cool with it.

A horsie

Casadi – Pretty Damn Slick

Casadi is something I’ve been aware of and not really explored much. It is a C++ / python / matlab library for modelling optimization problems for optimal control with bindings to IPOpt and other solvers. It can produce C code and has differentiation stuff. See below for some examples after I ramble.

I’ve enjoyed cvxpy, but cvxpy is designed specifically for convex problems, of which many control problems are not.

Casadi gives you a nonlinear modelling language and easy access to IPOpt, an interior point solver that works pretty good (along with some other solvers, many of which are proprietary however).

While the documentation visually looks very slick I actually found it rather confusing in contents at first. I’m not sure why. Something is off.

You should download the “example pack” folder. Why they don’t have these in html on the webpage is insane to me.

It also has a bunch of helper classes for DAE building and other things. They honestly really put me off. The documentation is confusing enough that I am not convinced they give you much.

The integrator classes give you access to external smart ode solvers from the Sundials suite. They give you good methods for difficult odes and dae (differential algebraic equations, which are ODEs with weird constraints like x^1 + y^1 == 1) Not clear to me if you can plug those in to an optimization, other than by a shooting method.

Casadi can also output C which is pretty cool.

I kind of wondered about Casadi vs Sympy. Sympy has lot’s of general purpose symbolic abilities. Symbolic solving, polynomial smarts, even some differential equation understanding. There might be big dividends to using it. But it is a little harder to get going. I feel like there is an empty space for a mathemtical modelling language that uses sympy as it’s underlying representation. I guess monkey patching sympy expressions into casadi expression might not be so hard. Sympy can also output fast C code. Sympy doesn’t really have any support for sparseness that I know of.

As a side note, It can be useful to put these other languages into numpy if you need extended reshaping abilities. The other languages often stop at matrices, which is odd to me.

Hmm. Casadi actually does have access to mixed integer programs via bonmin (and commercial solvers). That’s interesting. Check out lotka volterra minlp example!topic/casadi-users/8xCHmP7UmpI

The optim interface makes some of this look better. optim.minimize and subject_to. Yeah, this is more similar to the interfaces I’m used to. It avoids the manual unpacking of the solution and the funky feel of making everything into implicit == 0 expressions.

Here is a simple harmonic oscillator example using the more raw casadi interface. x is positive, v is velocity, u is a control force. I’m using a very basic leap frog integration. You tend to have to stack things into a single vector with vertcat when building the final problem.

from casadi import *
import matplotlib.pyplot as plt

g = 9.8
N = 100

x = SX.sym('x',N)
v = SX.sym('v', N)
u = SX.sym('u', N-1)
#theta = SX('theta', N)
#thdot = SX('thetadot', N)

dt = 0.1
constraints = [x[0]-1, v[0]] # expressions that must be zero
for i in range(N-1):
    constraints += [x[i+1] - (x[i] + dt * v[i]) ]
    constraints += [v[i+1] - (v[i] - dt * x[i+1] + u[i] * dt)]

cost = sum([x[i]*x[i] for i in range(N)]) + sum([u[i]*u[i] for i in range(N-1)])

nlp = {'x':vertcat(x,v,u), 'f':cost, 'g':vertcat(*constraints)}
S = nlpsol('S', 'ipopt', nlp)
r = S(lbg=0, ubg=0) # can also give initial solutiuon hint, some other things
x_opt = r['x']
x = x_opt[:N]
v = x_opt[N:2*N]
u = x_opt[2*N:]
#u_opt = r['u']
print('x_opt: ', x_opt)

Let’s use the opti interface, which is pretty slick. Here is a basic cartpole

from casadi import *
import matplotlib.pyplot as plt

g = 9.8
N = 100

opti = casadi.Opti()

x = opti.variable(N)
v = opti.variable(N)
theta = opti.variable(N)
dtheta = opti.variable(N)
u = opti.variable(N-1)

opti.subject_to( u <= 1) 
opti.subject_to( -1 <= u) 
opti.subject_to( x <= 2) 
opti.subject_to( -2 <= x) 
opti.subject_to(x[0] == 0)
opti.subject_to(v[0] == 0)
opti.subject_to(theta[0] == 0)
opti.subject_to(dtheta[0] == 0)

dt = 0.05
for i in range(N-1):
    opti.subject_to( x[i+1] == x[i] + dt * (v[i]))
    opti.subject_to( v[i+1] == v[i] + dt * (x[i+1] + u[i]))
    opti.subject_to( theta[i+1] == theta[i] + dt * (dtheta[i]))
    opti.subject_to( dtheta[i+1] == dtheta[i] + dt * (u[i] * cos(theta[i+1]) - sin(theta[i+1]) ))

opti.minimize( sum1(sin(theta)))

opti.solver("ipopt") #,p_opts, s_opts)
sol = opti.solve()
plt.plot(sol.value(x), label="x")
plt.plot(sol.value(u), label="u")
plt.plot(sol.value(theta), label="theta")
p = opti.parameter()
opti.set_value(p, 3)

Very fast. Very impressive. Relatively readable code. I busted this out in like 15 minutes. IPopt solves the thing in the blink of an eye (about 0.05s self reported). Might be even faster if I warm start it with a good solution, as I would in online control (which may be feasible at this speed). Can add the initial condition as a parameter to the problem

I should try this on an openai gym.

Haskell has bindings to casadi.

Bouncing a Ball with Mixed Integer Programming

Edit: A new version.

Here I made a bouncing ball using mixed integer programming in cvxpy. Currently we are just simulating the bouncing ball internal to a mixed integer program. We could turn this into a control program by making the constraint that you have to shoot a ball through a hoop and have it figure out the appropriate initial shooting velocity.

import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt

N = 100
dt = 0.05

x = cvx.Variable(N)
v = cvx.Variable(N)
collision = cvx.Variable(N-1,boolean=True)

constraints = []
M = 20 # Big M trick

#initial conditions
constraints += [x[0] == 1, v[0] == 0]

for t in range(N-1):
    predictedpos = x[t] + v[t] * dt
    col = collision[t]
    notcol = 1 - collision[t]
    constraints += [ -M * col <= predictedpos ,  predictedpos <= M * notcol]
    #enforce regular dynamics if col == 0
    constraints +=  [  - M * col <= x[t+1] - predictedpos, x[t+1] - predictedpos  <= M * col   ] 
    constraints +=  [  - M * col <= v[t+1] - v[t] + 9.8*dt, v[t+1] - v[t] + 9.8*dt  <= M * col   ]

    # reverse velcotiy, keep position the same if would collide with x = 0
    constraints +=  [  - M * notcol <= x[t+1] - x[t], x[t+1] - x[t]  <= M * notcol   ] 
    constraints +=  [  - M * notcol <= v[t+1] + 0.8*v[t], v[t+1] + 0.8*v[t]  <= M * notcol   ] #0.8 restitution coefficient

objective = cvx.Maximize(1)
prob = cvx.Problem(objective, constraints)
res = prob.solve(solver=cvx.GLPK_MI, verbose=True)
plt.plot(x.value, label='x')
plt.plot(v.value, label= 'v')
plt.plot(collision.value, label = 'collision bool')

Pretty cool.

The trick I used this time is to make boolean indicator variables for whether a collision will happen or not. The big M trick is then used to actually make the variable reflect whether the predicted position will be outside the wall at x=0. If it isn’t, it uses regular gravity dynamics. If it will, it uses velocity reversing bounce dynamics

Just gonna dump this draft out there since I’ve moved on (I’ll edit this if I come back to it). You can embed collisions in mixed integer programming.  I did it below using a strong acceleration force that turns on when you enter the floor. What this corresponds to is a piecewise linear potential barrier.

Such a formulation might be interesting for the trajectory optimization of shooting a hoop, playing Pachinko, Beer Pong, or Pinball.

using JuMP
using Cbc
using Plots
N = 50
T = 5
dt = T/N

m = Model(solver=CbcSolver())

@variable(m, x[1:N]) # , Bin
@variable(m, v[1:N]) # , Bin
@variable(m, f[1:N-1])
@variable(m, a[1:N-1], Bin) # , Bin

@constraint(m, x[1] == 1)
@constraint(m, v[1] == 0)
M = 10
for t in 1:N-1
	@constraint(m, x[t+1] == x[t] + dt*v[t])
	@constraint(m, v[t+1] == v[t] + dt*(10*(1-a[t])-1))
	#@constraint(m, v[t+1] == v[t] + dt*(10*f[t]-1))
	@constraint(m, M * a[t] >= x[t+1]) #if on the next step projects into the earth
	@constraint(m, M * (1-a[t]) >= -x[t+1])
	#@constraint(m, f[t] <=  M*(1-a[t])) # we allow a bouncing force


k = 10
# @constraint(m, f .>= 0)
# @constraint(m, f .>= - k * x[2:N])

# @constraint(m, x[:] .>= 0)

E = 1 #sum(f) # 1 #sum(x) #sum(f) # + 10*sum(x) # sum(a)

@objective(m, Min, E)


More things to consider:

Is this method trash? Yes. You can actually embed the mirror law of collisions directly without needing to using a funky barrier potential.

You can extend this to ball trapped in polygon, or a ball that is restricted from entering obstacle polygons. Check out the IRIS project – break up region into convex regions Gives good support for embedding conditional variables. On a related note, gives a good way of defining piecewise linear functions using Mixed Integer programming.

Pajarito is another interesting Julia project. A mixed integer convex programming solver.

Russ Tedrake papers –

Break up obstacle objects into delauney triangulated things.

Trajectory Optimization of a Pendulum with Mixed Integer Linear Programming

There is a reasonable piecewise linear approximation for the pendulum replacing the the sinusoidal potential with two quadratic potentials (one around the top and one around the bottom). This translates to a triangle wave torque.

Cvxpy curiously has support for Mixed Integer Programming.

Cbc is probably better than GLPK MI. However, GLPK is way easier to get installed. Just brew install glpk and pip install cvxopt.

Getting cbc working was a bit of a journey. I had to actually BUILD Cylp (god forbid) and fix some problems.

Special Ordered Set constraints are useful for piecewise linear approximations. The SOS2 constraints take a set of variables and make it so that only two consecutive ones can be nonzero at a time. Solvers often have built in support for them, which can be much faster than just blasting them with general constraints. I did it by adding a binary variable for every consecutive pair. Then these binary variables suppress the continuous ones. Setting the sum of the binary variables to 1 makes only one able to be nonzero.

def SOS2(n):
	z = cvx.Variable(n-1,boolean=True)
	x = cvx.Variable(n)
	constraints = []
	constraints += [0 <= x]
	constraints += [x[0] <= z[0]]
	constraints += [x[-1] <= z[-1]]
	constraints += [x[1:-1] <= z[:-1]+z[1:]]
	constraints += [cvx.sum(z) == 1]
	constraints += [cvx.sum(x) == 1]
	return x, z, constraints


One downside is that every evaluation of these non linear functions requires a new set of integer and binary variables, which is possibly quite expensive.

For some values of total time steps and step length, the solver can go off the rails and never return.

At the moment, the solve is not fast enough for real time control with CBC (~ 2s). I wonder how much some kind of warm start might or more fiddling with heuristics, or if I had access to the built in SOS2 constraints rather than hacking it in. Also commercial solvers are usually faster. Still it is interesting.

import cvxpy as cvx
import numpy as np
import matplotlib.pyplot as plt

def SOS2(n):
	z = cvx.Variable(n-1,boolean=True)

	x = cvx.Variable(n)
	constraints = []
	constraints += [0 <= x]
	constraints += [x[0] <= z[0]]
	constraints += [x[-1] <= z[-1]]
	constraints += [x[1:-1] <= z[:-1]+z[1:]]
	constraints += [cvx.sum(z) == 1]
	constraints += [cvx.sum(x) == 1]
	return x, z, constraints

N = 20
thetas = cvx.Variable(N)
omegas = cvx.Variable(N)
torques = cvx.Variable(N)
tau = 0.3
c = [thetas[0] == +0.5, omegas[1] == 0, torques <= tau, torques >= -tau]
dt = 0.5

D = 5
thetasN = np.linspace(-np.pi,np.pi, D, endpoint=True)
zs = []
mods = []
xs = []
reward = 0
for t in range(N-1):
	c += [thetas[t+1] == thetas[t] + omegas[t]*dt]
	mod = cvx.Variable(1,integer=True)
	x, z, c1 = SOS2(D)
	c += c1
	c += [thetas[t+1] == thetasN@x + 2*np.pi*mod]
	sin = np.sin(thetasN)@x
	reward += -np.cos(thetasN)@x
	c += [omegas[t+1] == omegas[t] - sin*dt + torques[t]*dt ]

objective = cvx.Maximize(reward)
constraints = c

prob = cvx.Problem(objective, constraints)
res = prob.solve(solver=cvx.CBC, verbose=True)

print([mod.value for mod in mods ])
print([thetasN@x.value for x in xs ])
print([x.value for x in xs ])


# Other things to do: BigM. 
# Disjunctive constraints.
# Implication constraints.

Blue is angle, orange is the applied torque. The torque is running up against the limits I placed on it.