We can use mixed integer programming to make a controller for Flappy Bird. Feel free to put this as a real-world application in your grant proposals, people.
While thinking about writing a MIP for controlling a lunar lander game, I realized how amenable to mixed integer modeling flappy bird is. Ben and I put together the demo on Saturday. You can find his sister blog post here.
The bird is mostly in free fall, on parabolic trajectories. This is a linear dynamic, so it can directly be expressed as a linear constraint. It can discretely flap to give itself an upward impulse. This is a boolean force variable at every time step. Avoiding the ground and sky is a simple linear constraint. The bird has no control over its x motion, so that can be rolled out as concrete values. Because of this, we can check what pipes are relevant at time points in the future and putting the bird in the gap is also a simple linear constraint.
There are several different objectives one might want to consider and weight. Perhaps you want to save the poor birds energy and minimize the sum of all flaps cvx.sum(flap). Or perhaps you want to really be sure it doesn’t hit any pipes by maximizing the minimum distance from any pipe. Or perhaps minimize the absolute value of the y velocity, which is a reasonable heuristic for staying in control. All are expressible as linear constraints. Perhaps you might want a weighted combo of these. All things to fiddle with.
There is a pygame flappy bird clone which made this all the much more slick. It is well written and easy to understand and modify. Actually figuring out the appropriate bounding boxes for pipe avoidance was finicky. Figuring out the right combo of bird size and pipe size is hard, combined with computer graphics and their goddamn upside down coordinate system.
We run our solver in a model predictive control configuration. Model predictive control is where you roll out a trajectory as an optimization problem and resolve it at every action step. This turns an open loop trajectory solve into a closed loop control, at the expense of needing to solve a perhaps very complicated problem in real time. This is not always feasible.
My favorite mip modeling tool is cvxpy. It gives you vectorized constraints and slicing, which I love. More tools should aspire to achieve numpy-like interfaces. I’ve got lots of other blog posts using this package which you can find in my big post list the side-bar 👀.
#objective = cvx.Minimize(cvx.sum(flap)) # minimize total fuel use
objective=cvx.Minimize(cvx.sum(flap)+10*cvx.sum(cvx.abs(vy)))# minimize total fuel use
I think it is largely self explanatory but here are some notes. The dt = t//10 + 1 thing is about decreasing our time resolution the further out from the current time step. This increases the time horizon without the extra computation cost. Intuitively modeling accuracy further out in time should matter less. The last_solution stuff is for in case the optimization solver fails for whatever reason, in which case it’ll follow open-loop the last trajectory it got.
The controller as is is not perfect. It fails eventually, and it probably shouldn’t. A bug? Numerical problems? Bad modeling of the pipe collision? A run tends to get through about a hundred pipes before something gets goofy.
Since we had access to the source code, we could mimic the dynamics very well. How robust is flappy bird to noise and bad modeling? We could add wind, or inaccurate pipe data.
Unions of Convex Region. Giving the flappy bird some x position control would change the nature of the problem. We could easily cut up the allowable regions of the bird into rectangles, and represent the total space as a union of convex regions, which is also MIP representable.
Verification – The finite difference scheme used is crude. It is conceivable for the bird to clip a pipe. Since basically we know the closed form of the trajectories, we could verify that the parabolas do not intersect the regions. For funzies, maybe use sum of squares optimization?
Realtime MIP. Our solver isn’t quite realtime. Maybe half real time. One might pursue methods to make the mixed integer program faster. This might involve custom branching heuristics, or early stopping. If one can get the solver fast enough, you might run the solver in parallel and only query a new path plan every so often.
3d flappy bird? Let the bird rotate? What about a platformer (Mario) or lunar lander? All are pretty interesting piecewise affine systems.
Is this the best way to do this? Yes and no. Other ways to do this might involve doing some machine learning, or hardcoding a controller that monitors the pipe locations and has some simple feedback. You can find some among the forks of FlapPyBird. I have no doubt that you could write these quickly, fiddle with them and get them to work better and faster than this MIP controller. However, for me there is a difference between could work and should work. You can come up with a thousand bizarre schemes that could work. RL algorithms fall in this camp. But I have every reason to believe the MIP controller should work, which makes it easier to troubleshoot.
“Drake (“dragon” in Middle English) is a C++ toolbox started by the Robot Locomotion Group at the MIT Computer Science and Artificial Intelligence Lab (CSAIL). The development team has now grown significantly, with core development led by the Toyota Research Institute. It is a collection of tools for analyzing the dynamics of our robots and building control systems for them, with a heavy emphasis on optimization-based design/analysis.
While there are an increasing number of simulation tools available for robotics, most of them function like a black box: commands go in, sensors come out. Drake aims to simulate even very complex dynamics of robots (e.g. including friction, contact, aerodynamics, …), but always with an emphasis on exposing the structure in the governing equations (sparsity, analytical gradients, polynomial structure, uncertainty quantification, …) and making this information available for advanced planning, control, and analysis algorithms. Drake provides interfaces to high-level languages (MATLAB, Python, …) to enable rapid-prototyping of new algorithms, and also aims to provide solid open-source implementations for many state-of-the-art algorithms. Finally, we hope Drake provides many compelling examples that can help people get started and provide much needed benchmarks. We are excited to accept user contributions to improve the coverage.”
Drake is a powerful toolkit for the control of dynamical systems, and I hope I lower some of the barrier to entry with this post. Be forewarned, Drake changes quickly with time, and some of the following may be out of date (especially the rigid body trees) or ill advised. Use your judgement.
Drake may be installed from binaries or source. Both may be gotten here:
Drake uses Bazel as a build tool. Bazel is an open-source build and test tool similar to Make, Maven, and Gradle.
There are three commands that you need to know:
Query is very useful for investigating available binaries within the examples folder and elsewhere.
The notation // is used to refer to the build’s main directory. This corresponds to the drake folder. ... is the notation for everything in the subdirectory
bazel build //... will build everything in the project.
To query all the binaries available for tools run bazel query //tools/...
Drake can be run natively in C++, or by using its MATLAB, python, or Julia bindings. This manual will be focusing on using Drake in python. By default pydrake will not be in the path. You can put pydrake into the path after building by running the following line or adding it to your .bashrc
The following line will import everything in Drake into the python namespace.
from pydrake.all import *
Drake is only compatible with python 2.7. If your default system install is python3, you may need to explicitly run the command python2.
The following message may be thrown if you inadvertently use python3.
Traceback(most recent call last):
from pydrake.all import
ImportError:dynamic module does notdefine module export function(PyInit__common_py)
Pydrake itself has generated documentation available here:
A very useful tool for exploring and confirming the Drake functionality and syntax is the python REPL. From the python REPL or within your code, it is useful to inspect an object using either help(obj) or print(dir(obj)) which will print a list of all the properties available on your object.
Besides the source code itself, the most accurate and up to date information is available in searchable form on the Doxygen page. This is a useful reference, but can be overwhelming.
For dynamic simulation, Drake exposes a Lego block interface for building complex systems out of smaller pieces, a paradigm similar to Simulink and other modeling software.
Objects possess input and output ports. One wires up input ports to output ports to build composite systems.
To build a simple forward simulation, construct a builder object. Then add all subsystems to the builder object. Explicitly connected input and output ports together. One possible cause of crashes may be leaving ports unconnected.
Once the entire system has been built, a Simulator object can be constructed from it. You may select an integration scheme and set initial conditions by getting a context from the Simulator object. The context holds state information.
Edit : I think the Rigid Body Trees interface is deprecated. Use Multi Body Trees.
There are two complementary perspectives to take of the degrees of freedom of a robot, intrinsic coordinates and extrinsic coordinates. The intrinsic coordinates have one variable per degree of freedom of the robot. A common example is a set of joint angles. The dynamics are simply expressed in the intrinsic coordinates and can derived using Lagrangian mechanics. The extrinsic coordinates specify the spatial locations and orientation of frames attached to the robot. These are called frames. These spatial coordinates may be constrained in a relationship by a rigid mounting or gearing, so there may be more extrinsic frames available than intrinsic coordinates. Extrinsic coordinates are particularly pertinent for discussions of geometry, contact, and external forces.
Drake uses the URDF (Universal Robot Description Format) format. This is a common robot format originating in the ROS community for which you can find models of many commercial robots online. It is an XML based format with visualization, inertial, and collision tags.
This is an example URDF for a pendulum cart system.
The RigidBodyTree is a Drake class that describes both the intrinsic and extrinsic properties of a linkage. RigidBodyTrees may be built from a URDF file, some of which are packaged inside of Drake. For example, the Jaco arm URDF is available packaged with Drake.
where q is the state vector, M is the inertia matrix, C captures Coriolis forces, and is the gravity vector. The matrix B maps control inputs u into generalized forces.
External forces on the tree are described as wrenches. Wrenches are an object that combines forces and torques. In Drake, they are specified by 6 dimensional vectors. From the Drake docs:
“A column vector consisting of one wrench (spatial force) = [r X f; f], where f is a force (translational force) applied at a point P and r is the position vector from a point O (called the “moment center”) to point P.”
Drake will also compute the quantities in the manipulator equation. For example, to compute the term in the manipulator equations with no externally applied wrenches.
The geometric Jacobian function returns the Jacobian in a sparse format. It returns a tuple of a matrix and a vector of indices to which coordinates these correspond. The function takes the id of three different frames. In this example, it computes the differential of the transformation from frame 0 to frame 9 expressed in frame 3.
The Drake visualizer may be found in the tools folder. The Drake visualizer needs to be running before any applications that needs to communicate with it are started. To get the drake visualizer running run the following command from the drake directory
The package as of October 2018 supports Ubuntu 14.04 and ROS Indigo. The package includes bindings to the Jaco SDK for reading sensor data and giving motion commands. The package supports a variety of control modes, including torque control.
To update functionality to ubuntu 16.04 and ROS Kinetic you may wish to pull from this external branch
Drake provides a common interface to many optimization solvers. Because of the tight integration, Drake has the capability to directly build the equations of motion for a system into a form the solver can comprehend.
The Mathematical Program class provides a high level interface to the different solvers. This class can be constructed as an object on it’s own or as returned by other helper classes such as trajectory optimization builders.
Drake provides a symbolic expression modeling language in which to describe constraints and costs.
from pydrake.all import *
import numpy asnp
There are a large variety of solvers out there for problems of different structure.
A Mathematical Program is generally of the form
One of the oldest and most studied class of these programs are known as Linear Programs which has linear cost and constraints. This mathematical program has very efficient solvers available for it.
A large class of optimization problems that are tractable are known as Convex Optimization problems. The cost functions must be bowl shaped (convex), the inequality constraints must define convex regions and the equality constraints must be affine (linear + offset). For this class, gradient descent roughly works and refinements of gradient descent like Newton’s method work even better. Convexity implies that greedy local moves are also acceptable global moves and there are no local minima or tendril regions to get caught in.
The common reference for convex optimization is the textbook by Boyd and Vandenberghe freely available at http://web.stanford.edu/~boyd/cvxbook/. There is also an accompanying video course available online.
Subclasses of Convex programming may have solvers tuned to them. Important subclasses include:
Linear Programming – Linear objective, linear equality, and linear inequality constraints.
Quadratic Programming – Linear Programming + quadratic objective
Second Order Cone Programming
Semidefinite Programming – Optimization allowing for the constraint of a SemiDefinite matrix. This means the matrix is constrained to have all nonnegative eigenvalues or equivalently the quadratic form it defines is non-negative for all possible vectors .
Sum of Squares Programming – Optimization over polynomials with the constraint that the polynomial can be written as a sum of squares, a manifestly positive form.
Many problems cannot be put into this form. If the inherent nature of the problem at hand requires it, you may choose to use a nonlinear programming solver or Mixed Integer Programming Solver.
The solvers Ipopt and Snopt are nonlinear programming solvers. Ipopt is an open source and Snopt is proprietary. These solvers use local convex approximations to the problem to heuristically drive the solution to a local minimum.
Mixed Integer Linear Programming is Linear Programming with additional ability to require variables to take on integer values. This additional constraint type takes the problem from polynomial time solvable to NP-complete. A surprisingly large number of discrete and continuous optimization problems can be encoded into this framework. Internally, these solvers use linear programming solvers to guide the discrete search.
Part of the art and fun of the subject comes from manipulating your problem into a form amenable to powerful available solvers and theory.
Hans Mittelmann at the University of Arizona maintains benchmarks for a variety of optimization tools. http://plato.asu.edu/bench.html A rule of thumb is that commercial solvers perform better than open source solvers.
Available Solvers in Drake
Mosek – Mosek is a proprietary optimization solver package offering solvers for
Automatic Differentiation is the capability to have derivatives computed alongside code that computes the values. It is largely based upon application of the chain rule. There are two modes, forward and reverse mode.
Forward mode is the simplest to describe. Functions can be overloaded so that they take a dual number, a value combined in a tuple with it’s derivative information. As the value of a function is computed, the Jacobian of that function is matrix multiplied on the derivative concurrently.
Drake exposes automatic differentiation capability for manual use
from pydrake.all import *
import numpy asnp
# second parameter initializes forward mode derivative information
f=x *x *x
More importantly Drake uses automatic differentiation internally to marshal symbolic problems into forms acceptable to external solvers and to calculate the various Jacobians we’ve already seen.
Trajectory optimization is a framework in which one uses Mathematical programming to solve optimal trajectory problems. The input to the system
is considered to be a decision variable in a Mathematical programming problem.
The combination of dynamical system modeling, automatic differentiation, and bindings to mathematical programming solvers makes Drake an excellent platform for trajectory optimization.
In Direct Collocation, both the path and the input variables are discretized along time. The trajectories are described by cubics and force curves are described by piecewise linear. One way of performing direct collocation is to take the path position at a discrete number of time points and make a decision variable for each. The discretized equations of motion become constraints that neighboring time points have to obey. One may then inject any other desired requirements (staying inside some safe region for example) as additional constraints.
The Drake docs state:
DirectCollocation implements the approach to trajectory optimization as described in C. R. Hargraves and S. W. Paris. Direct trajectory optimization using nonlinear programming and collocation. J Guidance, 10(4):338-342, July-August 1987. It assumes a first-order hold on the input trajectory and a cubic spline representation of the state trajectory, and adds dynamic constraints (and running costs) to the midpoints as well as the knot points in order to achieve a 3rd order integration accuracy.
Drake provides a useful interface for talking about trajectories. For both describing the cost function and the constraint functions, you want them to apply at all time steps. You can ask drake for variables representing the position or forces at all time steps. Drake will then build the appropriate mathematical program applying the cost of constraint at all time steps.
dircol=DirectCollocation(robot,ctx,10,0.01,0.1)# timesteps, minimum time step, max timestep
You may want to select the initial and final state specifically to specify goals and initial conditions
Example Application: Estimating end-effector force from Jaco motor torques
End effector forces become part of the equations of motion.
The geometric Jacobian transforms the extrinsic force into intrinsic coordinates. It is in general a rectangular matrix, since the number of extrinsic coordinates does not need to match the number of intrinsic coordinates.
A force applied to the end effector appears linearly in the manipulator equations as . This will be canceled by the torques of the motors during static or quasi-static movement. Hence, we can can determinethe external force from the motor torques if we assume it is the only external force at play.
The pseudo-inverse is the best possible solution to an overdetermined set of linear equations, in the least squares sense. We use this operation due to the non square nature of the Jacobian.
The following script prints both the end effector force as supplied by the Jaco SDK and the force as computed by Drake from the internal motor torques.
The surface of a circle is not a convex shape. If you include the interior of a circle it is. You can build a good approximation to the circle as polygons. A polygon is the union of it’s sides, each of which is a line segment. Line sgements are convex set. Unions of convex sets are encodable using mixed integer programming. What I do is sample N regular positions on the surface of a circle. These are the vertices of my polygon. Then I build boolean indicator variables for which segment we are on. Only one of them is be nonzero . If we are on a segment, we are allowed to make positions that interpolate between the endpoints of that segment , where and . These are only allowed to be nonzero if we are on the segment, so we suppress them with the indicator variables . That’s the gist of it.
Given a point on the circle (basically sines and cosines of an angle) we can build a 2d rotation matrix from it. Then we can write down the equations connecting subsequent links on the arm. . By using global rotations with respect to the world frame, these equations stay linear. That is a subtle point. and are variables, whereas is a constant describing the geometry of the robot arm. If we instead used rotation matrices connecting frame i to i+1 these R matrices would compound nonlinearly.
All in all, pretty cool!
import cvxpy ascvx
import numpy asnp
import matplotlib.pyplot asplt
# builds a N sided polygon approximation of a circle for MIP. It is the union of the segments making up the polygon
# might also be useful to directly encode arcs. for joint constraint limits.
segment=cvx.Variable(N,boolean=True)#segment indicator variables, relaxing the boolean constraint gives the convex hull of the polygon
xs=np.cos(angles)#we're using a VRep
constraints+=[x==l*xs,y==l*ys]# x and y are convex sum of the corner points
constraints+=[cvx.sum(l)==1,l<=1,0<=l]#interpolations variables. Between 0 and 1 and sum up to 1
constraints+=[cvx.sum(segment)==1]# only one indicator variable can be nonzero
constraints+=[l[N-1]<=segment[N-1]+segment]#special wrap around case
The really cute part of it is using electrical conduit as rails, which are shockingly inexpensive. Like a couple bucks for 4 feet! Holy shnikes!
We’ve been printing up a storm for the last couple weeks. A ton of parts!
We already had a lot of motors and stuff lying around. Declan bought a lot of stuff too just for this. Assorted bearings and bolts. The plans have a bill of materials.
Repetier host seemed to work pretty well for controlling the board
Used the RAMPS branch of the mpcnc marlin repo
Edited the header files as described in this post so that we could use both extruders as extra x and y motor drivers. It did not seem like driving two motors from the same driver board was acceptable. Our bearings are gripping the rails a little too tight. It is tough to move.
This is an interesting web based g-code maker. Ultimately a little to janky though. It works good enough to get started http://jscut.org/jscut.html . Not entirely clear what pocket vs interior vs whatever is. engrave sort of seemed like what I wanted. Went into inkscape with a reasonable png and traced bitmapped it, then object to path. It’s also nice to just find an svg on the internet
The following code was needed to zero repetier and the RAMPS at the touch off point. We added it as a macro. It is doing some confusing behavior though.
G92 X0 Y0 Z0
pycam is the best I can find for 3d machining. Haven’t actually tried it yet
Casadi is something I’ve been aware of and not really explored much. It is a C++ / python / matlab library for modelling optimization problems for optimal control with bindings to IPOpt and other solvers. It can produce C code and has differentiation stuff. See below for some examples after I ramble.
I’ve enjoyed cvxpy, but cvxpy is designed specifically for convex problems, of which many control problems are not.
Casadi gives you a nonlinear modelling language and easy access to IPOpt, an interior point solver that works pretty good (along with some other solvers, many of which are proprietary however).
While the documentation visually looks very slick I actually found it rather confusing in contents at first. I’m not sure why. Something is off.
It also has a bunch of helper classes for DAE building and other things. They honestly really put me off. The documentation is confusing enough that I am not convinced they give you much.
The integrator classes give you access to external smart ode solvers from the Sundials suite. They give you good methods for difficult odes and dae (differential algebraic equations, which are ODEs with weird constraints like x^1 + y^1 == 1) Not clear to me if you can plug those in to an optimization, other than by a shooting method.
Casadi can also output C which is pretty cool.
I kind of wondered about Casadi vs Sympy. Sympy has lot’s of general purpose symbolic abilities. Symbolic solving, polynomial smarts, even some differential equation understanding. There might be big dividends to using it. But it is a little harder to get going. I feel like there is an empty space for a mathemtical modelling language that uses sympy as it’s underlying representation. I guess monkey patching sympy expressions into casadi expression might not be so hard. Sympy can also output fast C code. Sympy doesn’t really have any support for sparseness that I know of.
As a side note, It can be useful to put these other languages into numpy if you need extended reshaping abilities. The other languages often stop at matrices, which is odd to me.
Hmm. Casadi actually does have access to mixed integer programs via bonmin (and commercial solvers). That’s interesting. Check out lotka volterra minlp example
The optim interface makes some of this look better. optim.minimize and subject_to. Yeah, this is more similar to the interfaces I’m used to. It avoids the manual unpacking of the solution and the funky feel of making everything into implicit == 0 expressions.
Here is a simple harmonic oscillator example using the more raw casadi interface. x is positive, v is velocity, u is a control force. I’m using a very basic leap frog integration. You tend to have to stack things into a single vector with vertcat when building the final problem.
from casadi import *
import matplotlib.pyplot asplt
#theta = SX('theta', N)
#thdot = SX('thetadot', N)
constraints=[x-1,v]# expressions that must be zero
Very fast. Very impressive. Relatively readable code. I busted this out in like 15 minutes. IPopt solves the thing in the blink of an eye (about 0.05s self reported). Might be even faster if I warm start it with a good solution, as I would in online control (which may be feasible at this speed). Can add the initial condition as a parameter to the problem
Here I made a bouncing ball using mixed integer programming in cvxpy. Currently we are just simulating the bouncing ball internal to a mixed integer program. We could turn this into a control program by making the constraint that you have to shoot a ball through a hoop and have it figure out the appropriate initial shooting velocity.
The trick I used this time is to make boolean indicator variables for whether a collision will happen or not. The big M trick is then used to actually make the variable reflect whether the predicted position will be outside the wall at x=0. If it isn’t, it uses regular gravity dynamics. If it will, it uses velocity reversing bounce dynamics
Just gonna dump this draft out there since I’ve moved on (I’ll edit this if I come back to it). You can embed collisions in mixed integer programming. I did it below using a strong acceleration force that turns on when you enter the floor. What this corresponds to is a piecewise linear potential barrier.
Such a formulation might be interesting for the trajectory optimization of shooting a hoop, playing Pachinko, Beer Pong, or Pinball.
@constraint(m,M*a[t]>=x[t+1])#ifon the next step projects into the earth
#@constraint(m,f[t]<=M*(1-a[t]))#we allowabouncing force
More things to consider:
Is this method trash? Yes. You can actually embed the mirror law of collisions directly without needing to using a funky barrier potential.
You can extend this to ball trapped in polygon, or a ball that is restricted from entering obstacle polygons. Check out the IRIS project – break up region into convex regions
There is a reasonable piecewise linear approximation for the pendulum replacing the the sinusoidal potential with two quadratic potentials (one around the top and one around the bottom). This translates to a triangle wave torque.
Cvxpy curiously has support for Mixed Integer Programming.
Cbc is probably better than GLPK MI. However, GLPK is way easier to get installed. Just brew install glpk and pip install cvxopt.
Getting cbc working was a bit of a journey. I had to actually BUILD Cylp (god forbid) and fix some problems.
Special Ordered Set constraints are useful for piecewise linear approximations. The SOS2 constraints take a set of variables and make it so that only two consecutive ones can be nonzero at a time. Solvers often have built in support for them, which can be much faster than just blasting them with general constraints. I did it by adding a binary variable for every consecutive pair. Then these binary variables suppress the continuous ones. Setting the sum of the binary variables to 1 makes only one able to be nonzero.
One downside is that every evaluation of these non linear functions requires a new set of integer and binary variables, which is possibly quite expensive.
For some values of total time steps and step length, the solver can go off the rails and never return.
At the moment, the solve is not fast enough for real time control with CBC (~ 2s). I wonder how much some kind of warm start might or more fiddling with heuristics, or if I had access to the built in SOS2 constraints rather than hacking it in. Also commercial solvers are usually faster. Still it is interesting.