A Basic Branch and Bound Solver in Python using Cvxpy

Branch and bound is a useful problem solving technique. The idea is, if you have a minimization problem you want to solve, maybe there is a way to relax the constraints to an easier problem. If so, the solution of the easier problem is a lower bound on the possible solution of the hard problem. If the solution of the easier problem just so happens to also obey the more constrained hard problem, then it must also be the solution to the hard problem. You can also use the lower bound coming from a relaxed problem to prune your search tree for the hard problem. If even the relaxed problem doesn’t beat the current best found, don’t bother going down that branch.

A standard place this paradigm occurs is in mixed integer programming. The relaxation of a binary constraint (either 0 or 1) can be all the values in between (any number between 0 and 1). If this relaxed problem can be expressed in a form amenable to a solver like a linear programming solver, you can use that to power the branch and bound search, also using returned solutions for possible heuristics.

I built a basic version of this that uses cvxpy as the relaxed problem solver. Cvxpy already has much much faster mixed integer solvers baked in (which is useful to make sure mine is returning correct results), but it was an interesting exercise. The real reason I’m toying around is I kind of want the ability to add custom branching heuristics or inspect and maintain the branch and bound search tree, which you’d need to get into the more complicated guts of the solvers bound to cvxpy to get at. Julia might be a better choice.

A somewhat similar (and better) project is https://github.com/oxfordcontrol/miosqp which doesn’t use cvxpy explicitly, but does have the branch and bound control in the python layer of the solver. There are also other projects that can use fairly arbitrary solvers like Bonmin

As a toy problem I’m using a knapsack problem where we have objects of different sizes and different values. We want to maximize the value while keeping the total size under the capacity of the bag. This can be phrased linearly like so: \max v \cdot x s.t. \sum_i s_i x_i<= capacity , x \in {0,1}. The basic heuristic I’m using is to branch on variables that are either 0 or 1 in even the relaxed solution. The alternative branch hopefully gets pruned fast.

This is at least solving the problem fairly quickly. It needs better heuristics and to be sped up, which is possible in lots of ways. I was not trying to avoid all performance optimizations. It takes maybe 5 seconds, whereas the cvxpy solver is almost instantaneous.

Mixed Integer Programming & Quantization Error

I though of another fun use case of mixed integer programming the other day. The quantization part of a digital to analog converter is difficult to analyze by the techniques taught in a standard signals course (linear analysis, spectral techniques, convolution that sort of thing). The way it is usually done is via assuming the quantization error is a kind of randomized additive noise.

Mixed Integer programming does have the ability to directly encode some questions about this quantization though. We can directly encode the integer rounding relations by putting the constraint that the quantized signal is exactly +-1/2 a quantization interval away from the original signal. Then we can run further analysis on the signals and compare them. For example, I wrote down a quick cosine transform. Then I ask for the worst case signal that leads to the most error on the quantized transform versus the transform of the unquantized signal. My measure of worst case performance was the sum of the difference of the two transforms. I chose this because it is tractable as a mixed integer linear program. Not all reasonable metrics one might want will be easily encodable in a mixed integer framework it seems. Some of them are maximizing over a convex function, which is naughty. (for example trying to maximize the L2 error \sum|x-y|^2 )

In a variant of this, it is also possible to directly encode the digital signal process in terms of logic gate construction and compare that to the intended analog transform, although this will be a great deal more computational expensive.

This is interesting as a relatively straightforward technique for the analysis of quantization errors.

This also might be an interesting place to use the techniques of Vanderbei https://vanderbei.princeton.edu/tex/ffOpt/ffOptMPCrev4.pdf . He does a neato trick where he partially embeds the FFT algorithm into an optimization problem by adding auxiliary variables. Despite the expense of adding these variables, it greatly increases the sparsity of the constraint matrices, which will probably be a win. I wonder if one might do something similar with a Fast Multipole Method , Barnes Hut, or Wavelet transform? Seems likely. Would be neat, although I’m not sure what for. I was thinking of simulating the coulomb gas. That seems like a natural choice. Oooh. I should do that.

Solving the XY Model using Mixed Integer Optimization in Python

There are many problems in physics that take the form of minimizing the energy. Often this energy is taken to be quadratic in the field. The canonical example is electrostatics. The derivative of the potential \phi gives the electric field E. The energy is given as \int (|\nabla \phi|^2 + \phi \rho) d^3 x . We can encode a finite difference version of this (with boundary conditions!) directly into a convex optimization modelling language like so.

The resulting logarithm potential

It is noted rarely in physics, but often in the convex optimization world that the barrier between easy and hard problems is not linear vs. nonlinear, it is actually more like convex vs. nonconvex. Convex problems are those that are bowl shaped, on round domains. When your problem is convex, you can’t get caught in valleys or on corners, hence greedy local methods like gradient descent and smarter methods work to find the global minimum. When you differentiate the energy above, it results in the linear Laplace equations \nabla^2 \phi = \rho. However, from the perspective of solvability, there is not much difference if we replace the quadratic energy with a convex alternative.

Materials do actually have non-linear permittivity and permeability, this may be useful in modelling that. It is also possible to consider the convex relaxation of truly hard nonlinear problems and hope you get the echoes of the phenomenology that occurs there.

Another approach is to go mixed integer. Mixed Integer programming allows you to force that some variables take on integer values. There is then a natural relaxation problem where you forget the integer variables have to be integers. Mixed integer programming combines a discrete flavor with the continuous flavor of convex programming. I’ve previously shown how you can use mixed integer programming to find the lowest energy states of the Ising model but today let’s see how to use it for a problem of a more continuous flavor.

As I’ve described previously, in the context of robotics, the non-convex constraint that variables lie on the surface of a circle can be approximated using mixed integer programming. We can mix this fairly trivially with the above to make a global solver for the minimum energy state of the XY model. The XY model is a 2d field theory where the value of the field is constrained to lie on a circle. It is a model of a number of physical systems, such as superconductivity, and is the playground for a number of interesting phenomenon, like the Kosterlitz-Thouless phase transition. Our encoding is very similar to the above except we make two copies of the field phi and we then force them to lie on a circle. I’m trying to factor out the circle thing into my library cvxpy-helpers, which is definitely a work in progress.

Now, this isn’t really an unmitigated success as is. I switched to an absolute value potential because GLPK_MI needs it to be linear. ECOS_BB works with a quadratic potential, but it was not doing a great job. The commercial solvers (Gurobi, CPlex, Mosek) are supposed to be a great deal better. Perhaps switching to Julia, with it’s richer ecosystem might be a good idea too. I don’t really like how the solution of the absolute value potential looks. Also, even at such a small grid size it still takes around a minute to solve. When you think about it, it is exploring a ridiculously massive space and still doing ok. There are hundreds of binary variables in this example. But there is a lot of room for tweaking and I think the approach is intriguing.

Musings:

  • Can one do steepest descent style analysis for low energy statistical mechanics or quantum field theory?
  • Is the trace of the mixed integer program search tree useful for perturbative analysis? It seems intuitively reasonable that it visits low lying states
  • The Coulomb gas is a very obvious candidate for mixed integer programming. Let the charge variables on each grid point = integers. Then use the coulomb potential as a quadratic energy. The coulomb gas is dual to the XY model. Does this exhibit itself in the mixed integer formalism?
  • Coulomb Blockade?
  • Nothing special about the circle. It is not unreasonable to make piecewise linear approximations or other convex approximations of the sphere or of Lie groups (circle is U(1) ). This is already discussed in particular about SO(3) which is useful in robotic kinematics and other engineering topics.

Edit: /u/mofo69extreme writes:

By absolute value potential, I mean using |del phi| as compared to a more ordinary quadratic |del phi|2.

This is where I’m getting confused. As you say later, you are actually using two fields, phi_x and phi_y. So I’m guessing your potential is the “L1 norm”

|del phi| = |del phi_x| + |del phi_y|

? This is the only linear thing I can think of.

I don’t feel that the exact specifics of the XY model actually matter all the much.

You should be careful here though. A key point in the XY model is the O(2) symmetry of the potential: you can multiply the vector (phi_x,phi_y) by a 2D rotation matrix and the Hamiltonian is unchanged. You have explicitly broken this symmetry down to Z_4 if your potential is as I have written above. In this case, the results of the famous JKKN paper and this followup by Kadanoff suggest that you’ll actually get a phase transition of the so-called “Ashkin-Teller” universality class. These are actually closely related to the Kosterlitz-Thouless transitions of the XY model; the full set of Ashkin-Teller phase transitions actually continuously link the XY transition with that of two decoupled Ising models.

You should still get an interesting phase transition in any case! Just wanted to give some background, as the physics here is extremely rich. The critical exponents you see will be different from the XY model, and you will actually get an ordered Z_4 phase at low temperatures rather than the quasi-long range order seen in the low temperature phase of the XY model. (You should be in the positive h_4 region of the bottom phase diagram of Figure 1 of the linked JKKN paper.)”

These are some interesting points and references.