Deep Learning Basic Shapes

Our Github Source:

We are trying to get some basic deep learning to happen. Don some MNIST tutorials before, but there in lies the problem. These tutorials have a lot already done for you.

So we’re going to try an even simpler task. Classifying computer generated shapes.

We generate images of shapes using scikit-image.

We started off using Caffe (because it has support for non cuda gpu) and were displeased. It’s heavily configuration file based. The python api mostly is about generating configuration files for you. You need to dip into C to get some stuff done. You need to put your files into an intermediate database file. The docs mostly consist of tutorial python notebooks. It was just all too much barrier for getting started. We may return for various reasons though at some point. It has a database of pretrained networks. Also supposedly Caffe is fast.

Once we switched over to Keras, things started going a lot smoother.

We copied the VGG convolutional example network and chopped out some of it’s layers and lessened the size of the hidden layers. The more complicated the network, the more time and data its going to need. This is a simple task so it shouldn’t need so much.

We had an initial problem with the loss diverging. Our learning rate was too high. We had to cut it down. I read somewhere that a rule of thumb is to decrease the learning rate by factors of 2 until the loss stops diverging.

We also weren’t sure what cross-entropy was. This helped.

The cross entropy is the number of bits you’d need to determine the actual answer after the neural network gave you its guess for the probabilities of the answer given the input. If the neural network gives you a 1 for one answer then it is absolutely certain and you need no more bits. If it gives you a 1 for the wrong answer, you’d need infinitely many bits if you were encoding answers using the networks guesses. In efficient coding, impossible answers take infinite bits. Kind of funny thing.

If I understand it right, the cross entropy of a randomly guessing neural network would be \log _2 (3 ), since you’d expect it to take that many bits to encode 3 options. The starting point tends to be around 6 or so though?

The tensorflow vs theano backend is a constant problem. I suppose we’re going to use the tensorflow backend. Tensorflow has momentum behind it.

You can change the backend by changing keras file.

But you also need to constantly be switching the channel ordering.

The decay parameter is a thing that decreases the learning rate. Basically I think it is 1/iteration at which you want the learning to slow down?

I think we’re going to use AWS for now rather than buy a nice graphics card. Unfortunately, it is important to have CUDA capability and our card is a Radeon.

AWS isn’t so bad. We experienced maybe a 10x speedup in a very unofficial test compared to local cpu on a g2.xlarge.

This is useful. Although the software is constantly going out of date. It has enough to get running.

The smallest g2 instances are currently 0.60$/hour for a regular instance.

Spot instances are often suggested. They cannot be stopped, only terminated which sucks. You can mount an external EBS. Spot instance are somewhere between 0.20$ and 0.6$ per hour.

You can check here

Or on your AWS console.



Chebyshev Polynomials

Chebyshev polynomials are really cool.

No please. Don’t leave. I’m so lonely.

They really are. I remember them coming up a lot during my Gilbert Strang phase (his numerical computing course online is really awesome). When in doubt use Chebyshev polynomials.

They have a lot of great properties. A lot of special functions are just a bunch of random garbage.

The key to Chebyshev polynomials is that they are really \cos in disguise. and they inherit a ton of properties just from that. \cos is a very special function.


What this says is that if you transformed  the domain [-1,1] using x=\cos(\theta) , then the Chebyshev polynomials are just cosines. A couple of properties fall out of this.

First, the equioscillation property. One thing that defines chebyshev polynomials is that they are polynomials with the minimum maximum values on the domain, i.e. they oscillate between -1 and 1. This is why they are good for function approximation. Using them tends to make the approximation error have a minimum maximum value over the interval. Other polynomials of the same degree overshoot the bounds.

This is not surprising from the cosine perspective.

An equally spaced grid is the best one in a periodic domain. This domain transformation transforms the equally spaced grid \theta_n = n/2\pi N to the chebyshev points x_n = \cos(n/ 2\piN). It is somewhat surprising that these are actually much better points to sample functions at rather than the equally spaced points.

A fast n \ln n transform between the chebyshev coefficients and the sampled point values can be achieved using the discrete cosine transform. The discrete cosine transform takes an expansion in cosines to samples on the equally spaced grid and again the chebyshev functions are just cosines in disguise.

They inherit orthonormality from the orthonormality of cosines.

An advantage that chebyshev functions have over cosines is that they are polynomials. This makes them easy to manipulate and compute. The correct way to evaluate a Chebyshev series is using the recurrence.


Once you’ve sampled a function into Chebyshev form, it is simple to differentiate, integrate, multiply, and add functions. Composition of functions may be a bit harder. There is the very interesting relation that T_n(T_m(x))=T_{nm}(x), but it is not clear to me that there is an easy way to find a relation $latexT_n(a+b)$ in order to evaluate the composition of full Chebyshev series efficiently.

For differentiation and integration it is important to remember the \sin factor that occurs due to the change of domain from the unit circle to the unit interval.

Transformation between the samples at Chebyshev points and the polynomial coefficients can be achieved using a DCT.

There are two sets of Chebyshev points, one for roots of T_n one for the extrema.

The Cosine Series works with an implicit even periodic extension of the function you want to approximate. This is why it can work better than approximating a non periodic function with the Fourier series, where the implicit periodic version will have discontinuities.

Root finding becomes possible.

scipy has some solid chebyshev capabilities.

It surprises me that Haskell does not have an appropriate translation of chebfun as far as I can find. Chebfun is such a functional kind of thing, a beautiful programming pattern.

This library has some basic chebyshev capabilities

I’d like to take moment here to comment how important the vector type is in Haskell. It is not mainstream in the sense that any Haskell tutorial mentions it as I recall, but it gives you access into the world of c-like arrays (which is ugly and unhaskelly I guess).




Some Resources on Automatic Differentiation & Power Series in Haskell

Ed Kmett’s ad package. Probably everything you could ever want if you understand it.

Conal Elliott – Beautiful Differentiation – There is an implementation in vector-space

Functional Differentiation of Computer Programs J Karczmarczuk

GR raytracing uses ad package in branch

general backpropagation







Some simple ST Monad examples

The ST monad is what you use to do real mutable state in Haskell basically.

The State monad is a more pure guy that just automatically threads the state through pure functions for you.

The ST monad, and structures in it, to my understanding is actually backed by computer memory that is changing. Some things that should be fast and easy become actually fast and easy. Like changing a single element in an array without rebuilding the whole damn thing (or a lot of it).

The ST monad is probably bad style. You’re supposed to bend over backward to avoid mutability. It’s a gun. Don’t shoot yourself. Maybe better style is to use the C FFI (foreign function interface) if you really really need mutability.

Unlike the IO monad the ST monad can be escaped. Sometimes this is called thawing and freezing, the process of going into and out of the monad.

Here’s a couple snippets that demo how it works.

I recommend not thinking about the actual monad laws of this thing. The type signature of ST is messed up. It uses some kind of weird type argument to guarantee in the typechecker that the ST monad reference can’t leak outside the monad. In terms of just pretending the do notation is imperative like, it makes sense though.

makeSTRef puts that value you give it into the variable n.

readSTRef pulls out. Modify Let’s you modify.

runST ubiquitously has to be used to rip off the ST monad layer. You don’t want it if you want to combine a bunch of little ST functions. makeArray’ doesn’t have it so if you look at it in ghci you don’t see 10, you see <<ST Action>>. If you haven’t read the reference or frozen the vector, you can’t use runST. You’ll get an error because that would leak the mutable thing out of the monad.

Vectors are how Haskell has actual c style arrays. It’s kind of like numpy if you’re familiar with that. Unboxed means you’re using raw ints and floats and stuff.

M.replicate builds a size 3 vector filled with doubles of value 1.2. Then I modify the second element and freeze it into an immutable vector to escape the monad.

Storable vectors might be the better default. They are the same really, except they can be passed through the C FFI. I believe hmatrix uses them (and other c interfacing libraries) for passing arrays into Haskell.



Scientific Programming in Haskell with hmatrix

I’ve been puttering around with some ideas about how linear algebra could work in Haskell.

So I decided to check out what has been done. hmatrix binds to the gnu scientific libraries and lapack.

There is also htensor and repa which I need to check out.

This is the hello word of scientific programs. I could write this is numpy, scipy, matplotlib FAST. But I’m super used to those. Maybe not a fair comparison.

I build a circulant function for the 1d periodic Laplace equation matrix. I did not find good constructors that would elegantly construct the thing so I make a circulant routine using the fft. That’s ok I guess.

I got the eigenvalues and eigenvectors with eigSH.

Then I plot it out to an external svg file using Cairo. I do not like the mental overhead of a monad.But I guess you can just copy and paste and ignore all that.

The Plt.line function takes a label and a list of (x,y) tuples


Ok. Actually on reflection it’s not bad. Scratching my head around the circulant matrix constructor took some time. And so did installing stuff. I did also keep slamming into type mismatches. and it didn’t like eigSH’ which I don’t get.

Maybe Haskell for Mac would work well here? I did buy a copy.

Maybe Julia would be better? I think Julia has lots of functional and pythony things.

I should refactor the fftRow stuff into a higher order combinator that maps into rows and columns





Topological Bands I.


Topology is the study of geometry minus the wiggles. More specifically topology studies continuous maps.

Topology separates into a number of sub fields.

  • General Topology – Studies the set theoretic properties and definitions of topology
  • Algebraic Topology – Studies

The piece of topology that is most relevant to physics (partially because it is the most computational) is the notion of the notion of the topological invariant. Topological invariants are numbers that can be computed. If they disagree for two objects, then the two objects are not topologically equivalent. The reverse is not true.

An example of a topological invariant is the winding number of a closed curve around a point in 2d. This can also be interpreted as a classification of mappings from circles to circles.

Another example is the Euler number. The Euler number kind of counts the number of holes in a surface. It is defined as \chi = V-E+F , the number of vertices minus edges plus the number of faces in a tessellation of a surface. If one considers the Euler number of possibly disconnected surfaces or edges or vertices, the Euler number does a decent job of counting the number of objects.

Nick Read has published a review article in Physics Today. Physics Todays is a pretty excellent introduction to modern topics if you can find an article on that topic.

Topological Matter has one or more of the following

  1. An energy gap above the ground state
  2. A number of degenerate ground states that depends on the topology of the surface on which they live (Torus vs Sphere).
  3. Anyonic Quasiparticles.
  4. Quantized Response functions (quantized Hall conductance for example)
  5. Unavoidable Edge Modes


Two objects that are far apart need to be brought together in order to be compared. A Connection is a specification on how to transport something around in space. This connection is defined differentially. Given a tiny displacement dx, what is the tiny change $dO$ in the corresponding object.

There are three examples of connections of different flavor.

The first is the connection defining parallel transport of vectors in a curved geometry.

The game goes like so: Suppose you want to transport a little arrow in the surface of a sphere  to a new point on a sphere. You want a procedure to keep the arrow in the surface of the sphere. The simplest procedure is to project the arrow back into the sphere after moving it a tiny bit dx. Move, project. Move, project. This procedure keeps the arrow in the surface of the sphere the whole way. This procedure is the connection for parallel transporting the arrow on the sphere. (This procedure defines what it means for two arrows to be parallel at nearby points).

When one actually does this however, a curious thing results. The arrow upon returning to the original point after a closed loop, may have been rotated relative to the original arrow. Different paths of transport result in different amounts of turning. This turning is a result of the curvature of the sphere and the parallel transport procedure is a method by which to probe the curvature.

Rather that integrating the connection along the entire closed route, there is an alternative but equivalent way of doing the accounting. Consider a tiny loop or area dA. This loop causes a correspondingly tiny rotation d\theta . Because the rotation gets cancelled from traverse the same path in opposite directions, the sum of all the tiny loop rotations in an area will equal the rotation calculated for the bounding edge of that area. This is Stokes theorem about the curl if that helps.

Another connection is that of the electromagnetic vector potential A. The vector potential specifies how to transport quantum amplitudes and compare them at different positions. This is the Aharonov-Bohm effect. When a particle travels through the vector potential it gets an extra phase \Delta \phi = q/\hbar \int A\cdot dx.

A final connection is that connected to Berry Phase. Instead of talking about transport in physical geometrical space, Berry phase refers to transport in parameter space. This is analogous to the transition from cartesian coordinates in Newtonian mechanics to that of generalized coordinates in Lagrangian mechanics, which may not have a simple geometrical interpretation necessarily.

Berry phase can be interpreted as being similar to the extra phase that an oscillator might receive upon a cyclic change in its parameters for example slowly changing the length and mass of a pendulum. This seems like a small effect and it typically is a drop in the bucket compared to all the ordinary dynamic phase accumulated \int \omega(t)dt but it nevertheless exists and actually has a lot of conceptual importance.

A particularly important example of Berry phase is that of the spin-1/2. This occurs when you rotate the magnetic field that the spin is in for example.

u=\begin{array}{c}\cos (\theta/2)\\ \sin (\theta/2) e^{i \phi}\end{array}

A=i<u|\partial|u>=(0, - sin^2(\theta/2) )

\int A_\phi d\phi = -2\pi sin^2(\theta/2) =-\Omega/2

$\Omega=\int \sin(\theta) d\theta \phi $ plus use some half angle identities.

In summary, the Berry phase is half of the solid angle \Omega enclosed by the path.


Next Time: Discretizing the Schrodinger Equation






Based on

767 was the value checked for at 25Mhz. This is a count of 768.

I doubled it to 1535. and added one to all references to Counter X.

I am confused on these values. References I’m finding say to shoot for 31.5kHz but this works out to 32.5khz?

50e6/31.46875e3 gives 1588.87785501. Ok. That works too. I guess the monitor is just flexible.





I got an error for can’t use pin 101 which is V sync

This helped

Need to disable nCEO which is using that pin


Alright. It appears to work. That is a good start.


FEM again

  • Sfepy
  • Fenics
  • BEM++
  • OpenFoam
  • Elmer
  • Moose
  • Code-Aster
  • Gerris
  • ASL
  • meshpy
  • SolidPython
  • Salome
  • FreeCad
  • KiCad
  • Qucs
  • Skidl

I kind of cribbed from the wikipedia CAE listing

I think it would be cool to integrate all of these better. Simulate full EM equations.

Fenics uses Docker now? My how the world turns.

So that’s nice. Docker for Mac got slick as hell.

docker ps -a lists all guys

docker run -it –rm ubuntu

docker info

-v shares folders

They wrote stuff to this all for you

Sfepy also seems nice though. I hope that two seemingly equal options doesn’t cripple me.

I had to reinstall mayavi and vtk in order to get it working. I was just ripping at my distros. Hope that doesn’t bite me later.


Installed openfoam docker image

$FOAM_TUTORIALS is a variable with tutorial location

cp -R elbow ~/OpenFOAM/elbow

The 0 directory is the starting time drectory

has a file with the name of fields

pressure and velocity in this case



icoFoam runs simulation




switch to geometry mode.

STEP files preferred?

Create new entity

select surfaces that you want special

go into mesh mode

msh -> create mesh


Hypothesis sets paramters

Warning triangle right click compute mesh

create group


group on geometry

select wall from before

right click and export unv file

ideasUNVToFoam convertst this to an openfoam file






OpenCV Android

Setup OpenCV SDK in Android Studio project

A summary of the video

Get the SDK from

Make new project with regular settings

new > import module

select java folder in sdk folder

got an erro

had to install android-sdk by clickign isntall missing platforms

copy all stuff from native > libs into a JNILibs folder by dragging and dropping

Changed the SDK version to after 21 to remove camera2 error


I was having an NDK error

Go to tools > SDK manager and download cmake lldb and ndk

still have errors.

Have to go into files as far as I can tell. This gradle isn’t around