AWS and Computing Clusters and MPI

Just been curious about parallel computation. Clusters. Gives me a little nerd hard-on.

Working my way up to running some stuff on AWS (Amazon Web Services).

So I’ve been goofing around with mpi. Mpi (message passing interface) is sort of an instant messager for programs to pass around data . It’s got some convenient functions but its mostly pretty low level.

I’ll jot some fast and incomplete notes and examples

Tried to install mpi4py.

sudo pip install mpi4py

but it failed, first had to install openmpi

To install on Mac I had to follow these instructions here. Took about 10 minutes to compile

so mpi4py

give this code a run

#mpirun -np 3 python helloworld.py
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
name = MPI.Get_processor_name()
print "Hello. This is rank " + str(rank) + " of " + str(size) + " on processor " + name

the command mpirun runs a couple instances. You know which instance you are by checking the rank number which in this case is 0 through 2.

Typically rank 0 is some kind of master

lower case methods in mpi4py work kind of like how you’d expect. ¬†You can communicate between with comm.send and comm.recv

#mpirun -np 2 python helloworld.py
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
name = MPI.Get_processor_name()


if rank == 0:
comm.send("fred",dest=1)
else:
counter = comm.recv(source=0)
print counter

However I think the these are toy methods. Apparently they use pickle (python’s fast and dirty file storage library) in the background. On the other hand, maybe since you’re writing in python anyhow, you don’t need the ultimate in performance and just want things to be easy. On the third hand, why are you doing parallel programming if you want things to be easy? On the fourth hand, maybe you

The capital letter mpi functions are the ones that are better, but they are not pythony. They are direct translations of the C api which uses no returns values. Instead you pass along pointers to the variables you want to be filled.
from mpi4py import MPI
import numpy as np
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
name = MPI.Get_processor_name()

nprank = np.array(float(rank))
result = np.zeros(1)
comm.Reduce(nprank, result, op=MPI.SUM, root=0)

if rank == 0:
print result

Leave a Reply

Your email address will not be published. Required fields are marked *