Amazon AWS Server Types Performance Comparison for a Deep Learning/Convolutional Neural Network Project

Udactity has a Deep Learning course. One of the projects is to set up and train an image classifying convolutional neural network using TensorFlow. The tensors can get rather large (GB size). The computational requirements are such that if you want to play around with things, tear them apart and put them back together again to really dig into it, you need server resources that don’t take forever to give you back some results. Since it’s easy to switch an AWS server’s machine type, I thought i’d run a short experiment to compare AWS server types performance for a deep learning/convolutional neural network I was working on.

I provisioned a server at AWS based on the machine image “udacity-dl (ami-60f24d76)” that the Udacity Deep Learning folks provided. Once you have this up and running you can easily change the server image type. See https://aws.amazon.com/ec2/instance-types/ for some information on AWS image types. The cost per hour values in the table below are from this page. Side note: unless you’re rich you need to pay close attention to the cost per hour. It’s easy to suck down a significant chunk of money over the days and days and days of playing around and trying new things. It’s a blast but not cheap.

I tried several different machine types to compare performance (how fast the model would complete a training iteration) and here’s what I found. I limited the servers I tested to those that had a cost per hour less than a dollar. The tests were running the exact same script so the only variable was the machine type.

One server type should jump out at you. That’s what I’m using now.