Showing posts with label GCE. Show all posts
Showing posts with label GCE. Show all posts

Tuesday, October 30, 2018

Buy, Rent, or Borrow Computing Power for Machine Learning

I am going through an interesting exercise at work. There are a number of us going through a Reinforcement Learning book and we all want to play with the code described in the various chapters. While I have the luxury of having a Linux machine sitting under my desk, others are not so fortunate. One of my coworkers has a rather large Linux box with a lot of computing power and so several people are running the exercises there. Unfortunately it can only handle about 3 really large jobs concurrently and so we are looking at other options for getting compute power for our Machine Learning exercises.

We tried the first solution by buying a large computer but there are a number of problems associated with that. I explained the first being that we are limited to about 3 people using it at the same time. the resources are fixed and don't scale well. The second problem is we have to maintain that computer. During one particularly large job, the computer stopped running at 4am and we have no idea why. We believe it might have been a hardware failure because the logs just suddenly stopped recording anything. It would be nice to have someone monitoring the computer 24 hours a day but that is not possible. Especially for a simple learning exercise.

The next option is to rent space on a cloud service such as Amazon (AWS), Google (GCE), or Microsoft (Azure). We do not require graphics processing units (GPUs) and so we can get enough computing power for all our experiments for around $650/month. We will take 2 to 3 months to read the book and would require about $2,000. That is significantly less than the price we paid for buying the computer mentioned previously. Furthermore the hardware will scale nicely. If we want to run more experiments, we increase the number of servers we rent. When we don't need them any more, we shut them down and don't pay to keep them running.

Finally there is the option of borrowing computing power. There are a number of other groups within the company that have spare compute cycles we could use for our learning exercises. This is the ideal solution if we only factor in cost. However the reality is that someone could be kicked off the hardware when other higher-priority tasks need to run instead or the problem of spreading out experiments evenly across the company.

We will probably end up renting servers from one of the public cloud companies as it seems to strike the balance between being cost effective and least troublesome. Your situation may be different and it is always worth considering all 3 options.

Tuesday, September 22, 2015

Software Development Trends

Today I had an experience at work that underscored the evolution of software development. When I was in school working on programming assignments for various classes, there really wasn't a need to use a formal methodology. I looked at the problem, sketched out my algorithm, sat down and coded it. Life was pretty easy but so were the assignments, relative to the projects I am involved with now.

I got out into the software development field and had a team that kept up on all of the latest software development methods and we were one of the leaders in implementing the capability and maturity model (CMM). It worked for us and we were able to create some pretty amazing computer software that was on time and under budget.

Fast forward a few years and I found myself doing consulting for various organizations. As I was a lone coder working on projects, there really wasn't a need to completely embrace CMM. Instead I took what felt good and coupled it with what worked in college. Once again life was easy but my projects were rather small and not very complex.

Eventually I found myself working for a very small software organization in Southern California and was introduced to agile software development. My first thought was that "agile" must have been developed by 3M because you use a lot of sticky notes. It felt like the next evolution of software development and seemed fairly efficient.

Then "the cloud" comes along and the next bit of evolution is needed so DevOps is born. DevOps doesn't replace "agile" but becomes coupled with it. In addition to creating software, a group of software developers gets tasked with writing tools to deploy the software. This isn't a large task for small and simple programs. However bigger software projects requiring large numbers of computing nodes or instances running on such services as Amazon Web Services (AWS) or Goole's Compute Engine (GCE) require a lot more care to ensure the code gets deployed correctly.

In my meeting today we discussed DevOps and whether it should be done by a specific team. The group consensus is that each developer should have a hand in the deployment process and understand DevOps. I wouldn't say we are ready to disband the DevOps team yet as each project needs at least one person responsible for architecting the entire deployment system. However this is a shift from how we have done things in the past and seems to be the next step in the evolution of software development. Of course, now that I have written this down it is obsolete. Tomorrow there will be yet another twist.