Thursday, September 22, 2011

Keeping Time

This week has been spent rebuilding and configuring servers in our research and development private cloud at work. Some time ago, our company's IT department decided to remove a machine that the rest of us use to keep all of our computers' clocks synchronized. One of my favorite sayings is: "A man with one watch knows what time it is, a man with two isn't quite sure." This is also true for computers and so when servers in a private cloud can't decide on the time, it wreaks havoc. In our case, it brought the cloud down and all of the machines with it.

In reality, not everything in the cloud came to a halt. The servers were still out there running just fine, we just lost all network connectivity to them and so it was impossible to log into them. The only solution was to stop the servers and restart them. If they were actual machines, a power cycle (or turning them off and on) would do the trick. As they are part of a cloud system, it requires that I reconfigure them once they come back up. All of the data in the database machines was preserved and so we didn't lose anything important. Furthermore I have simple scripts or small programs to reconfigure the servers, so rebuilding everything didn't require much thought, just time.

When I found out that our private cloud had crashed because of clocks not being in sync, I was pretty upset. I wondered how a few seconds difference could have such a negative effect. I guess it underscores the importance of keeping the correct time.

No comments:

Post a Comment