Early in my career I heard the story of a software developer that had the unique practice of putting on a pith helmet whenever forced to debug his own code. I'm not sure how true the story is but it evokes the image of an early explorer trudging through the jungles of a distant continent. In my current role, I am often asked to search for system problems and help resolve them.
Last Friday I received a cry for help and immediately looked into the issue. My manager didn't like how I responded as he wanted me to find the source of the problem before bothering the group that reported it. I saw it differently as I wanted them to know that we immediately started looking at the issue and wondered if they had any more information they could provide to help diagnose the issue. Ultimately the only thing I could do was log a bug with another group in the company and have them research the root cause of the issue.
We told the group that reported the issue that we would work to get the issue resolved by Monday. Such is the luxury of working for a world-wide company like Sony. I can go enjoy my weekend and hope another team working in Japan or India has enough time to find the issue and resolve it before Monday even starts in the United States. Unfortunately sometimes an issue can be more difficult to troubleshoot and so it is important to get as much information as possible on Friday so we don't have any back-and-forth with questions.
Fortunately one of our excellent engineers saw the problem on Saturday and worked through the weekend to get it fixed. The root cause boiled down to "too much data." As this means the problem may arise again we have a two-pronged approach to solving it: The first is to monitor for larger-than-normal data sets. That will alert the operations team before the system users get a hint that something has gone wrong. The second is to modify the architecture to handle larger volumes of data.
This morning I came in and saw everything that had taken place over the weekend and could report back to the system user who noticed the problem. I could also share the resolutions that we plan to implement so it doesn't happen again. The user confirmed the problem fixed and I closed the bug appropriately.
While I like the idea of wearing a pith helmet to signify that I am trudging into the jungles of code, I am glad I have not adopted the practice. First, I work out of my home office so nobody would have any idea of the significance other than my wife who would just laugh at me. Second, I would have gotten a lot of strange looks and questions wearing the unique hat through the weekend.

No comments:
Post a Comment