Thursday, March 10, 2011

Too Much Data

Yesterday I started a database report and let it run all day. When I was ready to leave work and the report was still not done, I let it run overnight. I figured if I came in this morning and the report was still running, I would try to figure out what the problem was and see if I could fix it.

I got into the office this morning and the report was still running. I had to have the report done this afternoon to send to management. I had no faith that the report would complete by the time it was due and so I started looking into other methods of creating it. On the off chance my report was going to eventually complete, I let it run.

Most modern databases are fairly smart, have efficient algorithms, and are pretty good at figuring the best way of doing things, but they are not perfect. I was able to come up with another way of getting the information and discovered an assumption I had made about the data was false. Instead of asking for a subset of the data from a single source, I was asking for all the data from multiple sources joined together in such a way that was overwhelming. Discovering this problem allowed me to finish the report and get it out on time.

This whole experience underscored the importance of not making assumptions.

No comments:

Post a Comment