Thursday, November 12, 2015

A Use for Square Root

Earlier this week I volunteered to come up with an algorithm to spread test data across a number of days. We ran a test for about an hour and then were left with trying to make the data look like it came in over a series of days instead of all on the same day. "Well that's easy," I thought, "just subtract a random number of days from the entry date." It turns out, that wasn't going to work because we needed to show a change in the amount of traffic. Statistically, the random number would provide an equal number of entries for each day. I quickly adjusted my algorithm by using the square root of a random number. I then subtract the integer part of the square root from the existing date.

So how does that work? It is actually quite simple. If you generate random numbers from 1 to 35 and only use the integer of their square roots you get the following:
  • sqrt(0)  to sqrt(0)  = 0 with  1 chance
  • sqrt(1)  to sqrt(3)  = 1 with  3 chances
  • sqrt(4)  to sqrt(8)  = 2 with  5 chances
  • sqrt(9)  to sqrt(15) = 3 with  7 chances
  • sqrt(16) to sqrt(24) = 4 with  9 chances
  • sqrt(25) to sqrt(35) = 5 with 11 chances
As you can see there is a higher chance the random number will generate a 5 than a 1. I created the following SQL statement for a MySQL database:

UPDATE my_table 
  SET date_coumn = (ADDDATE(date_column, 
      INTERVAL -1 * FLOOR(SQRT(ROUND(RAN() * 35))) day )); 

We ran the UPDATE statement and it behaved exactly as we expected. There were more entries that appeared 5 days previously than the previous day. Other than using the Pathagorean Theorem to calculate the length of a diagonal line, this is the first time I have ever had a valid use for the square root function. I guess I can't wear my One-more-day-that-I-didn't-use-Algebra shirt.

No comments:

Post a Comment