Showing posts with label Analysis. Show all posts
Showing posts with label Analysis. Show all posts

Tuesday, September 23, 2025

Digging Into the Data

I spend my days working with data. Sometimes I do analysis and sometimes I am the guy the analysts go to when they need help solving a problem. While on vacation last week I had one of my coworkers ask me about some of the data in our system. I explained that I didn't really want to dive into the problem until I got back to work but would thoroughly investigate her question.

Yesterday I had way too many pressing tasks and so I couldn't look into the issue until today. This is one of those types of problems I love to solve and so I attacked it first thing this morning. I am going to simplify the problem for multiple reasons, not the least of which relates to my confidentiality agreement. The problem description is that I have a database with a CUSTOMER table without a date indicating when the customer entered the database. It is always nice to know how long someone has been a customer and the analyst wanted to know when that happened.

I figured there might be another table that might have that information and so I dug around a bit and found an EVENT table. I first looked at all possible event types and discovered about a half dozen. Unfortunately none of them indicated when someone became a customer. To help make sense of the EVENT table, I looked up all possible events for a specific customer and then sorted them by the event date column. A trend quickly appeared and it became obvious that all users would have a specific event once they became a customer. That event would occur frequently but all one needs to do is look at the earliest one and you have a rough approximation of when that person became a customer.

I sent my findings to the analyst that asked the question. About an hour later I received word that my technique works for only about a third of the customers. That is fine for some preliminary analysis but if we want to have beginning dates for each customer, we will have to go back to the group sending us the data and have them add an appropriate column.

I had hoped to solve the problem for all of the customers but fell short. I did find a preliminary workaround and that will allow us to test some ideas we have, which will then justify asking for the data from the source which will take a few days. Until then, the workaround will have to suffice. 

Wednesday, December 7, 2022

A Neat Trick in Excel: VLOOKUP

Lately I have been doing a lot of data analysis. One of the tools I have been using is Microsoft Excel and I recently learned a new trick using VLOOKUP. I thought I would put the information here in case I ever need it again. Oh and it may help you as well.

Most of the time I work with data, I use a relational database. That allows me to join various tables to replace cryptic values with more descriptive ones. A classic example is that of an EMPLOYEE table that uses a department number instead of the name. The database will then have a lookup table that converts the number into the name. This saves a certain amount of space and makes for much more efficient queries.

You can do the same thing in Excel using the VLOOKUP function. In my spreadsheet I have 2 tabs with the first being the employee information and the second being the department. Let's assume the employee tab has the following information:


A

B

C

1

Firstname

Lastname

Deptno

2

Matthew

Bennett

10

3

Mike

Smith

20

4

Sara

King

30

 Now let's assume that those department numbers translate to something much more readable like:


A

B

1

Deptno

Name

2

10

Research

3

20

Operations

4

30

Finance

You can use the VLOOKUP function in the 4th column of the table to include values from the department tab. It has the following syntax:

VLOOKUP(LookupCell, TabName!StartCell:EndCell, PasteColumn, Match)

So the LookupCell is that of the number that we want to convert to descriptive text. In the example above, we would use the column C values. 

The next values to fill in are that of the lookup table. While we could have them on the same page of the workbook, I find it easier to list them on a different one and call it the "department" tab. There are only 3 values and so we would start in column A on the second row. We would then run to B4.

The PasteColumn corresponds to the column number on the lookup tab that contains the value we want to show. Our descriptive text is in the 2nd column and so we would use the value "2".

The Match parameter indicates if we want to use an approximate or exact match. This example uses an exact match and so it should be set to "FALSE". I haven't tried doing an approximate match and so I am not sure of a use case for setting it to TRUE. Perhaps I can play with it for another blog entry.

Now let's put it all together. In cell D2, I would use the following value:

=VLOOKUP(C2, department!$A$2:$B$4, 2, FALSE)

I would then copy that formula and paste it into cells D3 and D4.

If those dollar signs confuse you, that is to hold the values constant regardless of what cell you paste the formula into. Otherwise Excel will adjust the reference. In the above example, the C2 reference doesn't have the dollar sign and will be converted into C3 when you paste it in the C3 cell.

This really helped me with some analysis I did and so I hope it helps you too.