Overly Complicated Excel: Rate of the (Rate of the (Unemployment Rate))

This is the second of (possibly) several pieces I'll do on the unemployment rate. If you think I missed something important, by all means comment on it, but there is a chance it just hasn't come up yet.

A Little Bit of Math (and Philosophy)

A train leaves Chicago at 6 PM heading to LA (2000 miles away) at 60 miles per hour. A man, starting 1 hour later in LA, walks at 3 miles per hour towards Chicago along the train tracks. How far from Chicago do they meet?

Trick question. Nobody walks in LA.

Rates are funny things. We tend to think of them as units per time (miles/hour, beats/min, dollars/year), but really they are just units per different unit. (Technically they could be units per same unit but that would be quite boring.) So any fraction is a rate. The ratio 0.77 euro / dollar is an exchange rate. Similarly, any percent is a rate. Cent is 100 of something, so per-cent is 1/100 or .01 of something. You could say, "I pay 31% of my income in taxes," and I would say:

You need a better tax attorney.
Your Tax = 31 * 1/100 * Your Income

Because I could stick the equals sign in there to make this an equation, we can harness the power of algebra and calculus to affect both sides. We start with:

Your Tax = 31/100 * Your Income

If we were to increase the money to one side of the equation (you earn another $100 to Your Income), algebra says we must increase the other side to balance it (by increasing Your Tax by $31):

(Your Tax + 31) = 31/100 * (Your Income + 100)

Calculus says that we can do this for any size change in any variable:

(Your Tax) + (Change in Your Tax) = 31/100 * ((Your Income) + (Change in Your Income))

Which simplifies to:

(Change in Your Tax) = 31/100 * (Change in Your income)

Often written:

d(Your Tax) = 31/100 * d(Your Income)

or 31% = d(Your Tax)/d(Your Income)

Don't be afraid of the d() terminology. It's just an easy way of saying "small change in" and is called a derivative. That last equation read out loud would sound like:

"Change in Your Tax per Change in Your Income is thirty-one percent"

And because we already agreed that fractions, ratios and percents were rates you could also say:

"Your Tax rate is 31% of Your Income"

The Unemployment Rate

So since we agreed that any percent or fraction is indeed a rate... (I assume we agreed. You must have agreed implicitly. I assume you would have said something if you didn't. Ok, you agree.) ... the unemployment rate (reported as "U3") is a real rate. It is defined by the form:

Unemployed People = U3 * Workforce

Where Unemployed are strictly defined as >16 yrs. old and actively searching for work. The Workforce is defined as Unemployed People + Employed People. Of course we (you implicitly) just agreed that we can do this too:

U3 = d(Unemployed People)/d(Workforce)

I like the mental image of a giant game of duck-duck-goose. If U3 is 10%, and our circle of >16 year olds (for some reason tricked into playing duck-duck-goose by the little kids) is 10 people, the chooser runs around the circle and selects one person. If there are 20 people in the circle, they select two. It doesn't matter how long it takes them to choose the person- they can pat you on the head as fast or slow as they like. Time is not in the equation at all. It's not that kind of rate.

Until it is.

Historical Data

While thinking about our current unemployment rate's extremely linear trajectory, I was curious as to whether this was a common phenomenon. Maybe it was just this recovery? Maybe it was just this portion of this recovery? The Bureau of Labor Statistics (BLS) has the monthly unemployment rate from their survey going back to 1948. I threw this in to Excel and plotted it. (I also added a bar to denote recessions based on the National Bureau of Economic Research, though this doesn't play into the current post's theme.)

Oh, and you can track my work if you like.

What jumps out at you?

I was immediately drawn to the right side of the graph, with our recent unemployment jump and current BBQ recovery (low and slow). I also note that there's never really been a "golden decade" of low unemployment. It hit nice low levels in the early 50's, late 60's and late 90's, but has never been sustainably low. Maybe that is the first history lesson here: enjoy the good times, because they won't last. Our economy and job outlook have been consistently inconsistent.

So what else jumps out? Well, for one thing the graph kind of looks like a mountain range. And like mountains, the sides of many of the peaks differ in their steepness. The easier way to say this is that the slopes are different. A slope is a change in something with respect to a change in something else, which, you guessed it... is a rate.

Now what we are looking at is something (U3) that is changing over time. (My apologies if this is already obvious to you. I'm trying to get everyone up to speed together.) If there were an equation that modeled the following function, you could just use calculus to take the derivative:

(and there is an equation... it's just really complicated)

A much easier way to find slopes is to use the data we have (U3 for each given month), and assume that to get to the next data point (U3 for the next month) the slope is linear. Since there are no data points between them, for our purposes this is absolutely true. This makes it simple to calculate the slope as (Change in U3)/(Change in time) or if you like d's it is d(U3)/dt. If you are following along in my spreadsheet, Column A is Date (month) and B is U3. To this I add a Column C with the following equation (type the inside brackets portion):

(Cell C3) [=B3-B2]

This is of course assuming your data moves forward in time as your column value increases. It doesn't really matter if you flipped the data, just flip your equations too. You might notice I leave out time (Column A) from the equation. Excel counts dates in days, and if I already know that the difference I am calculating is over 1 month then all I need to do is complete the column (drag down or double click the right bottom corner) and I have a nicely defined Column C of d(U3)/month. A graph of it looks like this:

This is what I would call "noisy data." It has lots of random fluctuations and when you get a data point like the one in January 1975 it is unclear if it is meaningful because the next month there was a much smaller increase in U3. What we need is a smoothing algorithm. The easiest way to smooth data is to take a simple moving average (SMA) of the trailing 12 months.

(Cell D13) [=AVERAGE(C2:C13)]

Complete the column and you have data that is much better behaved. There are clear trends that match up nicely with the original data. (Technically our SMA is 6 months behind though, which can be solved by plotting the data with an adjusted axis, shown below, or making your Cell D13 [=AVERAGE(C7:C18)]. For our purposes we are just looking at trends though.)

This looks much better! If you conditional format the data for positive and negative (go to format -> conditional formatting, and make two rules: [if cell is < 0, green] and [if cell is > 0 red]) you see long stretches of continuous data. The average for this data [=AVERAGE(D2:D784)] is almost 0 (though not quite, because we are at higher U3 than when the data start). The standard deviation [=STDEV(D2:D784)] is about 0.1, which makes sense. About 95% of the data are within the range of -0.2 to 0.2.

What Does It Mean?

A couple notes on this graph. All the positive peaks are brief and sharp, while the negative peaks are mostly low and elongated (almost not peaks at all). Yes, there have been exceptions to this, but U3 has increased at a high rate (> 0.2% per month) about twice as often as it decreased at a high rate (< -0.2% per month). In the last 50 years we really have had only one deep valley (while we were recovering from a double dip recession in the early 80's). Most often what we see is a prolonged period of d(U3)/dt of about -0.05% per month.

Our most recent unemployment jump (2007-2008) and corresponding recovery (2009-now) fits this model exactly, and as I mentioned in a previous post, our current d(U3)/dt is about -0.06% per month. It is anyone's guess as to whether it will continue, but if you look at the chart, most of our history is a long slow grind towards full employment, punctuated by blips of uncontrolled unemployment spikes.

But Wait... There's More!

We have already found a value for the change in U3 with respect to time, and plotted it on a graph... against time. We could take the derivative again using our linear-by-subtraction method. In this case we take our SMA smoothed data and find the difference per month (Cell E13 [=D13-D12] and complete the column). Here I show the full plot and a zoomed in section where I only plot every 4th datapoint, along with the original U3 so you can see the correlation.

This data is already smoothed and still looks noisy, but the message I want you to take away from this is that the "change in the change" is almost always near zero. When a second derivative equals zero, the data series is a straight line (as opposed to curving up or down). This basically means that U3 is moving along not as a curvy line, but more like a series of straight lines. You can see this just by eye.

The portions of that graph that least match up to the straight black lines are where the second derivative is non-zero. You might notice that the upswings in U3 have a worse fit (by eye) than the slow descent. We can show this in the data too. First we will separate our second derivative data into two columns.

Up-swings in U3 (Cell F12) [=IF(D12>0,E12," ")]

Down-swings in U3 (Cell G12) [=IF(D12<0,E12," ")]

The average of each column is near zero, but more importantly, the standard deviation of the up-swings [=stdev(F12:F786)] is higher (0.031) than the down-swings [=stdev(G12:G786)] (0.024). If unemployment is going up, it is more likely to be accelerating (followed by a deceleration followed by a more steady decent).

Take Home Message

Sixty-odd years of data seems like a lot, with nearly 800 data points to make our graphs, but really we are looking at only a handful of economic events. From the first derivative it looks like these events cause a rapid increase in U3, followed by a slow and steady decrease that has historically lasted two to eight years before a new challenge. For the large economic events such as recessions, you get brief second derivative blips to the positive then negative meaning the rate is no longer linear. When the graph does this it is the signal that the U3 rate is about to change very significantly. Of course with the smoothing in this model you get that signal 6 months after it has happened. Maybe it is the signal to run?

--------

My humor may be low, but I'm trying to keep the quality of my posts high. That means updating only once a week, or whenever I get a really good idea. This, in combination with the imminent demise of google reader, has led people to ask if I could email them when the blog is updated. A compromise that I think will work well is a subscription to this google group (an email list). Send an email to that link (overly-complicated-excel+subscribe@googlegroups.com) and you will be added to the list. The blog will email that list whenever it posts are published. If you want your money back you can always unsubscribe the same way.

Overly Complicated Excel

Monday, April 29, 2013

Rate of the (Rate of the (Unemployment Rate))

A Little Bit of Math (and Philosophy)

The Unemployment Rate

Historical Data

What Does It Mean?

But Wait... There's More!

Take Home Message

No comments:

Post a Comment

Total Pageviews