Overly Complicated Excel

Sunday, September 28, 2014

Cash or Charge (It's Still Miles per Hour)

In which we do things to units that we may regret the next morning.

Car Talk

So my friend Brian and I were talking the other day about his car. This is not unusual. It is, after all, a nice car. Actually, it's a Tesla Model S, so beyond being nice it is also the type of car that people who enjoy modeling in Excel can talk about in terms of physics-y stuff too. One thing he mentioned is that he could get a much better charge from a 240V plug than a 120V plug. The difference was something like 26 miles per hour.

Here is where it makes it a much better story if I lie and say we were in the car at the time driving 26 miles per hour. We weren't, though. It's too bad. It would have made it an excellent story.

Anyways, I was confused.

"You can measure your car charge in miles per hour?" I asked dubiously. He explained that the car has a similar efficiency curve to a gas car, where as you go faster you get worse "milage", but that the onboard computer could easily figure out your average driving style and with whatever charge you had on your battery give you an approximate number of miles you could travel. Apparently you don't think of your car as half-full of charge, but rather that you have ~150 miles of charge left. Well, the same type of thinking works when you charge the battery up. You don't think about how you are filling it up overnight, but that you are adding ~200 miles of charge. If you do that in 10 hours, you are charging at 20 miles per hour.

This obviously breaks down near the edges. A battery that is almost fully charged will charge more slowly. Driving very slowly and very fast both ruin efficiency, as does cold weather. As does windows down, AC, and for that matter (because the it is not a combustion engine) the heater. Anyways, imagine all that stuff cancels out or doesn't matter. ~~What you get is~~ What I get is a mental image of a car on a racetrack, tethered by an extension cord to the middle, and (unlike this car) continually able to keep going. Kind of like this, but bigger.

According to the Tesla website the charging rates are as follows:
120V: 3 mi per charging hour
240V: 29 mi per charging hour
Supercharger: 170 mi in as little as 30 min (much slower thereafter)

Those are all totally drivable speeds (although both 3 mph and 340 mph might get tiring after more than a few minutes). I'm sure there is some technical reason this couldn't be done, probably due to battery magic, err, physics, but a very reasonable way to view this is as an equitemporal pairing. If you drove for one hour at 29 miles per hour you would have to charge one hour at 29 miles per hour to recoup the energy.

Power Ballad

As an aside, you can use this to figure out "How much energy does a Tesla use?"

The company states that you can use the 240V, 40 Amp outlet to charge at 29 miles per hour. Some back of the envelope calculations (240 volts x 40 amps = 9.6 kWatt) and (29 mph / 9.6 kW or 29 miles / 9.6 kWh) and you can get about 3 miles per kWh.

As an aside from this aside, if a regular car gets 21 mpg and energy rates stay at about $0.12 per kWh, gas would need to be (21mpg / 3mi x $0.12) $0.92 per gallon to make the cost of driving equal. Interestingly, that is about the (nominal) price of gas in the early 90's when GM pulled the plug on electric cars.

Back to our regularly scheduled aside... But wait! you might say, that is a lot of numbers. How do I know I can trust your math? Good point, hypothetical you. We should come at this from a different direction that doesn't use volts x amps mumbo jumbo. They very explicitly state that you can get in the range of 300 miles from their larger sized battery. This battery happens to be called the 85 kWh battery. Assuming they didn't wildly misname it, you should be able to get (300 miles / 85 kWh = 3.6 miles / kWh). So pretty close.

Sun Roof

So where was I going with that aside? No clue. Didn't even really need Excel. Ah, but you read it anyways, so we should probably do something with that. How about this: we've been assuming that you charge from a wall outlet, but what about solar panels? Aren't they the future? What would happen if you could collect all the sun's energy that radiates down on a certain area (conveniently about 1 kW per square meter) and used it to power a Tesla? You would get about 3 miles of driving per m^2 of perfectly efficient solar panel per hour of sunlight. Thats right, if you put up perfect solar panels on the roof of your 1000 sq ft one-story house, your roof conceivably could be described as 276 miles per hour (Tesla model S miles per hour to be exact).

Wow. Brian might be able to make use of that, but I certainly can't drive that fast. But here is an idea that might work- charge the car off of it's parking spot. A car is either driving or it is parked. The more it drives, the more power it needs, and also the more its parking spot is vacant. Can you get the power for driving just from the parking spot it vacated? (For this exercise, assume that most driving, and therefore most solar charging, happens during the day.) A parking spot is roughly (2 m x 5 m) 10 m^2, and therefore could be charging at 10 kW per hour of time that it was not covered by the car. The car could go 30 miles per each hour of charge. If you never break 30 mph, you will always have enough solar panel charge when you get home to get your battery back to full. (If you take this one step further and use the same 10 m^2 of solar panels, but give the car much less "getupandgo" such that it uses less energy and can go 500 miles, you'd pretty much have this.) Improve the solar panels to 100%, stick them on the car itself, and you'd be driving around that racetrack at 15 mph indefinitely, without the tether to a power outlet.

So it is kind of quaint that you can measure both driving and charging in miles per hour, but can we extrapolate this to anything else?

Apologies, and Not For My Puns This Time

So, I guess I have some 'splaining to do with the whole 11-months-between-posts thing. The thing is, I got a new job where I get paid to do science! While this is great for many reasons, the one thing it shorts me on is free time. I might try to fit in a few shorter blog entries and also space things out more evenly, but bear with me if I miss a few months.

So, with that out of the way. Lets figure out a way to use mph to figure out how much I get paid!

The main idea here is that there is a limited range of money someone could be paid for work. (Lets say that this is $1-100,000,000 per year, or about 8 orders of magnitude. More realistically people make between $20k and $200k, or just 1 order of magnitude.) Depending on your diet, this will buy a certain number of calories. If you subtract your basal metabolism (needed for survival), you get an excess number of calories. What do we do with these calories? We walk miles! If you are paid hourly, then you are paid in possible walking miles per hour!

This is about to (as I found out while researching it) get very complicated, so lets go stepwise in building our model. First we need our initial inputs. The obvious one is wage. And here we hit our first stumbling block. If you work hourly, you probably know your hourly wage, but if you are salaried, it can be convoluted. If you work 40 hours a week for a year (52 weeks) you end up putting in a nice-and-close-to-round 2080 hours which lends itself to this rule of thumb: divide your salary (in thousand $'s) by 2 and it is your hourly wage... almost. It will always be (2080/2000) 4% too high. If you take two weeks of vacation it is exact. But many people don't work 40 hours a week, so for them I made this handy chart:

While we are plugging in numbers, let's use the annual median income of $28,500 (as of 2012) which we can enter as $13.70 per hour. Yay data! Leave some space for labeling things, and add [13.70] value to cell B2. (For those of you new to the blog, things [in brackets] should be entered in Excel.)

How many calories is that? The average American spends a little over $6k a year to consume 3800 calories a day. I know. That means calories cost (cell B3 [=6000/(365*3800)]) $0.0043, or about (cell B4 [=1/B3]) 233 calories per dollar. (As a sanity check this value kind of makes sense. One dollar of candy bar might have more calories, but one dollar of lettuce will have fewer.) Of course this will change with individual diets, but this means that our caloric wage is (cell B5 [=B2*B4]) 3167 calories per hour. An average American could probably survive on 1 hour of work a day to buy their food if all their other needs were provided for. This fits with the statistic that the average American spends around 12% of their income on food (although this varies widely at the outer quintiles of the wage range).

But how much of that is excess? To figure that out we need to find out how many calories we burn in an hour of doing nothing (aka resting metabolic rate, or RMR). This depends on several factors, is empirically determined, and is a different equation for male or female. Enter the following:

Cell E2: Gender
Cell E3: Weight (lbs.) Cell F3: [=0.4536*E3] kg
Cell E4: Height (inches) Cell F4: [=2.54*E4] cm
Cell E5: Age

Male RMR (cell E6): =(F3*10+F4*6.25-5*E5+5)
Female RMR (cell E7): =(F3*10+F4*6.25-5*E5+5)
My RMR (cell E8): =IF(E2="Male",E6,E7)

Hopefully I haven't lost you yet. Basically, the equation says that the heavier, taller, and younger you are, the more calories your body burns to stay alive. For some reason women also burn fewer calories even holding the rest of that constant. As an aside, oxidative metabolism produces radical species that are a component of aging. Is this an aspect of why women live longer than men? This is the type of coincidence I'd love actually assign a correlation to... but not for this (already way too late in coming) post!

So if we throw in some numbers for a 32y/o male who is the average American weight (195 lbs) and height (5'10" = 70 inches), their RMR is 1841 calories a day, or (cell F8 [=F7/24]) about 77 calories per hour. Their caloric wage accrual rate is actually now only (cell B6 [=B5-F8]) 3015 calories per hour. This resting metabolism only takes up 2-3% of your wage!

Ok, one more aside. Imagine you were a hunter gatherer picking, oh, I don't know... blueberries (which have about 0.78 calories each). You would need to consume 100 blueberries an hour to satisfy your RMR. That need would increase as you spent energy picking. You would need 4872 blueberries a day consume an average American diet of calories. If you wanted "to make the average median income" in calories, you would need to pick just over a berry per second. As anyone who has gone berry picking knows, that is easy to do for a few minutes but very hard to sustain for an hour (let alone 8 hours!) It kind of puts it into perspective that not only was it hard for a nomadic people to maintain and transport a surplus, but it probably wasn't practical to form one in the first place!

So back to the mph wage. We now have two values of caloric wage that could be converted to walking miles. The values aren't that different, but just different enough that we should probably try to choose the adjusted one. After all, that would be the more overly complicated choice. The biggest complication there is that your RMR is a constant need, but work is not constant. If someone works one hour a week, they still have the RMR to fulfill for the entire week. For the man working the median income wage ($13.7 an hour) the surplus calories from work will be a function of the amount of time worked. It would look something like this:

To build in this adjustment, we need to tweak the wage one more time. For this we must know how many hours per week you work (cell B7) and don't work (cell B8 [=24*7-B7]), plus how many calories per hour you need to make up for the hours you aren't working (cell B9 [=B8*F8/B7]). If we subtract that from the RMR wage, we get the full adjustment (cell B10 [=B6-B9]). For our average guy, the numbers might be 40 hrs per week worked, 128 hours per week not worked, resulting in a 245 cal/hr deduction to make up for the unworked time. His final caloric wage is 2845 cal/hr.

So let's use those calories!

Walka Walka Walka

It turns out that walking is complicated! While running (on flat ground), the caloric cost is independent of pretty much everything except how long you do it and how heavy you are. (You are basically jumping up and down repeatedly.) Walking is different. It still depends on how heavy you are, how long you do it and the gradient, but it also depends on the speed of the walking... with linear, quadratic, and cubic components! Walking very fast (> 5 mph) can even burn more calories than running. Its hard to do, and even speed walkers rarely get to 10 min miles. Additionally, all the equations for caloric cost break down at the extremes of the velocity spectrum, so lets preempt those problems and ask people how fast they walk (cell E9) and use data validation to constrain it to 0.5 to 5 mph where the equations are mostly linear with weight and time.

Cell E10 = 0.6*E3

Our average 195 lb guy burns about 118 calories per mile. This agrees with many of the online calculators out there, but this value is still describing gross calories burned. For the net calories per mile will need to subtract out what he would have burned anyways in that amount of time (the hours per mile). So when we account for this (cell E11 [=E10-F8/E9]) we find out that his calories per mile is quite variable. If he walks 3 mph, his calories per mile is reduced to 91 calories per mile.

Putting it All Together

So now that we have all the components, lets figure out what this average man's wage is in miles per hour. With their physical characteristics, working a 40 hour workweek, and walking 3 mph, they get paid (cell B12 [=B10/E11]) 31 mph.

Cool! How does it vary if he changes weight? Losing 20 lbs could actually earn him an extra 15% raise.

And what about gender inequality? If we had an average woman (5'4", 165 lbs) instead of an average man, we make some steps in the right direction. Other takeaways are that additional height has a very small positive impact on your wage and additional age has a very small negative one. Walking very slowly does help, but you'd run out of time to do your walking!

And of course, as always, you have to make wise food choices.

-----

Have an interesting idea you want modeled? Didn't answer your burning question about a random topic? Think I did my math wrong? Let me know in the comments! If you want to get email notifications when new posts go up, send an email to subscribe+overly-complicated-excel@googlegroups.com to subscribe to the mailing list. Also, you can click the link in the right hand sidebar.

Tuesday, October 15, 2013

I Feel the Need. The Need... For Speed

In which Tacoma tries to solve their traffic problems, and I join the chorus of people telling them to "Jump on it."

There and Back Again

Six hundred seventy one miles is a long way to drive in a day. I know that seems pretty obvious, but it is the kind of drive that seems longer once you are an hour in to it, when the "hundreds of miles left" digit hasn't changed. That kind of drive gets you to thinking. About speeding.

It was on a drive like this that I started thinking about the speed I was driving and what minor changes would mean. I know there are a lot of attitudes about speeding, based on who you are, where you are from, and how much control you feel over your machine. I think I fall into a group with a large number of people who are comfortable at 5-10 mph over the limit but uncomfortable in the 15-20 range. Thats not to say I don't enjoy the occasional ride in my friend's Tesla. Or singing along when my pile shakes as I hit 80 on the open road. I do. I just don't think of driving that way most of the time.

But back to my long drive and thoughts on speed. Speed is funny. We normally use units of miles per hour or meters per second or other various distance over time measures. But in practice this time-in-the-denominator thinking is made burdensome by one small truth. We don't say:

"I'll meet you in Tacoma. Its a half hour away, so I'll see you in 30 miles."

Correctly, we say:

"I'll meet you in Tacoma. It is 30 miles away, so I'll see you in a half hour."

We acknowledge the independence of distance and dependence of time, yet give them opposite roles in our measurement. (And we wonder why people have trouble determining which train gets to Chicago first!) All this is made uniquely interesting by freeway speeds that are generally close to 60 miles / h, making the super easy conversion of 1 min / mile. If you go from Seattle (I-5 exit 163) to Tacoma (I-5 exit 133), you don't need the internets to tell you that the 30 miles between those exits should take about 30 minutes (just to know that it is 2.5 h in current traffic). But when you diverge from that easily invertible speed, things get crazy. Twenty mph over (80) saves you 7.5 min, but 20 mph under (40) loses you 15 min. Its a kind of beautiful asymmetry in a hope-you-don't-get-stuck-in-Tacoma sort of way.

But Does It Make Sense?

So really, back to those six hundred seventy one miles. At 65 MPH the inverted speed is 0.923 min / mile. Can I save any time by speeding? Is it going to kill me to go slower than the posted speed? The curve is going to look the same, but bigger.

The big prize here would be to cut 5 hours off of the trip by going 125 MPH the whole time. I'm not sure my car can get to 125, let alone maintain it if I hit a hill. So, second prize? What about saving nearly an hour and a half by going 10 MPH over the limit (which my friend from NJ says they aren't even allowed to pull you over for). That seems doable!

But what does it get us? Depending on how much you value your time, maybe not much. That hour and a half could be made up by eating lunch in the car and skipping the stop. It could be made up by leaving a bit early. It could be made up by not leaving late.

And the downside case? Stuck in light traffic the whole time and losing 5 hours as my speed drops to 45 MPH. So to summarize, this long trip had a small achievable upside, and large achievable downside. What happens when the distance changes? Can we keep any of that time we saved by speeding, and perhaps fit in one extra activity before starting our trip? I know its been a while, so lets dust off that old spreadsheet and start modeling!

First let us come up with a list of activities:

Time (min) Activity
0.25 Tie your shoes.
0.33 Tie your shoes (double knotted).
1 Clap your hands 802 times (expert).
2 Find your wallet (expert).
5 Clap your hands 802 times (novice).
10 Find your wallet (novice).
or Clap your hands 802 times (beginner).
22 Watch an episode of Colbert Report (no commercials).
30 Watch an episode of Colbert Report (w/ commercials).
60 Watch an episode of 60 Minutes.

Enter the times in cells A4:A12 of the new sheet. Convert to hours by making cell B4 [=A4/60] (entering the portion [in brackets]), and complete the column (either dragging down or double clicking the bottom right corner).

Don't Forget to Log Out!

Next we need to think about our output. I want to know if I can still make up time given various distances. But 671 miles is very different from 1 mile, so chances are I'll want to use a logarithmic axis to display the data. Since I realize that now, I can choose distances to make the graph look pretty. This can be generalized to any log graph, using the range and number of data points you want. Enter the range in A1. For me that will be [=1000-1] because I'll have the log axis go from 1 to 1000. (Don't try to make it go to zero or you will have problems.) Now decide how many data points you want and put them in the log base (or denominator of an exponent). In my case I'll choose 16 so it looks like [=A1^(1/16)] = about 1.5. This is the factor that each distance will be increased by (basically counting in "one-point-fiveary" instead of decimal or binary). Make C1:S1 the numbers 1, 2, 3, 4...16, and make C2 [=1.5^C1] then complete the row. Copy and paste special "values" into row C3:S3 then delete all that stuff you needed to get your values. In my case I have the values 1.5, 2.25, 3.375... 985.261253. These values aren't nice and round, but on the graph they will be perfectly spaced on the log axis.

Type the reference speed in cell A1. Lets start with 65 MPH. Next, fill in cell C4 with [=-C$3/($B4-C$3/$A$1)] which is an algebraic way of solving for the speed at which you can make up that much time in that much distance (vs. your reference speed). Complete the column. Complete the row. (The $ should take care of keeping fixed values in place but I usually spot check. The form should still follow [= - distance / (time - distance / RefSpeed)] in all cells). Immediately you might notice that output is somewhat nonsensical. There are negative speeds in there! Those occur because of the physical impossibility of making up more time than the drive would take. You can't make up an hour on a 1 h drive, because you'd be going infinitely fast. You can't make up 2 h either (hence the negative speed).

One way to get this out of the way is to do an "if" gate, where a negative value returns a ridiculously fast speed, of, say 1000 MPH. This value will never appear on our graphs, but gives them the appearance of the infinite asymptote they are approaching. Copy cells B4:B12 and paste special "values" in B14. For cell C14 enter [=IF(C4<0,1000,C4)] and complete the column. Complete the row. Yay! Graph that!

Looks like at freeway speeds it is tough to make up time. Anything under 10 miles and you need some major speeding (the kind that can get you pulled over even in NJ) to make up just 2 minutes. Alternatively, drives over 100 miles lead to nice time gains from your speeding, such that you might even catch up on your TV watching.

Things get a bit more interesting on surface streets though. Bump the reference speed down to 30.

The lower speeds here make the curves look much sharper. Driving ten miles at our new reference speed would take about 20 minutes if we hit all green lights, but to shave even 5 minutes off that would require us to increase our speed by a third. On the other hand, that same level of speeding can save an hour when you go over a hundred miles.

Actually, now that I mention it, stoplights are a much better reason to be late. Forget that double-knotting of your shoe... hitting just one stoplight could put you a minute or two behind schedule. I wanted to see how much time you might lose to stoplights, so I played around with modeling traffic light timing in Excel before realizing:

A) It's hard.
B) It depends on what kind of pattern you have, and
C) It's been done before. Hundreds of thousands of times.

A much easier path is to make some assumptions:

Assumption 1. The number of lights you pass through increases linearly with distance you drive. (If you drive 10 miles you will pass through twice as many intersections as a 5 mile drive.)

Assumption 2. For every X lights you pass through, you will be stopped N number of minutes. (I'll choose 4 intersections, stopping for 1 minute. I don't know which intersections, just total stop time.)

Assumption 3. The spacing for intersections is similar to that in Seattle. Every 5 blocks = major intersection = quarter mile.

This results in something quite interesting. There is no effect of speed on the amount of time you are stopped at lights. Of course this is a little simplistic, and does not account for running yellow lights (which would favor speeding) or timed lights for a given stretch (which discourages speeding), but instead lets us calculate a new ratio of "stoplight time / mile". With the conditions I mentioned in assumption 2, that ratio would be 1 min/mile. Glancing back up at the 30 MPH chart, I can see that to save 1 min on 1 mile (or 10 min on 10 miles etc.) would be... 60 MPH?

Let me double check that. Ten miles at 30 MPH is 20 min. Adding 1 min/mile of stoplight time would bring the total to 30 min. If we tried to get back down to 20 min total trip time we'd subtract 10 min for sitting at stoplights and try to make the 10 miles in the remaining 10 minutes. Yup, thats 60 MPH.

And the sad thing is that I was trying to be conservative with my estimate of stoplight time. Downtown it could be much higher, and remember, this stoppage is distance independent, so we can't just drive 1000 mies to get efficiency (although after we leave the city the stoplight time/mile should drop significantly). And here's a spooky thought (my hedge against whether or not I get in a Halloween post), what if the stoplight time gets to 2 min/mile?

That time is unrecoverable.

Tuesday, July 16, 2013

Baby You Can Drive My Population Growth

In which we watch the baby-making process over time. Um, no... in which we analyze how babies are brought into the world. Nope, still creepy. In which we stare at data.

Rollin' In Their Graves

The number of people alive today are over 1/10th of the people who ever lived.

I'm not sure where I first heard that curious statistic, but it is exactly the kind of thing I like to repeat. Is it true? I mean, I know there are a lot of people in China and India, as well as a bunch of people in the US, Indonesia, Brazil, Pakistan, Nigeria etc, but to compare that to all the people? Ever? In all ~~6,000~~ ~~30,000~~ 200,000 years of human existence on Earth? Assuming the zombie apocalypse, would we really just need to destroy 9 zombies each? That sounds quite doable! (Side note, it feels really good to write a statistical blog and start a sentence with, "Assuming the zombie apocalypse".)

So our first task today is to see if this is true. If so, will our odds of surviving the zombie apocalypse get better or worse over time?

Oh, The Humanity

One problem we run into is the definition of what it means to have "all of humanity". It means there was a beginning. We separated from our closest surviving primate ancestors (chimps) a few million years ago, but were still mingling and struggling against other hominids until much more recently. Our small populations at the time make the annual human contribution quite tiny, but the sheer variation in start date magnifies the importance of the question. One solution is to use our pinch point. About 100,000 years ago the population of early humans was reduced to possibly as few as 2,000 individuals. After this, we were clearly our own species, so I will use 100,000 BCE as our start date and 2,000 as our starting population. Other data I pulled off the US Census website where they have nice global estimates for 10,000 BCE through 1950 at which point they have precise annual estimates.

First let's model their data. I put date in Column A (using negatives for BCE) and population (in millions) in Column B. The initial plot looks kind of exponential.

Reploting as a log scale graph reveals distinct phases.

You can see how big a role artificial fertilizers were. Since their introduction, the doubling period of global population growth has been 50-ish years. This is obviously not sustainable, but we will come back to that later.

But we don't want to know how many people there were in any given year. We want to know how many people there were total. To do this we will count the one thing everyone has had... a birth. Even if they only lived a few minutes or are still alive today, at some point they were born. (I hadn't thought about it until now but a significant portion of the zombie apocalypse will be children. It is sad, but it does improve our chances.) The number of births in a given year can be calculated by the crude birth rate (CBR) which is total births per 1,000 people per year. This number can be found for modern eras and estimated for ancient ones. It ranges from 50-ish in cultures with high infant mortality, limited women's rights, and limited contraception to 10-ish in cultures with the opposite. Both extremes put pressure on society (to care for the young or the elderly, respectively), but if you are the leader of a culture you would probably prefer the latter.

Anyways, put CBR in Column C and we'll estimate that pre-1950 it is about 37.2 (our earliest data point). If we average the population over our different time spans and multiply by years and CBR we get approximate number of people born during that time span. We make that equation by typing the [in brackets] portion. Move everything down so data starts in Row 4 and we have room for calculations.

(Cell D4) [=0.002] Million People Initially

(Cell D5) [=(C5/1000)*AVERAGE(B4:B5)*(A5-A4)]

Complete the column. Column E will be how many people are born up to that point. And Column F will be the ratio of current population to people ever.

(Cell E4) [=sum(D$4:D4)]

(Cell F4) [=A4/E4*100]

Complete the columns. Cool! We really are about 1/8th of the population ever. Here are some representative data points.

Making the CBR of nomadic hunter gatherer people's 20 or 50 (which are both defensible positions without much real data) only changes this by about 3%. Interestingly, back when I heard the statistic, it might have been 1/10th of the population. The fraction of "All People" who are currently alive has been steadily growing.

But what will happen in the future?

One Hundred... Billion... Dollars. What? People? That Doesn't Make sense.

To answer that, lets first have a little aside. If you thought the growth of the twentieth century was scary, the following graph should be downright terrifying.

Keeping with the 50-year doubling trend would mean that in 200 years we could be looking at 100 billion people. Even discounting the Earth's ability to sustain such a population, each person would only have a third of an acre of Earth's land-surface area. Unless we started building underground. This is obviously silly, right?

It May Be A Growing Problem

At what point will global growth slow? Lets look forward to futuristic date of... 1970. Yeah, this totally threw me for a loop too. I was taking the second derivative of population with respect to time, you know, like you do, and bamb! It is glaring. Here is an even easier way to look at it. Find the average annual growth rate for each period:

(Cell G5) [=EXP(LN(B5/B4)/(A5-A4))-1]

Complete the column and you see that for most of human history there has been low or no growth in population followed by furious activity in the last thousand (and especially hundred) years.

Lets zoom in.

Correlation is not causation, but it certainly looks like artificial fertilizer helped increase the growth rate. As for the decline, it could be many things. China instituted its one-child policy, many economies slowed down, but importantly, lots of countries modernized their view about contraception and women's rights. If we extrapolate the rate of growth, we eventually get down to zero. At that point our population would stabilize at right around 10 billion people.

So that slowing is good news for the planet (also confirmed by the US Census and UN independent estimations), but bad news for our odds in the zombie apocalypse. Even worse is that we start losing our edge even before the population stabilizes.

Undead Reckoning

Lets take a minute to think about what we really want to know and what we already know. We know there is a ratio of "Current Population" to "All People Ever". We know this ratio changes. We want to know when that ratio is going to stop changing in the positive direction, and start going down. In math or physics, we would call this "finding the maximum of a curve" and would accomplish this by defining the derivative of the ratio with respect to time and finding where it goes to zero. That is where it stops changing. If you dislike reading the maths, this might be a good time to skip to the graph. Basically we are doing a related rates problem:

We already have the rate of current population growth, but to bring this forward, I first need to tell you how I made the extrapolations. Start by taking a derivative of population growth. This is just the change in population growth over a change in time.

(Cell H6) [=(G6-G5)/(A6-A5)]

Complete the column, and you have the increase or decrease in population growth per-year. This data is often very noisy, but recently it has been remarkably stable, with a value of around -0.03%. The negative means that the rate of population growth is decreasing each year. To extend it past 2013, I calculate the moving average of the previous 40 years:

(Cell H107) [=AVERAGE(H67:H106)]

When you complete the column (and add a hundred years to column A) you can extrapolate the growth rate:

(Cell G107) [=G106+H107]

And you can increase (or decrease) your future population and total of "All People Ever":

(Cell B107) [=B106*G107]

(Complete all other rows)

(Add in estimates of CBR for future years)

Now we are back to the graph I showed previously, so let's look at the growth rate for "All People Ever":

(Cell I5) [=EXP(LN(E5/E4)/(A5-A4))-1]

Plotting these two on the same graph demonstrates that the "Current Population" rate passes over the "All People Ever" rate well before the "Current Population" rate reaches zero (and our population stops growing).

So our best chances are in the year 2059 when 15.21% of All People Ever will currently be alive. This may seem counter intuitive since "Current Population" will still be growing, but one rationalization is that while "All People Ever" is just based on births, "Current Population" is based on births and deaths, meaning that "All People Ever" will start growing faster than "Current Population". (And theoretically, integrating the area under the Current Population curve between the "maximum ratio" date and the "population stabilizes" date will yield a number of people that will die after the best zombie apocalypse date but before current population decreases.)

Boneheaded Oversight

Of course, as my wife mentioned to me when looking over my first draft, we shouldn't really be afraid of people who are dead for such a long time that they are just bones. Those aren't real zombies. Let's limit it to just people who might still be creepy looking. What is the ratio of people alive today to people who died in the last 50 years?

(Cell J44:J206) = Crude Death Rate (CDR)

(Cell K44) [=(J44/1000)*AVERAGE(B43:B44)*(A43-A44)]

(Cell L93) [=A93/SUM(K44:K93)]

Complete the columns and graph.

Looking at it this way our odds are much better. There are currently 2.86 people alive for each person dead in the last 50 years. On the other hand, our advantage goes away even sooner, due to the stabilization of our CDR. Here's hoping that the zombie apocalypse is right around the corner?

-----

Wednesday, July 10, 2013

Finding Your Rebalance

In which I use a lot of fancy math to tell you to buy low and sell high.

Aaaaaaand We're Back

So much for sticking to my once-a-week schedule, but thanks for coming back! I've been thinking about today's post for a long time, but have been too busy to actually model anything until the last couple days. While I've been teasing with several recent finance posts, this one comes closest to my actual philosophy. That said...

I am not a financial planner. You should do your own research and make your own financial decisions based on what is best for you. Also, I highly recommend making a leveraged buyout offer for Dell Computers. Everyone I know who has done it is now a billionaire.

Winning!

Anyways, one of the biggest problems with investing is that we don't know the future. Sure, we can Buy and Hold, but unless someone from the future tells us what to Buy and Hold, we are stuck with our best guesses and a distinct lack of hoverboards. So we diversify. By that I mean we make lots of guesses and end up with an average yield. Most people are of two minds on this. Their rational risk-averse scared-mammal mind thinks, "Whew, glad I am safe! And look at all these gains!" while their greedy reward-centered hungry-reptile mind thinks, "Wow, if I had just put a bigger bet on the winners I could have had so much more!"

Well, why don't people put a bigger bet on the winners? Mainly because, well, most people only have so much money and they don't know the future. They hedge their bets to mitigate the chance of betting on a loser. Once you know it is a winner, it is already too late to "Buy Low".

So what if you follow the mantra of "Buy Low, Sell High"? After all, that is what investing in the markets is all about. Once your winners are winning, sell high and bet on some potential winners that may go either way. Rebalancing your assets this way can be dangerous with individual stocks. If you sell a winner and end up putting all your money on penny stocks that go to zero, your investment hasn't done much for you. You want assets that are safe and won't go to zero but still show some growth.

One asset that fits the bill is an index fund. These mutual funds (or exchange traded funds- ETFs) simply follow a basket of stocks (or more recently bonds, real estate, gold, or bitcoins) and reflect the value of the diverse set of underlying assets. So this is today's challenge: apply the idea of rebalancing to index funds.

A few more notes on index funds. The oldest ETFs have only been around since the 90's, and most are less than ten years old. For this analysis I'll only look at the last 10 years of data, and even though I'd love to throw in some alternative assets, the only tradable ones that fit our needs cover things like the NASDAQ (QQQ) and S&P 500 (SPY). They charge a small fee (~0.05%) to rebalance the stocks inside the fund (not always holding everything, but mimicking the results) and distribute the dividends, but for our purposes we will ignore the slightly different fees and distributions as well as the cost of trading them and tax implications.

Whew! Let's do some modeling.

An Interesting Start (get it? compound growth? anyone?)

We will start with the NASDAQ and S&P 500 ETFs, which were both assets you could have invested in 10 years ago. This puts our start date as July 2003, conveniently after the dot-com bust, but far enough back that we should be able to see some divergence. NASDAQ (QQQ) consists of holdings that are concentrated in the tech sector. Additionally, it is "market cap weighted" so the bigger companies comprise a larger portion of the assets. Nearly 20% of it is just Apple and Microsoft, and another 30% is the next 8 largest companies. Contrast that with S&P 500 (SPY) which consists of holdings spread across more industrial companies. Yes, they still have Apple and Microsoft, but they also have Exxon, Wells Fargo, General Electric, and Johnson & Johnson. They are also market cap weighted, but with so many more companies, the top ten don't even make up 20% of the index. (As an aside, this market cap weighting also means that the ETF only needs to buy or sell stock components when it needs money, as an increase in the price of a company will automatically make it a larger component of the index. When the ETF does sell assets, selling 0.1% of every company means selling more of the expensive stocks than the cheap ones... already initiating a form of rebalancing!)

So given the information above, which ETF would you have chosen as the best asset for your, say, $10k investment? The right answer is... you don't know. Or rather, you didn't know. You can't predict the future nor should you. Over that 10 years, the NASDAQ returned 8.2% annually and the S&P 500 returned 5.9%. If you invested 50:50 in both, you returned just over 7%.

That difference between the two ETFs can be attributed mostly to the growth of two companies: Apple and Google. While both indices had them, NASDAQ had more.

Oh, how did I make that graph? You can get the daily close price for QQQ and SPY from July 7, 2003 to July 3, 2013. This turns out to be over 2500 data points each, which tends to slow down Excel when I try to graph it. I take the data, crop out the unnecessary bits and sort it in ascending order with:

Column A = Date

Column B = QQQ

Column C = SPY

I then make a Column D which is a series of ascending numbers a fixed number of rows apart.

You can sort by Column D (ascending), and effectively sample your data. I used every 11 trading days for the graph, but you can easily do every month (21 trading days) or year (252 trading days). Delete Column D and any excess data, as you won't need it any more. I also insert three rows at the top to run calculations. To get the value of each portfolio type the [In Brackets] portion:

Complete the columns and graph. That 50:50 Buy and Hold portfolio is what we aim to beat. Hopefully we get up to the yield of the NASDAQ, but without the risk of choosing the wrong asset.

When we rebalance, we are trying to get back to our initial distribution of assets because they have grown at different rates. This only works if the two data sets are not perfectly correlated with each other. We can check for this in Excel by looking at our full data set and probing with

Correlation Coefficient [=CORREL(B2:B2530,C2:C2530)] = 0.72

So about 72% of the movement in SPY can be predicted by movement in QQQ, but not all of it. This makes a lot of sense, as there are several underlying stocks in common between the indices, but not all of them. Again, this is where some alternative assets would be nice, as they may have even lower correlations. The other thing you need is for some force other than momentum to be pushing the prices. It seems silly to add this caveat, but you can imagine that if a stock only went up because it went up the day before, you will never get any benefit from selling that stock to buy one that isn't doing as well. I don't think this should be a problem.

Stock of the Month Club

As a first pass, lets sample our data monthly (see above) so that we have Date, QQQ, and SPY in Columns A:C with our 50:50 in Column D. We can have a simple reporter above them that tells us the (geometric) average annual yield over ten years.

Annual Yield of QQQ [=exp(ln((B2533/B5)/10)-1)*100]

Basically you figure out how much growth you have had (final/initial value), spread it over 10 years, and use algebra to find the yield you would need to produce that growth. For the actual "Monthly Rebalanced" portfolio we will need three columns (E:G). Each month we will sell all shares, then buy back a balanced portfolio. (In reality you would only sell the excess shares to buy the lower shares but this is easier to model/explain and gives the same result.) To do this, put 10000 in cell E5. Then determine how many shares you will buy using:

(Cell F5) [=E5/2/B5]

(Cell G5) [=E5/2/C5]

We now have equal value in shares of the two ETFs. Next month we need to sell our shares at that months current price.

(Cell E6) [=F5*B6+G5*C6]

Complete the columns. (Completing columns F and G down to row 6 first is the easiest way to do this without getting an error.) Lets check out the results!

Confused? Yeah, I was too. It just so happens that you get no benefit at all. But this should work?!? I'm supposed to make fun of other people's silly financial strategies not disprove my own!

One way to wrap your head around this is that when we model the Monthly Rebalance method, we aren't doing it smartly. We really don't know why we rebalance or even if it's necessary. We could have a completely balanced set of assets that we sell and buy back in the exact same ratio. Alternatively, we might be missing out on huge imbalances simply because it isn't time to change yet. When it is time to change, the imbalances may have resolved themselves. All this means is that the small amount of momentum inherent in the price swings can eat away at our inept rebalancing attempts. There are two solutions to this. We can rebalance smarter or more often.

If At First You Don't Succeed, Repeat As Necessary

Lets try more often! Bring back the full set of data with date, QQQ (daily), and SPY (daily) in Columns A:C. Make a 50:50 portfolio in Column D as well. Now apply the same E:G equations and complete the columns. This is modeling what would happen if we sold everything at the end of the day and immediately bough back a balanced portfolio. Lets check the results... again!

Now thats what I'm talking about! Now we were able to take advantage of all the imbalances brought on by the volatile market. An extra 0.7% yield is nothing to sneeze at, and all it costed us was... wait (doing maths in head cheap $5 trades, selling all of two ETFs, buying back two ETFs, each of 252 trading days a year... $5*(2+2)*252 = ) a little over $5k a year... to handle our $10k portfolio. Ouch. Granted it would still be $5k for a $10mm portfolio (0.05%), but I don't have ten million dollars.

I Can't Believe It's Not Smart Balance

So what if we instead rebalanced smarter. Only when we needed to. Only when things were really out of whack. First, lets define really out of whack.

Out of whack = (Cell A1) = 0.05

So "out of whack" currently is set at 5%. We'll be changing this later on. I'll just add this Smart Rebalance to our Daily Rebalance worksheet. Start out the same way:

(Cell H5) [=10000]

(Cell I5) [=H5/2/B5]

(Cell J5) [=H5/2/C5]

(Cell H6) [=I5*B6+J5*C6]

So you start with the same shares. Now we throw in some IF statements:

(Cell I6) [=IF(ABS((I5*B6-J5*C6)/H6)>A$1,H6/2/B6,I5)]

(Cell J6) [=IF(ABS((I5*B6-J5*C6)/H6)>A$1,H6/2/C6,J5)]

So now if the difference in our two assets is more than 5% of our portfolio, we rebalance to 50:50, otherwise we keep the shares we have. Complete the columns. And the results?

Not too bad! How much did it cost us? Well, we can make a quick reporter:

(Cell K5) [=SUM(K6:K2533)]

(Cell K6) [=IF(J6=J5," ",1)

This returns 1 each time we rebalance, then adds them up. For the 5% threshold, we end up with an average of 1.7 trades per year (doing math again... man I should really get a computer program that can do this instead $5*1.7*(2 sells + 2 buys) =) about $34 a year to manage our portfolio and squeeze another 0.15% out. Again, this doesn't really help a $10k portfolio (since it only squeezes out $15), but right around a $23k portfolio it does pay off. Additionally, many brokerage firms give you a few free trades a year or free trades on ETFs if you hold them for more than a month. Also, if we really just sold our overperformer and bought the underperformer, that cuts our costs in half. Anyways, you probably noticed by now that the Smart Rebalancing didn't quite get back to the Daily Rebalancing. This changes if you change your stringency threshold.

As you can see, moving to the 2% range can squeeze out nearly 0.5% extra yield and still stay near one rebalance a month. Unlike our previous Monthly Rebalance, these Smart Rebalances are distributed over the ten years as necessary.

Though you can't tell it just from this graph (as rebalancing can go either way), NASDAQ fared much better during the '08 collapse (once again, due to Apple and Google being more heavily weighted). Benefiting from one fund's awesome stretch is what rebalancing does best.

Foreign vs. 'Merican

But as I mentioned before, NASDAQ and S&P 500 have a lot of correlation. Lets look at some other ETFs that have ten years of data. I won't bore you with the Excel details- basically I imported the data the same way and ran [=CORREL(B5:B2533,C5:C2533)] on each relationship.

As you can see, most index funds of stocks (name on the side, ETF ticker symbol on top) have significant correlation. (As an aside, you can see why the only good use for the Dow Jones is that it predicts movement in the S&P 500. If only we had an indicator for that... such as the S&P 500.) One nice low correlation pairing is the Consumer Discretionary Fund (XLY) which features stocks like Home Depot and Ford, with the Emerging Market Fund (EEM) featuring stocks like Samsung and China Mobile.

As before, a 50:50 mix gives an average yield between the two individual ETFs that can be significantly improved by Daily Rebalancing. Interestingly, it takes very little Smart Rebalancing to achieve similar results. This may be because both these funds are more volatile than the ones we were looking at previously, but the end result is that you can set your threshold high and end up trading only once or twice a year. This meant that when EEM reached new highs at the beginning of the year, a 10% threshold allowed you to sell (maybe missing the true peak) so that you would instead be invested in XLY when EEM came crashing down in June.

One, Two... Many

Realistically, it is unlikely that you are invested in just two ETFs that you have to balance. In a way that makes things easier. Now we are just looking for the huge imbalances where one component is, you guessed it, way out of whack. Lets try it with all four ETFs.

The 50:50 portfolio and Daily Rebalance portfolio are pretty much the same. Just change any "/2" to "/4" and you are basically there. (I guess that technically makes it a 25:25:25:25 portfolio.) For the Smart Rebalance I'm going to need to get a bit tricky. Fill in Columns A:E with the date and the four different datasets. Column F will be our portfolio size, and we will still make F5 [=10000], but instead of implicitly valuing each component, lets make Columns G:J be the value of our four assets. Initially we can assign G5:J5 [=2500]. Now we need to determine how many shares we have to start and insert them in Columns K:N similar to before:

(Cell K5) [=G5/B5]

(Cell L5) [=H5/C5]

(Cell M5) [=I5/D5]

(Cell N5) [=J5/E5]

The value of the share the next day contributes to the total portfolio:

(Cell F6) [=sum(G6:J6)]

(Cell G6) [=K5*B6]

(Cell H6) [=L5*C6]

(Cell I6) [=M5*D6]

(Cell J6) [=N5*E6]

Now for the IF statement. We will rebalance only if the difference between the most valuable and least valuable components is more than (A1= out of whack) percent of the portfolio:

(Cell K6) [=IF((MAX(G6:J6)-MIN(G6:J6))/(F6)>A$1,(F6/4/B6),K5)]

(Cell L6) [=IF((MAX(G6:J6)-MIN(G6:J6))/(F6)>A$1,(F6/4/C6),L5)]

(Cell M6) [=IF((MAX(G6:J6)-MIN(G6:J6))/(F6)>A$1,(F6/4/D6),M5)]

(Cell N6) [=IF((MAX(G6:J6)-MIN(G6:J6))/(F6)>A$1,(F6/4/E6),N5)]

Complete the columns! Lets look at different thresholds:

Now we're talking! With a 3% threshold, trading just 5 times a year you can beat three of the four components of your portfolio. While you don't get the huge results of your best ETF, your transaction costs are very reasonable to squeeze out a 0.5% increase over a simple Buy and Hold. In reality, you aren't going to have an alarm going off when your portfolio starts to go out of whack. You aren't going to have this exact schedule:

Hopefully you just keep an eye on things. Hopefully you aren't paying someone else 1% of your portfolio just so they can "beat the market" by 0.5% doing what you could do in 5 minutes every two months. Over the last 10 years it seems like you could invest quite well just by diversifying and rebalancing. Just remember, past events may or may not be indicative of future ones.

-----

So apparently Blogger (read: Google Overloards) now allows people to actually subscribe to blogs! It feels so 2008! The google groups email list is still active, but if you want to subscribe to the blog now I'd suggest using the link on the right panel.