Bias of many types…and a walk

Today’s workout: end of “leisure” workout. I did my 8.1 cornstalk course in 2 hours (some rain…I didn’t get that wet) and then 2 more miles on the treadmill: 12:00/11:20 to get 23:20. I wanted to do at least a little faster than marathon pace.

RIP: BKS Lyengar, famous yogi and author of Light on Yoga.

Here he is in 1977 when he was in his late 50’s. What flexibility, strength, and body control!

Survivorship bias: this is the annoying tendency to see, say, a dozen successful companies, see what they have in common, and then conclude that what they have in common is what made them successful. Nope; you have to see how many companies did those same things and WERE NOT successful, among other things. From the article:

This is what Pomona College economist Gary Smith calls the “survivor bias,” which he highlights as one of many statistically related cognitive biases in his deeply insightful book Standard Deviations (Overlook, 2014). Smith illustrates the effect with a playing card hand of three of clubs, eight of clubs, eight of diamonds, queen of hearts and ace of spades. The odds of that particular configuration are about three million to one, but Smith says, “After I look at the cards, the probability of having these five cards is 1, not 1 in 3 million.” [...]

Smith found a similar problem with the 1982 book In Search of Excellence (more than three million copies sold), in which Tom Peters and Robert Waterman identified eight common attributes of 43 “excellent” companies. Since then, Smith points out, of the 35 companies with publicly traded stocks, 20 have done worse than the market average.

Depression I talked about depression in an earlier post. Here is some of what science knows about it right now:


See the subtle racism here? The idea is that this black Attorney General who has spoken out about race relations is somehow too “emotionally invested” or biased to be even handed. Why would a black Attorney General be any less evenhanded than a white one? And shouldn’t we be far more concerned with an Attorney General who did NOT see race relations as a problem?

Here: Kansas City police officer posts a snarky post about Michael Brown’s character (the dead teenager in Ferguson) and shows a photo of a young black man with a gun and money in his mouth. But this black man is some guy in Oregon…not Michael Brown. It is amusing that police officers everywhere are telling us to not to rush to judgement but… :-)

I suppose that given that we have 300+ million people in this country and a lot of police officers, a few are bound to be crackpots.

Racism in sports
Sadly, some African American athletes have racist stuff directed at them. Here is an example (Eddie Chambers, an elite boxer)

Politics: emotional issues robs us of abstract reasoning ability…

Good Vox article here. Moral (for me): mathematical and statistical reasoning really disciplines our thinking, BUT does not convince non-technical people.

This is one reason discussing issues with people outside of math, science and engineering departments is so difficult for me.

Early 538 Senate forecast and some dissent on the concept

Sadly, I have to agree with Nate Silver’s Senate election forecast: the Democrats are slight underdogs to keep the Senate at 50-50. We have too many seats in red states up for election.

However, some are taking shots at Mr. Silver’s website (NOT the election forecast):

Timothy Egan joins the chorus of those dismayed by Nate Silver’s new FiveThirtyEight. I sorry, but I have to agree: so far it looks like something between a disappointment and a disaster.

But I’d argue that many of the critics are getting the problem wrong. It’s not the reliance on data; numbers can be good, and can even be revelatory. But data never tell a story on their own. They need to be viewed through the lens of some kind of model, and it’s very important to do your best to get a good model. And that usually means turning to experts in whatever field you’re addressing.

Yes: knowing how to crunch data does NOT replace knowing a field. Of course, this next Krugman comment is epic:

Unfortunately, Silver seems to have taken the wrong lesson from his election-forecasting success. In that case, he pitted his statistical approach against campaign-narrative pundits, who turned out to know approximately nothing. What he seems to have concluded is that there are no experts anywhere, that a smart data analyst can and should ignore all that.

This made me laugh. Sure, I see the pundits as being mostly, well…entertainers. But I am not sure that Mr. Silver is competing with experts but rather trying to trying to introduce some data into journalism.

I don’t think that his target audience is the readers of Scientific American.

Five Thirty Eight Starts up again….the discussion

I admit that while I haven’t read every new 538 article, I’ve enjoyed most of what I’ve read.

What I see here isn’t really new, but it is well put together. For example: this article correctly asserts that the increase in storm/disaster damage really isn’t due to global warming (at least there is no evidence for it) but rather because there is more there to damage in the first place.

This article talks about the arguments about “potential GDP” versus “actual GDP”: is what we are now seeing “the new normal“?

I honestly think that articles that talk about the “subject being debated” have some value; it is nice to know what the argument is about. So far, I’ve liked what I’ve seen.

Nate Silver further explains what he has in mind for this new site:

FiveThirtyEight is a data journalism organization. Let me explain what we mean by that, and why we think the intersection of data and journalism is so important.

If you’re a casual reader of FiveThirtyEight, you may associate us with election forecasting, and in particular with the 2012 presidential election, when our election model “called” 50 out of 50 states right.

Certainly we had a good night. But this was and remains a tremendously overrated accomplishment. Other forecasters, using broadly similar methods, performed just as well or nearly as well, correctly predicting the outcome in 48 or 49 or 50 states. It wasn’t all that hard to figure out that President Obama, ahead in the overwhelming majority of nonpartisan polls in states such as Ohio, Pennsylvania, Nevada, Iowa and Wisconsin, was the favorite to win them, and was therefore the favorite to win the Electoral College.

Instead, our forecasts stood out in comparison to others in the mainstream media. Commentators as prestigious as George F. Will and Michael Barone predicted not just a Mitt Romney win, but a Romney sweep in most or all of the swing states. Meanwhile, some news reporters defaulted to characterizing the races as “toss-ups” when the evidence suggested otherwise.1

So, why the need for data journalism? One important point:

Students who enter college with the intent to major in journalism or communications have above-average test scores in reading and writing, but below-average scores in mathematics. Furthermore, young people with strong math skills will normally have more alternatives to journalism when they embark upon their careers and may enter other fields.4

This is problematic. The news media, as much as it’s been maligned, still plays a central a role in disseminating knowledge. More than 80 percent of American adults spend at least some time with the news each day. (By comparison, about 25 percent of Americans of all ages are enrolled in educational programs.)

Meanwhile, almost everything from our sporting events to our love lives now leaves behind a data trail. Much of this data is available freely or cheaply. There is no lack of interest in exploring and exploiting it: Google searches for terms like “big data” and “data analytics” have grown at exponential rates, almost as quickly as the quantity of data itself has grown.

But this new website is now without critics.

For example: Paul Krugman has this to say about one of the economics articles:

feel bad about picking on a young staffer, but I think this piece on corporate cash hoards — which is the site’s inaugural economic analysis — is a good example. The post tells us that the much-cited $2 trillion corporate cash hoard has been revised down by half a trillion dollars. That’s kind of interesting, I guess, although it’s striking that the post offers neither a link to the data nor a summary table of pre- and post-revision numbers; I’m supposed to know my way around these numbers, and I can’t figure out exactly which series they’re referring to. (Use FRED!)

More to the point, however, what does this downward revision tell us? We’re told that the “whole narrative” is gone; which narrative? Is the notion that profits are high, but investment remains low, no longer borne out by the data? (I’m pretty sure it’s still true.) What is the model that has been refuted?

“Neener neener, people have been citing a number that was wrong” is just not helpful. Tell me something meaningful! Tell me why the data matter!

Krugman goes further in another blog post:

Now, about FiveThirtyEight: I hope that Nate Silver understands what it actually means to be a fox. The fox, according to Archilocus, knows many things. But he does know these things — he doesn’t approach each topic as a blank slate, or imagine that there are general-purpose data-analysis tools that absolve him from any need to understand the particular subject he’s tackling. Even the most basic question — where are the data I need? — often takes a fair bit of expertise; I know my way around macro data and some (but not all) trade data, but I turn to real experts for guidance on health data, labor market data, and more.

What would be really bad is if this turns into a Freakonomics-type exercise, all contrarianism without any appreciation for the importance of actual expertise. And Michael Mann reminds me that Nate’s book already had some disturbing tendencies in that direction.

Yes: there is no substitute for knowing what you are talking about….and the problem with the “opinion pundits” is that frequently: they don’t know what they are talking about.

On Being a New Republican: day one.

Well, Bruce Raunner won the GOP Republican primary by 3 points over Dillard. Now the real race begins in earnest.

2014 will not be kind to us:


Of course, the above is sanitized; the poorer people won’t show up either. Give the Republicans credit: they vote more often and that it a positive for them and a negative for us.

Metaphors and Mathematics

This illustrates a few mathematical concepts:

1. This is an example of a “projection map” in mathematics.

2. This shows how information is ALWAYS lost when one takes a projection. Something very similar happens when one does a statistical regression.

These different data sets have almost identical regression coefficients:


(more here)

Back to politics


Once again, all over the place: videos, denial, mammograms

Workout notes Treadmill: 6 mile run in 1:02:50. Started off at 11:0x mpm and did 2 minutes each in the following pattern: 0-.5-1-1.5-2 then 10:42 (same pattern) then 10:31 for most of the rest: 0-.5-1-1.5-2-2-1.5-1-.5-0 then 5 minutes each at 2-1.5-1 then I finished the rest at .5, increasing the pace each minute.

Then 2 miles (16 laps of lane 3) of walking in 29:37 (14:23 for the last mile).

What I’ve noticed: while my legs aren’t classically “dead”, it is almost as if someone sucked out my quad muscles with a straw. They are, well, not doing a thing.

Physical Stuff

Since we are talking gym: this “gym stereotype” clip is funny. I am the old man in the locker room; I suppose that comes from the fact that many of us don’t look at others…so what is the fuss? It just doesn’t register any more.

Now for some physical craziness. Yes, the law-and-order person in me wondered if these people had the proper permissions to do this. But, well, the video is rather incredible. Physically, these guys are much of what I am not.

Evidence based medicine and science is hard. We create models and then go with our best educated guess…and sometimes it takes years to gather data. Here is a vast study about mammograms and their effectiveness:

One of the largest and most meticulous studies of mammography ever done, involving 90,000 women and lasting a quarter-century, has added powerful new doubts about the value of the screening test for women of any age.

It found that the death rates from breast cancer and from all causes were the same in women who got mammograms and those who did not. And the screening had harms: One in five cancers found with mammography and treated was not a threat to the woman’s health and did not need treatment such as chemotherapy, surgery or radiation.

The study, published Tuesday in The British Medical Journal, is one of the few rigorous evaluations of mammograms conducted in the modern era of more effective breast cancer treatments. It randomly assigned Canadian women to have regular mammograms and breast exams by trained nurses or to have breast exams alone.

Researchers sought to determine whether there was any advantage to finding breast cancers when they were too small to feel. The answer is no, the researchers report.

Unfortunately, this study will probably be pillared by those whose lives were saved, so they think, by mammograms. Remember: this is NOT a study about regular breast exams; it is about mammograms which are supposed to catch the cancer at the early stages.

So, someone who had a genuine harmful cancer detected by a mammogram and was saved may have well be saved by a later detection via a conventional exam.

I suggest reading the whole article; much of the data that shows “x out of 1000 were saved by mammograms” came out before the newer drugs came out.

I don’t know what to think because this isn’t my field of expertise. But it is interesting, to say the least. I just hope that science and statistics determines the best policy and not emotion.

Now about statistics and onto politics: remember the morons and their “unskewed Presidential race polls”? Well, these people haven’t learned a thing; they are refusing to believe the current data about the Affordable Care Act.

I suppose that instead of breaking people down by “conservative/liberal”, we should break them down by “convinced by evidence/not convinced by evidence”.

Social Views Did you know that people who won lotteries changed their economic views in the conservative direction? Now there are some caveats in this study (e. g. people who are likely to play a lottery might have a different mentality that those who don’t; and yes, the lottery really is a tax on those who can’t do math). But Paul Krugman has a ton of fun with this finding.

Stock market graph pattern

I saw an article which posted this graph:


Oh, the text admits that the scales aren’t the same (left versus right) though the time lines are. This is supposed to mean something?

Well, I took the liberty to look at longer trends (1922 to 1930 and then 2007 to 2014) (using this tool)



I don’t see a whole lot of similarity.

Looking at 2 years (as in the “scary graph”):


Hmmm, I suppose that this 1935 to 1937 could be made to fit too.

And one can look at the various cycles, this time scaled in percentages:


I suppose it is human to recognize patterns even where none exists.

Now, I am not a market expert; there might be other signs of a impending crash. But I kind of doubt that pattern fitting is a legitimate sign.

Contempt for elementary education and other topics

Workout notes
Shorter weight workout followed by a cold 4 mile road walk (Bradley Park hill course). It was cold (15 F, or -9 C), somewhat breezy and sunny; there were isolated 50 to 100 meter stretches that were completely “frozen snow/ice” covered. But I wanted to get outside a bit.

The weight workout was a bit different today: part of the rotator cuff (dumbbells), hip hikes, Achilles:
pull ups: 15, 15, 10, 10 (good)
super set with dumbbells: 3 sets each of:
seated military (sets of 12 with 50’s)
upright rows (sets of 10 with 25’s)
bench presses (sets of 10 with 70’s)
bent over rows (sets of 10 with 65’s)
curls: (sets of 10 with 30’s)

Then an ab super set; 3 sets of 10 with crunch, twist, sit back, vertical crunch.

Then came the outdoor walk.

Posts of the day
The NSA sometimes put tracking/control devices in computers that were going overseas; hence they could easily spy on or manipulate computer activity.

Fun with statistics:
Of course correlation and causation are not the same. Then again, sometimes there are good reasons for a non-causal correlation (e. g. my time to run the mile slowing down with years of marriage or the years that Obama has been in office) and sometimes the correlation is simply spurious. Here is a “fun” collection of them.

Oh yes, sometimes there is a valid correlation but the cause and effect are reversed: for example basketball players tend to be tall. So, your “how to get taller” program involves getting your client to take up basketball. :-)

Educational matters

Some time ago I remember seeing a poster outside of a student affairs office; I believe the poster had a picture of various women yelling at a man in the middle; one of the things being said by the females was “how we dress has nothing to do with sex.” Really? Check this out. This is about a sorority “twerk off”.

So, now we’ll hear stuff about “sexualization” and…oh yes, “slut shaming”. Seriously. :-)

My view: this twerking contest is young people being, well, young people. It is all part of the human mating ritual. It neither surprises nor outrages me. No, these women aren’t doing this for me or with people like me in mind; for me, “twerking” is, say, my wife bending over to get her pills out of the lower cabinets or bending over in the garden, etc.

Our society is too tense about these matters, IMHO. The only thing that I ask: if this is going to end up in “new kids”, make sure that you can SUPPORT those kids BEFORE having them, ok? I am not a conservative, but the old saying “you breed ‘em, you feed ‘em” makes sense to me.

And speaking of kids, they need to be educated too.
In the local paper, there have been a series of articles about cheating on standardized tests for “special needs” students. Here is one such article:

■ “Charter Oak staff violated ISAT testing protocol in providing inappropriate testing accommodations to special education students during the administration of the ISAT.” Teachers directed students to correct answers in a variety of ways, going as far as to erase answers themselves.

■ “All staff members interviewed reported they did not receive any formal training on ISAT administration on a yearly basis.”

So, they appear to be saying “I’m sorry we cheated, but we weren’t trained enough to know that changing the pupil’s answers or erasing their wrong answers was cheating.” You need to be TRAINED to know that is wrong?

Then there is this little gem from the St. Louis paper (about a month ago):

The proud parents who attended Lincoln Elementary’s honor roll assemblies years ago assumed the school was a shining example of academic achievement.
Kids by the dozens lined up to be celebrated for earning grades that put them on the honor roll.

Then the school in St. Charles got state test results.

Most of the students failed, casting doubt on the school’s success and challenging the validity of many of its students’ glowing report cards. Administrators knew they had a problem.

What they did next upended everything parents, teachers and students thought they knew about grading.

St. Charles joined a national movement that — sometimes amid a formidable backlash — is rebuilding how a child’s performance in a class or course is calculated.

It’s a switch that seeks to move away from rewarding students merely for completing work, and instead bases grades on mastery of a subject.

Swept away are points for finished homework assignments, or good behavior and class participation. Instead, grades are more heavily based on exam results and the quality of work.

Oh my goodness: you mean making a good grade in the subject should infer having some demonstrated ACTUAL KNOWLEDGE of the said subject???? Who knew?

But reading this was useful to me. Some time ago, a “business calculus” student came up to me in anguish. She showed me her homework paper with 0 points on it. She said “I did all this work here, and it was marked WRONG.” I said: “yes, it was marked wrong because the “work” was totally incorrect; there was no correct work here. She gave me the “are you serious?” look; it was if having to be correct to get credit was a new concept for her.

Maybe this is why?

So, none of this is flattering to our grade school educators or educational system. But….yes, I know, this isn’t ALL school districts; these aren’t ALL of the educators and yes, much of the blame might be put on what happens to the pupil BEFORE they get to school (at home) and on this as well:


So yes, I know that there are good, dedicated teachers and educators who are busting their rear ends to do something about it, and these people need good pay and our moral support.

Some halftime stats and science

Yes, I am blogging at halftime of the Oklahoma versus Alabama game. OU leads 31-17 but Alabama has the type of team that can overcome adversity…and I remember the Chick-Fil-A Peach Bowl where Duke lead 38-17 at the half only to lose to the (ugh) Aggies.

The quality of this blog has suffered recently due to…well, increasing business. First it was the super busy semester and then it was vacation.

Hopefully, I can talk about a few things of substance this time.

Weather: yes, it is very cold in Illinois this “winter”. The jet stream has dipped and we are paying the price as the Jet Stream holds back the Arctic Air Mass.

Screen shot 2014-01-02 at 9.42.12 PM

Now of course, Republicans deny global warming…and now an increasing number are denying evolution:

There also are sizable differences by party affiliation in beliefs about evolution, and the gap between Republicans and Democrats has grown. In 2009, 54% of Republicans and 64% of Democrats said humans have evolved over time, a difference of 10 percentage points. Today, 43% of Republicans and 67% of Democrats say humans have evolved, a 24-point gap.

Paul Krugman says that this reflects increasing tribalism (“what does a good conservative believe?”) which, of course, has consequences in other public policy matters (e. g. macroeconomics). Hence Republican candidates have to be very careful not to present the unvarnished truth if they want to keep their base (e. g., Mitt Romney walking back his statements about cutting spending during a recession limiting growth)

Now, there is peril for liberals here too: this is one reason those of us who are scientifically literate must speak out for science, even when it goes against what many of our liberal political allies might think:

What this tells us is that elite opinions matter a lot in public discourse. The gap between liberals and non-liberals is not really there on this issue (GMO) at the grassroots. That could change, as people of various ideologies tend to follow elite cues. This is why the strong counter-attack from within the Left elite is probably going to be effective, as it signals that being against GMO is not the “liberal position.”

The same applies to woo-woo, “alternative medicine”, the irrational attacks against “fracking” (some attacks about it being improperly or inappropriately used ARE legitimate), etc.

I don’t want liberal leaning media to be at the point where it makes the reader more ignorant than before; here is an example of the Wall Street Journal doing exactly that (on income inequality).

Aging and time to failure curves
It is well known that as we age, the probability of dying in a given year goes up. In fact, the probability of dying in a given year doubles with every 8 years of life. Example: if you are married to someone who is 16 years older than you are, they are 4 times more likely to die in a given year than you are.

This article discusses the various mechanisms of why this might be true; it makes for interesting reading.

The bottom line: the model of the attacks on the body being produced at a constant rate, but the body’s ability to fight those attacks being reduced at a linear rate DOES fit this model.

Now as far as the bathtub curve, the lead in to this reliability engineering blog post gives a nice introduction to it, though this article deals with how current reliability engineering deals with “burn in failures” and how “time to obsolescence” affects the curve.

Embarrassment on the Hike and Bike trail

The good news: I enjoyed my run. I did have to stop once to smooth the tongue in my shoe (it was bunching and putting pressure on my instep) and I had a rock.

The ok news: 45 F, and the 8 miles took me 1:20:05.

The bad news: this run was work…not a race effort, but work. And my goodness, I must have gotten passed scores of times. 15 years ago (or longer), I only got passed by “team” members. was the fitter looking guys passing me. Then it was the fitter looking men and women. Then it was average looking men. Now: it is average looking men and women; it is as if I am running in place. And you see the trajectory…

But there is an interesting mathematical modeling thing going on. Imagine the pace of the runners being plotted on a normal density curve; faster paces to the right, slower to the left. Also remember that, given that people start at different places on the trail, run in different directions and start at different times, you’ll neither be passed by nor pass most of the runners out there. So, I’ll have to work on this and see how one might model this. But what I do know is that I get passed by runners far more than I pass other runners. Does this mean that my pace is slower than the mean pace? My first guess is yes, but I have to think about this one.

December 28, 2013 Posted by | running, statistics | | Leave a comment


