Tuesday, March 31, 2009

The timelines of the Terminator

i09 has saved me the trouble of figuring out the many different timelines created by all of the jumping back in time in the Terminator movies and series. I took a crack at compling when the various Judgement Days are in Postponing Judgement Day - Terminator style, with an obligatory graph.

The writers at io9 have figured on at least 10 different timelines developed from before the first Terminator movie through T2 and T3, the series and the upcoming Terminator movie.

Given the thesis of their arguements - that every jump back in time creates a new timeline - I would contend that they are really many, many more than ten timelines. The next question is how flexible history and the timlines are and do you believe in the Great Man Theory of history or a people's history approach. Do men make the times or do the times make the men and how large do the changes made by the time travellers in the terminator movies and series have to be to change the course of history?

Two sides to the current financial crisis political discussion

I can't say it any better than SMBC.

My favorite lines are:

"I believe in a constitutional republic with slightly more government intervention" - "Well, I believe in a constitutional republic with slightly less government intervention"

(via Saturday Morning breakfast Club)

MEENIME license plate spotted in Delaware

This MEENIME license plate seems to be the obvious choice for this Maryland Mini Cooper driver. It appears that MINIME was already taken.

Perhaps this person is a meany me and this refers not to their choice of transportation but to their meanness.

Perhaps the person is attempting to make choice between vehicles. There might be an EeniMe MynieMe and a MoMe to go with MeeniMe.

Myers Briggs Personality announcement license plate

I am not sure why this 56 year old INTP feels the need to announce their personality type on their license plate (INTP56). It is possible that I am completely wrong in the interpretation. Any other suggestions?

In the Myers-Briggs system INTP means Introverted Intuitive Thinking Perceiver. It seems contradictory that an introvert would want to announce that they are one in such a public forum as a license plate.

Friday, March 27, 2009

Another graphical anaysis of something or other

It is really quite clear from the chart above that my conclusions are flawless and that I am very clever.

(maybe to many of this, this, these or this recently)

(via Saturday Morning Breakfast Cereal)

Wednesday, March 25, 2009

Hey Einstein? What do you know about Einstein?

Mental Floss has a quiz about Einstein. If you think you are so smart, see how well you can do.

I only got 70% for not knowing enough about his early life and believing urban legends. The average is 44% so I feel OK. One of my goals in school was to do just well enough to get an A, no better. I always did well on the tests where everybody had difficulty. The curve then put me in the A range.

March Madness statistical analysis does NOT guaranty good pool performance

After all of the statistical analysis from 24 years of the NCAA Basketball tournament why is my pool doing so poorly? The pool type favors upsets, the analysis says to pick upsets, but it doesn't say which to pick. To win the bracket must contain the correct upsets. Mine, however, does not.

Potential reasons for this:
  • I don't know anything about college basketball. I admit to this as a first principle
  • My technique picked every upset beyond a certain threshold of difference in the Sagarin ratings. This is probably too aggressive and resulted in two 6 seeds in the Final Four, which I should have corrected as this is too unlikely.
  • The average number of upsets is different from the fluctuations in upsets from year to year. Picking the average number is picking a certain outcome from the many different outcomes over the years. Average value is different from most likely value in a histogram.
For fluctuations vs. average it is instructive to look at the outcome of the round of 64. Over 24 years of tournaments the graph below (reproduced from an earlier post) shows the fraction of times a given matchup resulted in either the expected outcome or an upset where the lower seed wins.
As an example, 55 out of 96 historical matchups between 8 and 9 seed teams result in upsets where the 9 seed wins. That is more than half the time. That does not mean that every year there are two upsets, it means that on average over all the years roughly half of the outcomes are upsets. This has implications for what we can expect each year.

Each year in the round of 64 a given matchup occurs 4 times, one for reach region. A different presentation of the round of 64 data from the past 24 years shows that for each given matchup different years have different numbers of upsets. The possible range is from no upsets to four upsets. The chart tallies the number of years with each particular combination of upsets for each matchup. Comparing this variability data to the outcome chart shows that while more than half the time 9 seeds beat 8 seeds, in any give year every possibility has occurred. In fact only 9 years in 24 opportunities (~38%) have there been exactly two upsets. It is the most common value, but still more likely to be wrong than right.

I am not sure how to present a similar analysis for the later rounds, since the number of opportunities is determined by the outcome of the round before. It is just as important in those rounds to realize that the average outcome over 24 years is not the same as the most likely outcome from those 24 years.

Perhaps a more systematic process that seeks to maximize the number of points even in the face of these uncertainties is needed.

Sunday, March 22, 2009

Daffodils and Crocuses

The first daffodil of Spring is about to bloom.

Tiny crocuses.

Crocus and spider visitor.

Friday, March 20, 2009

NCAA March Madness bracket submission

A commenter asked to see the bracket produced after all of the machinations and analysis of the statistics of the past history of the NCAA March Madness basketball tournament. Here is is. Click for larger.

I am already in last place in my pool after 16 games. This allows the illustration of an important point. There will always be upsets in the NCAA tournament, the key to winning this particular office pool is to pick the correct upsets, and sometimes to be the only one that picked a particular correct upset. More games await.

Wednesday, March 18, 2009

Picking the 2009 March Madness Basketball Brackets with Statistics

I don't know anything about college basketball (unlike President Obama). I don't mean that I don't know the main rules, or how many players there are, or what's allowed and not allowed. I mean that I don't know what teams are good or bad, which players are destined for the NBA, which coaches are the best or who won what game in 1985. I am not even sure of what teams are in what division so I always check. I suspect that this ignorance is exactly the correct approach to take when picking the game winners in the NCAA Division I playoffs, March Madness.

Instead of learning the teams and the players, I have explored the statistics of the past 24 years (data can be found here) (Last years picks, Round of 64 and 32 upsets, Final Four and Championship probabilities) combined with other's specialized knowledge like the Sagarin ratings. This year I am updating some of my charts to include data from 2006, 2007 and 2008.

The pool I enter favors upsets. The points for each round are the round multiplier times the seed of the winning team that you picked. Thus if a 10 seed wins Round 2 and you pick it you get 2*10 for points. To win this type of pool it is imperative that you pick upsets. Game results for 24 years of Round 1 are shown graphically below.

Some quick points for Round 1, the round of 64:
  • No 16 seed team has ever won in the first round. Don't be the first to pick one.
  • 15 seeds are also very safe and normally win their games.
  • History shows that there will be at least one, and in some years two upsets favoring a 10, 11, 12 seed.
  • One could make the case for one upset a year favoring a 13 and 14 seed as well.
  • 9 seeds win against 8 seeds more than half of the time. Pick two upsets.
Even in Round 2 with 32 teams, upset picking is important as well. 24 years of Round of 32 matchups are shown below with expected and upset outcomes tabulated.

Because this round depends on the outcome of the first round the number of opportunities is different for each matchup. In the most extreme case, no 16 seed team has ever beaten a 1 seed to advance to this round, so there is no data for that matchup. Only once has a 15 seed beaten 2 seed and then played a 7 seed, thus there is only one occurrence on the chart.

Some lessons from the Round of 32 chart:
  • 1 seed teams typically win in this round as well, rarely being beaten by 8 or 9 seeds.
  • Matchups with 5 vs 4 seeds, 6 vs 3 seeds and even 10 vs 2 seeds and 12 vs 4 seeds (surprisingly) seem to be toss-ups over the 24 years of data. Almost half the time there is an upset and the lower seed wins. If you have them in your bracket pick the correct underdog half of the time.
  • Matchups with 7 vs 2 seeds do result in upsets about a third of the time. Look for opportunities to pick one.
The results of this chart show what teams advanced to the Sweet Sixteen Round and should help to determine which upsets to pick according to past history.

Below is a matchup outcome chart for the Sweet Sixteen round which is similar to the earlier charts, but much more complicated.
As each round progresses there are more combinations of possible matchups, though most of them have never actually occurred in history of the tournament in its modern form. No 16 seed has ever advanced so those matchups are not represented. 15 seeds rarely advance, so many of those matchups also have no data.

Some lessons gleaned from the Round of 16 outcome chart:
  • 1 seeds usually win. They always beat 12 seeds that make it through.
  • The closer the distance between seeds the more the outcome is a tossup. This is true for all of these charts.
  • In the three times that 11 seeds have made it to this round they have beaten the 7 seed they played. Whether that is statistically significant or not is the question.
On the other side of the range, rather than add combinatorial complexity, it is easier to compile the results of past years for the late rounds to see how likely it is that certain seeds reach the Sweet Sixteen, Elite Eight, Final Four, The Championship Game and finally win the championship. These frequency charts are easier to read than matchup charts at these rounds because the combinations of matchups grow large as the tournament progresses.

Sweet Sixteen frequency chart below
These frequencies are determined by who succeeds in the Round of 32 and are reflected in the Round of 32 outcomes chart above. Look at the lump for the 10, 11, and 12 seeds. In years where these teams move forward knowing to pick them results in a large multiplicative effect on your score. Correctly picking #10 Davidson last year won me the pool.

Elite Eight seed frequency chart below.

Final Four seed frequency chart below.
Championship game seed frequency chart below.Championship winner seed frequency chart below.Some points for the Final Four, Championship game, and winner:
  • Every other year or so a 5, 6, 8, 10, 11 seed makes it to the Elite Eight.
  • One 11 seed, three 8 seeds, three 6 seeds and four 5 seeds have appeared in the Final Four in 96 opportunities over 24 years, choose these upsets sparingly, but if you get them right you might just win the pool.
  • In the Championship game, one 8 seeds and two of 4, 5, and 6 seeds have made it that far. use sparingly.
  • No team lower seeded than 8 has won the whole Championship. A 6, 8 and 4 seed have won it once each. The Final winner has been a 1 seed more than half of the time.
I have also taken the point totals for the past 24 years assuming a perfect sheet and plotted them.

I try to make sure that the potential points on my Playoff sheet add up to a reasonable number based on the past history of the tournament. The histogram below is a simple way to compare the past data to a current bracket selection.

It provides a way of ensuring that I haven't picked to many upsets, or worse, been too cautious and picked too few. Last year this method caused me to adjust my sheet to have more upsets and pick #10 Davidson to make it to the Elite Eight. I won the pool so handily that I was already uncatchable at that round.

After all of this discussion of picking upsets and examination of the data indicates that upsets happen and are the key to winning the pool, but which upsets and where. This is where we resort to the expertise of others. I use the Sagarin ratings (click on 2008-09 NCAA men's ratings by team) which are essentially a least squares ranking of all of the teams, based on all of the games that a team has played in the year. He suggests using the Predictor ranking to predict the outcome of a game rather than the ranking itself. Every year I match the teams to their rankings, the rankings represent the number of points a team is expected to score in a game so the difference of these rankings is the difference in the game. Since there is some error in the rankings I choose a value below which I will pick the lower seated team to win (picking upsets) and generate my bracket.

This year I automated the process in Excel. If a Predictor difference fell below the chosen factor I set the lower ranked team as the winner. Only for the final four does the model let the best team (higher Predictor score) win regardless of seed. A plot of the resulting expected points versus this factor shows some interesting cutoffs. Realize also that the home advantage for the Sagarin ratings this year is 3.79, almost two baskets. So the factors listed below are not out of the question. Always assuming that they fault to the upset is unreasonable, but called for to maximize point possibilities for this particular bracket.

In a similar manner to the inflection points from my earlier football simulations, certain values for the factor make the potential points jump between values as teams losing teams win and winning teams lose at certain rounds, only to be swept away at higher rounds. This leads to a high sensitivity of the final potential points to small changes in the game spread factor. An earlier plot shows that the 50% median value for the potential points was 631 and that 90% of the years had total pool points of less than 796. With this in mind I set the factor to 2.33, just below the first step change from 665 to 845 and then I examined the pool for reasonableness according to the statistics shown above. One caution with this model is that it might allow improbable events like too low a seed to make it through to a high round, so I used it merely to cause me to push the limits on upsets.

All that being said, be aware that on any day, any given team can beat any other, thus the format of March Madness is given to upsets and surprises and picking a bracket is still as much luck as skill. These models are an attempt to quantify this uncertainty and use it to drive bracket picks that will take advantage of luck, upsets and surprises when they occur.

Hubble captures transit of Saturn's moons

The Hubble telescope has taken some very nice pictures of four of Saturn's moons transiting at once.

Titan is the largest visible at the top. Enceladus, Dione, and their shadows on Saturn are on the left, while Mimas is visible on the right. The moons, Saturn and Earth with the Hubble telescope orbiting it have to be aligned just right to get these images. Someone had to do the calculations of place and time so that Hubble could be pointed at Saturn to catch it.

More pictures and videos are available at the HubbleSite.

The end of the semester dream

I am glad that someone else still has this dream. It has been 15 years since I was in grad school and I still have the assignment-not-done or the final-is-today-but-I-didn't-go-to-class-all-semester nightmares, especially at the beginning of the school year. Education trained my body well.

Now I know I am not the only one.

(via xkcd)

Friday, March 13, 2009

More crocuses poke up through the leaves

The crocuses in the yard think that Spring is coming. Having helped to free some of them this week from the leaf piles that encroach on their area, they are now bursting forth at every location.

I also like to get closeups of the inside.

These are crocuses of a different color that haven't quite opened yet. Can you tell that purple is my favorite color.

Finally some snowdrops that I planted. Last year these didn't do much; I am hoping for better this year.

Spring is springing.

Thursday, March 12, 2009

Understanding the World Baseball Classic Brackets

For those of you starved for baseball and not excited about Spring Training, there is the World Baseball Classic this year. 16 teams from around the world compete. It is a truly international cast, perhaps more world series than the World Series, however there are strict rules about player participation, especially pitchers, since afterward most of these players must go back to play in their real jobs in the major leagues. The excitement this year seems to be that the Netherlands is doing better than expected. Remembering that the Netherlands includes Antigua and Barbados and recalling all of the talent in baseball from the Caribbean perhaps explains this, though they had to beat the Dominican Republic twice to move to Round Two.

A friend and I were wondering how the games would progress and who would be in or out if they win or lose so I went to the website to figure it out. After many minutes spent deciphering the pool play procedures I realized that a diagram for each round would be the best way to understand it. You are welcome to read the rules to determine it for yourself. I thought the brackets on the website were less than self explanatory, but easier to understand now that Round One is almost completed.

The diagrams below are state diagrams (or directed graphs or maybe tree diagrams) for the World Baseball Classic. The diagram below shows the order of play for each of the Pools A, B, C, D.

Each game is clear with, for example, W2 refering to the team that won game 2 or L4 referring to the team that lost game 4. Two teams from each Pool advance to Round Two, a winner and a runner up.

In Round Two the first two games match a winner from one pool to the runner up from another. Pool A and B from Round One go into Pool 1 for Round Two and Pool C and D from Round One go into Pool 2 for Round Two. Play then follows much the same process as above. Below is the diagram from Round Two.

Two teams advance from each Pool 1 and 2, a winner and a runner up. As displayed in the diagram below, in the semifinal the winner of Pool 1 plays the runner up of Pool 2 and vice versa. The winners of these games go on the the Finals.

The winner of the Finals receives accolades as shown, while the loser gets a hardy hand shake and a good job.

Tuesday, March 10, 2009

Morbidity of Saints named Richard

In naming little Linus, I chose to give him my names as his middle name, Richard. For Catholics the middle name has the potential to become the child's confirmation name and it should be a saint's name. Technically, so should the first name, but as always tradition and innovation yield some creative tension. Linus is safe both ways in that Linus is a saint's name (and mentioned in the bible) and it is interesting and unique. As Richard is the middle name, I thought that I would investigate a little further the various Saint Richards and Blessed Richards to see whom Linus might use for inspiration in life and especially at his conformation.

The most famous Saint Richard is Richard of Chichester (pictured to the left)a kindly bishop who refused his families wealth for a life of study and the church, has some minor troubles with King Henry III over the supremacy of the Pope over the King, but dies a natural death. The topic of these troubles will be much more trouble for St. Richards later in history.

There are others, one of my favorites is a little known Saint Richard sometimes called King of the Saxons (pictured to the right), who's claim to fame lies primarily in being the father of other saints (father of Willibald, Winebald, and Walpurga). Saint Richard was a pilgrim who died on his journey, the King rumor grew up around him after his death and is unsubstantiated.

What was concerning is that as the list moves towards the reformation one realizes that Richard was a popular English name, and that Henry VIII and Elizabeth I persecutions generated many Catholic English martyrs named Richard during their reigns. The chart below shows that, at least for the lists I could find, being English and Richard and Catholic seemed to guarantee martyr and saintly status after 1500 AD.

Red in the chart above indicates a martyred death. Most deaths were due to being hanged, drawn and quartered. I suppose a theme for a Reformation Saint Richard would be to defend Catholicism against Protestantism and be executed for it. I am pleased that in modern times that the enmity between Richards and English monarchs has lessened some. However, in the 1900's being named Richard and Catholic was especially dangerous during the Spanish Civil War and World War II. Of 35 Richards, 28 were martyred. I wonder how that would compare to the entire list of saints. This is a clear case of selection bias as it is very much easier to become an official saint through martyrdom than after a natural death.

Linus will have many bloody stories of Richards defending the faith for his confirmation class.

(information for the St. and Bl. Richards came from Catholic Online Saints R page, the Saints.SQPN saints R page, and eCatholic Hub saints database saints with first name Richard with much reference to Wikipedia)

Sunday, March 08, 2009

Crocuses and Snowdrops out before Spring

In flagrant defiance of the calendar, these crocuses and snow drops in the garden have decided to show their blooms even before the first day of Spring.

Yellow Crocus

White snowdrops.

Spring is coming! Prepare yourself.

Wednesday, March 04, 2009

Guess the voices - Simpsons Guest Star Quiz


Care to test your knowledge of voice guest stars on The Simpsons? Mental Floss has a quiz that does just that. In my opinion they picked some easy ones, but maybe I have watched too much of the Simpsons over the twenty years they have been on (half my life!). The average on the quiz is only 57%.

I got only got one wrong for a score of 93% (that's still an A right?). Howard will get them all correct.

(via Mental Floss, via Neatorama)

Tuesday, March 03, 2009

Snow makes birds crazy for the birdfeeder

Yesterday's snow brought hoards of hungry black birds to the bird feeder. This mixed group was made up of both grackles and starlings. There were many more than this picture indicates, perched on all of the trees and across the creek. I guess the snow made them a little crazy. rarely have I ever seen more than one bird at the feeder at a time and those were tiny birds. Yesterday had two grackles feeding and birds perched on top.

Even the dark-eyed juncos got into the act. They rarely go to the feeder, preferring to let one of their avian brethren from a different species peck and throw seed from the feeder to the ground where they forage. This time they had to do it themselves. Actually feeding at the feeder is an unusual behavior.