Monday, January 12, 2009

Playoff Fantasy Football - Simulating the playoffs

Each year, RKBfantasyfootball (My sister is a commissioner, so I must play.) has playoff fantasy football. Since you don't have to maintain your concentration for 17 weeks, it is a good way for non-football fanatics to play without a huge commitment of time. They run a points-only league so the rules are straightforward. Pick a kicker, quarterback, two running backs, a tight end, three wide receivers and a team defense from the list of players and teams in the playoffs. You get the sum of the points these players or defense, special teams included, score during the playoffs (the detailed, but simple rules are here).

The key to winning is to pick the players which generate the most points. The problem really breaks down to two issues. One: Predicting which teams will play the most games in the playoffs. Two: Picking high scoring players on those teams for a multiplicative effect. My first focus is on picking the teams.

It has been my goal to somehow simulate the football playoffs in order to help me make my picks for the RKB playoff fantasy football. I am not a football expert like many of the participants so I try to make do with statistics and data. I had proposed to use genetic algorithms to search for the best possible player picks for the playoffs but I had been stuck with how I might simulate the playoffs. I not so jokingly suggested using a video game like Madden2008 to play enough games to collect the data but that would not be practical from a time standpoint. I have also used the Sagarin ratings to good effect in previous football predictions so I decided to use it here.

The Sagarin ratings are essentially a least squares of the rankings of the teams in the NFL determined by the games they win or lose. I use the ratings from the end of the season, it appears that he updates with the playoff games after that. It is slightly more complicated than that but what he suggests is to use his pure points rating as the best way to predict the outcome of a game between two teams. He also calculates a home advantage factor, which tends to be about a field goal, this year it is 2.81 at the end of the regular season. The extra wrinkle I add in order to add the element of chance back into the simulation is to add a plus or minus random factor since we know that each game still has elements of chance. We can then look at the sensitivity of the outcome of the playoffs to both this random factor and to the home advantage.

I built up a simulation of each week of the playoffs through to the Superbowl which uses the pure points plus the home advantage plus the random factor. The winning team is determined by the difference between home team pure points score + home advantage - the away team pure points +/- the random factor. Positive means the home team wins, negative is the opposite. The winning team goes on. The simulation also keeps track of the teams rank so that the correct rank match-ups occur in each week no matter the pure points outcome. Finally, there is no home advantage for the Superbowl. I collected the outcome of 1000 simulations at each set of home advantage and random factor parameter values. The 3D chart below shows which team wins the Superbowl according to these simulations.


The left side of the chart shows the expected outcome with a 2.81 home advantage factor, Tennessee wins the Superbowl. They play the New York Giants, who you can see winning the Superbowl sometimes as the random factor is increased. On the other side of the spectrum, with no home advantage, the Pittsburgh Steelers win the Superbowl and they play the Philadelphia Eagles, who you can see winning the Superbowl sometimes as the random factor is increased on that side. Also notice that Baltimore starts to appear as a winnr as Pittsurgh diminishes. If the random factor is increased to large values, in effect the simulation is acting as if every game is a 50/50 shot. Both scenarios, home advantage 2.81 and 0 converge on each other as the advantage becomes swamped by the randomness. In that case teams with out a bye in the first round have a 1 on 16 chance of winning the Superbowl, and teams with a bye have a 1/8 chance, because they play one less game and have one less chance at losing.

Randomness aside, it is clear that the home advantage has an important effect on the playoff outcome in the simulation so it is necessary to look for tipping points in this factor. The chart below reveals the effect of the home advantage in the absence of the random factor and some tipping points where the AFC and NFC champs and thus the Superbowl winners change.

With no home advantage Philadelphia plays Pittsburgh in the Superbowl. As the factor increases to 1.08, Pittsburgh eventually loses to Tennessee, but they play Philadelphia. Increasing the factor to 1.3 and beyond has the New York Giants playing the Tennessee Titans in the Superbowl. Philadelphia is a wild card team, who will play four games if they reach the Superbowl, and this scenario is fairly robust across a range of the home advantage. It also seems to be the scenario playing out in the current playoff situation as 3 of four teams with home field advantage lost this weekend.

These simulations are important because we need to determine how many games each of the teams will play so that we can see how many times our player picks have to score. A great player that only plays in one game may not yield as many points as a fair player who gets three or even four game opportunities to contribute to the score. Next we will examine the distribution of the number of games each team plays and hose sensitive the results are to the home advantage and random factor.

1 comment:

newman said...

love the graph and the statistics matey,

keep it up



newman