(Translated by https://www.hiragana.jp/)
RealClearPolitics - HorseRaceBlog - Predict the Race for Yourself
The Wayback Machine - https://web.archive.org/web/20131013212936/http://www.realclearpolitics.com/horseraceblog/2008/03/predict_the_dem_race.html
About this Blog
About The Author
Email Me

RealClearPolitics HorseRaceBlog

By Jay Cost

« A Review of the Pennsylvania Primary | HorseRaceBlog Home Page | A Bad Choice for Veep »

Predict the Race for Yourself

Political analysts have been scaling down Hillary Clinton's chances of victory. Many have taken to offering up numerical odds of her success. A source for the Politico affiliated with the Clinton campaign pegs it at 10%. David Brooks puts it at 5%. The InTrade market has it higher - at about 20%. Mike Allen and Jim Vandehei do not offer a number of their own, but they claim she has "virtually no chance of winning."

I agree that Clinton is more likely to lose than win. I also do not necessarily disagree with these low estimates. However, I disagree with the way these estimates are occasionally presented. There is sometimes an implication that these are precise predictions - when in fact a prediction like this must be very imprecise. This is why I was so vague in offering my own estimate last week.

There are reasons to expect imprecision in this kind of situation. Precision depends in part on the number of variable factors that create that which we are predicting. The more things that must happen for the prediction to come true, the less precise it is. Take an example. Suppose we are predicting whether a pitcher will strike out a batter. We can be reasonably precise. After all, there are just two factors to account for - the pitcher and the batter. Suppose, on the other hand, we're predicting who will win the World Series. Precision is very difficult here. After all, our prediction depends on thousands of factors shaking out in a certain way.

The situation is similar in this election. We can make a prediction of what will happen, and we should predict that Obama is more likely to win than Clinton. However, there are so many factors that will go into who wins the nomination that speaking more precisely than this becomes quite problematic.

Let's examine this in detail. A key issue in determining the nominee is who is seen to have won more votes. Many important factors will go into this perception. They can be organized under three questions. How shall the votes be counted? Who will win the remaining contests, and by how much? What will turnout be? Varying our answers to just a few of these questions can dramatically alter which candidate is favored in the popular vote count. This makes prognosticating a very imprecise endeavor.

First, there are many reasonable ways to count the popular vote. None is obviously superior to the rest. Of course, it does not matter which we think is most appropriate. What matters is what the superdelegates think, as they will be the "tie-breakers" in the nomination battle.

They could approach it in many ways. They could take the basic vote count and choose to exclude or include Michigan, Florida, or caucus estimates. Assuming they want to include the Michigan results and the caucus estimates (for IA, ME, NV, and WA, whose state parties do not supply actual vote totals), they could account for them in different ways. With Michigan, they could (a) give Obama the "unaffiliated" vote, (b) not give Obama the "unaffiliated" vote, or (c) reallocate the vote based upon whom voters claimed in the exit poll they would support if all candidates had been on the ballot. If they include caucus estimates, they could (i) count the non-binding Washington primary instead of the caucus, or (ii) count the Washington caucus instead of the primary.

This implies more than a dozen ways to count the votes. Different counts would achieve different goals - beyond favoring one candidate or another. For instance, the "Exclude Michigan and Florida" counts hew closely to the position of the Democratic National Committee. If the superdelegates think the DNC's posture to the delegates should also apply to the votes, they might prefer those counts. If they have the normative principles of McGovern-Fraser in mind, and want to include as many votes as possible while being fair to both candidates, they might account for Florida, Michigan (giving Obama some share of the vote there), the caucuses, and the Washington state primary.

Importantly, changing the count could turn this from a close race to an Obama blowout, and vice-versa.

Second, what results should we expect in the remaining states? It seems to me that we have very little purchase on this question. Personally, I have been "lost" for the last few days trying to get a handle on North Carolina. It is a highly complicated state that cannot be predicted easily. Yet it will be incredibly determinative. A swing of 5 points in a state like North Carolina could make a difference of more than 60,000 votes.

It's easy to get an illusive sense of certainty on these state contests. An example can be found in this article on Indiana by Anne Kornblut. She sees Obama and Clinton as being "roughly equal" in the Hoosier State - but her reasoning is unpersuasive. One of Obama's big advantages is that he is from a neighboring state with an overlapping media market - but this did not help Clinton in Connecticut or Romney in New Hampshire. In Clinton's favor is the fact that Evan Bayh has endorsed her, but Ted Kennedy's endorsement did Obama no good in Massachusetts. Generally, it seems to me that this estimate is so "rough" that we should not make too much of the perceived "equality."

The biggest problem is with Puerto Rico. We are literally without precedent there. It's never voted in a presidential election of any kind. It is therefore extremely difficult to get an idea of who will win, let alone by how much. An even bigger question with Puerto Rico is turnout. Puerto Ricans are some of the most active voters in the world, and turnout could be very high. But how high? 100,000, 500,000, 1 million, 2 million? Again, we have no precedent for it.

Turnout stateside is more predictable than in Puerto Rico - but there are still limitations. We know that turnout has risen since Super Tuesday. In open primaries, it averaged 66% of the 2004 Kerry vote on Super Tuesday. On March 4, turnout averaged 83%. Will it level off, taper off, or increase? What about closed contests? On Super Tuesday, turnout in those was much lower than in open contests. But Maryland and DC (the last two closed contests) voted at about the same rate as Virginia (an open contest held on the same day). What will happen next?

Once again, varying our answers can dramatically affect the results. For instance, if Clinton wins Pennsylvania by the same margin she carried Ohio, a 10% increase in turnout will provide her a net of 29,000 votes.

Here's the broader point. We have a large number of unknown factors. For many of them, we have very little idea what values they will ultimately take. What we do know is that small changes in several of them could induce large changes in the vote count. This makes it extremely difficult to be as precise as many commentators have been. We need to be wary of all the uncertainty we face here.

It is for this reason that I offer for public consumption the following Excel spreadsheet. It is set up to enable you to plug turnout and vote margins in, and see what effect the changes will have on the different vote counts. It seems to me that, rather than have Politico, the Times, or the Post outline which outcomes are possible, all of us should just take a look for ourselves.

So, predict the Democratic race for yourself.*

Enjoy!

* - Note that the initial values in the spreadsheets are not to be interpreted as my predictions. Instead of making predictions, I decided instead to publish the spreadsheet! The initial values are only meant to illustrate the effect that a not-unbelievable swing in the popular vote toward Clinton could have on the race.

-Jay Cost