In Paul Burka’s latest post, he questions the methodology behind the poll conducted by the Department of Government and the Texas Politics project at UT. Here is their response, in full (courtesy of professors Jim Henson and Daron Shaw).
In the Friday afternoon Texas Monthly podcast, in a post on his blog the following day, and in the comment fields following that entry, Paul Burka made a series of inaccurate characterizations of the poll released by the Department of Government and the Texas Politics project last week. Consequently, we feel compelled to respond. In so doing, we hope to give Mr. Burka, readers of the blog, and the broader public a clearer idea of how the poll works.
Mr. Burka’s skepticism concerning some of our results seems based on a combination of his misunderstanding of where our sample comes from and how we use the Internet to administer the survey. Let us begin by explaining our decision to conduct an online survey. Put another way, why didn’t we just do another phone poll? In our view, the issues preventing effective online polling are receding while those plaguing traditional phone polling are becoming increasingly troublesome. In particular, phone polls have had lower response rates in recent years, which exacerbate widely recognized response biases. Weighting the data is the typical response, but how reliable are estimates when you have to weight low incidence populations (for example, young African American males) by a function of 8 or 12 or even 16? Perhaps more problematic is the spread of cell phone use and the decline of landlines. Finally, talking to people over the phone also places constraints on the sort of question frames and response options you can use; these problems are reduced or removed when you use the web.
So how do we deal with the difficulties of obtaining a representative online sample? For starters, the University of Texas Poll is NOT an “opt-in” survey or a traditional Internet survey in which one randomly samples from a large list of online users. This is apparently what Mr. Burka thinks, judging from his podcast and blog. It is difficult for us to imagine where he got this impression, given that our methodology statement clearly states that we (a) identify a random-sample via traditional phone techniques (that is, we draw a stratified cluster sample), and then (b) “match” those identified by traditional techniques with someone from our pool of online respondents (the online pool contains approximately 2 million people). The match is based on six characteristics, so that a “selected” young, college-educated, Hispanic, Democrat, male, from Laredo is matched with someone from our pool who has these characteristics. There are still potential representativeness issues, of course, most notably the quality of the matches among lower incidence populations. But these issues plague political polling no matter the approach, and ours at least holds the promise for significant improvement (as the pool expands). Mr. Burka’s response to all of this is a series of ad-hoc observations, an admission that he isn’t an expert in methodology, and then a dismissive declaration that the poll is “strange.”
Mr. Burka then goes on to question the size and make-up of the poll. From this, we suspect he means to critique the “universe” from which we draw the sample and perhaps also the sample size. Once again, we are puzzled by these criticisms. Our universe is made up of all adults in the state of Texas, which is appropriate given our focus on estimating popular opinion on a wide set of issues. The sample size is 800, which brings with it a margin of error or approximately +/- 3.5 percentage points. Had Mr. Burka looked at the numerous polls he routinely quotes, he would see that a sample of 800 adults is typical and is certainly large enough to produce statistically sound inferences.
Apparently, Mr. Burka is echoing objections from some political operatives to the specific results of the trial ballot item pitting Governor Rick Perry against Senator Kay Bailey Hutchison. He would have preferred that we sample from a list of registered voters (or from a list of people with a primary voting history) rather than from the adult population. As indicated above, our primary focus in this poll is not election forecasting. If it were, we would not be opposed to list-based sampling, as are some in our profession. On the other hand, we do not simply ask the candidate preference items of all people: we only ask registered voters who say they intend to vote in the Republican or Democratic primary. Not all of these individuals will vote in the primary, of course. But the relevant question then becomes whether or not the over-statement of intent creates any systematic bias favoring a certain candidate. The demographic characteristics of the primary electorate we project matches very closely with that which existed in 2006 and 2008 (compared with exit polls from those elections). Furthermore, in this particular case there is no reason to think that Governor Perry would be systematically advantaged by a larger, self-reported pool of Republican primary voters. In fact, we could make an argument that Senator Hutchison would be advantaged by such an enlarged pool (given her better overall numbers). At any rate, we find the contention that our estimates are biased ironic coming from those who rely on phone surveys that (1) have response rates in the 20%-30% range, (2) are heavily weighted, and (3) come from registered voter lists that are hardly up-to-date reflections of the current pool of registered voters (a fact that certainly could have produced problems in 2008)).
Mr. Burka’s does not limit his critique to the poll’s results, however. He also refers to some of our questions as “odd.” This particular reference is to the use of 0-100 thermometer ratings to gauge respondents’ emotional reactions to public figures. Mr. Burka’s objection is that an average thermometer score of, say, 52 is not the same thing as a 52% approval rating. True. One score represents the mean level of affect towards a public figure while the other represents the aggregate level of performance approval. Our poll provides data on both, and they are different measures that need to be understood in different ways. We never pretend or suggest otherwise.
But let us take the criticism more broadly. If by “odd” Mr. Burka means that we sometimes use two-sided issue frames to give respondents an even-handed representation of issue debates, then yes, we do ask odd questions. If by “odd” he means that we offer scaled response options rather than dichotomous, yes/no options, then yes, we have odd response options. But these are odd only by Mr. Burka’s standard; they are quite common in polling as well as in social scientific circles. In addition, we still include standard measures of performance, such as job approval and most important question items.
We should also point out that the exact wording of our questions, the specific results, a detailed explanation of our methodology, and the data set itself are available to the public on our website, which also contains contact information for the principle investigators as well as information about the company we use to do the actual data collection, YouGov/Polimetrix. The goal of the University of Texas Poll is transparency, and we believe that we have provided that. The point is not that we are “right” or that the poll is “perfect.” It is still difficult to “get” low incidence respondents; we still have a margin of error; question frames can affect results. But some of the criticisms leveled by Mr. Burka and others are not of this stripe; they are, in fact, flat-out wrong.
One final note on methodology—Mr. Burka’s reporting methodology. Neither of us was contacted by Mr. Burka about the survey. We were, therefore, surprised that he chose to inaccurately question the methodology of the poll in his podcast. When gently chided by some during the podcast, he forged ahead and headlined his blog post with a declaration that the poll was “not legitimate.” This is, in our opinion, an irresponsible and lazy characterization. We hope to hear from him next time around.
- 1 week