The Opinion Poll Simulator

margin of error

sample size

population size

# polls to run

Set actual supports

Opinion Polls

Opinion polls are a powerful tool for gauging public opinion on a wide range of issues. They are surveys designed to measure the opinions, attitudes, and beliefs of a representative sample of individuals on a particular topic. Opinion polls are conducted by polling organizations or market research firms, which use a variety of methods to collect data, such as phone interviews, online surveys, or face-to-face interviews.

Opinion polls are used to provide insight into public opinion on a wide range of issues, including politics, social issues, and consumer preferences. They can be used to measure attitudes toward political candidates, government policies, and public opinion on social issues. Opinion polls are also used by businesses to measure consumer preferences and attitudes towards products and services.

The accuracy of opinion polls depends on the methodology used to collect the data and the representativeness of the sample. Polling organizations must ensure that the sample of individuals surveyed is diverse and representative of the population they are trying to measure. Polls can also be affected by a range of factors, including the timing of the poll, the wording of the questions, and the overall climate of public opinion.

Despite these limitations, opinion polls remain a valuable tool for researchers, policymakers, and businesses to better understand public opinion and make informed decisions based on this information.

Why do opinion polls work? (without the math!)

You may have heard people express doubt about the accuracy of opinion polls. "How can they know what the entire population thinks by asking 1000 people?" they might ask. But the truth is that you can get a very good idea of what the population as a whole thinks by asking a very small fraction of it. What matters is that the sample asked is representative of the population. A representative sample will be a good approximation of the population.

But what is a representative sample? It refers to a subset of a larger population that accurately reflects the key characteristics or attributes of the entire population. It is a randomly selected group of individuals, objects, or data points that are carefully chosen to ensure that they possess the same characteristics as the broader population in terms of demographics, characteristics, or relevant variables of interest.

What might surprise you is that the size of the population from which you sample doesn't really have an effect the accuracy of the results, even when the sample size stays the same. A small mental exercise can help you understand why this is the case. Imagine you have 50 thousand red balls and 50 thousand blue balls in a (large) bag. You pick 1000 balls at random, and arrange them in a rectangle, in the random order you picked them. Ignoring the exact order and counts of red and blue in each row and column, an overall pattern or "texture" will emerge. Now imagine you have 50 million red and 50 million blue balls in the bag. You again pick 1000 balls at random, and arrange them in a rectangle the same way. Is there any reason to believe that the overall pattern observed will be any different? No, there isn't. The "texture" of the population depends on the proportion of red and blue balls in the bag, not on the total count of balls. And as long as your sample is large enough to pick up that texture, it will be reasonably accurate. The same is true for opinion polls. The "texture" of the population depends on the proportion of people with different opinions, not on the total count of people. Opinion polls look at a sample that is large enough that it will "look like" the overall pattern, from a bird's-view.

In the small toy below you can see how the "texture" of a population changes as you change the proportions of blue vs red, and then shuffle the pixels. The "Population" element contains of 300 x 300 = 90000 pixels. Use the slider to change the proportions of red and blue. The "Shuffle population"-button will arrange the pixels randomly. The draggable square on top of the pixels represents a sample of the population. You can drag it around to select a different sample. Since the pixels are randomly arranged, the sample that each position of the square corresponds to will itself also be random. To the right of the population pixels Just above the population pixels you can see a zoomed in view of that random sample. Below Above the view of the sample you can see the counts for red and blue, and how they compare to the actual proportions in the entire population. The size of the square can be adjusted by moving the slider with the label "sample size".

Sampling on Wikipedia

The simulator

The simulator on top of this page works by creating a population of a given size, and then it assigns an opinion, that is a candidate supported, to each of the individuals in that population. The proportion of assigned supports to the different candidates are set in the form Set actual supports.

It then simulates a number (as set by "# polls to run") of polls where in each of them it picks a random sample (whose size is given by "sample size") from the population, and then counts the support for the different candidates in that sample. The results are shown in the chart, where you can compare actual support and polled supports.

This is not a scientific tool. It is an app to maybe convince you, or someone you know, that polls can be reasonably accurate, even if it polls only a relatively small sample from a large population. But be aware that unlike in the real world the sampling here is truly random over the entire population. This is important for the accuracy of the results. In the real world this may be more difficult to achieve.

You can play around with the settings and see how they affect the results. Maybe you will be surprised by what you see. For example, it may surprise you to see that the size of the population from which you sample doesn't really have a big effect on the accuracy of the results, even when the sample size stays the same.

After you have run a simulation the histograms with the green and red bars will show how the results from the polls are distributed around the true value. Green indicates that the poll result is within the margin of error, while red indicates that it is outside of it. You can click on a candidate's bar to see more details on the results. You can also change the margin of error and see how that affects the proportion of results within and outside of it.