jimhabegger wrote:Steersman, I spoke too soon. I did some more weighting experiments, and saw something that I had forgotten: The way that Pew does the weighting, If there are any sample biases that are correlated with people's answers to the questions, and independent of the demographics, then the weighting won't correct those biases in the answers, at all, and I don't think that their margins of error account for that at all.
Would be interested to see how you did any simulations of that, particularly as I don’t think you’ve given anything to justify your suggestion. All fine and dandy to question arguments and data, but they have to be backed-up with some data themselves.
But you said some things earlier about margins of error than motivated me to look into some of the math behind sampling in general, and
Pew’s results and report in particular:
jimhabegger wrote:- I might simply be wrong in thinking that the chances of the sample percentages being within the reported margins of error of the population percentages are no better than random.
As I think I’ve argued or we’ve discussed, there seems to be some variation in the possible margins of error depending both on the standard deviation of the possible samples, and the actual population value in question – typically, at least for Pew’s survey, what percentage of the population say either yay or nay on any particular question.
It’s all too easy for us - myself included - to make various conclusions or assertions about a sample if we don’t understand what’s happening underneath the hood – lies, damned lies, and statistics. And all that. :-) Hence my going back to a few “first principles” to get a better handle on what various terms and claims really mean, and where they come from – even if only for my own benefit. While there are many aspects I still don’t have a good understanding of, and I expect some of my arguments or assumptions are a bit iffy or vague, I hope they’re more or less on the money. And they certainly seem consistent.
But the starting point is a really simple case of a population of 30 people, of which 30% (9) vote “yes” on any given question. And that entire population of 30 is tested or estimated by a smaller sample population of 10 which may or may not have the same percentage of people voting “yes”. But if it does match then there will be 3 among the 10 who have done so. So the questions then become, how many possible samples of 10 from a population of 30 will there be that have precisely 3 voting yes, and how many other samples will there be of 0, 1, 2, or 4 through 10 people voting “yes” and the balance (of the 10) voting "no"?
And, through the magic of the
binomial theorem (=n! / ((n-k)!k!)), we find that there are about 30 million possible combinations, in total, taken 10 at at time from the 30. And that, of those, there are about 10 million combinations of those 30 million that have precisely 30% of that sample population who have voted “yes”, with a smaller number of samples in which those voting “yes” has been different from the population percentage. The following table summarizes that information with the last column giving the “yeses” and “noes” in the sample population (of 10):
https://i.imgur.com/jhTcB8G.jpg
A graph summarizing that information and based on it:
https://i.imgur.com/vO6s58z.jpg
Maybe nothing too magical about the foregoing, although I think it serves to emphasize the fact that probability estimates are based on a set of possibilities that can be quite readily calculated.
But the next step is to consider a slighter larger population size (300) and sample size (100) to get a handle on what the distribution might look like – below – and what is a relevant standard deviation of it:
https://i.imgur.com/naWTMXD.jpg
Of particular note is that the sample populations (population samples) – some 10^80 possible combinations of people taken 100 at a time from a population of 300 – are plotted in red while a normal distribution, with a mean of 30 – 30% of the sample population of 100 – and a standard deviation of 4, is plotted in blue. Also, there are two vertical lines at 0.26 and 0.22 which correspond to the minus one and minus two standard deviations (4). The horizontal lines at 0.606 and 0.135 correspond to the number of sample populations – or the percentages relative to the mean – which have counts of 26 (=30-4) and 22 (=30-8) at the standard deviation offsets.
Also to note is that because the standard deviation is in counts – the number of yeses either side of the mean – the error relative to the mean – 30 in the case above – is going to depend on the actual mean. For instance, with a mean of 30 and a standard deviation of 4, the error is +/- 8/30 = +/- 27%. But if the mean is 80 then the error is 8/80 = 10%. While there is some variation of the standard deviation with the population value – it goes to zero at the extremes (0% & 100%) – the variation in the value over the midrange (0.1 to 0.9) seems of less significance than the nominal or mean value. The following shows results of the calculation for the standard deviation [
S.D. = Sqrt[p*(1-p)*n] for a range of sample sizes and 3 particular percentages of a population (p, population means, or averages) making a particular binary choice:
https://i.imgur.com/bNwktOr.jpg
And, finally, some plots of the distributions for 3 different p values (0.3, 0.5, & 0.8) for a sample size of 1000 and a total population of 1 million. Note the actual distributions, based on the binomial theorem, closely match the plotted equation for the normal distribution, and that standard deviations are close to the ones shown in the graph above:
https://i.imgur.com/dF1PtKZ.jpg
https://i.imgur.com/cGbhPlb.jpg
https://i.imgur.com/Nc2Zjvy.jpg
So, even though the standard deviations are fairly close – a maximum of 16 at p = 0.5 to a minimum of 13 at p= 0.8 – the magnitude of the error relative to those p values varies from a minimum of 3.3 % at p=0.8 to a maximum of 10% at p=0.3.
Which then raises a few questions about
Pew’s methodology or the interpretation to put on their results, particularly their margin of error and the units (“points”) they use. One is tempted to say that their typical values – somewhere in the vicinity of 5 “points” (whatever that means) – is equivalent to the 6.4% figure for the p=0.5 above. However, they seem to have lumped in their weightings into those margins of error they listed without actually specifiying, that I can see, how those weightings affected the base values which presumably have a more solid justification as indicated above.
So, still a bit of a puzzle. However, I think it reasonable to argue that the results probably have the underlying variations or margins as suggested by the theoretical considerations described above, and that the effects of weighting are largely unknown. However, that said, I expect Pew’s nominal values are probably fairly accurate, and that various margins of error, including those from weightings, could just as easily make the expected values better or worse than those suggested.