If thou gazest long into the Pit, the Pit will also gaze into thee

Posted: **Wed Oct 18, 2017 3:47 pm**

Recently I've written to some highly qualified people in survey research, trying to find out what reasons anyone might have for thinking that the margins of error, in reports of opinion polls about religious views, are actually true. So far I've only heard from a few of them, and all I've seen from them is wishful thinking and blind faith. One of them gave me a good idea though, for search terms: demographic predictors of religious views. If there is some correlation between demographics and religious views, then it might be possible to calculate genuine confidence intervals on weighted percentages from the samples.

Until now, when I've seen people misquoting and misrepresenting opinion polls from Muslim populations, I've never thought it would do any good to correct them, because I don't trust the opinion polls themselves. I've just been ignoring them. Now I see some possible value in them.

- I see some value in the sample percentages themselves, no matter what the population percentages are. I see value in knowing how some people answered the questions.

- Even though I don't see any reason to think that the sample percentages are within the reported margins of of the population percentages, they still might be close enough to have some value.

- Even if the percentages are all wrong for the populations, there still could be some value in correcting misquotes and misrepresentations.

- I might simply be wrong in thinking that the chances of the sample percentages being within the reported margins of error of the population percentages are no better than random.

Posted: **Wed Oct 18, 2017 7:23 pm**

I volunteered to be available for polls and market research at least 7 years ago so i get "polled" fairly regularly, especially in run-ups to elections and i answered 4-5 in the run up to Brexit.

I am in very little doubt that polls are designed to get the answers they want. For example on the Brexit polls they were asking about immigration but nothing on the economy and how the two relate. I am pretty sure if you read the answers i gave you would just come away with the impression that I was xenophobic, racists and a "little Englander".

Another thing about the Brexit polls was that I chatted up the pollsters. There is nothing in the rules about them answering your questions, so I asked how they thought it was going, and they either said that they thought more people were in favour of Brexit or that they were confident a lot more people were for Brexit. (One even told me that young people seemed to be for remain while anyone over 40 seemed to be for leaving).

And yet all the polls were saying that Remain would win. So you have to figure that they were compensating badly because they are bad at it or because they wanted the poll as a tool to influence the vote.

When it comes to religious polling, the only use I see is if you are polling the same questions over a period of decades and comparing it to solid data such as church attendance.

Posted: **Wed Oct 18, 2017 9:16 pm**

Vicky, I think that popular opinion polls, with their loaded and leading questions, are designed to tell stories, with little or no regard for how well they represent what people actually think. That's part of what I think makes them worse than useless: false pictures of what people think, masquerading as science.

Posted: **Thu Oct 19, 2017 12:02 am**

I've been wondering if I could safely imagine any limit to how far the sample percentages in a religious opinion poll could be from the population percentages. Just now I looked for the biggest errors I could find in election polls, which have been empirically tested, and it was 20%. How far off target could the percentages be in religious opinion polls, which have never had any empirical way of being tested at all? 30%? 40%? 50%? I've searched and searched on the Internet for what possible grounds anyone could have to stand on, for putting any limits at all, on how far off their sample percentages could be from population percentages, in religious opinion polls, and all I've found was wishful thinking and blind faith, ignoring and denying even the most widely recognized sources of error. The three responses I've received to that question from 25 or more researchers, have not been any better than that.

In spite of all that, I think now that there might be some value, sometimes, in correcting misquotes and misrepresentations of those polls, however far the reported percentages might be from the population percentages, as long as I make it clear that I have no confidence at all, myself, in those percentages.

Posted: **Sat Oct 21, 2017 8:23 am**

This is one of the reasons why I don't trust opinion polls to tell us anything about how many people think what.

Developing Standards for Post-Stratification Weighting in Population-Based Survey Experiments

Posted: **Sat Oct 21, 2017 6:37 pm**

jimhabegger wrote:Recently I've written to some highly qualified people in survey research, trying to find out what reasons anyone might have for thinking that the margins of error, in reports of opinion polls about religious views, are actually true. So far I've only heard from a few of them, and all I've seen from them is wishful thinking and blind faith. One of them gave me a good idea though, for search terms: demographic predictors of religious views. If there is some correlation between demographics and religious views, then it might be possible to calculate genuine confidence intervals on weighted percentages from the samples.

Until now, when I've seen people misquoting and misrepresenting opinion polls from Muslim populations, I've never thought it would do any good to correct them, because I don't trust the opinion polls themselves. I've just been ignoring them. Now I see some possible value in them.

- I see some value in the sample percentages themselves, no matter what the population percentages are. I see value in knowing how some people answered the questions.

- Even though I don't see any reason to think that the sample percentages are within the reported margins of of the population percentages, they still might be close enough to have some value.

- Even if the percentages are all wrong for the populations, there still could be some value in correcting misquotes and misrepresentations.

- I might simply be wrong in thinking that the chances of the sample percentages being within the reported margins of error of the population percentages are no better than random.

As I've said several times, I don't think you really understand how sampling works, and I don't think you've made much effort to rectify that.

While statistics and sampling is a complex topic, and I sure don't have a complete grasp of all of the nuances and details, the essence seems quite credible and straightforward. More specifically, the premise - and it is a bit of an assumption although hardly any "wishful thinking and blind faith" - is apparently that if a population's response to a given question follows a normal distribution, and if the sampling is truly random then the sample is going to be a "reasonably accurate" represenation of the entire population - 19 times out of 20 within a 5% margin of error is, I think, the standard way of stating that fact.

It is maybe a reasonable question to ask whether those assumptions are valid, or if any weightings were justified as your last comment suggests. But absent some evidence that they are not then one has more than a little reason to think the survey of a population provides a "reasonably" accurate synopsis of the values and positions of the entire population itself.

Kind of looks like you're engaged in some "motivated reasoning", in a rather desperate attempt at trying to whitewash Islam and Muslims.

Posted: **Sun Oct 22, 2017 6:26 am**

Steersman, I sent you a PM, and I posted a message to you in the thread about reducing and counteracting anti-Muslim prejudices.

Posted: **Wed Oct 25, 2017 7:28 am**

Steersman, I finally found a way to do some weighting experiments, without spending dozens of hours on it. One demographic variable to use for weighting, and one question variable, to see the effects of the weighting. As often as not, the weighting magnified some or all of the errors, but never by more than 40% of the error. For example, if the error in the unweighted question percentage was 5%, the error in the weighted question percentage wouldn't be more than 7%.

I'm satisfied now with considering the population percentages as being within twice Pew's reported margins of error, which in Pew's World's Muslims survey would never be outside of +/- 13 points.

Posted: **Thu Oct 26, 2017 12:55 am**

Steersman, I spoke too soon. I did some more weighting experiments, and saw something that I had forgotten: The way that Pew does the weighting, ff there are any sample biases that are correlated with people's answers to the questions, and independent of the demographics, then the weighting won't correct those biases in the answers, at all, and I don't think that their margins of error account for that at all.

Posted: **Sun Oct 29, 2017 1:30 am**

jimhabegger wrote:Steersman, I spoke too soon. I did some more weighting experiments, and saw something that I had forgotten: The way that Pew does the weighting, If there are any sample biases that are correlated with people's answers to the questions, and independent of the demographics, then the weighting won't correct those biases in the answers, at all, and I don't think that their margins of error account for that at all.

Would be interested to see how you did any simulations of that, particularly as I don’t think you’ve given anything to justify your suggestion. All fine and dandy to question arguments and data, but they have to be backed-up with some data themselves.

But you said some things earlier about margins of error than motivated me to look into some of the math behind sampling in general, and Pew’s results and report in particular:

jimhabegger wrote:- I might simply be wrong in thinking that the chances of the sample percentages being within the reported margins of error of the population percentages are no better than random.

As I think I’ve argued or we’ve discussed, there seems to be some variation in the possible margins of error depending both on the standard deviation of the possible samples, and the actual population value in question – typically, at least for Pew’s survey, what percentage of the population say either yay or nay on any particular question.

It’s all too easy for us - myself included - to make various conclusions or assertions about a sample if we don’t understand what’s happening underneath the hood – lies, damned lies, and statistics. And all that. :-) Hence my going back to a few “first principles” to get a better handle on what various terms and claims really mean, and where they come from – even if only for my own benefit. While there are many aspects I still don’t have a good understanding of, and I expect some of my arguments or assumptions are a bit iffy or vague, I hope they’re more or less on the money. And they certainly seem consistent.

But the starting point is a really simple case of a population of 30 people, of which 30% (9) vote “yes” on any given question. And that entire population of 30 is tested or estimated by a smaller sample population of 10 which may or may not have the same percentage of people voting “yes”. But if it does match then there will be 3 among the 10 who have done so. So the questions then become, how many possible samples of 10 from a population of 30 will there be that have precisely 3 voting yes, and how many other samples will there be of 0, 1, 2, or 4 through 10 people voting “yes” and the balance (of the 10) voting "no"?

And, through the magic of the binomial theorem (=n! / ((n-k)!k!)), we find that there are about 30 million possible combinations, in total, taken 10 at at time from the 30. And that, of those, there are about 10 million combinations of those 30 million that have precisely 30% of that sample population who have voted “yes”, with a smaller number of samples in which those voting “yes” has been different from the population percentage. The following table summarizes that information with the last column giving the “yeses” and “noes” in the sample population (of 10):

https://i.imgur.com/jhTcB8G.jpg

A graph summarizing that information and based on it:

https://i.imgur.com/vO6s58z.jpg

Maybe nothing too magical about the foregoing, although I think it serves to emphasize the fact that probability estimates are based on a set of possibilities that can be quite readily calculated.

But the next step is to consider a slighter larger population size (300) and sample size (100) to get a handle on what the distribution might look like – below – and what is a relevant standard deviation of it:

https://i.imgur.com/naWTMXD.jpg

Of particular note is that the sample populations (population samples) – some 10^80 possible combinations of people taken 100 at a time from a population of 300 – are plotted in red while a normal distribution, with a mean of 30 – 30% of the sample population of 100 – and a standard deviation of 4, is plotted in blue. Also, there are two vertical lines at 0.26 and 0.22 which correspond to the minus one and minus two standard deviations (4). The horizontal lines at 0.606 and 0.135 correspond to the number of sample populations – or the percentages relative to the mean – which have counts of 26 (=30-4) and 22 (=30-8) at the standard deviation offsets.

Also to note is that because the standard deviation is in counts – the number of yeses either side of the mean – the error relative to the mean – 30 in the case above – is going to depend on the actual mean. For instance, with a mean of 30 and a standard deviation of 4, the error is +/- 8/30 = +/- 27%. But if the mean is 80 then the error is 8/80 = 10%. While there is some variation of the standard deviation with the population value – it goes to zero at the extremes (0% & 100%) – the variation in the value over the midrange (0.1 to 0.9) seems of less significance than the nominal or mean value. The following shows results of the calculation for the standard deviation [S.D. = Sqrt[p*(1-p)*n] for a range of sample sizes and 3 particular percentages of a population (p, population means, or averages) making a particular binary choice:

https://i.imgur.com/bNwktOr.jpg

And, finally, some plots of the distributions for 3 different p values (0.3, 0.5, & 0.8) for a sample size of 1000 and a total population of 1 million. Note the actual distributions, based on the binomial theorem, closely match the plotted equation for the normal distribution, and that standard deviations are close to the ones shown in the graph above:

https://i.imgur.com/dF1PtKZ.jpg

https://i.imgur.com/cGbhPlb.jpg

https://i.imgur.com/Nc2Zjvy.jpg

So, even though the standard deviations are fairly close – a maximum of 16 at p = 0.5 to a minimum of 13 at p= 0.8 – the magnitude of the error relative to those p values varies from a minimum of 3.3 % at p=0.8 to a maximum of 10% at p=0.3.

Which then raises a few questions about Pew’s methodology or the interpretation to put on their results, particularly their margin of error and the units (“points”) they use. One is tempted to say that their typical values – somewhere in the vicinity of 5 “points” (whatever that means) – is equivalent to the 6.4% figure for the p=0.5 above. However, they seem to have lumped in their weightings into those margins of error they listed without actually specifiying, that I can see, how those weightings affected the base values which presumably have a more solid justification as indicated above.

So, still a bit of a puzzle. However, I think it reasonable to argue that the results probably have the underlying variations or margins as suggested by the theoretical considerations described above, and that the effects of weighting are largely unknown. However, that said, I expect Pew’s nominal values are probably fairly accurate, and that various margins of error, including those from weightings, could just as easily make the expected values better or worse than those suggested.

Posted: **Sun Oct 29, 2017 4:25 am**

Steersman, well done!

"... the effects of weighting are largely unknown."

Exactly.

Actually, I would have more confidence in Pew's percentages if they did not do this, which I think is purely for cosmetic purposes, and which I think is just as likely to move the opinion percentages away from population percentages, as to move them closer:

"Additionally, where appropriate, data were weighted through an iterative procedure to more closely align the samples with official population figures for characteristics such as gender, age, education and ethnicity."

It's that weighting which leaves me at a loss about what kind of confidence I can have in projecting those percentages onto the populations.

Anyway, I've decided that if I ever get into any more discussions about Pew polls, I'll just use their percentages, with a disclaimer that I personally have no idea if they apply to the populations or not.

Posted: **Sun Oct 29, 2017 8:17 pm**

jimhabegger wrote:Steersman, well done!

Thanks. Took a little more time and effort than I thought.

jimhabegger wrote:"... the effects of weighting are largely unknown."

Exactly.

Actually, I would have more confidence in Pew's percentages if they did not do this, which I think is purely for cosmetic purposes, and which I think is just as likely to move the opinion percentages away from population percentages, as to move them closer:

I rather doubt they would be doing that just for "cosmetic purposes" - one might suggest your biases are showing a bit if not hanging out in the cold.

jimhabegger wrote:"Additionally, where appropriate, data were weighted through an iterative procedure to more closely align the samples with official population figures for characteristics such as gender, age, education and ethnicity."

It's that weighting which leaves me at a loss about what kind of confidence I can have in projecting those percentages onto the populations.

Maybe they're a little vague about the reasons for doing that, and the methods they employ, but I kind of get the impression from the various articles I've read - mostly Wikipedia - that there is plenty of justification for the principle at least. For instance, if a population is made up of, say, 30% Shia and 70% Sunni then it might be wise to ensure that one's sampling follows the same profile, or that the results should be adjusted to compensate if it doesn't. Again, it looks like your biases are showing.

jimhabegger wrote:Anyway, I've decided that if I ever get into any more discussions about Pew polls, I'll just use their percentages, with a disclaimer that I personally have no idea if they apply to the populations or not.

Looks rather disingenuous at best if not intellectually dishonest to do that. You might instead try referring to the expected standard deviations given the sample sizes, and to how that theoretically affects the margins of error. And suggest that weighting is at least designed to minimize increased error margins. While the standard deviations are apparently not the whole story - seems the Pew report has different values for the margins of error for different countries even though sample sizes are close - they certainly seem to be a major part of it.

If thou gazest long into the Pit, the Pit will also gaze into thee

Possible uses for opinion polls

Possible uses for opinion polls

Re: Possible uses for opinion polls

Re: Possible uses for opinion polls

Re: Possible uses for opinion polls

Re: Possible uses for opinion polls

Re: Possible uses for opinion polls

Re: Possible uses for opinion polls

Re: Possible uses for opinion polls

Re: Possible uses for opinion polls

Re: Possible uses for opinion polls

Re: Possible uses for opinion polls

Re: Possible uses for opinion polls