One of the biggest challenges in statistics is collecting a representative sample: finding a subset of the population that will do a good job of approximating the whole group. When a dataset contains a lot of sampling bias and is not reflective of the general population, it is essentially worthless as a guide. That cannot be fixed by using a larger sample side, nor can it be dealt with via fancy mathematics.
The classic example of sampling bias is the ‘Dewey Defeats Truman’ headline, from The Chicago Tribune in 1948. The newspaper got their prediction wrong because they sampled people with telephones, at a time when telephones were comparatively rare. Most of the people who had them were rich, and rich people were more supportive of Dewey. As a consequence, telephone polling provided bad information about the likely voting behaviour of the whole population.
This clearly relates to the decision of Canada’s federal government to make the 2011 long-form census optional. With a mandatory census, you more closely approximate an unbiased sample (it isn’t perfect, because some people will refuse to fill in even a mandatory form). With a voluntary census, you are always vulnerable to the possibility that the sort of people who will make the effort to complete it will differ from those who will not. In such a situation, the data in the census could be a poor reflection of the situation in the population as a whole.
That is why it is foolish for the Fraser Institute to advocate the use of voluntary polling or market research, in place of a census. The quality of data from such sources can never be as good, because sampling bias will always make it suspect.
Zoom also has a post on scrapping the mandatory long census.