When the sample you selected says nothing about the population at large.
What is it?
A sample is said to be biased when it was collected in such a way that some members of the intended population have a lower sampling probability than others: all individuals were not equally likely to have been selected. Thus the results are not reliable as the sample is not representative of the targeted population.
Sampling bias is a big problem for polling organisations and market research institutes. In 1936, before polling became a national sport in the US, the magazine Literary Digest called 2 million phone numbers to ask people who they intended to vote for president. The forecast made by the magazine turned out to be completely wrong (even though 2 million is a huge sample!) because the people who owned a phone in 1936 were not representative of the population. They were richer and more urban than the rest of the population. They were a biased sample.
Roughly around the same time, a man named George Gallup, predicted the result of the US presidential election with astounding accuracy, based on a sample of only 50,000 people. The inventor of the Gallup poll had come up with a successful statistical method of survey sampling for measuring public opinion. His samples were significantly more reliable.
Today, statisticians and sociologists are very familiar with sampling bias. Yet surveys and studies are consistently exposed to it. Although (almost) all individuals now have a phone, phone surveys remain unreliable because the people who answer phone surveys tend to be older, less active and more available than the population at large.
In fact all the polls that rely on the goodwill of respondents fall prey to the bias. Answering a series of questions is in itself a filter. Those who respond may be quite different from those who do not. In that case there is a self-selection bias, or non-response bias (when the group of people responding has different responses than the group of people not responding).
What does it mean for human resources?
HR departments often carry out surveys whose reliability depends on the quality of the selected samples. Only reliable results can emplower HR people to draw relevant conclusions and make decisions that may have an impact on every employee, whether it pertains to work spaces, engagement or training.
Unfortunately the samples are often biased. HR people give too much weight to groups of individuals that are more in the limelight, for example, that are more talked about by the media. The media tend to prefer what’s sensational to what’s representative. One could argue that the level of attention given millennials is an illustration of the phenomenon.
Last but not least, HR departments are concerned by the subject of union representativeness, as they are responsible for social dialogue. Whether or not unions are actually representative of all employees is a critical question. As fewer and fewer people are affiliated to a union, the whole system of collective negotiation is in crisis today.
How can it be overcome?
To take sampling bias into account, HR people must learn to think like statisticians and “correct” their samples to make them more representative. Overrepresented profiles must sometimes be taken out and underrepresented profiles counted multiple times. But if some groups (women, for example) are completely absent, no correction can be made.
The more we rely on big data and quantitative analyses to make corporate decisions, the more it becomes vital to be familiar with the dangers of sampling bias and the techniques used by statisticians to remedy the bias.