A couple of things happened recently that set me thinking about sampling. The first was a brilliant webinar hosted by the British Academy of Management (BAM). The speaker was a quantitative researcher and was giving tips on how to present quantitative data analysis successfully. I lean naturally toward qualitative work so I attended the webinar to get a better understanding of research methods outside my comfort zone. The points the speaker made about sampling were fascinating and made me think deeply about the importance (and difficulty) of getting it right if you want to be able to publish the results. The second thing was a tutorial with one of my PhD students who is preparing for the primary research phase – we spent nearly a whole hour just discussing sampling and how to get it right. Sampling is not easy – it is complex, layered and involves careful decision making to ensure it is justifiable.
The first question to ask is, why are you sampling at all? Why can’t you include the whole population, i.e. everyone concerned in your study? The answer is probably because there are too many in the total population so it would take a lot of time and money and resources to include them all, making your project logistically impossible. The sample you use needs to be representative of the total population. As some populations are extremely complex, this is a huge challenge. You have to start by understanding the population. Can you describe it accurately? If not, you may need to do some research about the population before you start thinking about sampling so you can make sure it’s truly representative.
For example, how would you draw a representative sample of students at a university? The population includes students from age 18 upwards studying at access, HND, undergraduate, postgraduate taught and postgraduate research levels, from a large number of different countries, ethnicities, genders, races, cultures, with different abilities and health issues, with a range of different home circumstances and studying a host of different subjects. How do you draw a sample that represents such a complex population? There’s no easy answer, unfortunately.
Sampling methods
There are several recognised ways of drawing a sample. The one preferred by many researchers is random sampling. Selecting people at random means that everyone in the population has an equal chance of being included in the sample. You could achieve this by putting the names of all students at a university into a hat (it would have to be a huge hat) and then drawing out the names. Or you could put the names in a long list in Word or Excel and then select every 10th name, although this might present problems of clustering of similar surnames so it would be safer to use student numbers instead. This would have the added bonus of making the sample anonymous. The benefit of a random sample is that it reduces potential bias in the selection process. This sounds great but it’s quite hard to achieve in some cases. And there is a further problem with simple random sampling – would the sample be truly representative of the complex population described?
To deal with the issue of representation of the population you could use stratified random sampling. This method accepts that there are different sub-populations (strata) and seeks to draw a random sample from each stratum. For example, if your research is about the experience of students studying at different levels, you could take a random sample of undergraduates, a random sample of postgraduates and so on until all the levels are represented. You would have to explain in your limitations that all other factors (age, gender, disability, ethnicity, etc.) were considered equal or not important to the research.
Quota sampling is used in market research in order to achieve a certain number of responses in each of the subgroups of the population. This is similar to stratified random sampling except that the sample may not be random. Have you ever been stopped on the street and asked to fill in a survey? Why have they stopped you? They may be looking to fill a quota, so the first 25 people they find that match the quota descriptor are approached to complete the survey. This cannot be guaranteed to be random or representative of the population but can be a reasonable approximation.
Purposive sampling is where the researcher uses his/her judgement to select a sample. This could mean, for example, deliberately selecting only female students in their second year of study on Humanities courses when researching something specific about the effect of the curriculum on women. Or it could be inviting followers of specific social media influencers to take part in a survey about the impact of influencers on consumer decision making. The benefits of this sort of sample are that you are likely to attract people who have good knowledge about your research topic and that they are likely to respond if they are already interested. The drawback is whether the researcher’s judgement is accurate and objective or whether it introduces bias to the sample.
Convenience sampling is quite often seen in student dissertations – this is literally selecting the people most convenient to the researcher. The only benefit of this type of sample is its convenience – there is nothing else to recommend it and it is very unlikely to be representative of a larger population.
Finally, snowball sampling is a method used to widen the potential reach of the initial sample by asking respondents to send on the invite to people they feel may be interested in participating. If using this method you need to include some qualifying questions at the start of the survey to ensure that all potential respondents meet the qualifying criteria for inclusion.
Who and how many to include?
The decision about who to include and who to exclude from the sample must be clearly tied to the research objectives and must be clearly explained in the methodology and ideally in the rubric at the start of the survey. Qualifying questions, as mentioned earlier, should also be used to ensure participants fit the criteria.
The last question is how many to include in your sample – what proportion of the population should you seek to cover? Generally, the larger your sample, the better. In a survey collecting quantitative data you should aim for a sample that will give you statistically significant results, i.e. the results will be accepted as true as opposed to chance. This depends not only on the size of the total population but the margin of error you are happy to accept. For example, if your total population is 1000 people, for 95% confidence in your results you could use a sample of only 278, but this would give you a +/-5% margin of error. If you increase the sample size to 906 the margin of error is reduced to only +/-1%. Many researchers are happy with 95% confidence and a 5% margin of error. Something peculiar happens when you reach populations of 250,000 and over – a sample of 384 is needed for all populations of this size, to achieve 95% confidence with a 5% margin of error. For a very useful table of sample sizes and an explanation of the calculation used to determine them, see The Research Advisors.
Careful consideration of your sampling method, inclusion/exclusion criteria and sample size will lead to robust and reliable results. Errors can still be made with sampling and there’s another host of issues for sampling for qualitative studies but those are best left for another post!
Reference
The Research Advisors (2006) Sample size table [online], available at https://www.research-advisors.com/tools/SampleSize.htm, accessed 1 August 2021.