All examples based on: What is the average number of hours primary school students in SG study every week?
Population
- The group that we want to draw conclusions for
- e.g. Primary school students
Population parameter
- Something you want to find out about the population
- If the sample is not biased, the sample statistic can be used to estimate the population parameter
- e.g. average number of hours studied per week
Sample
- Subset of the population selected in the study
Estimate
- Without information from every member of the population, we can only try to get a good estimate of the population parameter
- An Estimate is an inference about the population parameter
Sampling Frame
- The list from which the sample was obtained
- To be able to generalise findings from sample to population: sampling frame population (Generalisability)
Sampling Method
Probability Sampling
- All units in the sampling frame have a known, non-zero chance of being selected
- Probability of selecting each unit need not be the same
- Preferred, to reduce selection bias
Types of probability sampling:
- Applicable for any population
- Simple random sampling
- Draw names at random without replacement
- Every sample of size n has an equal chance of being selected
- +ve: Good representation of population
- -ve: Time consuming
- Systematic sampling
- Include every unit in the population, with a random start point
- +ve: Simpler than simple random
- -ve: May under-represent population
- Simple random sampling
- Applicable when population comes as groups
- Stratified random sampling
- Divide population into groups (strata) with similar characteristics (males & females), take a random sample from each subgroup (stratum), combine these to form final sample
- +ve: Good representation of sample by stratum
- -ve: Criteria required for classification of population into stratum
- Cluster sampling
- Divide population into natural subgroups (cities in a country), take a random sample of clusters, include whole cluster in final sample
- +ve: Less time consuming and costly
- -ve: Requires internally heterogeneous and externally homogeneous clusters
- Stratified random sampling
Non-Probability Sampling
- a human decides who to sample
- human brain is not a random
Sample Size
- the larger the better, to reduce random error
Bias
- Selection bias
- When the sample does not equally represent the population due to biased selection of units into the sample
- Caused by using non-probability sampling (human selection causes bias)
- e.g. surveying only students from two classes in a primary school
- Non response bias
- Participants’ non disclosure
- Participants’ non participation
- Applicable for all types of sampling
- Ensure response rate is not too low to reduce
- Bias means that conclusion is not generalisable (Generalisability)
Generalisability
- Whether conclusions from the sample can be generalised to the population
- NOT generalisable when:
- Sampling frame does not cover entire population
- There is bias in the sample
- To be generalisable:
- Have a good sampling frame greater or equal to the population
- Use probability sampling to minimise selection bias
- Minimise non-response rate
- Have a large sample size to reduce variability and random error in sample
Census
- An attempt to reach out to the entire population
- Disadvantages:
- High cost
- Takes a long time to complete
- Can suffer from non response bias