All examples based on: What is the average number of hours primary school students in SG study every week?

Population

  • The group that we want to draw conclusions for
  • e.g. Primary school students

Population parameter

  • Something you want to find out about the population
  • If the sample is not biased, the sample statistic can be used to estimate the population parameter
  • e.g. average number of hours studied per week

Sample

  • Subset of the population selected in the study

Estimate

  • Without information from every member of the population, we can only try to get a good estimate of the population parameter
  • An Estimate is an inference about the population parameter

Sampling Frame

  • The list from which the sample was obtained
  • To be able to generalise findings from sample to population: sampling frame population (Generalisability)

Sampling Method

Probability Sampling

  • All units in the sampling frame have a known, non-zero chance of being selected
  • Probability of selecting each unit need not be the same
  • Preferred, to reduce selection bias

Types of probability sampling:

  • Applicable for any population
    • Simple random sampling
      • Draw names at random without replacement
      • Every sample of size n has an equal chance of being selected
      • +ve: Good representation of population
      • -ve: Time consuming
    • Systematic sampling
      • Include every unit in the population, with a random start point
      • +ve: Simpler than simple random
      • -ve: May under-represent population
  • Applicable when population comes as groups
    • Stratified random sampling
      • Divide population into groups (strata) with similar characteristics (males & females), take a random sample from each subgroup (stratum), combine these to form final sample
      • +ve: Good representation of sample by stratum
      • -ve: Criteria required for classification of population into stratum
    • Cluster sampling
      • Divide population into natural subgroups (cities in a country), take a random sample of clusters, include whole cluster in final sample
      • +ve: Less time consuming and costly
      • -ve: Requires internally heterogeneous and externally homogeneous clusters

Non-Probability Sampling

  • a human decides who to sample
  • human brain is not a random

Sample Size

  • the larger the better, to reduce random error

Bias

  • Selection bias
    • When the sample does not equally represent the population due to biased selection of units into the sample
    • Caused by using non-probability sampling (human selection causes bias)
    • e.g. surveying only students from two classes in a primary school
  • Non response bias
    • Participants’ non disclosure
    • Participants’ non participation
    • Applicable for all types of sampling
    • Ensure response rate is not too low to reduce
  • Bias means that conclusion is not generalisable (Generalisability)

Generalisability

  • Whether conclusions from the sample can be generalised to the population
  • NOT generalisable when:
    • Sampling frame does not cover entire population
    • There is bias in the sample
  • To be generalisable:
    • Have a good sampling frame greater or equal to the population
    • Use probability sampling to minimise selection bias
    • Minimise non-response rate
    • Have a large sample size to reduce variability and random error in sample

Census

  • An attempt to reach out to the entire population
  • Disadvantages:
    • High cost
    • Takes a long time to complete
    • Can suffer from non response bias