Rates
| HD | No HD | Row Total | |
|---|---|---|---|
| Smoker | 38 | 14962 | 15000 |
| Non-Smoker | 44 | 84956 | 85000 |
| Column Total | 82 | 99918 | 100000 |
- Marginal Rate
- % of population that satisfies the condition
- Rate(Smoker)
- Conditional Rate
- A rate based on part of the population that satisfies a certain condition
- % of people out of population that satisfies “given” condition
- Rate(HD | Smoker) → Rate of HD given Smoker
- Rate(HD | Smoker)
- Joint Rate
- % of population that satisfies both conditions
- Rate(HD & Smoker)
Symmetry Rule
If rate(A | B) [op] rate(A | not B), then rate(B | A) [op] rate(B | not A). Where [op] is
Basic Rule of Rates
- Given subgroups A, B and C, the overall rate(A) is always between rate(A | B) and rate(A | C)
- From 1, when C = “not B”, rate(A) is always between rate(A | B) and rate(A | not B)
- The closer rate(B) gets to 100%, the closer rate(A) gets to rate(A | B)
- Rate(A) is exactly between rate(A | B) and rate(A | not B) if rate(B) =
Association
| B | Not B | Row Total | |
|---|---|---|---|
| A | |||
| Not A | |||
| Col. Total |
- Categorical variables A and B are associated to each other if rate(A | B) rate(A | not B)
- A and B are positively associated if rate(A | B) rate(A | not B)
- Negatively associated if rate(A | B) rate(A | not B)
- Comparing rate(A | B) vs rate(A | not B) is exactly the same as comparing rate(B | A) vs rate(B | not A)
- Association does not establish causation
- Whether the association can be generalised from sample to population is based on Generalisability
Confounder
A confounder is a third variable, associated with both the dependent and independent variables It causes any association determined between the dependent and independent variable to be unreliable
Find Confounders
| Male, HD | Male, No HD | Female, HD | Female, No HD | Row Total | |
|---|---|---|---|---|---|
| Smoker | 25 | 9582 | 13 | 5380 | 15000 |
| Non Smoker | 30 | 34954 | 14 | 50002 | 85000 |
| Column Total | 55 | 44536 | 27 | 55382 | 100000 |
To determine if sex is a confounding variable, check for association between sex vs smoking and sex vs HD e.g. since rate(Smoker | Male) rate(Smoker | Female) and rate(HD | Male) rate(HD | Female), sex is a confounder
Control for Confounders
- Split the data
- If there is an association observed in both groups, then there is an association
Simpson’s Paradox
- When a trend appears in more than half of the groups of data, but disappears or reverses when the groups are combined
- Relationship between rates in subgroups disappear/reverse when the groups are combined
- Simpson’s paradox there is a confounding variable (confounder Simpson’s paradox)
| Major | No. of Applications (Male) | No. of Successful Applications (Male) | Rate (Male) | No. of Applications (Female) | No. of Successful Applications (Female) | Rate (F) |
|---|---|---|---|---|---|---|
| A | 2000 | 800 | 40% | 10 | 2 | 20% |
| B | 10 | 8 | 80% | 1000 | 600 | 60% |
| C | 10 | 8 | 80% | 1000 | 600 | 60% |
| Overall | 2020 | 816 | 40.3% | 2010 | 1202 | 59% |
Paradox: In a majority of majors, rate of successful male applications is higher than female, but overall, rate of successful male applications is lower