For mutually exclusive A1,A2,⋯,An (Ai∩Aj=∅ for any i=j), P(A1∪A2∪⋯∪An)=P(A1)+P(A2)+⋯+P(An)
Use induction on axiom 3
For event A, P(A’)=1−P(A)
S=A∪A’ and A∩A’=∅, from axiom 2 and 3,
1=P(S)=P(A∪A’)=P(A)+P(A’)
For any events A and B, P(A)=P(A∩B)+P(A∩B’)
Based on A=(A∩B)∪(A∩B′) and (A∩B)∩(A∩B′)=∅ and axiom 3
For any events A and B, P(A∪B)=P(A)+P(B)−P(A∩B)
From A∪B=B∪(A∩B′) and B∩(A∩B′)=∅, and the previous property
P(A∪B)=P(B)+P(A∩B′)=P(B)+P(A)−P(A∩B)
If A⊂B, then P(A)≤P(B)
Since A⊂B , then A∪B=B
A∪B=A∪(B∩A′) and A∩(B∩A′)=∅
P(B)=P(A∪B)=P(A∪(B∩A′))=P(A)+P(B∩A′)≥P(A)
Finite sample space with equally likely outcomes
S={a1,a2,⋯,an} , where all a are equally likely, P(a1)=P(a2)=⋯
Then for event A⊂S, P(A)=no. of sample points in A/no. of sample points in S
Conditional Probability
Probability that event A happens, given that we have the information that “event B has occurred”, or probability of A given BP(A∣B)=P(B)P(A∪B)
Kinda changing the sample space to be B, so now the sample points where A happens is A∪B
Multiplication rule
P(A∩B)=P(A)P(B∣A) if P(A)=0
P(A∩B)=P(B)P(A∣B) if P(B)=0
Inverse probability formula:P(A∣B)=P(B)P(A)P(B∣A)
Independence
Events A and B are independent, A⊥B iff P(A∩B)=P(A)P(B)
If P(A)=0, A⊥B iff P(A∣B)=P(A)
Knowledge of A does not change the probability of B
A⊥B⟺P(B∣A)=P(A)P(A∩B)=P(A)P(A)P(B)=P(B)
Venn diagram can’t show independence, only can show mutually exclusive (they are totally different)
Law of Total Probability
Partition: If A1,A2,⋯ are mutually exclusive events, and ∪i=1nAi=S , A1,A2,⋯ is a partition of S
LoTP: For a partition A1,A2,⋯ of S and any event B, P(B)=∑i=1nP(B∩Ai)=∑i=1nP(Ai)P(B∣Ai)
For any events A and B, P(B)=P(A)P(B∣A)+P(A’)P(B∣A’) (special case where the partition is A and A’)
Baye’s Theorem
For a partition A1,A2,⋯ of S and any event B and k=1,2,⋯ . P(Ak∣B)=∑i=1nP(Ai)P(B∣Ai)P(Ak)P(B∣Ak)
Derived from conditional probability, multiplication rule and law of total probability: P(Ak∣B)=P(B)P(Ak∩B)=∑i=1nP(B∩Ai)P(Ak)P(B∣Ak)=∑i=1nP(Ai)P(B∣Ai)P(Ak)P(B∣Ak)
Or derived from P(A∩B)=P(A)P(A∣B)=P(B)P(B∣A) → P(A∣B)=P(A)P(B)P(B∣A)
Special case where n=2, partition is {A,A’}, P(A∣B)=P(A)P(B∣A)+P(A′)P(B∣A′)P(A)P(B∣A)
P(A∣B) (Posterior Probability): The updated probability of A occurring given that B has occurred. This is the value you are solving for
P(A) (Prior Probability): The initial probability of A occurring before any new information (B) is taken into account
P(B∣A) (Likelihood): The probability of B occurring given that A is true. This is a measure of how likely the evidence (B) is under the hypothesis (A)
P(A′) (Complementary Prior Probability): The probability of the opposite of A occurring
P(B∣A′) (Likelihood of the Alternative): The probability of B occurring given that A is false (i.e., A′ is true). This accounts for false positives or alternative causes of the evidence.