Chapter 4. Foundations for Inference
Chapter 5. Inference for Numerical Data
Chapter 6. Inference for Categorical Data
Statistical inference is one field of statistics.
It is all about judging the quality of guess on parameters. To be more specific, statistical inference is about understanding the quality of parameter estimates.
Generally, there are two methods to measure the quality of parameter estimates. However, they are pretty much the same thing from different point of views.
1. Confidence Interval
- (based on sample parameter distribution, not the distribution of sample itself)
- Build confidence interval around alternative hypothesis and see if null hypothesis value is within the interval
2. Hypothesis Testing
- (based on sample parameter distribution, not the distribution of sample itself)
- Given interval based on null hypothesis distribution, see if alternative hypothesis value is outside of the interval.
- P value: the area outside the interval = how strong the evidence supports the null hypothesis
When applying the methods, there could be two cases: numerical data case and categorical data case.
1. Numerical Data
- Measuring the difference of two means
- Sample mean distribution: standard error = (std_null^2/n_null + std_alt^2/n_alt)^(1/2)
2. Categorical Data
- Measuring the difference of two proportions
- Sample proportion distribution: standard error = (p1(1-p1)/n1 + p2(1-p2)/n2)^(1/2)
- Conditions: The number of successes and failures should be over 10, respectively.
#Comment: Being good at statistics means having good understanding of conditions to apply specific statistical methods.
Just knowing how to apply statistics method itself could be dangerous. You might end up making serious mistakes in interpreting data.