Table of Contents
ToggleStatistics and Probability: Sampling and Hypothesis Testing
What Is Sampling?
Sampling is the process of selecting a subset of individuals or items from a larger population to make inferences about the population as a whole. In statistics, it’s often not practical or possible to study an entire population, so samples are used instead.
There are different sampling methods that can be used, including:
- Random Sampling: Every individual in the population has an equal chance of being selected.
- Systematic Sampling: Individuals are selected at regular intervals from a list.
- Stratified Sampling: The population is divided into subgroups (strata), and samples are taken from each subgroup.
- Cluster Sampling: The population is divided into clusters, and a random sample of clusters is selected.
Key Concepts in Sampling
Sample Size and Population Size
- Sample Size (\(n\)): The number of individuals or items selected from the population. A larger \(n\) typically leads to more accurate estimates.
- Population Size (\(N\)): The total number of individuals or items in the population.
Sampling Error
Sampling error is the difference between the sample statistic and the population parameter. It decreases as \(n\) increases:
\[ \text{Sampling Error} = \bar{x} – \mu \]
where \(\bar{x}\) is the sample mean and \(\mu\) is the population mean.
What Is Hypothesis Testing?
Hypothesis testing is a statistical method to make inferences about population parameters based on sample data. Key steps:
- Formulate Hypotheses: Null (\(H_0\)) and alternative (\(H_1\)).
- Select Significance Level (\(\alpha\)): Typically 0.05.
- Calculate Test Statistic (e.g., \(z\) or \(t\)).
- Make a Decision: Reject or fail to reject \(H_0\).
Null and Alternative Hypotheses
- Null Hypothesis (\(H_0\)): No effect/difference (e.g., \(H_0: \mu = \mu_0\)).
- Alternative Hypothesis (\(H_1\)): Contradicts \(H_0\) (e.g., \(H_1: \mu \neq \mu_0\)).
Example Hypothesis
Test if the average height of students is 170 cm:
- \(H_0: \mu = 170\)
- \(H_1: \mu \neq 170\)
Significance Level and p-Value
Significance Level (\(\alpha\))
Probability of Type I error (rejecting \(H_0\) when true). Common values:
\[ \alpha = 0.05, 0.01, \text{or } 0.10 \]
p-Value
Probability of observing results as extreme as the sample data, assuming \(H_0\) is true. Reject \(H_0\) if:
\[ p\text{-value} < \alpha \]
Types of Errors
- Type I Error: Rejecting \(H_0\) when true (\(\alpha\)).
- Type II Error (\(\beta\)): Failing to reject \(H_0\) when false.
Example Problem
Test if average student weight is 60 kg (\(\alpha = 0.05\)):
- Sample: \(n = 50\), \(\bar{x} = 62\), \(\sigma = 8\)
- \(H_0: \mu = 60\), \(H_1: \mu \neq 60\)
Solution:
- Calculate \(z\)-statistic:
\[ z = \frac{\bar{x} – \mu}{\sigma/\sqrt{n}} = \frac{62 – 60}{8/\sqrt{50}} = 1.77 \] - Critical value for \(\alpha = 0.05\) (two-tailed): \(\pm 1.96\).
- Decision: Since \(1.77 < 1.96\), fail to reject \(H_0\).
Common Mistakes in Hypothesis Testing
- Not Defining Hypotheses Clearly: Make sure to clearly define both the null and alternative hypotheses before conducting the test.
- Confusing p-Value with Significance Level: The p-value is the probability of observing the test statistic under the null hypothesis, not the significance level.
- Misinterpreting Type I and Type II Errors: Be aware of the consequences of making a Type I or Type II error, and adjust the significance level accordingly.
Practice Questions
- A sample of 100 students has an average height of 160 cm. Test the hypothesis that the average height of students in the school is 165 cm at the 5% significance level.
- A new drug is tested on 200 patients. The null hypothesis is that the drug has no effect. The sample mean recovery time is 12 days with a standard deviation of 4 days. Perform a hypothesis test at the 1% significance level.
- In a factory, a machine is tested for accuracy. The null hypothesis is that the machine is accurate to within 0.5 mm. The sample measurement shows a mean of 0.8 mm with a standard deviation of 0.2 mm. Perform a hypothesis test at the 10% significance level.
- Test if average height is 165 cm (\(n = 100\), \(\bar{x} = 160\), \(\alpha = 0.05\)).
- Drug test: \(H_0: \text{no effect}\), \(\bar{x} = 12\), \(\sigma = 4\), \(n = 200\), \(\alpha = 0.01\).



