Module 7 · Quantitative Methods

Estimation and Inference

EN: Sampling distributions, the Central Limit Theorem, standard error and confidence intervals.
VN: Phân phối mẫu, định lý CLT, sai số chuẩn và khoảng tin cậy.

In this module

Sampling Distribution of the Mean
Central Limit Theorem (CLT)
Standard Error of the Mean
Confidence Interval — σ Known (z)
Confidence Interval — σ Unknown (t)
Critical Values for Common Confidence Levels

1. Sampling Distribution of the Mean Core

About: If you took many random samples, the means \(\bar{X}\) themselves form a distribution. Its mean = μ; its variance = σ²/n (much smaller than population variance for large n).Tóm tắt: Lấy nhiều mẫu ngẫu nhiên, trung bình mẫu \(\bar{X}\) tạo thành phân phối. Mean = μ; var = σ²/n — nhỏ hơn variance tổng thể khi n lớn.

EN: The distribution of \(\bar{X}\) computed from many random samples of size n drawn from the same population.
VN: Phân phối của trung bình mẫu \(\bar{X}\) qua nhiều lần lấy mẫu.

\[ E(\bar{X}) = \mu, \qquad \text{Var}(\bar{X}) = \frac{\sigma^{2}}{n} \]

Practice problem

Population mean μ = 10%, σ = 15%. Sample of n = 25. Compute E(\(\bar{X}\)) and Var(\(\bar{X}\)).

Show solution

E(\(\bar{X}\)) = μ = 10%

Var(\(\bar{X}\)) = σ²/n = (15)²/25 = 225/25

E = 10%, Var = 9 (so SE = 3%)

2. Central Limit Theorem (CLT) Core

About: For sample size n ≥ 30, the sampling distribution of the mean is approximately normal — regardless of the population's distribution. Foundation of nearly all inferential statistics.Tóm tắt: Với n ≥ 30, phân phối của trung bình mẫu xấp xỉ chuẩn — bất kể phân phối gốc. Nền tảng thống kê suy diễn.

EN: For sample size \(n \ge 30\), the sampling distribution of \(\bar{X}\) is approximately normal — regardless of the population's distribution.
VN: Khi \(n \ge 30\), phân phối của \(\bar{X}\) xấp xỉ chuẩn — bất kể dạng phân phối gốc.

\[ \bar{X} \overset{\text{approx}}{\sim} N\!\left(\mu,\ \frac{\sigma^{2}}{n}\right) \]

Why it matters: CLT lets us use normal/t tests on means even when the underlying data is non-normal — the workhorse of inferential statistics.

Practice problem

Sample of n = 36 monthly returns from a non-normal population with σ = 4%. Per CLT, what is approx. distribution of \(\bar{X}\)?

Show solution

CLT: n ≥ 30 → \(\bar{X}\) approximately normal regardless of underlying distribution.

SE = σ/√n = 4/6

\(\bar{X}\) ~ N(μ, (0.667%)²) approximately

3. Standard Error of the Mean Core

About: SE measures uncertainty of the sample mean. Shrinks at rate √n — to halve SE you need 4× the sample size. Distinct from σ which describes individual variation.Tóm tắt: SE đo bất định của trung bình mẫu. Giảm theo √n — muốn giảm SE một nửa cần n gấp 4. Khác σ (đo biến thiên cá nhân).

\[ \sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}} \quad \text{(known } \sigma\text{)} \qquad s_{\bar{X}} = \frac{s}{\sqrt{n}} \quad \text{(unknown } \sigma\text{)} \]

Practice problem

A sample of 100 daily returns has standard deviation s = 1.5%. Compute the standard error of the mean.

Show solution

\(s_{\bar{X}} = 1.5\%/\sqrt{100} = 1.5\%/10\)

= 0.15%

4. Confidence Interval — σ Known Core

About: A 95% CI gives a range that captures the true μ in 95% of repeated samples. Width depends on confidence level (z-value) and SE. Use z when σ is known (rare in practice).Tóm tắt: CI 95% chứa μ thật trong 95% lần lấy mẫu. Độ rộng phụ thuộc mức tin cậy và SE. Dùng z khi biết σ.

Normal distribution: 95% of mass within ±1.96σ; 99% within ±2.58σ.

\[ \bar{X} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \]

Components / Thành phần

\(z_{\alpha/2}\) Standard-normal critical value — 1.96 (95%), 1.645 (90%), 2.58 (99%).
\(\sigma/\sqrt{n}\) Standard error of the mean.

Practice problem

A sample of 36 monthly returns has \(\bar{X}\) = 1.2% and known σ = 3%. Build the 95% confidence interval for the population mean.

Show solution

SE = 3%/√36 = 0.5%

CI = 1.2% ± 1.96(0.5%) = 1.2% ± 0.98%

95% CI ≈ [0.22%, 2.18%]

5. Confidence Interval — σ Unknown Core

About: When σ is unknown (almost always in practice), use sample s and the t-distribution with df = n−1. Wider intervals than z, reflecting added uncertainty.Tóm tắt: Khi không biết σ (thường xuyên), dùng s mẫu và phân phối t với df = n−1. CI rộng hơn z.

EN: Use the Student's t distribution with \(df = n - 1\) when σ is unknown (almost always the case in practice).
VN: Dùng phân phối t Student với bậc tự do \(n - 1\) khi không biết σ.

\[ \bar{X} \pm t_{\alpha/2,\,n - 1} \cdot \frac{s}{\sqrt{n}} \]

For \(n > 30\), \(t \approx z\) (t distribution converges to standard normal).

Practice problem

n = 25, \(\bar{X}\) = 8%, s = 5%. Build the 95% confidence interval (t = 2.064).

Show solution

SE = 5/√25 = 1.0%

CI = 8 ± 2.064(1.0)

95% CI ≈ [5.94%, 10.06%]

6. Common Critical Values Reference

About: Memorize: 1.645 (90%), 1.96 (95%), 2.58 (99%) for two-tailed z-tests. These show up everywhere in CFA hypothesis testing.Tóm tắt: Nhớ: 1.645 (90%), 1.96 (95%), 2.58 (99%) cho z-test hai phía. Xuất hiện liên tục trong CFA.

z critical values (two-tailed)

90% ±1.645
95% ±1.960
99% ±2.576