Skip to main content

Tests and Confidence Intervals

info

Test or Confidence Interval is a way to estimate the error we do in our estimation because of the lack of data.

Tests

Tests are the backbone spine of statistical modeling. You can ever confirm or deny the totality of your work in a second with it. But to be transparent, tests are still quite fuzzy for me. So I will explain what I understood:

In order to choose between an hypothetical model H0H_0 named the null hypothesis and an other model H1H_1 named alternative hypothesis, you can apply a methodological framework: a test. For that, you can use either a method with a quantile or with a p-value.

  1. Define a statistical model:
M:=(X,A,{Pθ}θΘ)\mathcal M := ( \mathcal X, \mathcal A, \{ \mathbb P_{\theta} \}_{\theta \in \Theta} )
  1. Define:
  • Null hypothesis H0H_0: θΘ0Θ\theta \in \Theta_0 \subset \Theta
  • Alternative hypothesis H1H_1: θΘ1Θ\theta \in \Theta_1 \subset \Theta
  1. Create a statistic t(X)t(X) that helps distinguish between H0H_0 and H1H_1. Under H0H_0, the distribution of t(X)t(X) is denoted by LL (i.e., the law of t(X)t(X) under Pθ0\mathbb P_{\theta_0}).

  2. Fix a significance level α(0,1)\alpha \in (0,1) (e.g., α=0.05\alpha = 0.05) and use the quantile of order 1α1 - \alpha of the distribution LL, denoted by q1αLq_{1-\alpha}^L and defined by: Pθ0(t(X)q1αL)=1α\mathbb P_{\theta_0}(t(X) \leq q_{1-\alpha}^L) = 1 - \alpha. This means that, under H0H_0, the probability that t(X)t(X) exceeds q1αLq_{1 - \alpha}^L is exactly α\alpha

(Note: For a one-sided test with large values of t(X)t(X) as evidence against H0H_0; adjust accordingly for two-sided or left-tailed tests.)

  1. Compute the observed value t(x)t(x) of the test statistic from the data:
  • If t(x)q1αLt(x) \geq q^L_{1-\alpha}, we are in a rare event under H0H_0reject H0H_0.
  • Otherwise, we do not reject H0H_0.

Give the statistical test of level 10% to check if the normal law N(θ,2)\mathcal N(\theta, 2) is center in 2 or not. You have two samples equal to x=(3.7,3.9)x=(3.7, 3.9).

  1. Define a statistical model:
M:=(X,A,{Pθ}θΘ) \mathcal{M} := ( \mathcal{X}, \mathcal{A}, \{ \mathbb{P}_{\theta} \}_{\theta \in \Theta} )
  1. Define:
  • Null hypothesis H0H_0: θΘ0Θ\theta \in \Theta_0 \subset \Theta
  • Alternative hypothesis H1H_1: θΘ1Θ\theta \in \Theta_1 \subset \Theta
  1. Construct statistic t(X)t(X), a function of the data that helps distinguish between H0H_0 and H1H_1.

Under H0H_0, the distribution (or "law") of t(X)t(X) is denoted: t(X)Lunder Pθ0H0t(X) \sim L \quad \text{under } \mathbb{P}_{\theta_0} \in H_0

  1. Compute the Observed Value of the statistic t(x)t(x).

  2. Compute the p-value. The p-value is the probability, under the null hypothesis, of observing a value of the test statistic as extreme or more extreme than the one observed:

p-value:=Pθ0(t(X)t(x)) \text{p-value} := \mathbb P_{\theta_0}(t(X) \geq t(x))

(Note: For a one-sided test with large values of t(X)t(X) as evidence against H0H_0; adjust accordingly for two-sided or left-tailed tests.)

  1. Choose a significance level α\alpha, typically α=0.05\alpha = 0.05.
  • If p-value α\leq \alpha: Reject H0H_0
  • If p-value >α> \alpha: Do not reject H0H_0

Give the statistical test of level 10% to check if the normal law N(θ,2)\mathcal N(\theta, 2) is center in 2 or not. You have two samples equal to x=(3.7,3.9)x=(3.7, 3.9).

As you saw, tests really depend of the statistic (and the law associated) that you choose ! As I said, I am not fluent in test but I have to precise that a lot of different statistic/law can be found: Wald, Fisher, Student, χ2\chi^2, Likelihood-ratio, etc. Some of them are really useful in a specific context. For example, when:

  • you don't know the variance, use a Student law.
  • Θ0Θ1\Theta_0 \subset \Theta_1, use a Wald or a Likelihood-ratio statistic (which follow a χ2\chi^2 law).

Also you have to not be dumb and use the good side of your density to create a good rejection area. Some mathematician try to automatize this (Neyman-Pearson, etc.) but that's complicate the process for not so much.

If your aim is to understand machine learning, you will see that we can go really far with tests.

TODO: puissance d'un test

Confidence Intervals

How to create a CI

In more than 1D

Bonferroni

Wald