Chung and Romano (2013) Exact and Asymptotically Robust Permutation Tests, AOS.

Introduction

$X_1, \ldots, X_m \sim P$ IID.
$Y_1, \ldots, Y_n \sim Q$ IID. $X$ と $Y$ は independent.
Let $N = m + n$ and write

\begin{align*}Z = (Z_1, \ldots, Z_N) = (X_1, \ldots, X_m, Y_1, \ldots, Y_n)\end{align*}

$\bar \Omega = \{(P,Q) : P = Q\}$ . $(P,Q) \in \bar \Omega$ であるとき， $(Z_1, \ldots, Z_N)$ の同時分布は任意の $(Z_{\pi(1)}, \ldots, Z_{\pi(N)})$ の分布に等しい。ただし， $\pi(1), \ldots, \pi(N)$ は $1, \ldots, N$ の permutation.
$\mathbf{G}_N$ をすべての permutation の集合とする。（ $|\mathbf{G}_N| = N!$ ）
検定統計量 $T_{m,n} = T_{m,n}(Z_1, \ldots, Z_N)$ . 原則的にはパワーが最大になるように $T$ を選びたい。
$T_{m,n} = T_{m,n}(Z_{\pi(1)}, \ldots, Z_{\pi(N)})$ をすべての permutation について計算し，

\begin{align*}T_{m,n}^{(1)} \le T_{m,n}^{(2)} \le \cdots \le T_{m,n}^{(N!)} \end{align*}

と並べる。

$k =N! - [\alpha N !]$ for a nominal level $0 \lt \alpha \lt 1$
$\phi(Z) = 1$ if $T_{m,n}(Z) \gt T_{m,n}^{(k)}(Z)$ , $\phi(Z) = 0$ if $T_{m,n}(Z) \lt T_{m,n}^{(k)}(Z)$ とおく。Tieの場合は論文参照。
このとき，for any $(P,Q) \in \bar \Omega$ ,

\begin{align*}E_{P,Q}[\phi(Z)] = \alpha \end{align*}

また， $\hat R_{m,n}^T(t) = \frac{1}{N!}\sum_{\pi \in \mathbf{G}_N}I(T_{m,n}(Z_{\pi(1)}, \ldots, Z_{\pi(N)}) \le t)$ とおく。

Permutation test: Reject $H_0$ if $T_{m,n}(Z) \gt T_{m,n}^{(k)}(Z)$ or $T_{m,n}(Z) \gt 1 - \alpha \text{ quantile of } \hat R_{m,n}^T(\cdot)$

帰無仮説 $H_0: (P,Q) \in \Omega_0$ where $\Omega_0 \subset \bar \Omega$ について，permutation testは必ずちょうど $\alpha$ の棄却率を有する＝ Exactness
$\Omega_0$ が $\bar \Omega$ より大きい時に問題が生じる。なぜなら帰無仮説のもとでpermutateしたデータの分布が元のデータの分布と異なるから。
たとえば $\Omega_0 = \{(P,Q) : \mu(P) = \mu(Q)\}$ 。検定統計量としては $T_{m,n} = \sqrt{N}(\bar X_m - \bar Y_n)$ 。このとき，permutation testは $\Omega_0 = \{(P,Q) : \mu(P) = \mu(Q), P \neq Q\}$ については検定力がない。
Neuhaus (1993): by proper studentization of a test statistic, the permutation test can result in asymptotically valid inference even when the underlying distributions are not the same.
The goal of this paper: we would like to retain the exactness property when $P = Q$ , and also have the asymptotic rejection probability be $\alpha$ for the more general null hypothesis specifying the parameter.

Robust studentized two-sample test

$\theta(\cdot)$ を real-valued parameterとする。興味のある帰無仮説は $H_0: \theta(P) = \theta(Q)$ 。
Assume that $\hat \theta_m = \hat \theta_m(X_1, \ldots, X_m)$ and $\hat \theta_n = \hat \theta_n(Y_1, \ldots, Y_n)$ satisfy

\begin{align*}m^{1/2}(\hat \theta_m - \theta(P)) & = \frac{1}{\sqrt{m}}\sum_{i = 1}^m f_P(X_i) + o_P(1) \\ n^{1/2}(\hat \theta_n - \theta(Q)) & = \frac{1}{\sqrt{n}}\sum_{j = 1}^n f_Q(Y_j) + o_P(1) \end{align*}

また，これと同様のasymptotic linearityが混合分布 $\bar P = p P + q Q$ からのIIDサンプルについても成り立つと仮定する。

Theorem 2.1: 帰無仮説を $H_0: \theta(P) = \theta(Q)$ , 検定統計量を $T_{m,n} = N^{1/2}[\hat \theta_m(X_1, \ldots, X_m) - \hat \theta_n(Y_1, \ldots, Y_n)]$ とする。このとき， $T_{m.n}$ の permutation distribution について， $\sup_t | \hat R_{m,n}^T(t) - \Phi(t/\tau(\bar P))| \overset{p}{\to} 0$ が成立する。ただし， $\tau^2(\bar P) = \sigma_2(\bar P)/(p(1-p))$ , $p = \lim n/N$ .

Remark: $T_{m,n}$ の真の漸近分布は，平均 $0$ ・分散 $\frac{1}{p}\sigma^2(P) + \frac{1}{1-p}\sigma^2(Q)$ の正規分布。これは $\hat R_{m,n}^T(t)$ の極限と異なる。

Theorem 2.2: $\sigma(P)$ と $\sigma(Q)$ の一致推定量として $\hat \sigma_m$ と $\hat \sigma_n$ が得られるとする。また， $V_{m,n} = \sqrt{\frac{N}{m}\hat \sigma_m^2 + \frac{N}{n}\hat \sigma^2_n }$ とする。このとき， $S_{m,n} = T_{m,n}/V_{m,n}$ の permutation distribution $R_{m,n}^S(\cdot)$ について， $\sup_t | \hat R_{m,n}^S(t) - \Phi(t)| \overset{p}{\to} 0$ が成立する。

k-sampleのケース：Theorem 3.1 参照。