0

In calculating an empirical p-value for test statistic $T_{0}$ (in my case the KS test statistic, $D$) using a permutation test

$\tilde{p}_{N}(x)=\frac{N\tilde{G}_N(x)+1}{N+1}$ $\quad$ (3.4)

where

$\tilde{G}_N(x) = 1 - \frac{1}{N}\sum_{i=1}^{N}1_{[0,\infty)}(x-T^{(i)})+\frac{1}{N}\sum_{i=1}^{N}1_{[0]}(T^{(i)}-x)1_{[0,\infty)}(U_{i}-U_{0})$

taken from Dufour and Farhat's department paper (see (3.4) on pg. 6) I have a couple questions. First, how do I interpret $1_{[0,\infty)}$ and $1_{[0]}$? For clarification, $U_{i}$ are uniform random variates used to sort statistics $T_{i}$ in case of ties.

My second question is also basic and regards $x$. On page 5, the p-value function is defined as $G(x)=P[T\geq x|H_{0}]$. Is $x$ the originally calculated $T_{0}$? So in (3.4) we find the difference between our calculated test statistic, $T^{(i)}$ and $T_{0}$?

I have the feeling that this isn't that complicated but the notation is throwing me off. I found this post which explains a bit of my first question but I'm missing some key notation to piece it all together. I just want to understand the procedure. Thanks for any help!

Vedom
  • 1
  • This was simple, and embarrassingly so. The $1_{A}$ function in this instance become one when $T_{0}$ is greater than the statistic calculated from the permuted sample. It essentially finds the proportion of times that the statistic is greater than the statistic $(T^{(i)})$ from the random permuted sample and calculates a p-value from there. This is more embarrassing than asking a friend a question only to answer it yourself right after articulating it. Thanks for the sounding board! – Vedom Aug 20 '12 at 19:57

1 Answers1

0

This was simple, and embarrassingly so. "$1_{A}$" was actually $1_{A}(x)$ which is the function $1_{A}(x)= 1,when~ x\in A~ \text{and}~ =0, x\notin A$. So in this instance the function returns one when $T_{0}$ is greater than the statistic calculated from the permuted sample $(T^{(i)})$. It essentially finds the proportion of times that $T_{0}$ is greater than the statistic $(T^{(i)})$ from the random permuted sample and calculates a p-value from there using (3.4). This clarification was found on page 6 of the cited paper but was confusing with the repeated use of $x$ as a symbol representing different variables in the same line.

Answering my own question here is more embarrassing than asking a friend a question only to answer it yourself right after articulating it. Thanks for the sounding board though!

Vedom
  • 1