Consider the simultaneous diagonalization of $A^TA$ and $I-P$ which exists as they are both symmetric matrices.
$$Y^TA^TAY=I$$
$$Y^T(I-P)Y=D$$
In particular, $Y$ can be computed as eigenvector of $(A^TA)^{-1}(I-P)$ since we have
$$(A^TA)^{-1}(I-P)Y=YD.$$
Now, let's look at our optimization problem,
\begin{align}
tr(X^TA^TAX(X^T(I-P)X)^{-1})&=tr(X^TY^{-T}Y^{-1}X(X^TY^{-T}DY^{-1}X)^{-1}).
\end{align}
Let $Y^{-1}X=QR$ where $R$ is nonsingular and $Q$ has orthogonal columns.
\begin{align}
tr(X^TA^TAX(X^T(I-P)X)^{-1})&=tr(R^TR(R^TQ^TDQR)^{-1})\\
&=tr((Q^TDQ)^{-1})\\
&\geq \lambda_1 + \ldots + \lambda_q
\end{align}
where $\lambda_1,\ldots, \lambda_q$ are the $q$ smallest positive eigenvalues of $D^{-1}$. The minimal value can be attained by picking $Q$ to be the standard unit vectors.
Hence, we have $X=YQR$.
Last but not least, note that if $X$ and $XS$ attained the same value for $tr(X^TA^TAX(X^T(I-P)X)^{-1})$ if $S$ is nonsingular.
Hence we can pick $X=YQ$ which are the eigenvectors of $(A^TA)^{-1}(I-P)$.