The method of Lagrange multipliers yields the solution.
One studies the (real-valued) function $F(X)$ with the (matrix) constraint that $C(X)=I$ for $F(X)=\text{trace}(AX)$ and $C(X)=X^TX$, hence
$$
F(X)=\sum\limits_{ij}A_{ji}X_{ij},\quad
C_{ij}(X)=\sum\limits_{k}X_{ki}X_{kj}.
$$
The gradients of $F$ and $C$ are given by $\partial_{ij}F(X)=A_{ji}$ and $$
\partial_{ij}C_{k\ell}(X)=X_{i\ell}[k=j]+X_{ik}[\ell=j].
$$
One wants that there exists some multipliers $\lambda_{ij}$ such that, for every $i$ and $j$,
$$
\partial_{ij}F(X)=\sum\limits_{k\ell}\lambda_{k\ell}\partial_{ij}C_{k\ell}(X).
$$
This condition reads
$$
A_{ji}=\sum\limits_{k,\ell}\lambda_{k\ell}\left(X_{i\ell}[k=j]+X_{ik}[\ell=j]\right)=\sum\limits_{\ell}\lambda_{j\ell}X_{i\ell}+\sum\limits_{k}\lambda_{kj}X_{ik},
$$
or, equivalently, introducing the matrix $\Lambda=(\lambda_{ij})$,
$$
A^T=X\Lambda^T+X\Lambda.
$$
The matrix $M=\Lambda^T+\Lambda$ is such that $M^T=M$ and $XM=A^T$, hence $X^TX=I$ implies that $M$ should be such that $M^TM=M^2=AA^T$.
Using pseudo-inverse matrices if $A$ is not invertible and the usual definition of the square root of a symmetric matrix, one sees that the maximizing and minimizing matrices $X$ and the maximum and minimum values of $F$ are
When $A$ is invertible, one can use the usual definition of the square root of a symmetric matrix to find $M$. The maximizing and minimizing matrices $X$ and the maximum and minimum values of $F$ are$$
X_0=A^TM^{-1}=\pm A^T(AA^T)^{-1/2},\quad
F(X_0)=\pm\mbox{trace}((AA^T)^{1/2}).
$$
When $A$ is not invertible, the formula $X_0=A^TM^{-1}$, whichever meaning one gives to the notation $M^{-1}$, cannot yield an orthogonal matrix $X_0$ since, if $A$ is not invertible, neither are $A^T$, nor $A^T$ times any matrix. On the other hand, the formula $F(X_0)=\pm\mbox{trace}((AA^T)^{1/2})$ might still be valid.
$$A=\pmatrix{0&1\0&0};,$$
and it seems that in that case I can't get $X_0$ in the way you've written it, even using the pseudoinverse, because the resulting matrix isn't orthogonal?
– joriki Oct 30 '12 at 17:59$$ AA^T=\pmatrix{1&0\0&0};, $$
so your prescription gives
$$ (AA^T)^{-1/2}=\pmatrix{1&0\0&0};, $$
so we get
$$X_0=\pm\pmatrix{0&0\1&0};,$$
which isn't orthogonal.
– joriki Oct 30 '12 at 18:46