EDIT 1. We assume that $\lambda=1$ (change $B$ with $\lambda B$) and $A$ is only symmetric (the non-negativity has nothing to do here). $M_{n,k}$ denotes the real $n\times k$ matrices.
Let $f:X\in M_{n,k}\rightarrow tr(X^TAX)+ tr(X^TB)$. Since $Z=\{X;X^TX=I_k\}$ is compact, then the minimum of $f$ is reached in a point $X$ s.t. its derivative $Df_X(H)$ is $0$ for every $H$ in the tangent space $T_XZ$ of $Z$.
$\textbf{Proposition}.$ The $X$ that realize the minimum of $f$ are among the finite (generically) set of the solutions of the system (S):
$X^TX=I_k,X^TB=B^TX,(I_n-XX^T)(2AX+B)=0$.
$\textbf{Proof}$.
$Df_X:H\in T_XZ\rightarrow tr(2H^TAX+ H^TB)$.
Now $H\in T_XZ$ iff $H^TX=K$, a skew-symmetric matrix. Since $rank(X)=k$, $X^+=X^T$ and $H^T=KX^T+U(I_n-XX^T)$ where $U\in M_{k,n}$ is arbitrary.
Finally, the condition is: for every skew-symmetric $K$ and for every $U\in M_{k,n}$:
$tr((KX^T+U(I-XX^T))(2AX+ B))=0$. It suffices to consider the following 2 cases.
Case 1. $U=0$. then $X^T(2AX+ B)$ is symmetric, that is $X^TB$ is symmetric (generically, we obtain $k(k-1)/2$ relations).
Case 2. $K=0$. Then $(I-XX^T)(2AX+ B)=0$. $\square$
Of course, there is no closed-form solution.
EDIT 2. ** In order to gain 70% in computing time, we can diagonalize $A$.
Inded $A=PDP^T$ where $P\in O(n)$ and $D$ is diagonal. We put $Y=P^TX$; note that $Y^TY=X^TPP^TX=I_k$. On the other hand, $f(X)=g(Y)=tr(Y^TDY+Y^TP^TB)$.
Thus we may assume that $A$ is diagonal (change $B$ with $P^TB$).
** An example. When $k=3,n=5$, the number of complex solutions $X$ of the system (S) is (in the generic case) $248$, that is (S) admits, on average, $\approx 16$ real solutions that remain to be tested.
EDIT 3. Answer to @Royi . I reduce the problem to solving the matricial algebraic system (S). I tested the method using the Grobner basis theory (under Maple); that works until $n=6$. For larger $n$, you (it's your business) must use numerical methods in order to obtain -at least locally- the solutions. About that, there is a problem:
We fix a large $n$. If $k=1$, then $(S)$ admits (generically) $2n$ complex solutions that is, in average, $\sqrt{2n}$ real solutions. Then, with a good software, we can find and test all these solutions.
Unfortunately, when $k$ increases , the number of solutions of $(S)$ increases dramatically until $k=n-1$; indeed, when $k=n-1$, the number of complex solutions is $2^{2n-2}$, that is, in average, $2^{n-1}$ real solutions, what is associated to an exponential complexity. Therefore, in a first step, we must localize a region (not too large) where the function $f$ reaches its minimum.