For positive integers, the nontrivial case is when $x \neq y$ and both are at least $2$.
Then one is asking which is a more efficient way to partition $xy$ into several (equal) numbers for the purposes of getting a large product: multiply $x$ copies of $y$, or $y$ copies of $x$. It turns out that having a larger exponent is more important. If $2 \leq x < y$, then $x^y$ should exceed $y^x$, but this neat and simple pattern is rudely disrupted by brute calculation that $2^4 = 4^2$ and $2^3 < 3^2$.
As in the real-valued problem usually solved with calculus, the key is the function $F(t) = t^{1/t}$. The inequalities $x^y > y^x$ and $F(x) > F(y)$ are equivalent, reducing the problem to a comparison of values of $F$ at different points. For positive integers larger than $1$, the maximum value is $F(3)$, and $F( )$ is decreasing starting at $3$, but $F(2)=F(4)$. This implies that the exceptions observed above are the only ones to the pattern that it is more efficient to increase the exponent than the base.
For an induction proof without calculus it looks like the thing to prove is $F(n) > F(n+1)$ for $n \geq 3$, which is the same as $n^{n+1} > (n+1)^n$, or $n > (1 + \frac{1}{n})^n$. It is known that $(1+\frac{1}{n})^n$ increases toward $e$ which is less than $3$, which settles the problem. Without relying on that one can replace $n-1$ of the factors in $(1+1/n)^n$ by a telescoping product of terms $(n + 1-i)/(n-i)$, for $i=0$ to $(n-2)$, which leaves an inequality similar to $n > (n/2)$ that can be checked easily and implies the one on $F(n)$.