The computational basis is "natural" in the sense that it provides a practical representation of measurement outcomes. Other bases are also "natural" for other tasks, and bases cannot be interchanged aribtrarily if such a change impacts the underlying tensor structure of the system.
For example the Bell states form a basis in four dimensional Hilbert space representing two maximally entangled qubits. Bell states are not bilinear combinations of computational basis states, so we cannot simply choose to represent separable states over the Bell basis, or entangled states over the computational basis.
What seems to be missing from your assumptions is a recognition that the tensor product structure over the computational basis is of fundamental importance because it comports with measurement outcomes - our only connection to the quantum realm. In other words, the computational basis for an $n$-qubit system, $\vert \psi_0 \rangle$, is a tensor product space of the form
$$\vert \psi_0 \rangle = \vert b_0 \rangle \otimes \vert b_1 \rangle \otimes ... \otimes \vert b_{n-1} \rangle \in (\mathbb{C}^2)^{\otimes n},$$
where $\vert b_i \rangle \in \mathbb{C}^2$.
A transformation that preserves the tensor structure of $\vert \psi_0 \rangle$, i.e. does not induce entanglement, takes the form
$$U = U_0 \otimes U_1 \otimes , ... , \otimes U_{n-1} \in U(2)^{\otimes n} \subset U(2^n).$$
Hamiltonians of this form are regularly employed in both theory and practice, where any given $U_i$ can act trivially or non-trivially.
However, generally in practice only a discrete set of single qubit transformations are available. By the Solovay-Kitaev theorem $U_i$ can be approximated to precision $\epsilon$ using $\Theta(\log^c(\tfrac{1}{\epsilon}))$ gates from a fixed discrete set, where $c$ is a constant between 1 and 2 (as far as I know its exact value is still an open problem).
When multilevel gates are applied to the system to induce entanglement, we still need to be able to track the relationship back to the tensor structure of $\vert \psi_0 \rangle$. For sufficiently large $n$, applying an arbitrary $SU(2^n)$ transformation to $\vert \psi_0 \rangle$ amounts to complete decoherence of the system because any meaningful description of the wavefunction is lost (i.e., breaking $SU(2^n)$ down into a Tensor Network representation, discussed below, becomes intractable.)
Such transformations need to be applied in very rigorous and non-trivial ways. As Carlton Caves said "Hilbert space is a big place," and it's easy to get lost. Notice, for example, that the space of linear transformations spanned by $U(2^n)$ is exponentially larger than that of $U(2)^{\otimes n}$, e.g. $U(2^n)$ has real dimension $2^{2n}$, whereas $U(2)^{\otimes n}$ has real dimension $4n$.
In practice, low energy systems can only access a relatively small portion of the full Hilbert space because physically realistic states are heavily constrained by locality. Tensor networks are a good example of a class of approaches to managing this challenge by limiting the description of a system to physically realistic regions of Hilbert space (see, e.g. section 3.3 here).