This post has some answers that give some intuition as to the definition of the transpose. My rudimentary (perhaps inaccurate) understanding is that for a linear transformation $T: V \to W$, we're interested (why?) in a way to represent functionals on transformed points, $f \in W^*$, as functionals on the original points $T^\top(f) \in V^*$. Friedberg, Insel, Spence write:
For a matrix of the form $A = [T]_{\beta\to\gamma}$, the question arises as to whether or not there exists a linear transformation $U$ associated with $T$ in some natural way such that $U$ may be represented in some basis as $A^\top$
From where does "this question arise"? The notion of the existence of a transformation $T: V \to W$ does not obviously seem to imply the existence of a "dual" transformation from $W^*$ to $V^*$. Why do we want to "[represent] $U$ in some basis as $A^\top$" at all? And why do we care whether such a transformation exists? I would prefer an elementary explanation that doesn't invoke adjointness as a motivator (since it is not technically covered until later in this text), but this question of "why do we care about this" has been confusing me for a few days now. Thank you!