Often I feel that to truly understand something you must be able to explain it, or teach it, which is the reason I have put thought into devising an answer to my own question. I have considered all the answers (and will select one as best soon), and have come up with my own explanation of where the matrix comes from, how we get it, and how it acts. This explanation follows.
EDIT: NOTE: I realize after reading this that in this answer I have defined the function $P_B$ slightly differently and the result is that $[T]_B^{B'}$ appears slightly different aesthetically than how it did in the original question.
EDIT: I have undeleted this answer incase it is helpful to someone else but I will accept one of the other answers. Thank you all for your help
Let $T: V\rightarrow W$ be a linear transformation, and suppose that
$V$ has basis $B = \{v_1, ..., v_n\}$
$W$ has basis $B' = \{w_1, ..., w_m\}$
We have an isomorphism (I):
$$P_B : V\rightarrow F^n$$
$$v = \sum\limits_{i=1}^na_iv_i\mapsto (a_1, a_2, ..., a_n)$$
Proof. First we show that this is a homomorphism. Let $v = \sum\limits_{i=1}^{n}a_iv_i$ and $w = \sum\limits_{i=1}^nb_iv_i$ then $f(v + w) = (a_1 + b_1, ..., a_n + b_n)$ and $f(v) + f(w) = (a_1, a_2, ..., a_n) + (b_1, b_2, ..., b_n) = (a_1 + b_1, ..., a_n + b_n)$ thus $f(v + w) = f(v) + f(w)$ and $f(cv) = (cv_1, ..., cv_2) = c(v_1, ..., v_2) = cf(v)$. Now we know that this is a homomorphism, so we must show that it is a bijection. First we show that it is one-to-one. Suppose $f(w) = f(v)$ where $v$ and $w$ are as they were earlier. Then $f(v) - f(w) = f(v-w) = 0$. Thus $(a_1 - b_1, ..., a_n - b_n) = (0, 0, ..., 0)$, which implies that $a_1 = b_1, ..., a_n = b_n$, and thus $v = w$. Next we show that the map is onto. Consider $x\in F^{n} = (r_1, ..., r_n)$. But since $\{v_1, ..., v_n\}$ spans $V$ we can write $f(\sum\limits_{i=1}^nr_iv_i) = (r_1, ..., r_n)$ and thus the map is onto. $\Box$
Similarly we have the isomorphism (II):
$$P_{B'} : W\rightarrow F^m$$
$$w = \sum\limits_{i=1}^mb_iw_i\mapsto (b_1, b_2, ..., b_n) $$
Thus, it seems that once we choose bases, $B$ and $B'$, every vector $v\in V$ has a unique representation in $F^n$ and every vector $w\in W$ has a unique representation in $F^m$.
So we wonder, how can we represent $T: V\rightarrow W$ in terms of a map from $F^n \rightarrow F^m$?
Said another way, can we find a mapping which will satisfy the following diagram:
$$F^n \rightarrow V \xrightarrow{T} W\rightarrow F^m?$$
Using isomorphisms (I) and (II) we can piece together a mapping from $F^n\rightarrow F^m$ which is equivalent to $T: V\rightarrow W$:
$$P_B^{-1} : F^n\rightarrow V$$
$$T: V\rightarrow W$$
$$P_{B'} : W \rightarrow F^m$$
Hence, we have found the equivalent transformation from $F^n\rightarrow F^m$ that we were looking for:
$$P_{B'}\circ T\circ P_B^{-1} : F^n \rightarrow F^m$$
Finally we have the isomorphism (III):
$$\{\text{Linear Transformations } F^n\rightarrow F^m\}\rightarrow M_{m\times n}(F)$$
$$T\mapsto [T] =\left(\begin{smallmatrix} T\left(\begin{smallmatrix} 1\\ 0 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & T\left(\begin{smallmatrix} 0\\ 1 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & ... & T\left(\begin{smallmatrix} 0\\ 0 \\ . \\ . \\ . \\ 1\end{smallmatrix}\right)\end{smallmatrix}\right)$$
Proof. First we show that this is a homomorphism.
$$[c_1f_1 + c_2f_2] $$
$$= \left(\begin{smallmatrix} c_1f_1\left(\begin{smallmatrix} 1\\ 0 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) + c_2f_2\left(\begin{smallmatrix} 1\\ 0 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & c_1f_1\left(\begin{smallmatrix} 0\\ 1 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) + c_2f_2\left(\begin{smallmatrix} 0\\ 1 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right)& ... &c_1f_1\left(\begin{smallmatrix} 0\\ 0 \\ . \\ . \\ . \\ 1\end{smallmatrix}\right) + c_2f_2\left(\begin{smallmatrix} 0\\ 0 \\ . \\ . \\ . \\ 1\end{smallmatrix}\right)\end{smallmatrix}\right)$$
$$= \left(\begin{smallmatrix} c_1f_1\left(\begin{smallmatrix} 1\\ 0 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & c_1f_1\left(\begin{smallmatrix} 0\\ 1 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & ... & c_1f_1\left(\begin{smallmatrix} 0\\ 0 \\ . \\ . \\ . \\ 1\end{smallmatrix}\right)\end{smallmatrix}\right) + \left(\begin{smallmatrix} c_2f_2\left(\begin{smallmatrix} 1\\ 0 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & c_2f_2\left(\begin{smallmatrix} 0\\ 1 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & ... &c_2f_2\left(\begin{smallmatrix} 0\\ 0 \\ . \\ . \\ . \\ 1\end{smallmatrix}\right)\end{smallmatrix}\right)$$
$$= c_1\left(\begin{smallmatrix} f_1\left(\begin{smallmatrix} 1\\ 0 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & f_1\left(\begin{smallmatrix} 0\\ 1 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & ... &f_1\left(\begin{smallmatrix} 0\\ 0 \\ . \\ . \\ . \\ 1\end{smallmatrix}\right)\end{smallmatrix}\right) + c_2\left(\begin{smallmatrix} f_2\left(\begin{smallmatrix} 1\\ 0 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & f_2\left(\begin{smallmatrix} 0\\ 1 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & ... &f_2\left(\begin{smallmatrix} 0\\ 0 \\ . \\ . \\ . \\ 1\end{smallmatrix}\right)\end{smallmatrix}\right)$$
$$= c_1[f_1] + c_2[f_2]$$
Now we show that this is onto. Consider an arbitrary matrix $A$. Since a linear transformation is completely determined by its action on the basis vectors, we can define a linear transformation that gives $A$, and thus the function is onto. Next we show that the map is one-to-one. Suppose $S(f) = S(g)$. Then $\left(\begin{smallmatrix} f\left(\begin{smallmatrix} 1\\ 0 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & f\left(\begin{smallmatrix} 0\\ 1 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & ... &f\left(\begin{smallmatrix} 0\\ 0 \\ . \\ . \\ . \\ 1\end{smallmatrix}\right)\end{smallmatrix}\right) = \left(\begin{smallmatrix} g\left(\begin{smallmatrix} 1\\ 0 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & g\left(\begin{smallmatrix} 0\\ 1 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & ... &g\left(\begin{smallmatrix} 0\\ 0 \\ . \\ . \\ . \\ 1\end{smallmatrix}\right)\end{smallmatrix}\right)$, however, since again a linear transformation is completely determined by how it acts on the basis vectors, these two functions must be equivalent, and hence $f = g$. $\Box$
Therefore, we know that we can write a linear transformation $F^n \rightarrow F^m$ as an $m\times n$ matrix
This tells us that we can now write our transformation $T:V\rightarrow W$ as an $m\times n$ matrix. Or, more precisely, we can write $P_{B'}\circ T\circ P_B^{-1} : F^n \rightarrow F^m$ equivalently as:
$[T]_{B}^{B'} = [P_{B'}\circ T\circ P_B^{-1}] = \left(\begin{smallmatrix} P_{B'}\circ T\circ P_B^{-1}\left(\begin{smallmatrix} 1\\ 0 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & P_{B'}\circ T\circ P_B^{-1}\left(\begin{smallmatrix} 0\\ 1 \\ . \\ . \\ . \\ 0\end{smallmatrix}\right) & ... & P_{B'}\circ T\circ P_B^{-1}\left(\begin{smallmatrix} 0\\ 0 \\ . \\ . \\ . \\ 1\end{smallmatrix}\right)\end{smallmatrix}\right)$
Therefore, once we chooses bases, we can create an isomorphism (IV)
$$H: \operatorname{Hom}(V,W) = \{\text{Linear transformations } V\rightarrow W\}\rightarrow M_{m\times n}(F)$$
$$T\mapsto [T]_{B}^{B'}$$
And hence, after a choice of bases, we can write any linear transformation with a matrix.