There is not much to add to the article by Gordon, Jarvis, and Shaw that Alexander Gruber linked to, it explains things well and in much detail. I would just like to show how you can get the list of conjugacy class sizes for a large part quite easily, getting much mileage out of a few general ideas. I'll give references in brackets to relevant results in that paper. One point I won't go into at all is the isomorphism $GL(4,\Bbb F_2)\cong A_8$ that Derek Holt mentioned, and which provides an entirely independent approach.
The main point will be classifying conjugacy classes of matrices $A\in GL(4,\Bbb F_2)$ first by the characteristic polynomial $\chi_A$, and if necessary by their minimal polynomial $\mu_A$ and other invariant factors. I will order the invariant factors opposite to what is usual (and what the paper does), namely with each next polynomial a divisor (rather than multiple) of the previous one (this has the advantage that the multiplicity of each irreducible factor is weakly decreasing along the list, and forms a partition of its multiplicity in $\chi_A$). Thus the first invariant factor is $\mu_A$, and the product of all invariant factors is $\chi_A$ [theorem 1.1]. Quite often (for instance when $\chi_A$ is square-free) there is just one invariant factor, and the minimal and characteristic polynomials coincide; I'll call this the regular case (as in regular element of a Lie group), although the paper calls uses the term "cyclic" [theorem 1.3(iii)] which is quite justified.
Since $\chi_A$ for $A\in GL(4,\Bbb F_2)$ can be any monic polynomial of degree $4$ over $\Bbb F_2$ with non-zero constant term, it is useful to list the irreducible polynomials up to that degree, excluding $X$. Then there remains one irreducible polynomial each of degrees $1,2$, which I shall call $P_1=X+1$ and $P_2=X^2+X+1$, two irreducible polynomials of degree $3$ namely $P_{3,1}=X^3+X+1$ and $P_{3,2}=X^3+X^2+1$, and three irreducible polynomials of degree $4$ namely $P_{4,1}=X^4+X+1$, $P_{4,2}=X^4+X^3+1$ and $P_{4,3}=X^4+X^3+X^2+X+1$ (in fact only the number of irreducibles in each degree really matters).
Computing the centraliser of any matrix $A$ will be the main task. Each matrix that commutes with $A$ must stabilise any kernel or image of any polynomial in $A$, and then of course also intersections and sums of those. In particular, the canonical decomposition of the space into a direct sum of spaces annihilated by mutually coprime factors of the minimal polynomial (each a power of an irreducible polynomial) [theorem 1.2] must be respected by every matrix commuting with $A$, and this leads to a product decomposition of the centraliser of $A$.
One very useful fact is that in the regular case, the only matrices that commute with $A$ are those in the algebra $F[A]$ of polynomials in $A$ [theorem 1.3(iii)]. This is because in this case there exists a "cyclic vector" $v$, namely one from which any vector can be obtained by applying a polynomial in $A$; then if $B$ commutes with $A$, one easily shows for $P\in F[A]$ with $P\cdot v=B\cdot v$ that $P=B$.
Another relatively easy case is when $A$ is semi-simple, i.e., diagonalisable over an extension field of $F$ (note that the only invertible matrices that are diagonalisable over $F=\Bbb F_2$ itself are the identity matrices!). Here the minimal polynomial is square-free, and the minimal polynomial of each component of the primary decomposition is an irreducible polynomial $P$; that component is a module over $F[X]/(P)$ which is a field, and the corresponding component of the centraliser is the appropriate general linear group over that field.
Now we can start our classification, after remarking that $\#GL(4,\Bbb F_2)=15\times14\times12\times7=20160$.
$\chi_A$ is irreducible (three cases $\chi_A=P_{4,i}$ for $i=1,2,3$); then we are both in the regular and the semi-simple case. We have $F[A]\cong\Bbb F_{16}$ and the centraliser of $A$ in $GL(4,\Bbb F_2)$ is $C(A)=F[A]\setminus\{0\}\cong GL(1,\Bbb F_{16})=\Bbb F_{16}^\times$ which has $15$ elements. Each conjugacy class has size $20160/15=1344$, for a total of $3\times1344=4032$.
$\chi_A=P_{3,i}P_1$ for $i=1,2$ (two cases). This is also a regular semi-simple case, with $F[A]\cong\Bbb F_8\times\Bbb F_2$ and $C(A)\cong\Bbb F_8^\times\times\Bbb F_2^\times$ which has $7\times1=7$ elements, and each conjugacy class has size $20160/7=2880$, for a total of $2\times2880=5760$.
$\chi_A=\mu_A=P_2^2$, a single, regular case. One has $F[A]\cong F[X]/(P_2^2)$, and of the $16$ elements of this ring all but the $4$ ones divisible by $P_2$ are invertible, so $\#C(A)=12$, the conjugacy class has size $20160/12=1680$.
$\chi_A=P_2^2$ and $\mu_A=P_2$, a single, semi-simple case. One has $C(A)\cong GL(2,\Bbb F_4)$ which has $15\times12=180$ elements; the conjugacy class has size $20160/180=112$ elements.
$\chi_A=\mu_A=P_2P_1^2$, a single, regular case. One has $F[A]\cong \Bbb F_4\times F[X]/P_1^2$, and $C(A)=F[A]^\times$ has $3\times(4-2)=6$ elements, so the conjugacy class has $20160/6=3360$ elements.
$\chi_A=P_2P_1^2$ and $\mu_A=P_2P_1$, a single, semi-simple case. One has $C(a)\cong \Bbb F_4^\times \times GL(2,\Bbb F_2)$ which has $3\times(3\times2)=18$ elements, and the conjugacy class has size $20160/18=1120$.
$\chi_A=\mu_A=P_1^4$, a single, regular case. One has $F[A]\cong F[X]/(P_1^4)$ and $C(A)\cong F[A]^\times$ has $16-8=8$ elements, so the conjugacy class has size $20160/8=2520$.
$\chi_A=P_1^4$ and $\mu_A=P_1^3$, with a second invariant factor $P_1$. This is a single case that is neither regular nor semi-simple, so we need to work a bit harder. Here $A$ is a unipotent matrix with Jordan blocks of size $3,1$; its centraliser is the same as that of
$$
A-I=\begin{pmatrix}0&1&0&0\\0&0&1&0&\\0&0&0&0\\0&0&0&0\end{pmatrix}
$$
which can be explicitly computed to have $1^2\times2^4=16$ elements. The conjugacy class has $20160/16=1260$ elements.
$\chi_A=P_1^4$ and $\mu_A=P_1^2$, with a second invariant factor $P_1^2$. Again a case that is neither regular nor semi-simple; $A$ is a unipotent matrix with Jordan blocks of size $2,2$. Its centraliser is the same as that of
$$
A-I=\begin{pmatrix}0&0&1&0\\0&0&0&1&\\0&0&0&0\\0&0&0&0\end{pmatrix}
$$
which can be explicitly computed to to be a semidirect product of the vector space $\Bbb F_2^4$ with $GL(2,\Bbb F_2)$ which has $2^4\times(3\times2)=96$ elements. The conjugacy class has $20160/96=210$ elements.
$\chi_A=P_1^4$ and $\mu_A=P_1^2$, with two more invariant factors $P_1$. Here $A$ is a unipotent matrix with Jordan blocks of size $2,1,1$; its centraliser is the same as that of
$$
A-I=\begin{pmatrix}0&0&0&1\\0&0&0&0&\\0&0&0&0\\0&0&0&0\end{pmatrix}
$$
which can be explicitly computed have $2^5\times(3\times2)=192$ elements. The conjugacy class has $20160/192=105$ elements. A sigh of relief, we needed at least one non-identity class with an odd number of elements!
$\chi_A=P_1^4$ and $\mu_A=P_1$. Another semi-simple case, indeed the identity matrix. A singleton class.
A final check: $3\times1334+2\times2880+1680+112+3360+1120+2520+1260+210+105+1=20160$, Oof!
I did not mention the order of the elements in each class, but these depend only on the minimal polynomial, as described in this answer.