If you'd like to work it out yourself, the outline in Reid's Undergraduate algebraic geometry, exercises 2.9–2.10 is good. I'll give a sketch following another book, Dolgachev's Lectures on invariant theory, specifically §10.3.
Let $C$ be an irreducible cubic curve, and say it's nonsingular. Let $P$ be an inflection point on $C$, which you can find by computing the Hessian and intersecting $C$ with the Hessian curve, and choose a system of coordinates such that $P = [0:0:1]$, and the tangent line at $P$ is given by $T_0 = 0$. The cubic then has equation
$$T_2^2T_0 + T_2L_2(T_0,T_1) + L_3(T_0,T_1) = 0,$$
where $L_2$ is a quadratic form and $L_3$ is a cubic form. Since $T_0 = 0$ intersects $C$ at one point, we have that the $T_1^2$ coefficient in $L_2$ is zero, so in affine coordinates $x = T_1/T_0$ and $y = T_2/T_0$, the equation becomes
$$y^2 + axy + by + dx^3 + ex^2 + fx + g = 0.$$
Since $d \ne 0$ by assumption, we can assume that $d=1$. Now replacing $y$ with $y + ax/2 + b/2$, we can assume $a = b = 0$. Moreover, by changing variables $x \mapsto x + e/3$, we can assume that $e=0$. We therefore have that the equation is of the form
$$y^2 + x^3 + ax + b = 0.$$
On the other hand, if $C$ is singular, and we choose $[0:0:1]$ to be the singular point, the cubic has equation
$$T_2L_2(T_0,T_1) + L_3(T_0,T_1) = 0.$$
By linear transformations, we get that either $L_2 = T_0^2$ or $L_2 = T_0T_1$. A similar argument as to the nonsingular case gives that these two cases can be reduced to cuspidal cubics of the form
$$y^2 + x^3 = 0,$$
and nodal cubics of the form
$$y^2 + x^2(x+1) = 0,$$
respectively. See Dolgachev for details. These give the three cases you stated.
Now, with respect to Newton's classification, I first give a reference: in Brieskorn/Knörrer's book Plane algebraic curves, in §2.5, starting at p. 87, you can find a leisurely explanation of how Newton went about classifying plane cubics. Apparently Newton found 72 different kinds of plane cubics, but overlooked six of them! You can even see some pages from Newton's work on pp. 93–98. The main difference between his classification and ours is that 1. Newton works over the reals, and 2. he does not work projectively. A good reason why we work projectively and over the complex numbers is so that classification problems like these are more manageable!