Basically you want to "add up small patches of surface" whereas your intuitive approach ends up just adding up lengths - it neglects the "local stretching of the curve".
I'll put a bunch of explanations up front because I think they might be helpful but maybe the very end is enough already.
How to get the volume formula
Let's go back to square one and consider the volume case: assume we have a positive function $f$ then we can parametrize the solid of revolution obtained from this $f$ as
$$
\psi(x, r, \phi) = \pmatrix{x \\ r f(x) \cos(\phi) \\ r f(x)\sin(\phi) }
$$
where $x$ ranges over some interval $(x_1, x_2)$, $r$ from $0$ to $1$ and $\phi$ over the angles you want which is also an interval; and for simplicity I'll use $(0, 2\pi)$ here.
We get the volume of our solid by integrating over it - but we somehow want to use this map $\psi$ to do this integration because it's a quite natural expression of this solid that is likely to simplify the calculation.
We can think about what's happening with this map $\psi$ like this: we have a rectangular prism in a space with axes $x, r, \phi$ and $\psi$ molds this prism into our solid of revolution by somehow shaping these weird axes into the regular cartesian ones. If we knew how much $\psi$ "stretched" space while doing so, then we could use this information to relate the volume of this cube to the volume of the solid of revolution in some way. Since the stretching could really differ from one point to the next we gotta consider the local stretching of $\psi$ at each point in $x,r,\phi$ space.
It turns out that this local stretching factor is precisely the (absolute value of the) jacobian determinant of $\psi$ (there's some great geometric explanations for this). Denoting our solid of revolution by $S$ and the prism by $C$ we have $S=\psi(C)$ and find the formula
$$
\int_S ~dS = \int_{\psi(C)} ~d(\psi(C)) = \int_C | \det J_\psi| ~dC = \int_{x_1}^{x_2} \int_0^1 \int_0^{2\pi} |\det J_\psi(x,r,\phi)| ~d\phi ~dr ~dx.
$$
If we actually calculate the jacobian we find that
$$
J_\psi(x, r, \phi) = \pmatrix{1 & 0 & 0 \\ rf'(x) \cos(\phi) & f(x) \cos(\phi) & -rf(x)\sin(\phi) \\ rf'(x) \sin(\phi)) & f(x) \sin(\phi) & rf(x)\cos(\phi)}
$$
and using a Laplace expansion the determinant of this is easily found to equal $rf(x)^2$. Plugging this into the integral above then yields the formula you already know:
$$
\int_S ~dS = \int_{x_1}^{x_2}\int_0^1 \int_0^{2\pi} rf(x)^2 ~d\phi ~ dr~dx = \int_{x_1}^{x_2}f(x)^2~dx \int_0^1 r ~dr \int_0^{2\pi} ~d\phi = \pi \int_{x_1}^{x_2}f(x)^2~dx
$$
How to get the surface area formula
Okay let's repeat this for the area: we again set up a parametrisation of the surface of revolution. This time we only have to parameters: we eliminate the parameter $r$ from the previous one by simply setting it to $1$. We have
$$
\psi(x, \phi) = \pmatrix{x \\f(x) \cos(\phi) \\f(x) \sin(\phi)}.
$$
We already know how to do these calculations now: we simply calculate the jacobian determinant and integrate it over $x$ and $\phi$... the problem is that $J_\psi$ won't be square and doesn't have a determinant in the usual sense: $\psi$ doesn't just stretch space to go from the one with axes $x, \phi$ to the usual cartesian space; it has to add a whole new dimension.
Phrased differently: it takes a 2-dimensional rectangle, stretches it in some way and places this stretched piece of surface into cartesian space. Using this phrasing we can work out the appropriate scaling factor using a kind of QR-decomposition we get from an SVD. Omitting the details we find that the appropriate factor is the so-called Gramian determinant $\sqrt{\det (J_\psi^T J_\psi)}$. You can informally think of this kinda like this: we go from 2D to 3D space using $J_\psi$ and apply the stretching factor once while doing so. Then we go from 3D back to 2D using $J_\psi^T$ which has the same stretching factor. We then take the square root to get the value that a single stretching has.
The jacobian in this case is
$$
J_\psi(x, \phi) = \pmatrix{1 & 0 \\f'(x) \cos(\phi) & -f(x)\sin(\phi) \\f'(x) \sin(\phi) & f(x) \cos(\phi)}
$$
and after a bit of a calculation we find that the gramian determinant of this is
$$
|f(x)|\sqrt{1 + f'(x)^2}
$$
which is probably the expression with the square root of the derivative you've already found. Since we assumed $f$ to be positive we can drop the absolute value from this expression. Plugging this into our integral we then find
$$
\int_S ~dS = \int_{x_1}^{x_2} \int_0^{2\pi} f(x) \sqrt{1+f'(x)^2} ~d\phi ~dx = \int_{x_1}^{x_2} 2\pi f(x) \sqrt{1+f'(x)^2} ~dx.
$$
Analyzing the area formula
The $2\pi f(x)$ in this integral is precisely the circumference of the circle that you already expected to play some part. The extra factor here is $\sqrt{1+f'(x)^2}$ - what's up with that? This term is exactly the "length element" of the graph of $f$ (you can find this using the same method of parametrising and integrating the gramian determinant; just that you'll only have one parameter this time).
This element tells us something about the arc length of the graph of $f$. So at any point we're really considering the circumference you found, multiplied by the length of a small line segment along the bounding curve. So we've intuitively chopped up the surface into a bunch of cylindrical shells and sum up their respective surface areas to get the whole thing.