You can solve this problem easily using the proximal gradient method or an accelerated proximal gradient method such as FISTA. This is a powerful tool that everyone can easily learn to use. The proximal gradient method solves optimization problems of the form
$$
\text{minimize} \quad f(x) + g(x)
$$
where $f$ and $g$ are convex and $f$ is differentiable (with a Lipschitz continuous gradient) and $g$ is "simple" in the sense that its proximal operator can be evaluated efficiently. The proximal operator of $g$ is defined as follows:
$$
\tag{$\spadesuit$}
\text{prox}_{tg}(\hat x) = \arg \, \min_x \quad g(x) + \frac{1}{2t} \| x - \hat x \|_2^2.
$$
The proximal operator of $g$ reduces the value of $g$ without straying too far from $\hat x$. The parameter $t > 0$ can be thought of as a "step size" that controls how far away from $\hat x$ we are allowed to move. Note that $g$ is not required to be differentiable. (But $g$ is required to be lower-semicontinuous, which is a mild condition that is usually satisfied in practice.)
The proximal gradient method iteration is
$$
x^{k+1} = \text{prox}_{tg}(x^k - t \nabla f(x^k)).
$$
What is happening here is that, starting from the current location $x^k$, we take a gradient descent step to reduce the value of $f$, then we apply the prox-operator of $g$ to reduce the value of $g$.
The step size $t > 0$ is required to satisfy $t \leq 1/L$, where $L$ is a Lipschitz constant for $\nabla f$. (There is also a line search version of the proximal gradient method that might converge faster. In the line search version the step size $t$ is chosen adaptively at each iteration.)
Your optimization problem has the form $(\spadesuit)$ where
$$
f(x) = \| Ax - b \|_2^2
$$
and
$$
g(x) = \lambda \|x\|_2.
$$
The gradient of $f$ can be evaluated using calculus:
$$
\nabla f(x) = 2 A^T(Ax - b).
$$
There is a nice formula for the proximal operator of the $\ell_2$-norm:
$$
\text{prox}_{tg}(x) = \text{prox}_{t \lambda \| \cdot \|_2}(x) = x - P_B(x),
$$
where $B$ is the $2$-norm ball of radius $t \lambda$ centered at the origin and $P_B(x)$ is the projection of $x$ onto $B$. (A formula for the proximal operator of a norm can be found for example in Vandenberghe's UCLA 236c notes. See chapter 8 "The proximal mapping", slide 8-3.)
With these formulas, you are ready to implement the proximal gradient method to solve your optimization problem using just a few lines of code. There is also an accelerated version of the proximal gradient method known as FISTA which is equally simple and likely converges faster.