Firstly let's state in all that follows we will define binomial coefficients for $a,b\in\mathbb{N}\cup\{0\}$ as
$$\binom{a}{b}=\begin{cases} 0 & a\lt b\\ \frac{a!}{b!(a-b)!} & \text{else}\end{cases}$$
and define the finite difference operator $\Delta$ such that for some function of $i$ $f(i)$ we have
$$\Delta(f(i))=f(i+1)-f(i)$$
and
$$\Delta^k(f(i)) = \underbrace{\Delta(\Delta(\Delta\cdots (\Delta}_{\text{$k$ times}}(f(i)))\cdots ))$$
Seems pretty straightforward right? It's a kind of finite differentiation and will come in handy. Anyway, on with the question...
Okay, let's take your example of $i^2$, we will list some values and finite differences in a table
$$\begin{array}{c|c|c|c|c|c|c|c}
i & 0 & 1 & 2 & 3 & 4 & 5 & \cdots\\\hline
i^2 & 0 & 1 & 4 & 9 & 16 & 25 & \cdots\\\hline
\Delta(i^2) & 1 & 3 & 5 & 7 & 9 & & \cdots\\\hline
\Delta^2(i^2) & 2 & 2 & 2 & 2 & & & \cdots\\\hline
\Delta^3(i^2) & 0 & 0 & 0 & & & & \cdots\\\hline
\end{array}$$
Now, we see that the finite differences eventually become $0$ as we go down the table just as repeated differentiation of powers of $i$ eventually becomes constant and then $0$. This always happens for finite differences of powers of $i$ because when we take
$$\Delta i^n= (i+1)^n - i^n = \left(i^n + \sum_{k=0}^{n-1}\binom{n}{k}i^k\right) - i^n = \sum_{k=0}^{n-1}\binom{n}{k}i^k$$
thus reducing the highest power of $i$ by $1$, repeating this $n$ times will leave us with $0$.
Now I'm going to write these coefficients in a polynomial in $x$ and $y$ and this will seem unmotivated except that it is something that crops up frequently in combinatorics, here goes
$$\begin{split}
y^0(2&+&1x&+&0x^2)\\
y^1(2&+&3x&+&1x^2)\\
y^2(2&+&5x&+&4x^2)\\
y^3(2&+&7x&+&9x^2)\\
y^4(0&+&9x&+&16x^2)+\cdots\\\end{split}$$
Now, due to the nature of finite differences every successive row in this polynomial can be achieved from last by multiplying by $y(1+x)$, we see this has the effect of adding the term above and above and left one term together, it is the same reasoning that relates Pascal's identity in the famous triangle to the polynomial $\smash{\sum_{k=0}^{\infty}y^k(1+x)^k}$. The main difference in this case is the $y^0$ polynomial in the top row (row $0$). In Pascal's triangle row $0$ is $y^0(1)$ in this case it is $y^0(2+1x+0x^2)$ or just $y^0(2+1x)$.
So to obtain row $1$ we multiply
$$y^0(2+1x)(y(1+x))=y^1(2+3x+1x^2)$$
to obtain row $2$ we multiply
$$y^0(2+1x)(y(1+x))^2=y^2(2+5x+4x^2+1x^3)$$
row $3$
$$y^0(2+1x)(y(1+x))^3=y^3(2+7x+9x^2+5x^3+1x^4)$$
and so forth, if we add these we get
$$g(x,y)=(2+1x)\sum_{k=0}^{\infty}y^k(1+x)^k$$
Notice that we will gain extra (redundant) terms in $g(x,y)$ that are not in our table, e.g. the first of these redundant terms will be $1y^2x^3$ in row $2$. These terms are not needed here as we shall see.
Anyway, we note that the general term $i^2$ is the coefficient of $y^ix^2$ in $g(x,y)$ and we can state this succinctly with $[y^ix^2]g(x,y)$.
Okay, let's express this by expanding the binomial as follows
$$\begin{align}[y^ix^2]g(x,y) &= [y^ix^2](2+1x)\sum_{k=0}^{\infty}y^k(1+x)^k\\
&=[y^ix^2](2+1x)\sum_{k=0}^{\infty}y^k\sum_{j=0}^{k}\binom{k}{j}x^j\\
&=[y^ix^2]\sum_{k=0}^{\infty}y^k\sum_{j=0}^{k}\left(2\binom{k}{j}x^j+1\binom{k}{j}x^{j+1}\right)\\
&=[y^ix^2]\sum_{k=0}^{\infty}y^k\sum_{j=0}^{k}\left(2\binom{k}{j}+1\binom{k}{j-1}\right)x^{j}\\
&=2\binom{i}{2}+1\binom{i}{1}\end{align}$$
as given in the question.
In general
Notice that in general the row $0$ polynomial of $i^n$ for $n\in\mathbb{N}$ will be of the form
$$y^0(\Delta^n(0^n)+\Delta^{n-1}(0^n)x+\Delta^{n-2}(0^n)x^2+\cdots+\Delta(0^n)x^n)$$
and we will therefore be able to express
$$i^n= \Delta^n(0^n)\binom{i}{n}+\Delta^{n-1}(0^n)\binom{i}{n-1}+\Delta^{n-2}(0^n)\binom{i}{n-2}+\cdots +\Delta(0^n)\binom{i}{1}$$
So all you need to do is find the top row (row $0$) polynomial for which you can use finite differences as we did in the example.
Another way is to notice that after repeated operations of $\Delta$ on some $f(0)$ we have
$$\Delta^k(f(0))=\sum_{j=0}^{k}(-1)^{k-j}\binom{k}{j}f(j)$$
for the case of $f(i)=i^n$ we have
$$\Delta^k(f(0))=\sum_{j=0}^{k}(-1)^{k-j}\binom{k}{j}j^n$$
which are recognisable as being related to Stirling numbers of the second kind
$$S(n,k)= \frac{1}{k!}\sum_{j=0}^{k}(-1)^j\binom{k}{j}j^n$$
therefore
$$\Delta^k(0^n)=k!S(n,k)$$
and our formula for $i^n$ is
$$i^n=n!S(n,n)\binom{i}{n}+(n-1)!S(n,n-1)\binom{i}{n-1}+(n-2)!S(n,n-2)\binom{i}{n-2}+\cdots +1!S(n,1)\binom{i}{1}$$
This is easily summed using Pascals hockey stick rule on each term of the right hand side
$$\sum_{r=m}^{i}\binom{i}{m}=\binom{i+1}{m+1}$$
giving
$$\sum_{i=0}^{N}i^n=n!S(n,n)\binom{N+1}{n+1}+(n-1)!S(n,n-1)\binom{N+1}{n}+(n-2)!S(n,n-2)\binom{N}{n-1}+\cdots +1!S(n,1)\binom{N}{2}$$
in our example case
$$\begin{align}\sum_{i=0}^{N}i^2=2!S(2,2)\binom{N+1}{3}+1!S(2,1)\binom{N+1}{2}&=2\frac{(N+1)N(N-1)}{6}+\frac{(N+1)N}{2}\\&=\frac{(2N+1)(N+1)N}{6}\end{align}$$
Which is a well recognised result.
The power in this method should be self-evident.
In fact, although I said earlier that higher terms produced were redundant, if you look at the coefficients $[y^ix^3]g(x,y)$ for our example this is the desired summation itself. In general this also applies so that the desired summation is the coefficient of $y^ix^{n+1}$ in our 2 variable polynomial.