I'll give a brief overview of the main ideas about the relation between linear and affine (in)dependence.
Let $\mathbf p_i\in\mathbb{R}^d$ be points in a real space.
Reminder of linear (in)dependence
As a brief reminder about linear (in)dependence: the points are linearly dependent iff there's not-all-zero coefficients $\alpha_i$ such that $\sum_i \alpha_i \mathbf p_i=0$. The points are linearly independent if this is not the case, that is, if the only way to have $\sum_i \alpha_i \mathbf p_i=0$ is to use $\alpha_i=0$.
Given some $\mathbf x\in\mathbb{R}^d$, we say that $\mathbf x$ is a linear combination, or is in the linear span of the points $(\mathbf p_i)_i$ if we can write $\mathbf x=\sum_i \alpha_i \mathbf p_i$ for some set of coefficients $\alpha_i$. Note in particular that no further constraints are imposed on the coefficients.
Geometrically, these relate to hyperplanes (passing through the origin): $(\mathbf p_i)_{i=1}^m$ are linearly dependent if and only if there's a hyperplane of dimension smaller than $m$ that contains all the point. In other words, you don't need all the points to span the hyperplane that all of them together span. For example, $(1,0),(0,1),(1,1)$ are linearly dependent, because the first two points are sufficient to span $\mathbb{R}^2$.
Convex hull and affine span
Now let's say we want to know whether some $\mathbf x$ is "inside" the set of points $(\mathbf p_i)_{i=1}^m\subset\mathbb{R}^d$. That means to have $\mathbf x$ be a convex combination of the points, that is, to be able to write $\mathbf x=\sum_i \alpha_i \mathbf p_i$ with coefficients $\alpha_i\in[0,1]$ such that $\sum_i\alpha_i=1$.
Take note in particular of the constraints imposed on $\alpha_i$ in this definition.
We will refer to the set of all points that are convex combinations of $(\mathbf p_i)_{i=1}^m$ as the convex hull of the points, and denote it with $\operatorname{conv}(\{\mathbf p_1,...,\mathbf p_m\})$.
Yet another question one might be interested to ask is: is $\mathbf x$ in the affine hyperplane generated by the points $(\mathbf p_i)_{i=1}^m$? Note that an affine hyperplane, differently than a hyperplane, needs not pass through the origin (and thus, somewhat confusingly, an affine hyperplane is not a hyperplane).
Let us refer to the set of such points as the affine span, and denote it with $\operatorname{aff}(\{\mathbf p_i\}_{i=1}^m)$.
An easy way to characterise $\operatorname{aff}(\{\mathbf p_i\}_{i=1}^m)$ is to exploit what we know about a point being in the (non-affine) hyperplane generated by a set of points.
Observe that $\mathbf x$ being in the affine hyperplane generated by $(\mathbf p_i)_{i=1}^m$ is the same as $\mathbf x-\mathbf p_1$ being in the (non-affine) hyperplane generated by the points $(\mathbf p_i-\mathbf p_1)_{i=2}^m$.
Therefore, $\mathbf x$ is in the affine hyperplane generated by $(\mathbf p_i)_{i=1}^m$ iff there are coefficients $\alpha\in\mathbb{R}$ such that
$$\mathbf x - \mathbf p_1 = \sum_{i=2}^m \alpha_i (\mathbf p_i-\mathbf p_1)
\iff \mathbf x = \left(1-\sum_{i=2}^m \alpha_i\right)\mathbf p_1 + \sum_{i=2}^m \alpha_i \mathbf p_i.$$
In other words, we found that $\mathbf x\in \operatorname{aff}(\{\mathbf p_i\}_{i=1}^m)$ implies the existence of coefficients $\beta_i\in\mathbb{R}$ such that $\sum_{i=1}^m\beta_i=1$ and $\mathbf x=\sum_{i=1}^m \beta_i \mathbf p_i$. This is in fact an if and only if condition, and thus
$$\operatorname{aff}(\{\mathbf p_1, ...,\mathbf p_m\})=\left\{
\mathbf x\in\mathbb{R}^d: \,\, \mathbf x=\sum_{i=1}^m \beta_i \mathbf p_i,\,\,\,\beta_i\in\mathbb{R},\,\,\, \sum_i \beta_i=1
\right\}.$$
Affine (in)dependence
We say that a set of points $\{\mathbf p_i\}_{i=1}^m$ is linearly dependent if its linear span is generated by a strict subset of the points. Equivalently, if its linear span has dimension strictly smaller than $m$. The points are linearly independent if this is not the case.
We define affine (in)dependence in the same exact way, replacing "linear" with "affine".
Thus $\{\mathbf p_i\}_{i=1}^m$ is affinely dependent iff one of the points is in the affine span of the rest. Without loss of generality, say this is the first point. That would imply the existence of coefficients $\beta_i\in\mathbb{R}$ such that $\sum_i \beta_i=1$ and
$$\mathbf p_1 = \sum_{i=2}^m \beta_i \mathbf p_i
\iff
\mathbf p_1 + \sum_{i=2}^m (-\beta_i)\mathbf p_i = 0.$$
But this in turn implies, defining $\gamma_1=1$ and $\gamma_i=-\beta_i$ for $i\ge 2$, that there are coefficients $\gamma_i\in\mathbb{R}$ such that $\sum_i \gamma_i=0$ and $\sum_{i=1}^m \gamma_i \mathbf p_i=0$.
This statement, again, goes both ways: if there are $\gamma_i\in\mathbb{R}$ such that $\sum_i\gamma_i=0$ and $\sum_i\gamma_i \mathbf p_i=0$, then one of the points is in the affine span of the rest, hence the set is affinely dependent.
An equivalent way to talk about affine (in)dependence is to observe that $\{\mathbf p_i\}_{i=1}^m$ being affinely (in)dependent is equivalent to the set of "enlarged points" $\{(1,\mathbf p_i)\}_{i=1}^m$ being linearly (in)dependent.
This is discussed in more detail e.g. in Prove that $\{\vec x_i\}\subset\mathbb R^d$ is affinely independent iff $\{(1,\vec x_i)\}$ is linearly independent.
Yet another approach, which I partially used above, is to observe that affine (in)dependence of $\{\mathbf p_i\}_{i=1}^m$ is equivalent to linear (in)dependence of $\{\mathbf p_i-\mathbf p_1\}_{i=2}^m$. This is discussed e.g. in Prove that $v_0, v_1,...,v_k$ are affinely independent if and only if $v_1 - v_0,...,v_k - v_0$ are linearly independent.
Toy examples
Consider in $\mathbb{R}^2$ the points $\mathbf p_1=(1,0)$ and $\mathbf p_2=(0,1)$. These are both linearly and affinely independent.
On the other hand, add the points $\mathbf p_3=(1,1)$. Now
$\{\mathbf p_1,\mathbf p_2,\mathbf p_3\}$ is linearly dependent, but still affinely independent.
The idea is that while $\{\mathbf p_1,\mathbf p_2\}$ already span linearly the whole of $\mathbf R^2$, they span affinely only a line in it. You need all three points to get $\operatorname{aff}(\{\mathbf p_1,\mathbf p_2,\mathbf p_3\})=\mathbb{R}^2$.
This is a general feature: to affinely span $\mathbb{R}^d$ you need (at least) $d+1$ points.