I am not entirely sure I understand your question, but it appears to me you are trying to understand the relation between a monad and a monoid of endofunctors, specifically in your last point:
For any category C, the category [C,C] of its endofunctors has a monoidal structure induced by the composition and the identity functor IC. A monoid object in [C,C] is a monad on C.
Maybe I can illustrate this relation in more detail.
Let's first ask what a monoid object in a category $\mathcal C$ is. We would like to say it is a triple $(M,\cdot,1)$ with $\cdot:M\times M\to M$ and $1:*\to M$ satisfying the usual identity and associativity, where $*$ is the terminal object. This definition requires two things, namely that $\mathcal C$ has products and a terminal object.
We would like to define such an object in our category of endofunctors $\mathcal D=[\mathcal C,\mathcal C]$. However we do not in general have a product and terminal object. But the category $\mathcal D$ has a weaker structure, which is the structure of a (strict) monoidal category. Roughly, this means we have a tensor product $S\otimes T \equiv S\circ T$ for any two objects (note this is not commutative), and a unit object ${\bf 1} \equiv \text{Id}_{\mathcal C}$ such that
$$(S\otimes T)\otimes U = S\otimes (T\otimes U)$$
and
$$S\otimes {\bf 1} = S = {\bf 1}\otimes S.$$
(Note: the equalities here come from the "strict" monoidal category. Otherwise we would only have isomorphisms)
We can still define a monoid-like object in such a category. It is given by a triple $(M,\mu,\eta)$ with $\mu:M\otimes M\to M$, and $\eta:{\bf 1}\to M$ with the usual associativity and identity conditions
$$\mu\circ (\mu\otimes\text{Id}_M) = \mu\circ(\text{Id}_M\otimes \mu)$$
and
$$\mu\circ (\text{Id}_M\otimes\eta) = \text{Id}_{M\otimes I} = \text{Id}_M=\text{Id}_{I\otimes M} =\mu\circ (\eta\otimes\text{Id}_M).$$
Now if we consider such a monoid-like object in the endofunctor category $\mathcal D$ with the given monoidal structure, then this is a triple $(M,\mu,\eta)$ with $\mu:M^2\to M$, $\eta:{\text{Id}_{\mathcal C}}\to M$ satisfying
$$\mu\circ (\mu M) = \mu\circ (M\mu)$$
and
$$\mu\circ (M\eta) = \text{Id}_M = \mu\circ (\eta M).$$
These are precisely the conditions for a monad on the endofunctor $M$, where $\mu M$, $M\mu$, $\eta M$, and $M \eta$ are the whiskering operations.
Now you may be wondering how I got from $\epsilon\otimes\text{Id}_M$ to $\epsilon M$ and $\text{Id}_M\otimes\epsilon$ to $ M\epsilon$. The trouble is that we need to define the tensor $\delta\otimes \epsilon$ of two maps $\delta:S\to U$ and $\epsilon:T\to V$ inside of $\mathcal D$. This is not easy to define in this context, but to give an idea, this can be defined pointwise by
$$(\delta\otimes \epsilon)_X \equiv U(\epsilon_X)\circ\delta_{T(X)} = \delta_{V(X)}\circ S(\epsilon_X).$$ One can check that when $\delta$ or $\epsilon$ is the identity natural transformation, then this corresponds to a whiskering operation.
If I recall correctly this (roughly) corresponds to defining composition in the category of monads (I also think this is a good exercise), and there are actually multiple ways of doing this.