While the current answer gives the necessary indications I'll try to give a concise and self contained answer. For diagonalizability, the following criterion is often useful.
Proposition. If a linear operator $T$ on a vector space over a field $K$ satisfies a polynomial equation $P[T]=0$ where $P$ factors over $K$ into distinct monic factors of degree$~1$, then $T$ is diagonalisable.
Proof. Write $P=(X-\lambda_1)\ldots(X-\lambda_k)$ where $\lambda_1,\ldots,\lambda_k$ are its distinct roots (in$~K$). The vector space becomes a module over the ring $K[X]/(P)$ by having $X$ act as $T$. By the Chinese remainder theorem for $K[X]$, one has $K[X]/(P)=(K[X]/(X-\lambda_1))\times\cdots\times(K[X]/(X-\lambda_k))$; this is a product of $k$ fields isomorphic to$~K$, with $X$ mapped to $\lambda_i$ in factor$~i$. Any module over such a product decomposes canonically as a direct sum, where the unit of factor$~i$ acts as projection onto summand$~i$. For our vector space, summand$~i$ (if nonzero) is the eigenspace of $T$ for $\lambda_i$, whence the result. QED.
Now if $\lambda_1,\ldots,\lambda_k$ are the distinct eigenvalues of$~M$, then for $P=(X-\lambda_1)\ldots(X-\lambda_k)$ one has $P[M]=0$, and obviously substitution of the restriction of $M$ to the invariant subspace $W$ into $P$ is also zero, as well as that of the operator $M_{/W}$ induced by$~W$ in the quotient space $V/W$. By the proposition these linear operators are therefore diagonalisable.
For the second problem, note that (by a simple calculation) any linear operator commuting with $M$ stabilises each kernel (and also each image) of a polynomial in$~M$, and in particular each eigenspace of$~M$ (which is the kernel of $(X-\lambda_i)[M]$). Then $U$ stabilises each of the eigenspaces of $M$, and as these are all $1$-dimensional, any eigenvector for$~M$ is also eigenvector for$~U$. In particular any basis of diagonalisation for$~M$ is also one for$~U$.