Let $p(\lambda)=p_{0}+p_{1}\lambda+\ldots+p_{n-1}\lambda^{n-1}+p_{n}\lambda^{n}$.
Let $Q(\lambda)$ be the adjugate matrix of the square matrix $A-\lambda I$, which may be considered as a polynomial in $\lambda$ and with matrix coefficients (see this for 2 by 2 and 3 by 3 cases):
$Q(\lambda)=Q_{0}+\lambda Q_{1}+\ldots+\lambda^{q-1}Q_{q-1}+\lambda^{q}Q_{q}$, where $Q_{q}$ are constant matrices.
On one hand, by virtue of Cramer's Rule (Lay P179 Thm 8), $(adjA)A=(detA)I$.
So for $A-\lambda I$ in place of A: $( \, adj (A-\lambda l) \, )(A-\lambda l) =det (A-\lambda l )I$
$\implies
Q(\lambda)(A-\lambda l)=p(\lambda)I=p_{0}I+p_{1}\lambda I+\ldots+p_{n-1}\lambda^{n-1}I+p_{n}\lambda^{n}I.$
$1.$ What's the proof strategy? How would one determine/divine/previse the key steps, such as considering the adjugate of $A-\lambda I$, Cramer's Rule, and writing $Q(\lambda)(A-\lambda I)$ in two ways?
On the other hand, $Q(\lambda)(A-\lambda I)=Q_{0}A+\lambda(Q_{1}A-Q_{0})+\ldots+\lambda^{q}(Q_{q}A-Q_{q-1})-\lambda^{q+1}Q_{q}. $ Equate the two expressions for $Q(\lambda)(A-\lambda I)$ wrt powers of $\lambda$ as follows. Thus $q=n-1$.
$2.$ Why $q = n - 1$?
To save space, I (also concurrently) multiply the following by different powers of $A$ in orange:
$Q_{0}A=p_{0}I,$
$\color{orangered}{A(} Q_{1}A-Q_{0}=p_{1}I \color{orangered}{)} ,$
$...,$
$\color{orangered}{A^{n-1}(} Q_{n-1}A-Q_{n-2}=p_{n-1}I \color{orangered}{)} ,$
$\color{orangered}{A^{n}(} -Q_{n-1}=p_{n}I \color{orangered}{)}.$
$3.$ What's the proof strategy, regarding these multiplications of $A^i$ for all $1 \le i \le n$ in orange?
Add all the equalities together: $RHS = p(A)=p_{0}I+p_{1}A+\ldots+p_{n-1}A^{n-1}+p_{n}A^{n} \\ = LHS =O = \text{ zero matrix }. $
Addendum: I picked this as it looks like the easiest. Please advise if others are even easier.