Below is a conceptual derivation, showing how to view it as arising via CM = cross multiplication criterion for fraction equivalence. If $a$ and $b$ are invertible then we can use familiar properties of (modular) fractions, where we define $\,a/b := ab^{-1}.\,$ Solving for $r$ in both we obtain
$$\begin{align} \color{#c00}b\,r\equiv \color{#c00}a\\ \color{#0a0}a\,r\equiv \color{#0a0}b\end{align}\ \Longrightarrow\ \color{#c00}{\dfrac{a}b}\equiv r \equiv \color{#0a0}{\dfrac{b}a}\,\overset{\rm CM}\Longrightarrow\, \color{#c00}a\color{#0a0}a\equiv \color{#c00}b\color{#0a0}b\qquad\qquad $$
so $\,a^2\equiv b^2\,$ by CM. We can eliminate fractions to handle the case when $a$ or $b$ is not invertible. The proof of CM amounts to scaling the fractions to put them over a common denominator $ab$. We can do the same thing with the equations defining the fractions, where the denominator is just the coef of $\,r,\,$ e.g. $\, b\,r\equiv a \!\iff\! r\equiv a/b\,$ has denominator $\,b.\,$ So we want to scale both equations so they have the same coef of $\,r\,$ - which will necessarily be a common multiple of $a$ and $b$. We choose the simple multiple $\,ab,\,$ yielding
$$\begin{align}
&a\times [b\,r\equiv a]\to \color{#c00}{ab\,r}\equiv a^2\\
&\:\!b\times [a\,r\equiv b]\to \color{#c00}{ba\,r}\equiv b^2
\end{align}\ \Rightarrow\ a^2\equiv b^2$$
This proof amounts to scaling the proof of the CM criterion by $\,ab,\,$ so we work only with (modular) integers (vs. fractions). Of course we can do the same with the general CM rule for $\frac{a}b\equiv \frac{c}d$ to obtain
$$ \begin{align} &b\,r\equiv a\\ &d\,r\equiv c\end{align}\ \Longrightarrow\ ad\equiv bc\qquad\qquad $$
Remark $ $ More generally, from an equational point of view, we can view the above derivation as cross-multiplying the equations to eliminate $\,r.\,$ From this standpoint we can view the above as a special case of general elimination algorithms such as the Grobner basis algorithm (which is a (multivariate) generalization of both the division (with remainder) algorithm and Gaussian elimination). Even more generally we can view it as a special case of various overlapping (unification) methods of generating consequences of equations used in term rewriting systems, e.g. the Knuth Bendix algorithm. For example, see here where I show how to derive a proof of uniqueness of inverses this way.