0

Here is a problem on Optimization on Least Mean Square with answer of my class of Machine learning

Consider the least mean squares problem: $$ \min_{x\in\mathbb{R}^n}\|Ax - b\|_2^2 $$

Suppose $A \in \mathbb{R}^{m×n}$ is a full rank matrix and $m \ge n$. Find the closed-form solution of the least mean squares problem.

Hint: If $A \in \mathbb{R}^{m×n}$ is a full rank matrix and $m \ge n$,then $A^\top A$ is a positive definite matrix.

Solution: Let us first expand the objective function: \begin{align*} \min_{x\in\mathbb{R}^n}\|Ax - b\|_2^2&=(Ax - b)^\top(Ax - b)\\&=x^\top A^\top Ax - x^\top A^\top b - b^\top Ax + b^\top b\\ &= x^\top A^\top Ax - 2x^\top A^\top b + b^\top b \end{align*} This is a convex function of $x$ and so to find the minimum we take the derivative and set it equal to zero: $$ \nabla_x( x^\top A^\top Ax - 2x^\top A^\top b + b^\top b) = 2 \top A^\top Ax - 2 A^\top b = 0 $$ We know that $A^\top A$ is positive definite and invertible. Solving the last equation for $x$ we have $x = (A^\top A)^{-1}A^\top b$.

My questions

  • Given a matrix $A \in \mathbb{R}^{m×n}, m \ge n$, it is full column rank (and not simply full rank), right?
  • In the expression $\nabla_x x^\top A^\top Ax$, how do we know if the result is $2x^\top A^\top A$ or $2A^\top Ax$?
ecjb
  • 995

1 Answers1

2

Here's a reference for the term full rank:

For a non-square matrix with $m$ rows and $n$ columns, it will always be the case that either the rows or columns (whichever is larger in number) are linearly dependent. Hence when we say that a non-square matrix is full rank, we mean that the row and column rank are as high as possible, given the shape of the matrix. So if there are more rows than columns ( $ m> n$), then the matrix is full rank if the matrix is full column rank.

In your case, it is full column rank when they describe it as full rank.

You can use either row convention or column convention as long as it is consistent.

If you write $2x^TA^TA$, you have to use $2x^TA^TA-b^TA=0^T$.

If you write $2A^TAx$, you have to use $2A^TAx-A^Tb=0$.

Siong Thye Goh
  • 149,520
  • 20
  • 88
  • 149