Calculate the computational complexity of multiplication AxAT

Question

I need to implement an algorithm that calculates the symmetric matrix obtained by performing $A A^t$ being $A^t$ the transpose of $A$.

I did my analysis from two perspectives:

The first thing I notice is that it is not necessary to have more memory to have $A^t$ because it is the same as $A$, what changes is the "rotation". Therefore I not need store $A^t$ whether it can be "creative" and "intelligent" to iterate over the row / column properly in $A$.
The 2nd, what is known is that the symmetric matrices have the particularity that the elements of the upper and lower triangular (discounting the diagonal) are equal. Then, simply calculate the upper triangular and then "copy" the value of the position "transposed" in the lower triangular.

From what little I remember of the theory of computational complexity, traditional matrix multiplication $A B = (m\times n)\times (n\times p) = (m\times p)$, has complexity $O(mpn)$. If $A$ and $B$ are square everything is reduced to $O(m^3)$.

For my scenario where $A$ is $m\times n$ and recognizing that the symmetric matrix $S = A A^t$ is square, the expression reduces to $O(m^2n)$, but I can also be symmetrical save half the calculations: $O ((m^2n)/2)$. Until there more or less I think I'm doing OK.

The point is that I implemented two algorithms and can not remember as we proceed to analyze T() and thus see what could be more optimal. Hoping that one could guide me on how to estimate T():

Algorithm 1:

for i = 1..m do:
  for j = i..m do:
    if i = j
      then for k = 1..n do: 
             S(i,j) = S(i,j) + A(i,k) * A(i,k)
      else begin
             for k = 1..n do: 
               S(i,j) = S(i,j) + A(i,k) * A(j,k)
             S(j,i) = S(i,j)
           end

Algorithm 2:

for i = 1..m do:
begin
  for k = 1..n do:
    S(i,i) = S(i,i) + A(i,k) * A(i,k)
  for j = i+1..m do:
  begin
    for k = i..n do:
      S(i,j) = S(i,j) + A(i,k) * A(j,k)
    S(j,i) = S(i,j)
  end
end

Both by theory, do not exceed $O (m ^ 2n)$ obviously. When I try to analyze iteration cycles for the variable j, is where I get confused.

You can do theoretically better using fast matrix multiplication, and in fact, for the square case $n = m$, your problem is probably equivalent to matrix multiplication. But in practice, the blocked versions of your algorithms are probably the best. If you want to know which of them is better, implement them and do an experiment. — Yuval Filmus, Apr 22 '15 at 20:04
I know there are more efficient algorithms. I remember, if I do not fault the little memory, Strassen's algorithm is slightly less than O(n^3), something like O(n^2.81). In this time I only must limit to my proposals and I need to give a more "theoretical" justification and not just in the experiments. — Delphius, Apr 22 '15 at 20:15
Try counting exactly the number of arithmetical operations performed in each algorithm (or perhaps just the number of multiplications). — Yuval Filmus, Apr 22 '15 at 20:16
@Raphael I'm not sure. Delphius is interested in the concrete complexity of their algorithms, since both algorithms have the same asymptotic complexity. — Yuval Filmus, Apr 23 '15 at 00:12
Exactly. I'm studying these two algorithms and looking to determine which would be most appropriate ... if the value of T() could help knowing that in terms of asymptotic are the same. I'm thinking some things. I appreciate any input. — Delphius, Apr 23 '15 at 00:51
@YuvalFilmus I don't know what you mean by "concrete complexity". The actual runtime function, not only a $\Theta$-bound? Well, the reference question provides means to do that. In fact, it does barely talk about asymptotic bounds. — Raphael, Apr 23 '15 at 10:40
I read the post/article that linked Raphael. The truth is that this issue was not quite right given and studied in my career. I clarified some things, but others confuses me. How do you solve? Because no matter how I try, I get both of which are (m ^ 2n)/2. The difference is that eliminated the conditional and reorder cycles for the diagonal and triangular. Translate that to "numbers" I could not. — Delphius, Apr 23 '15 at 23:01
Well, what kind of result do you expect? (By the way, you'll pay heavily for not storing $A^T$ explicitly if you consider memory hierarchy. If $A$ is stored row-wise, traversing it column-wise for "reading" $A^T$ will create (asymptotically dominatingly) many cache misses. — Raphael, Aug 07 '15 at 09:46
The answer depends a lot on what you intend to use $A A^T$ for, after you know it. For example, if you use $A A^T$ only to do matrix-vector multiplications of the form $A A^T v$, then you do not even need to compute $A A^T$: just compute $u = A^T v$ and $z = A u = A A^T v$. — Vincenzo, Nov 11 '18 at 21:19

Calculate the computational complexity of multiplication AxAT

0 Answers0