Matrix Multiplication Algorithms

Since we are working with fairly large matrices in the shadowbox assignment, I started to think about how to do matrix multiplication efficiently on large matrices just out of curiosity. I have heard in past classes that the fastest known algorithm to do this is O(n^2.376) but nothing else was said so I decided to investigate and make a blog post about it in case anyone else cares (probably not) =).

Lower Bound (2*n^2)
Let A and B be nxn matrices. No matter how clever our algorithm is, we will always have to examine every entry of A and B. Since each matrix has n^2 entries, a lower bound of any matrix multiplication algorithm is 2*n^2.

Naive Matrix Multiplication (n^3)
Let A and B be nxn matrices. The naive matrix multiplication algorithm just follows the exact definition of matrix multiplication. AB = [AB(1) … A(Bn)] where B(i) is the ith column of B. To compute AB(i), we use n^2 multiplications since every entry in A will be multiplied by an entry in B(i) exactly once. We need to do this for each of AB(1) … AB(n), thus it will take n^3 time.

Strassen’s algorithm, (4.695*n^2.807)
The clever idea that Strassen’s algorithm employs is that there is a way to compute the product of two 2×2 matrices with 7 multiplications instead of the standard 2^3 = 8 multiplications of naive matrix multiplication. Let A and B be 2×2 matrices. Then we begin by computing:

M1 = (A[1,1]+A[2,2])(B[1,1]+B[2,2])
M2 = (A[2,1]+A[2,2])B[1,1]
M3 = A[1,1](B([1,2]-(B[2,2])
M4 = A[2,2](B[2,1]-B[1,1])
M5 = (A[1,1]+A[1,2])B[2,2]
M6 = (A[2,1]-A[1,1])(B[1,1]+B[1,2])
M7 = (A[1,2]-A[2,2])(B[2,1]+B[2,2])

If C=AB, then we have:

C[1,1] = M1+M4-M5+M7
C[1,2] = M3+M5
C[2,1] = M2+M4
C[2,2] = M1-M2+M3+M6

Since each of M1 … M7 requires exactly one multiplication, we have computed AB using 7 multiplications instead of 8. So at this point, you may wonder how this convoluted way to compute the product of two 2×2 matrices is applicable to general matrix multiplication. The idea is that given two nxn matrices A and B, if they are not (2^m)x(2^m) in dimension, then we just fill rows and columns with 0’s until they are. Then we break A and B into block matrices of dimension (2^(m-1))x(2^(m-1)). ie. The upper left block of A contains entries A[1,1] and A[2^(m-1),2^(m-1)] and every entry that is within the square block defined by those two entries. With these blocks, we can apply the clever idea of multiplying two 2×2 matrices except that A[1,1] now refers to the upper left block of A instead of just one entry. We apply the block division property recursively until the blocks become single entries and then carry out the multiplications. This results in a running time of about 4.695*n^2.807.

What happened to O(n^2.376)?
If you had the patience to read this entire blog post, you would recall that at the beginning, I said the fastest known algorithm was O(n^2.376). Well that is not entirely true. It is true that there is another algorithm, the Coppersmith–Winograd algorithm, which is O(n^2.376). You can read about it at

http://en.wikipedia.org/wiki/Coppersmith-Winograd_algorithm.

While it has better asymptotic complexity than Strassen’s algorithm, the big O notation hides a large constant factor (ie. the c2 in c1*n^2.376+c2) which makes it impractical in practice. Thus it is not technically the fastest, it just has the best asymptotic complexity so people refer to it as being the fastest.

Posted in Topics: Uncategorized

Jump down to leave a comment.

Leave a Comment

You must be logged in to post a comment.



* You can follow any responses to this entry through the RSS 2.0 feed.