Since we are working with fairly large matrices in the shadowbox assignment, I started to think about how to do matrix multiplication efficiently on large matrices just out of curiosity. I have heard in past classes that the fastest known algorithm to do this is O(n^2.376) but nothing else was said so I decided to investigate and make a blog post about it in case anyone else cares (probably not) =).
Lower Bound (2*n^2)
Let A and B be nxn matrices. No matter how clever our algorithm is, we will always have to examine every entry of A and B. Since each matrix has n^2 entries, a lower bound of any matrix multiplication algorithm is 2*n^2.
Naive Matrix Multiplication (n^3)
Let A and B be nxn matrices. The naive matrix multiplication algorithm just follows the exact definition of matrix multiplication. AB = [AB(1) … A(Bn)] where B(i) is the ith column of B. To compute AB(i), we use n^2 multiplications since every entry in A will be multiplied by an entry in B(i) exactly once. We need to do this for each of AB(1) … AB(n), thus it will take n^3 time.
Strassen’s algorithm, (4.695*n^2.807)
The clever idea that Strassen’s algorithm employs is that there is a way to compute the product of two 2×2 matrices with 7 multiplications instead of the standard 2^3 = 8 multiplications of naive matrix multiplication. Let A and B be 2×2 matrices. Then we begin by computing:
M1 = (A[1,1]+A[2,2])(B[1,1]+B[2,2])
M2 = (A[2,1]+A[2,2])B[1,1]
M3 = A[1,1](B([1,2]-(B[2,2])
M4 = A[2,2](B[2,1]-B[1,1])
M5 = (A[1,1]+A[1,2])B[2,2]
M6 = (A[2,1]-A[1,1])(B[1,1]+B[1,2])
M7 = (A[1,2]-A[2,2])(B[2,1]+B[2,2])
If C=AB, then we have:
C[1,1] = M1+M4-M5+M7
C[1,2] = M3+M5
C[2,1] = M2+M4
C[2,2] = M1-M2+M3+M6
Since each of M1 … M7 requires exactly one multiplication, we have computed AB using 7 multiplications instead of 8. So at this point, you may wonder how this convoluted way to compute the product of two 2×2 matrices is applicable to general matrix multiplication. The idea is that given two nxn matrices A and B, if they are not (2^m)x(2^m) in dimension, then we just fill rows and columns with 0’s until they are. Then we break A and B into block matrices of dimension (2^(m-1))x(2^(m-1)). ie. The upper left block of A contains entries A[1,1] and A[2^(m-1),2^(m-1)] and every entry that is within the square block defined by those two entries. With these blocks, we can apply the clever idea of multiplying two 2×2 matrices except that A[1,1] now refers to the upper left block of A instead of just one entry. We apply the block division property recursively until the blocks become single entries and then carry out the multiplications. This results in a running time of about 4.695*n^2.807.
What happened to O(n^2.376)?
If you had the patience to read this entire blog post, you would recall that at the beginning, I said the fastest known algorithm was O(n^2.376). Well that is not entirely true. It is true that there is another algorithm, the Coppersmith–Winograd algorithm, which is O(n^2.376). You can read about it at
http://en.wikipedia.org/wiki/Coppersmith-Winograd_algorithm.
While it has better asymptotic complexity than Strassen’s algorithm, the big O notation hides a large constant factor (ie. the c2 in c1*n^2.376+c2) which makes it impractical in practice. Thus it is not technically the fastest, it just has the best asymptotic complexity so people refer to it as being the fastest.






Leave a Comment
You must be logged in to post a comment.
* You can follow any responses to this entry through the RSS 2.0 feed.