by henrymorco | Aug 7, 2015 | SuanShu

Introduction Matrix multiplication occupies a central role in scientific computing with an extremely wide range of applications. Many numerical procedures in linear algebra (e.g. solving linear systems, matrix inversion, factorizations, determinants) can essentially be reduced to matrix multiplication [5, 3]. Hence, there is great interest in investigating fast matrix multiplication algorithms, to accelerate matrix multiplication (and other numerical procedures in turn). SuanShu was already the fastest in matrix multiplication and hence linear algebra per our benchmark. SuanShu v3.0.0 benchmark Starting version 3.3.0, SuanShu has implemented an advanced algorithm for even faster matrix multiplication. It makes some operations 100x times faster those of our competitors! The new benchmark can be found here: SuanShu v3.3.0 benchmark In this article, we briefly describe our implementation of a matrix multiplication algorithm that dramatically accelerates dense matrix-matrix multiplication compared to the classical IJK algorithm. Parallel IJK We first describe the method against which our new algorithm is compared against, IJK. Here is the algorithm performing multiplication for is , is , and is : for (i = 1; i < = m; i ++){ for (j = 1; j <= p; j ++){ for (k = 1; k <= n; k ++){ C[i,k] += A[i,j] * B[j,k]; } } } 1234567 for (i = 1; i < = m; i ++){ for (j = 1; j <= p; j ++){ for (k = 1; k <= n; k ++){ C[i,k] += A[i,j] * B[j,k]; } }} In Suanshu, this is implemented in parallel; the outermost loop is passed to a ParallelExecutor . As there are often more rows than threads available, the time complexity of this parallelized IJK is still roughly the same as IJK: ,...
## Recent Comments