
Language BenchmarkVersion 8 Jan. 2004. This is a comparison of the performance of different programming languages. The benchmark problem was to calculate the sum of all entries of a the square of a n x n matrix M where M(i,j)=i+j+0.5. The implementation in R is the shortest: matrixmult < function(n) { M < array(0.,dim=c(n,n)); for (i in 1:n) { for (j in 1:n) { M[i,j] < i+j+0.5; } } r < 0; for (i in 1:n) { for (j in 1:n) { for (k in 1:n) { r < r+M[i,k]*M[k,j]; } } } r; }
The operator The matrix multiplication part of the algorithm needs n^{3} "operations" (i.e. additions and multiplications) and is the same in all languages, except perhaps in the fast versions of the R and Maple implementation where the builtin matrix multiplication is used which may be implemented by a faster algorithm than O(n^{3}). The table shows the processor cycles needed per "operation", i.e. CPU frequency · total time / n^{3}, for different matrix sizes n.
The "fast" variants are meant to be very fast implementations of the matrix multiplication algorithm in the respective programming language. The fast implementations in C and C++ assure that memory is addressed consecutively. The fast implementations in R and Maple use the builtin matrix multiplication of these languages. For small matrix size n the naive C and C++ implementations perform already worse than the fast ones, but for large problems these naive implementations fall almost back to Java performance. The difference between C compilers (GNU gcc and Intel C++ Compiler Version 5.0.1 Build 010730D0) for the fast versions is neglectable, but g++ 3.0.2 and Intel C++ seem to have problems with the fast C++ version. Java is the fastest interpreter, even three times faster than the Glasgow Haskell Compiler ghc (it compiles to C, then uses gcc as a backend). All other interpreters seem to have performance problems, more or less serious. Haskell has the potential to optimize the program in a way that no memory access at all is needed and therefore to outperform all other languages but the ghc version seems to use even more memory than needed for n^{2} doubles. The Bash implementation shows that Bash has a serious performance problem with large array sizes because it performs worse than O(n^{3}). Code sizes are given in characters without leading spaces. The implementation in Haskell is the longest, the one in R the shortest (that doesn't use the highlevel builtin feature of matrix multiplication). The benchmarks were performed on a dual Athlon MP 1.2 GHz machine with DDRRAM (FSB 266 MHz). Full source (17 KB K) of the benchmark. If you know other programming languages in that you can provide the source for the algorithm above, I'd be glad to hear from you ( mail@Franosch.org ) so that I can include these languages to the comparison. Martin Nilsson has contributed a Pike version of the benchmark. Last modified 20080417 by webmaster@Franosch.org . 