CPU Time for matrix matrix multiplication

Question

I am trying to decide wether several similar but independent problems should be dealt with simultaneously or sequentially (possibly in parallel on different computers). In order to decide, I need to compare the cpu times of the following operations :

time_1 is the time for computing X(with shape (n,p)) @ b (with shape (p,1)).
time_k is the time for computing X(with shape (n,p)) @ B (with shape (p,k)).

where X, b and B are random matrices. The difference between the two operations is the width of the second matrix.

Naively, we expect that time_k = k x time_1. With faster matrix multiplication algorithms (Strassen algorithm, Coppersmith–Winograd algorithm), time_k could be smaller than k x time_1 but the complexity of these algorithms remains much larger than what I observed in practice. Therefore my question is : How to explain the large difference in terms of cpu times for these two computations ?

The code I used is the following :

import time
import numpy as np
import matplotlib.pyplot as plt

p     = 100
width = np.concatenate([np.arange(1, 20), np.arange(20, 100, 10), np.arange(100, 4000, 100)]).astype(int)

mean_time = []
for nk, kk in enumerate(width):
    timings = []
    nb_tests = 10000 if kk <= 300 else 100
    for ni, ii in enumerate(range(nb_tests)):
        print('\r[', nk, '/', len(width), ', ',  ni, '/', nb_tests, ']', end = '')
        x     = np.random.randn(p).reshape((1, -1))
        coef  = np.random.randn(p, kk)
        d     = np.zeros((1, kk))
        start = time.time()
        d[:]  = x @ coef
        end   = time.time()
        timings.append(end - start)

    mean_time.append(np.mean(timings))

mean_time = np.array(mean_time)


fig, ax = plt.subplots(figsize =(14,8))
plt.plot(width, mean_time, label =  'mean(time\_k)')
plt.plot(width, width*mean_time[0], label = 'k*mean(time\_1)')
plt.legend()
plt.xlabel('k')
plt.ylabel('time (sec)')
plt.show()

Arthur Hv Arthur Hv · Accepted Answer · 2019-03-19T13:02:18

You aren't only timing multiplication operation. time.time() takes time to complete.

>>> print(time.time() - time.time())
-9.53674316406e-07

When multiplied by the number of tries (10000) then the number of instances it becomes significant overhead, for n=100 you are in fact comparing what is 1.000.000 calls to time.time() to 100 regular numpy array multiplications.

For quick benchmarking, Python provides a dedicated module that doesn't have this problem : see timeit

CPU Time for matrix matrix multiplication

2 Answers