0
votes

I am trying to do a simple matrix multiplication using gpuArray in matlab. I am using a NVIDIA GForce 960M GPU with 4GB dedicated memory. The code is given below.

function gpuExample(A, B)
     tic
     C = A*B;    % matrix product on Client
     tC = toc;
     % copy A and B from Client to GPU
     a = gpuArray(A); b = gpuArray(B);
     tic
     c = a*b;    % matrix product on GPU
     tgpu = toc;
     tic
     CC = gather(c);   % copy data from GPU to Client
     tg = toc;

     disp(['Matrix multiply time on Client is ' num2str(tC)])
     disp(['Matrix multiply time on GPU is ' num2str(tgpu)])
     disp(['Time for gathering data from GPU back to Client is '        
            num2str(tg)])

     % Verify that GPU and Client computations agree
     tol = 1e-5;
     if any(abs(CC-C) > tol)
         disp('Matrix product on Client and GPU disagree')
     else
         disp('Matrix product on Client and GPU agree')
     end
end   %

N=4000;
A=rand(N); 
B=rand(N);
gpuExample(A,B)

The code works good for smaller matrix, but when I try with matrix dimension 4000X4000 for both matrix, GPU crashes, so do the Matlab execution.

The GPU output is as follows:

gpuDevice

ans =

CUDADevice with properties:

                  Name: 'GeForce GTX 960M'
                 Index: 1
     ComputeCapability: '5.0'
        SupportsDouble: 1
         DriverVersion: 7.5000
        ToolkitVersion: 7.5000
    MaxThreadsPerBlock: 1024
      MaxShmemPerBlock: 49152
    MaxThreadBlockSize: [1024 1024 64]
           MaxGridSize: [2.1475e+09 65535 65535]
             SIMDWidth: 32
           TotalMemory: 4.2950e+09
   MultiprocessorCount: 5
          ClockRateKHz: 1176000
           ComputeMode: 'Default'
  GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
      CanMapHostMemory: 1
       DeviceSupported: 1
        DeviceSelected: 1

Here is the crash report:

Warning: An unexpected error occurred during CUDA execution. The CUDA error was: CUDA_ERROR_LAUNCH_FAILED

To me, the GPU should be good enough to multiply two matrixes of size 4000X4000. why it is crashing.

2
and with smaller values of N it does not crash? - mpaskov
Yes, it is working for lower number of N. I tested up to N=3500 and it works fine - Md Monjur Ul Hasan
Were you working in double precision? Would single precision be an option to try to increase the problem size you can work with on your GPU? - Alex Taylor

2 Answers

0
votes

This isn't a proper answer, but stack overflow won't let me ask a question until I have a higher reputation.

I'm surprised not to see the property 'AvailableMemory' listed in the output from gpuDevice. What happens when you type

gpu = gpuDevice;
gpu.FreeMemory
gpu.AvailableMemory

These mobile chips behave oddly sometimes and it's not uncommon for them not to report allocation failures. This is because of the way they share video memory with compute kernels. So to answer your question, this is almost certainly because your chip doesn't have the 500MB or so needed to carry out the computation.

0
votes

I had the same problem and I have the same GPU.

gpu = gpuDevice;
gpu.FreeMemory
gpu.AvailableMemory

With these commands, we got the same problem. It is necessary to restart matlab.

The solution I found consists to work with smaller matrices: tradeoff computation vs memory