1
votes

edit: The main issue turned out to be that the package installers' combinations of CUDA Toolkit and NVidia driver version didn't fit my hardware setup. Installing the *.run file of CUDA solved it

I'm trying to install libgpuarray with pygpu for use of theano on Xubuntu 16.04 like described here: http://deeplearning.net/software/libgpuarray/installation.html

I have a Lenovo W520 with a Quadro 1000M GPU which has "Compute Capability 2.1" and is compatible with CUDA Toolkit up to version 8 according to Wikipedia

I've installed CUDA Toolkit 8.0.61-1 with the Debian installer. The nbody simulation (step 4 in the link) runs fine. apt-show-versions cuda says cuda:amd64/unknown 8.0.61-1 upgradeable to 9.0.176-1

nvidia-smi tells that driver version 384.90 is installed.

DEVICE=cuda0 python -c "import pygpu;pygpu.test()" in bash gives "GpuArrayException: GPU is too old for CUDA version".

I had previously CUDA Toolkit 9.0 installed before I realised this isn't compatible, then apt-get remove'd it before installing 8.0.

  • Is the previous installation 9.0 messing with something? How can I find out?
  • Or is this a bug in pygpu?
  • Any other suggestions?
1
The CUDA driver version contained in the 384.90 driver may be unhappy with your Fermi GPU. I suggest installing the latest 375.xx driver instead, and rerunning your test. The 384.90 driver officially supports Fermi GPUs, but CUDA 9 is incompatible with Fermi GPUs. Therefore if you want to run CUDA on a Fermi GPU, my recommendation is to use CUDA 8 as well as the driver branch associated with CUDA 8, i.e. 375.xx, as that is also compatible with Fermi.Robert Crovella
I'd like to try it. apt-get install nvidia-375 install 384 anyways. Then I downloaded the driver from nvidia.com/Download/Find.aspx, rebooted w/o Xserver, installed it regardless of error "The distribution-provided pre-install script failed". After that, nvidia-smi didn't find a driver. So I installed CUDA Toolkit 8 again and ended up with driver version 384, with the same problem :-/ascripter
You probably won't be able to use a package manager method (apt-get) very easily to downgrade a driver underneath a CUDA toolkit. For work like this I usually recommend the runfile installer method. It appears you tried that, but if you've previously had a package manager driver install, you can't just use the runfile installer at that point. You have to clean out the old install first. Instructions for this are covered in the linux install guide. If you can get things cleaned up, just don't use the package manager method at all, use the CUDA 8 runfile installer.Robert Crovella

1 Answers

2
votes

Robert pointed me to the solution which I'll describe in detail here. So if you've got an older GPU and have problems with installing drivers from a distribution package, here's the way to install from the runfile.

1. Cleanup previously installed clutter

    sudo apt-get remove --purge nvidia*
    sudo apt-get remove --purge cuda*
    sudo apt autoremove

This will remove any previously installed packages associated with nvidia / cuda. According to this thread (askubuntu) ubuntu-desktop has a dependency on nvidia-common, so re-install with sudo apt-get install ubuntu-desktop. This is not the case for Xubuntu 16.04

2. Download the runfile driver

Which can be found here: https://developer.nvidia.com/cuda-80-ga2-download-archive. Be sure to choose installer type "runfile".

In my case this is cuda_8.0.61_375.26_linux.run, CUDA 8 with the compatible driver version 375.26, as Robert pointed out.

3. Pre-installation instructions (for (x)ubuntu)

The NVidia : 2. Pre-installation Actions tell you to make sure that you have

  • a CUDA-capable GPU via lspci | grep -i nvidia.
  • a compatible distro uname -m && cat /etc/*release
  • gcc installed gcc --version
  • compatible kernel headers sudo apt-get install linux-headers-$(uname -r)

The NVidia : 4. Runfile Installation comes a bit short IMO, especially for Linux noobs like me. So here's the details:

4. Disable nouveau drivers

edit / create /etc/modprobe.d/blacklist-nouveau.conf with following content:

    blacklist nouveau
    options nouveau modeset=0

sudo update-initramfs -u to build the new kernel.

This did it for me. The solution described here (askubuntu) is more comprehensive in case you encounter problems.

5. Reboot in command line

There are probably different methods for that. I did it by modifying grub. Edit /etc/default/grub. Add / change these two keys (after backing up your existing grub file):

    GRUB_CMDLINE_LINUX="nomodeset"
    GRUB_CMDLINE_LINUX_DEFAULT="quiet 3"

sudo update-grub

and then reboot. If anything should fail booting in recovery mode will still work. Then you could undo your changes to grub.

6. Installing NVidia drivers + CUDA Toolkit

You should boot to the console now. First check that nouveau is really disabled via

lspci -nnk | grep -iA2 vga.

There is sth. like kernel driver in use: ***** which shouldn't read nouveau.

Now cdto the path where you initially downloaded the runfile and:

sudo sh cuda_8.0.61_375.26_linux.run

Afterwards recover your previous grub setup and reboot. You should have a working NVidia installation, and the pygpu test shouldn't fail anymore (at least not because of the wrong CUDA version)

7. Post-installation Actions

That should be the essentials. Let me know if it helped someone and I didn't only write it down for myself ^_^