78
votes

Reading implementation of scikit-learn in tensroflow : http://learningtensorflow.com/lesson6/ and scikit-learn : http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html I'm struggling to decide which implementation to use.

scikit-learn is installed as part of the tensorflow docker container so can use either implementation.

Reason to use scikit-learn :

scikit-learn contains less boiler plate than the tensorflow implementation.

Reason to use tensorflow :

If running on Nvidia GPU the algorithm wilk be run against in parallel , I'm not sure if scikit-learn will utilise all available GPU's ?

Reading https://www.quora.com/What-are-the-main-differences-between-TensorFlow-and-SciKit-Learn

TensorFlow is more low-level; basically, the Lego bricks that help you to implement machine learning algorithms whereas scikit-learn offers you off-the-shelf algorithms, e.g., algorithms for classification such as SVMs, Random Forests, Logistic Regression, and many, many more. TensorFlow really shines if you want to implement deep learning algorithms, since it allows you to take advantage of GPUs for more efficient training.

This statement re-enforces my assertion that "scikit-learn contains less boiler plate than the tensorflow implementation" but also suggests scikit-learn will not utilise all available GPU's ?

2
You should clarify the question (title) for better reference.Ivan De Paz Centeno
@IvanDePazCenteno please see title updateblue-sky
The classic scikit-learn lib is cpu-only, as indicated in the FAQs (edit: did not saw this ref in the answer, sry). (Also every bit of sklearn code i checked is not ready for GPU)sascha

2 Answers

107
votes

Tensorflow only uses GPU if it is built against Cuda and CuDNN. By default it does not use GPU, especially if it is running inside Docker, unless you use nvidia-docker and an image with a built-in support.

Scikit-learn is not intended to be used as a deep-learning framework and it does not provide any GPU support.

Why is there no support for deep or reinforcement learning / Will there be support for deep or reinforcement learning in scikit-learn?

Deep learning and reinforcement learning both require a rich vocabulary to define an architecture, with deep learning additionally requiring GPUs for efficient computing. However, neither of these fit within the design constraints of scikit-learn; as a result, deep learning and reinforcement learning are currently out of scope for what scikit-learn seeks to achieve.

Extracted from http://scikit-learn.org/stable/faq.html#why-is-there-no-support-for-deep-or-reinforcement-learning-will-there-be-support-for-deep-or-reinforcement-learning-in-scikit-learn

Will you add GPU support in scikit-learn?

No, or at least not in the near future. The main reason is that GPU support will introduce many software dependencies and introduce platform specific issues. scikit-learn is designed to be easy to install on a wide variety of platforms. Outside of neural networks, GPUs don’t play a large role in machine learning today, and much larger gains in speed can often be achieved by a careful choice of algorithms.

Extracted from http://scikit-learn.org/stable/faq.html#will-you-add-gpu-support

6
votes

I'm experimenting with a drop-in solution (h2o4gpu) to take advantage of GPU acceleration in particular for Kmeans:

try this:

from h2o4gpu.solvers import KMeans
#from sklearn.cluster import KMeans

as of now, version 0.3.2 still don't have .inertia_ but I think it's in their TODO list.

EDIT: Haven't tested yet, but scikit-cuda seems to be getting traction.

EDIT: RAPIDS is really the way to go here.