4
votes

If the answer is yes, what will be a simple example to test this capability?

I have tried to use the multiprocessing capabilities of SFrame and implicit. But the CPU utilization is always below 10% of a n1-highmem-32 (32 vCPUs, 208 GB memory) instance.

import os
os.environ['OMP_NUM_THREADS'] = "25"
import sframe
sframe.set_runtime_config('GRAPHLAB_DEFAULT_NUM_PYLAMBDA_WORKERS', 25)


import implicit
item_factors, user_factors = implicit.alternating_least_squares(train, 2)
1

1 Answers

2
votes

Sorry about the delay in answering. Jupyter Python kernel itself is single threaded. I am not certain about the specific sframe library but this is not something where Datalab does anything special either way. We use the standard Python kernel in Jupyter. Perhaps you could tag your question as an sframe one?

We have seen some customers use n CPUs for a team so that separate kernels can run on separate CPUs. However, in general, high-mem options are a better bet than multi-CPU VMs for a single user.

Separately, we have released a beta refresh that will let you run Datalab locally with an option to run the kernel in GCE. If you are interested, please take a look at: https://cloud.google.com/datalab/docs/quickstarts/

Thanks. Dinesh Kulkarni Product Manager, Datalab & Cloud ML