1
votes

NVidia GPU has 16GB memory at most, which limits large model training. Model parallism may needs modification of deep learning framework. Is it feasible to train tensorflow models using Intel multi-core CPUs? Could you give some advice about the hardware configuration and the performance?

1
Probably Intel Many Cores is better suited for this. You don't need a full fledged CPU for only doing linear algebra. Better to have a lot of basic cores.Margaret Bloom

1 Answers

1
votes

You can try using Intel AI Devcloud which is cloud hosted hardware and software platform available to developers, researchers and startups to learn and get started on their Artificial Intelligence projects. It has Intel® Xeon® Scalable Processors and each processor has 24 cores with 2-way hyper-threading. Each processor has access to 96 GB of on-platform RAM.

Refer the below link for more details.

https://ai.intel.com/devcloud/

You can access this platform for 30 days by registering in the following link.

https://software.intel.com/en-us/ai-academy/devcloud

You will get a welcome mail which gives the user name and password. Open the hyperlink in the welcome mail to get more details on how to connect and use the Devcloud. To get the best performance on Devcloud, change the parallelism threads and OpenMP settings (either inside the code or in the terminal) as below:

In the terminal:

export OMP_NUM_THREADS="NUM_PARALLEL_EXEC_UNITS"

export KMP_BLOCKTIME="0"

export KMP_SETTINGS="1"

export KMP_AFFINITY="granularity=fine,verbose,compact,1,0"

Inside code:

import os

os.environ["OMP_NUM_THREADS"] = "NUM_PARALLEL_EXEC_UNITS"

os.environ["KMP_BLOCKTIME"] = "0"

os.environ["KMP_SETTINGS"] = "1"

os.environ["KMP_AFFINITY"]= "granularity=fine,verbose,compact,1,0"

For more details regarding optimization, please refer:

https://communities.intel.com/docs/DOC-112392

Hope this helps.