I've run a image processing script using tensorflow API. It turns out that the processing time decreased quickly when I set the for-loop outside the session running procedure. Could anyone tell me why? Is there any side-effects?
The original code:
with tf.Session() as sess:
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(len(file_list)):
start = time.time()
image_crop, bboxs_crop = sess.run(crop_image(file_list[i], bboxs_list[i], sess))
print( 'Done image %d th in %d ms \n'% (i, ((time.time() - start)*1000)))
# image_crop, bboxs_crop, image_debug = sess.run(crop_image(file_list[i], bboxs_list[i], sess))
labels, bboxs = filter_bbox(labels_list[i], bboxs_crop)
# Image._show(Image.fromarray(np.asarray(image_crop)))
# Image._show(Image.fromarray(np.asarray(image_debug)))
save_image(image_crop, ntpath.basename(file_list[i]))
#save_desc_file(file_list[i], labels_list[i], bboxs_crop)
save_desc_file(file_list[i], labels, bboxs)
coord.request_stop()
coord.join(threads)
The code modified:
for i in range(len(file_list)):
with tf.Graph().as_default(), tf.Session() as sess:
start = time.time()
image_crop, bboxs_crop = sess.run(crop_image(file_list[i], bboxs_list[i], sess))
print( 'Done image %d th in %d ms \n'% (i, ((time.time() - start)*1000)))
labels, bboxs = filter_bbox(labels_list[i], bboxs_crop)
save_image(image_crop, ntpath.basename(file_list[i]))
save_desc_file(file_list[i], labels, bboxs)
The time cost in the original code would keep increasing from 200ms to even 20000ms. While after modified, the the logs messages indicate that there are more than one graph and tensorflow devices were created during running, why is that?
python random_crop_images_hongyuan.py I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: name: GeForce GT 730M major: 3 minor: 5 memoryClockRate (GHz) 0.758 pciBusID 0000:01:00.0 Total memory: 982.88MiB Free memory: 592.44MiB I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 730M, pci bus id: 0000:01:00.0) Done image 3000 th in 317 ms
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 730M, pci bus id: 0000:01:00.0) Done image 3001 th in 325 ms
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 730M, pci bus id: 0000:01:00.0) Done image 3002 th in 312 ms
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 730M, pci bus id: 0000:01:00.0) Done image 3003 th in 147 ms
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 730M, pci bus id: 0000:01:00.0) Done image 3004 th in 447 ms