1
votes

I plan to make an embedding visualization in TensorBoard Projector with prepared vector data (not that trained by TensorFlow), from a notebook in Google Cloud DataLab (instead of uploading TSV files in a web browser).

I've tried the code provided in this tutorial:

from tensorflow.contrib.tensorboard.plugins import projector

LOG_DIR='test_log'

# Create randomly initialized embedding weights which will be trained.
N = 10000 # Number of items (vocab size).
D = 200 # Dimensionality of the embedding.
embedding_var = tf.Variable(tf.random_normal([N,D]), name='word_embedding')

# Format: tensorflow/tensorboard/plugins/projector/projector_config.proto
config = projector.ProjectorConfig()

# You can add multiple embeddings. Here we add only one.
embedding = config.embeddings.add()
embedding.tensor_name = embedding_var.name
# Link this tensor to its metadata file (e.g. labels).
# embedding.metadata_path = os.path.join(LOG_DIR, 'metadata.tsv')

# Use the same LOG_DIR where you stored your checkpoint.
summary_writer = tf.summary.FileWriter(LOG_DIR)

# The next line writes a projector_config.pbtxt in the LOG_DIR. TensorBoard will
# read this file during startup.
projector.visualize_embeddings(summary_writer, config)

While LOG_DIR is an empty folder in the same folder with the notebook file.

Since metadata is not required in embedding visualization, I didn't set embedding.metadata_path .

Then I run the code:

from google.datalab.ml import TensorBoard as tb
tb.start('test_log')

A new page of TensorBoard can be opened, but it says:

No checkpoint was found.

when I switch to Projector view.

But as the code above shows, the data is created randomly, there should not be any checkpoint file.

Furthermore, at the next stage, I need to make an embedding visualization with my own vector data which is not trained by Tensorflow, without any checkpoint file.

When using Projector in a web browser, only a TSV file of vector data is required, it doesn't require any checkpoint file.

So the question is: What is the correct way to make embedding visualization in TensorBoard Projector from Google Cloud DataLab, with only a dataset of vector?

Thanks.

1
does this work in Jupyter? (you can access Jupyter on GCP via AI Platform | Notebooks?) - Lak
@Lak I've just tried in JupyterLab in AI Hub, after running %tensorboard --logdir "test_log" I got "Launching TensorBoard..." then just "<IPython.lib.display.IFrame at 0x7f07e5887320>". It seems that TensorBoard is launched but I have no idea how to reveal it. - Gong Weigang
Make sure that the logdir is non-empty. Also that you loaded the tensorboard extension. See: tensorflow.org/tensorboard/r2/tensorboard_in_notebooks - Lak
@Lak I'm sure the logdir is non-empty. I tried the tutorial code provided from your link. When I use %tensorboard --logdir logs it keeps the same as above. Then I use notebook.display(port=6006, height=1000) , it shows a big white block but nothing is on it. It seems that the TensorBoard is running (Checked by using notebook.list()) but it cannot be loaded or shown. - Gong Weigang

1 Answers

0
votes

Looking at that tutorial, you need to run the code to periodically save off checkpoints. Those should be the basis for the information showing up in your tensor board.

saver = tf.train.Saver()
saver.save(session, os.path.join(LOG_DIR, "model.ckpt"), step)