19
votes

I am new to bokeh and trying to figure out what columnDataSource does. It appears in many places but I am uncertain of its purpose and how it works. Can someone illuminate? Apologies if this is a silly question...

2
If you are familiar with R or Pandas DataFrame objects, the ColumnDataSource is basically a simpler version of that. It is a collection of arrays of data (columns) that can be referred to by names. The actual internal structure is just that: a dictionary that maps strings to lists/arrays. It is the primary way that data is moved from python, to the BokehJS browser library.bigreddot

2 Answers

19
votes

ColumnDataSource is the object where the data of a Bokeh graph is stored. You can choose not to use a ColumnDataSource and feed your graph directly with Python dictionaries, pandas dataframes, etc, but for certain features such as having a popup window showing data information when the user hovers the mouse on glyphs, you are forced to use a ColumnDataSource otherwise the popup window will not be able to get the data. Other uses would be when streaming data.

You can create a ColumnDataSource from dictionaries and pandas dataframes and then use the ColumnDataSource to create the glyphs.

6
votes

This should work:

import pandas as pd
import bokeh.plotting as bp
from bokeh.models import HoverTool, DatetimeTickFormatter

# Create the base data
data_dict = {"Dates":["2017-03-01",
                  "2017-03-02",
                  "2017-03-03",
                  "2017-03-04",
                  "2017-03-05",
                  "2017-03-06"],
             "Prices":[1, 2, 1, 2, 1, 2]}

# Turn it into a dataframe
data = pd.DataFrame(data_dict, columns = ['Dates', 'Prices'])

# Convert the date column to the dateformat, and create a ToolTipDates column
data['Dates'] = pd.to_datetime(data['Dates'])
data['ToolTipDates'] = data.Dates.map(lambda x: x.strftime("%b %d")) # Saves work with the tooltip later

# Create a ColumnDataSource object
mySource = bp.ColumnDataSource(data)

# Create your plot as a bokeh.figure object
myPlot = bp.figure(height = 600,
               width = 800,
               x_axis_type = 'datetime',
               title = 'ColumnDataSource',
               y_range=(0,3))

# Format your x-axis as datetime.
myPlot.xaxis[0].formatter = DatetimeTickFormatter(days='%b %d')

# Draw the plot on your plot object, identifying the source as your Column Data Source object.
myPlot.circle("Dates",
          "Prices",
          source=mySource,
          color='red',
          size = 25)

# Add your tooltips
myPlot.add_tools( HoverTool(tooltips= [("Dates","@ToolTipDates"),
                                    ("Prices","@Prices")]))


# Create an output file
bp.output_file('columnDataSource.html', title = 'ColumnDataSource')
bp.show(myPlot) # et voilà.