I am new to bokeh and trying to figure out what columnDataSource does. It appears in many places but I am uncertain of its purpose and how it works. Can someone illuminate? Apologies if this is a silly question...
19
votes
2 Answers
19
votes
ColumnDataSource is the object where the data of a Bokeh graph is stored. You can choose not to use a ColumnDataSource and feed your graph directly with Python dictionaries, pandas dataframes, etc, but for certain features such as having a popup window showing data information when the user hovers the mouse on glyphs, you are forced to use a ColumnDataSource otherwise the popup window will not be able to get the data. Other uses would be when streaming data.
You can create a ColumnDataSource from dictionaries and pandas dataframes and then use the ColumnDataSource to create the glyphs.
6
votes
This should work:
import pandas as pd
import bokeh.plotting as bp
from bokeh.models import HoverTool, DatetimeTickFormatter
# Create the base data
data_dict = {"Dates":["2017-03-01",
"2017-03-02",
"2017-03-03",
"2017-03-04",
"2017-03-05",
"2017-03-06"],
"Prices":[1, 2, 1, 2, 1, 2]}
# Turn it into a dataframe
data = pd.DataFrame(data_dict, columns = ['Dates', 'Prices'])
# Convert the date column to the dateformat, and create a ToolTipDates column
data['Dates'] = pd.to_datetime(data['Dates'])
data['ToolTipDates'] = data.Dates.map(lambda x: x.strftime("%b %d")) # Saves work with the tooltip later
# Create a ColumnDataSource object
mySource = bp.ColumnDataSource(data)
# Create your plot as a bokeh.figure object
myPlot = bp.figure(height = 600,
width = 800,
x_axis_type = 'datetime',
title = 'ColumnDataSource',
y_range=(0,3))
# Format your x-axis as datetime.
myPlot.xaxis[0].formatter = DatetimeTickFormatter(days='%b %d')
# Draw the plot on your plot object, identifying the source as your Column Data Source object.
myPlot.circle("Dates",
"Prices",
source=mySource,
color='red',
size = 25)
# Add your tooltips
myPlot.add_tools( HoverTool(tooltips= [("Dates","@ToolTipDates"),
("Prices","@Prices")]))
# Create an output file
bp.output_file('columnDataSource.html', title = 'ColumnDataSource')
bp.show(myPlot) # et voilà.
DataFrame
objects, theColumnDataSource
is basically a simpler version of that. It is a collection of arrays of data (columns) that can be referred to by names. The actual internal structure is just that: a dictionary that maps strings to lists/arrays. It is the primary way that data is moved from python, to the BokehJS browser library. – bigreddot