3
votes

My goal here is to create x,y,z scatterplots in HoloViews, where the plots are produced with Datashader, with the points aggregated by minimizing over 'z', and with the points colored according to 'z'. Ultimately this is for doing things like producing profile likelihood plots.

I made good progress generating plots with HoloViews + Datashader, even linking the plots in cool ways (see e.g. How to do linked data selections in HoloViews with Datashader + Bokeh backend), however I can't figure out how to control the point colors and aggregation method.

Below is some code (run in a Jupyter notebook) that does (almost) what I want in "plain vanilla" Datashader + Bokeh. How can I achieve the same thing via HoloViews, so that I can take advantage of the nice features in that package?

Note in particular that I want colors assigned to specific z values, I do not want it to be automatically normalised or any such thing. I tried to achieve this in the below code by setting the 'span' argument in the 'shade' function, though it doesn't quite work, because when I zoom in on the plot I see new green areas appear, which indicates that the absolute normalisation of the colors is not staying constant. Anyway it should be close enough to illustrate what I am after.

import pandas as pd
from bokeh.plotting import figure, output_notebook
import datashader as ds
from datashader.bokeh_ext import InteractiveImage
from datashader import transfer_functions as tf

output_notebook(hide_banner=True)

import matplotlib.colors as colors

#Define colormap
mn=0
mx=5
s0=0./(mx-mn)
s1=1./(mx-mn)
s2=2./(mx-mn)
s3=3./(mx-mn)
s4=4./(mx-mn)
s5=5./(mx-mn)

cdict = {
'red'  :  ((s0, 0., 0.), (s1, 1., 1.), (s2, 1., 1.), (s3, 1., 1.), (s4, .5, .5), (s5, .2, .2)),
'green':  ((s0, 1., 1.), (s1, 1., 1.), (s2, .5, .5), (s3, 0., 0.), (s4, 0., 0.), (s5, 0., 0.)),
'blue' :  ((s0, 0., 0.), (s1, 0., 0.), (s2, 0., 0.), (s3, 0., 0.), (s4, 0., 0.), (s5, 0., 0.))
}

chi2cmap = colors.LinearSegmentedColormap('chi2_colormap', cdict, 1024)
chi2cmap.set_bad('w',1.)


# Create some data to plot
x = np.arange(0,10,1e-2)
y = np.arange(0,10,1e-2)
X,Y = np.meshgrid(x,y)
x = X.flatten()
y = Y.flatten()
z = 5 * np.sin(x) * np.cos(y)

#------ Create pandas dataframe object from the data ------
print "Creating Pandas dataframe object"
df = pd.DataFrame.from_dict({"x": x, "y": y, "z": z})

# Create callback function for bokeh
def create_image(x_range, y_range, w, h):
    cvs = ds.Canvas(x_range=x_range, y_range=y_range, plot_width=200, plot_height=200)
    agg = cvs.points(df, 'x', 'y', ds.min('z'))
    img = tf.shade(agg, cmap=chi2cmap, how='linear', span=[mn,mx])
    #return tf.dynspread(img, threshold=0.9, max_px=10)
    return img

# Export image
#ds.utils.export_image(img, "test", fmt=".png", export_path=".", background="white")

# Interactive image via bokeh
p = figure(tools='pan,wheel_zoom,box_zoom,reset', background_fill_color="white",
           plot_width=500, plot_height=500, x_range=(np.min(x),np.max(x)), y_range=(np.min(y),np.max(y)))

p.axis.visible = False
p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None
InteractiveImage(p, create_image)

with output

enter image description here

1

1 Answers

5
votes

Ok I seem to have succeeded with this so here's what I came up with. The key thing is to create a new class derived from holoviews.operation.datashader.datashade, and to change the aggregator and cmap data members there:

class chi2_datashade(hvds.datashade):
    """Custom datashade class to do our projection and colormap"""
    aggregator = ds.min('z')
    cmap = chi2cmap
    normalization = 'linear'
    span = [mn,mx] # this requires https://github.com/ioam/holoviews/pull/1508 to work, which I just hacked in to holoviews for now

and then just use it as you would the original datashade class:

data = hv.Points(df)
chi2_datashade(data)

There was an issue with the span data member, in that it didn't exist and so wasn't connected to the underlying datashader options, but it will be fixed in an upcoming version, and can be easily changed in the source if you want to do it yourself (see https://github.com/ioam/holoviews/pull/1508)

There is in fact another issue, this time from datashader, in that it offsets the 'z' data according to the minimum value internally, and so screws up the meaning of the 'span' parameter. I raised this issue with them, but it is also a fairly simple fix in the source if you want to do it yourself (see https://github.com/bokeh/datashader/issues/368)

Here's the complete example:

import numpy as np
import pandas as pd
import datashader as ds
import holoviews as hv
import holoviews.operation.datashader as hvds
import matplotlib.colors as colors
hv.notebook_extension('bokeh')

#Define colormap
mn=0
mx=5
s0=0./(mx-mn)
s1=1./(mx-mn)
s2=2./(mx-mn)
s3=3./(mx-mn)
s4=4./(mx-mn)
s5=5./(mx-mn)

cdict = {
'red'  :  ((s0, 0., 0.), (s1, 1., 1.), (s2, 1., 1.), (s3, 1., 1.), (s4, .5, .5), (s5, .2, .2)),
'green':  ((s0, 1., 1.), (s1, 1., 1.), (s2, .5, .5), (s3, 0., 0.), (s4, 0., 0.), (s5, 0., 0.)),
'blue' :  ((s0, 0., 0.), (s1, 0., 0.), (s2, 0., 0.), (s3, 0., 0.), (s4, 0., 0.), (s5, 0., 0.))
}

chi2cmap = colors.LinearSegmentedColormap('chi2_colormap', cdict, 1024)
chi2cmap.set_bad('w',1.)


# Create some data to plot
x = np.arange(0,10,1e-2)
y = np.arange(0,10,1e-2)
X,Y = np.meshgrid(x,y)
x = X.flatten()
y = Y.flatten()
z = 5 * np.sin(x) * np.cos(y)

#------ Create pandas dataframe object from the data ------
print "Creating Pandas dataframe object"
df = pd.DataFrame.from_dict({"x": x, "y": y, "z": z})

class chi2_datashade(hvds.datashade):
    """Custom datashade class to do our projection and colormap"""
    aggregator = ds.min('z')
    cmap = chi2cmap
    normalization = 'linear'
    span = [mn,mx] # this requires https://github.com/ioam/holoviews/pull/1508 to work, which I just hacked in to holoviews for now


data = hv.Points(df)
chi2_datashade(data)

which produces this image:

enter image description here

This is a bit different to the OP image, but it turns out that this is just due to the datashader bug I mentioned. Fixing that bug and re-running the OP code I get this output:

enter image description here

which matches well. Looks like Holoviews just cuts off the data outside the 'span' chosen or some such, which is fine with me for my current needs.