1
votes

Objective of this Task:
1)Plotting a hierarhical sunburst (year -> product category -> product subcategory)
2)Label showing percentage with 1/2 d.p.
3)Continous colour scale based on total amount of sales

I was using Plotly Express to create a sunburst initially but I realised that the percentage shown in the chart does not sum up to 100% as shown below (33 + 33 + 30 + 5 = 101%) Plotly express sunburst chart

Then I tried using Plotly Go to plot the sunburst, I first define a function to create a dataframe, then plotting the sunburst with the newly created df. The function works fine but I do not know why does the figure not showing up. I am stucked with .

Function code:

levels = ['prod_subcat', 'prod_cat', 'year'] # levels used for the hierarchical chart
#color_columns = 'total_amt'
value_column = 'total_amt'

def build_hierarchical_dataframe(valid_trans, levels, value_column, color_column = None):
    """
    Build a hierarchy of levels for Sunburst or Treemap charts.

    Levels are given starting from the bottom to the top of the hierarchy,
    ie the last level corresponds to the root.
    """
    df_all_trees = pd.DataFrame(columns=['id', 'parent', 'value'])
    for i, level in enumerate(levels):
        df_tree = pd.DataFrame(columns=['id', 'parent', 'value'])
        dfg = valid_trans.groupby(levels[i:]).sum()
        dfg = dfg.reset_index()
        df_tree['id'] = dfg[level].copy()
        if i < len(levels) - 1:
            df_tree['parent'] = dfg[levels[i+1]].copy()
        else:
            df_tree['parent'] = 'total'
        df_tree['value'] = dfg[value_column]
        df_all_trees = df_all_trees.append(df_tree, ignore_index=True)
    total = pd.Series(dict(id='total', parent='',
                              value=valid_trans[value_column].sum()))
    df_all_trees = df_all_trees.append(total, ignore_index=True)
    return df_all_trees

Dataframe for plotting sunburst: DataFrame

Code for plotting Plotly Go Sunburst:

fig.add_trace(go.Sunburst(
    labels=df_all_trees['id'],
    parents=df_all_trees['parent'],
    values=df_all_trees['value'],
    branchvalues='total',
    marker=dict(
        colorscale='RdBu'),
    hovertemplate='<b>%{label} </b><br> Percent: %{value:.2f}',
    maxdepth=2
    ))

fig.show()

Result of Plotly Go: Missing Figure

Code of Subset Dataframe for this task:

c_names = ['year','prod_cat','prod_subcat','total_amt']
var = {
    'year': [2011,2011,2011,2011,2011,2011,2012,2012,2012,2012,2012,2012,2012,2012,2012,2012,2013,2013,2013,2013,2013,2013,2014,2014], 
    'prod_cat': ['Bags','Books','Books','Clothing','Clothing','Home and kitchen','Books','Books','Clothing','Clothing','Electronics','Electronics','Footwear','Footwear','Home and kitchen','Home and kitchen','Books','Books','Clothing','Electronics','Home and kitchen','Home and kitchen','Bags','Bags'], 
    'prod_subcat': ['Mens','Academic','Fiction','Mens','Women','Furnishing','Non-Fiction','Non-Fiction','Kids','Women','Audio and video','Computers','Mens','Women','Furnishing','Kitchen','Academic','Non-Fiction','Women','Mobiles','Bath','Furnishing','Mens','Women'], 
   'total_amt': [3443.18,5922.8,1049.75,1602.25,6497.4,3287.375,6342.7,2243.15,4760.34,2124.915,5878.6,1264.12,433.16,287.3,1221.025,3867.5,2897.31,2400.06,285.09,5707.325,5585.775,2103.92,3391.245,281.775]
}

valid_trans = pd.DataFrame(data = var, columns = c_names)
1

1 Answers

1
votes

To achieve 2dp percentages it's a simple case of updating the trace. You can use plotly express or graph objects. If using graph objects, using plotly express to structure inputs to go makes coding far simpler plotly express does structuring

pxfig = px.sunburst(valid_trans, path=['year','prod_cat']#,'prod_subcat']
                    , values='total_amt')

2dp percent...

pxfig.update_layout(margin=dict(t=0, l=0, r=0, b=0)).update_traces(texttemplate="%{label}<br>%{percentEntry:.2%}")

enter image description here

graph_objects

  • use structuring from plotly express
ig =go.Figure(go.Sunburst(
 ids=pxfig.data[0]["ids"],
  labels= pxfig.data[0]["labels"],
  parents= pxfig.data[0]["parents"],
    values=pxfig.data[0]["values"],
    branchvalues="total",
    texttemplate="%{label}<br>%{percentEntry:.2%}"
))
fig.update_layout(margin = dict(t=0, l=0, r=0, b=0))