2
votes

I have some problems sorting a multicategorial chart.

Some example code.

import pandas as pd
import plotly.graph_objects as go

data = [
    [0, "Born", 4, "Rhino"],  # commenting this line will also reverse sub category sorting
    [0, "Died", -1, "Rhino"],
    [1, "Born", 4, "Lion"],
    [1, "Died", -1, "Lion"],
    [2, "Born", 12, "Rhino"],
    [2, "Died", -5, "Lion"],
]
z_data = list(zip(*data))

df = pd.DataFrame({
    "tick": z_data[0],
    "category": z_data[1],
    "value": z_data[2],
    "type": z_data[3],
})
df = df.sort_values(by=['tick', 'category', 'value', 'type'])
print(df)
fig = go.Figure()
for t in df.type.unique():
    plot_df = df[df.type == t]
    fig.add_trace(go.Bar(
        x=[plot_df.tick, plot_df.category],
        y=abs(plot_df.value),
        name=t,
    ))
fig.update_layout({
    'barmode': 'stack',
    'xaxis': {
        'title_text': "Tick",
        'tickangle': -90,
    },
    'yaxis': {
        'title_text': "Value",
    },
})
fig.write_html(str("./diagram.html"))

uncommented commented

As you can see the tick 2 is before tick 1. This happens because the 'Rhino' is the first in type list, which will create the tick 0 and 2. The lion bars are added after with tick 1. But how can i sort the bars properly now?

PS. 'barmode': 'stack' is on purpose. Even if it is not used in this test example.

1

1 Answers

1
votes

I'm able to fix the tick but not the born/died order. I'm planning to plot row by row so I need to play with showlegend

Data

import pandas as pd
import plotly.graph_objects as go
data = [
    [0, "Born", 4, "Rhino"],  # commenting this line will also reverse sub category sorting
    [0, "Died", -1, "Rhino"],
    [1, "Born", 4, "Lion"],
    [1, "Died", -1, "Lion"],
    [2, "Born", 12, "Rhino"],
    [2, "Died", -5, "Lion"],
]
# you don't really need to zip here
df = pd.DataFrame(data, columns=["tick", "category", "value", "type"])
df["value"] = df["value"].abs()

Set color

In case you have more types there are answer here that can help you. Check doc

color_diz = {"Rhino": "blue", "Lion": "red"}
df["color"] = df["type"].map(color_diz)

Show Legend

Here I want to show the legend for the first occurrence of every type

grp = df.groupby("type")\
        .apply(lambda x: x.index.min())\
        .reset_index(name="idx")

df = pd.merge(df, grp, on=["type"], how="left")

df["showlegend"] = df.index == df["idx"]

Data to plot

print(df)
   tick category  value   type color  idx  showlegend
0     0     Born      4  Rhino  blue    0        True
1     0     Died      1  Rhino  blue    0       False
2     1     Born      4   Lion   red    2        True
3     1     Died      1   Lion   red    2       False
4     2     Born     12  Rhino  blue    0       False
5     2     Died      5   Lion   red    2       False

Plot

fig = go.Figure()
for i, row in df.iterrows():
    fig.add_trace(
        go.Bar(x=[[row["tick"]], [row["category"]]],
               y=[row["value"]],
               name=row["type"],
               marker_color=row["color"],
               showlegend=row["showlegend"],
               legendgroup=row["type"] # Fix legend
               ))
    
fig.update_layout({
    'barmode': 'stack',
    'xaxis': {
        'title_text': "Tick",
        'tickangle': -90,
    },
    'yaxis': {
        'title_text': "Value",
    },
})
fig.show()

enter image description here

EDIT

If you have more type you could use the following trick.

First I generate different types

import string
import numpy as np
import pandas as pd
import plotly.express as px

df = pd.DataFrame({"type":np.random.choice(list(string.ascii_lowercase), 100)})

Then I pick a color sequence from doc and put them on a dictionary

color_dict = {k:v for k,v in enumerate(px.colors.qualitative.Plotly)}

Then I put the unique type on a dataframe

df_col = pd.DataFrame({"type": df["type"].unique()})

and I assign each of them a color according to its index

df_col["color"] = (df_col.index%len(color_dict)).map(color_dict)

Finally I merge to the original df

df = pd.merge(df, df_col, on=["type"], how="left")