I am not aware of any plotting package that lets you create this plot in a straightforward way based on how your sample table is structured. One option could be to compute a start
and an end
variable and then create the plot like in the answers to this question, for example using the Altair Gantt chart like in this answer.
Here, I offer two solutions using matplotlib. By taking a look at the matplotlib gallery, I stumbled on the broken_barh
plotting function which provides a way to create a plot like the one you want. There are two main hurdles to overcome when using it:
- Deciding what unit to use for the x-axis and computing the
xranges
argument accordingly;
- Creating and formatting the x ticks and tick labels.
Let me first create a sample dataset that resembles yours, note that you will need to adjust the color_dict
to your codes:
import numpy as np # v 1.19.2
import pandas as pd # v 1.1.3
import matplotlib.pyplot as plt # v 3.3.2
import matplotlib.dates as mdates
gre = 1
yel_to_red = 2
red = 3
yel_to_gre = 4
color_dict = {1: 'green', 2: 'yellow', 3: 'red', 4: 'yellow'}
sec_g = 45
sec_yr = 3
sec_r = 90
sec_yg = 1
light_cycle = [gre, yel_to_red, red, yel_to_gre]
sec_cycle = [sec_g, sec_yr, sec_r, sec_yg]
ncycles = 3
sec_total = ncycles*sum(sec_cycle)
IntersectionId = 12345
currState = np.repeat(ncycles*light_cycle, repeats=ncycles*sec_cycle)
time_sec = pd.date_range(start='2021-01-04 08:00:00', freq='S', periods=sec_total)
df = pd.DataFrame(dict(IntersectionId = np.repeat(12345, repeats=ncycles*sum(sec_cycle)),
currState = currState),
index = time_sec)
The broken_barh
function takes the data in the format of tuples where for each colored rectangle that makes up the horizontal bar you need to provide the xy coordinates of the bottom-left corner as well as the length along each axis, like so:
xranges=[(x1_start, x1_length), (x2_start, x2_length), ... ], yranges=(y_all_start, y_all_width)
Note that yranges
applies to all rectangles. The unit that is chosen for the x-axis determines how the data must be processed and how the x ticks and tick labels can be created. Here are two alternatives.
Matplotlib broken_barh
with matplotlib date number as x-axis scale
In this approach, the timestamps of the rows where the light changes are extracted and then converted to matplotlib date numbers. This makes it possible to use a matplotlib date tick locator and formatter. This approach of using the matplotlib date for the x-axis values to simplify tick formatting was inspired by this answer by ImportanceOfBeingErnest.
For both this solution and the next one, the code for getting the indices of light changes and computing the lengths of the periods is based on this answer by Jaime, thanks to the general idea provided by this Gist by alimanfoo.
states = np.array(df['currState'])
starts_indices = np.where(np.concatenate(([True], states[:-1] != states[1:])))
starts_end_indices = np.append(starts_indices, states.size-1)
starts_end_pydt = df.index[starts_end_indices].to_pydatetime()
starts_end_x = mdates.date2num(starts_end_pydt)
lengths = np.diff(starts_end_x)
pydt_second = (max(starts_end_x) - min(starts_end_x))/starts_end_indices[-1]
lengths[-1] = lengths[-1] + pydt_second
xranges = [(start, length) for start, length in zip(starts_end_x, lengths)]
yranges = (0.75, 0.5)
colors = df['currState'][starts_end_indices[:-1]].map(color_dict)
fig, ax = plt.subplots(figsize=(10,2))
ax.broken_barh(xranges, yranges, facecolors=colors, zorder=2)
loc = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(loc)
formatter = mdates.AutoDateFormatter(loc)
formatter.scaled[1/(24.*60.)] = '%H:%M:%S' # adjust this according to time range
ax.xaxis.set_major_formatter(formatter)
ax.set_ylim(0, 2)
ax.set_yticks([1])
ax.set_yticklabels([df['IntersectionId'][0]])
plt.grid(axis='x', alpha=0.5, zorder=1)
plt.show()

Matplotlib broken_barh
with seconds as x-axis scale
This approach takes advantage of the fact that the indices of the table can be used to compute the lights' durations in seconds. The downside is that this time the x ticks and tick labels must be created from scratch. The code is written so that labels automatically have a nice format depending on the total duration covered by the dataset. The only thing that needs adjusting is the number of ticks, as this depends on how wide the figure is.
The code used to automatically select an appropriate time step between ticks is based on this answer by kennytm. The datetime string format codes are listed here.
states = np.array(df['currState'])
starts_indices, = np.where(np.concatenate(([True], states[:-1] != states[1:])))
lengths = np.diff(starts_indices, append=states.size)
xranges = [(start, length) for start, length in zip(starts_indices, lengths)]
yranges = (0.75, 0.5)
colors = df['currState'][starts_indices].map(color_dict)
fig, ax = plt.subplots(figsize=(10,2))
ax.broken_barh(xranges, yranges, facecolors=colors, zorder=2)
time = pd.DatetimeIndex(df.index).asi8 // 10**9 # time is in seconds
tmin = min(time)
tmax = max(time)
trange = tmax-tmin
approx_nticks = 6 # low number selected because figure width is only 10 inches
round_time_steps = [15, 30, 60, 120, 180, 240, 300, 600, 900, 1800, 3600, 7200, 14400]
time_step = min(round_time_steps, key=lambda x: abs(x - trange//approx_nticks))
timestamps = np.append(np.arange(tmin, tmax, time_step), tmax+1)
xticks = timestamps-tmin
ax.set_xticks(xticks)
fmt_time = '%H:%M:%S' if time_step <= 60 else '%H:%M'
xticklabels = [pd.to_datetime(ts, unit='s').strftime(fmt_time) for ts in timestamps]
ax.set_xticklabels(xticklabels)
ax.set_ylim(0, 2)
ax.set_yticks([1])
ax.set_yticklabels([df['IntersectionId'][0]])
plt.grid(axis='x', alpha=0.5, zorder=1)
plt.show()

Further documentation: to_datetime, to_pydatetime, strftime