Addressing strange plotting results using pandas and dates

Question

When plotting a time series with pandas using dates, the plot is completely wrong, as are the dates along the x-axis. For some reason the data are plotted against dates not even in the dataframe.

This is for plotting multiple sensors with independent clocks and different sampling frequencies. I want to plot all sensors in the same figure for comparison.

I have tried sorting the dataframe in ascending order, and assigning the datetime column as the dataframe index without effect. When plotting the data set against the timestamp instead, plots for each sensor look fine.

Excerpt from a typical CSV file:

    Timestamp Date Clock DC3 HR DC4
    13 18.02.2019 08:24:00  19,12   61  3
    14 18.02.2019 08:26:00  19,12   38  0
    15 18.02.2019 08:28:00  19,12   52  0
    16 18.02.2019 08:30:00  19,12   230 2
    17 18.02.2019 08:32:00  19,12   32  3

The following code produces the problem for me:

import pandas as pd
from scipy.signal import savgol_filter

columns = ['Timestamp', 'Date', 'Clock', 'DC3', 'HR', 'DC4']

data = pd.read_csv('Exampledata.DAT', 
               sep='\s|\t', 
               header=19, 
               names=columns, 
               parse_dates=[['Date', 'Clock']], 
               engine='python')

data['HR'] = savgol_filter(data['HR'], 201, 3) #Smoothing

ax = data.plot(x='Date_Clock', y='HR', label='Test')

The expected result should look like this only with dates along the x-axis:

Imgur

The actual result is: Imgur

An example of a complete data file can be downloaded here: https://filesender.uninett.no/?s=download&token=ae8c71b5-2dcc-4fa9-977d-0fa315fedf45

How can this issue be addressed?

esvendsen esvendsen · Accepted Answer · 2019-06-14T06:25:01

This issue is resolved by not using parse_dates when loading the file, but instead creating the datetime vector like this:

import pandas as pd
from scipy.signal import savgol_filter

columns = ['Timestamp', 'Date', 'Clock', 'DC3', 'HR', 'DC4']

data = pd.read_csv('Exampledata.DAT', 
               sep='\s|\t', 
               header=19, 
               names=columns, 
               engine='python')

data['Timestamp'] = pd.to_datetime(data['Date'] + data['Clock'], 
format='%d.%m.%Y%H:%M:%S')

data['HR'] = savgol_filter(data['HR'], 201, 3) #Smoothing

ax = data.plot(x='Timestamp', y='HR', label='Test')

This creates the following plot:

Imgur

Which is the plot I want.

Addressing strange plotting results using pandas and dates

2 Answers