2
votes

I'd like to reproduce this image from this link. But I got this weird result.

This is fairly straightforward. I'd like to plot a time series from a dataframe. It's not a xts type of data. It's just a simple dataframe. Date is already recognized as date. Not sure, why ggplot is showing this erratic lines instead of geom_line.

Can anyone let me know why? Thanks in advance!

tutorial original source

This is the desired output. desired output

This is my output... my output

My reproducible code:

library(ggplot2)

colnames(data)<-c("date","value")
data$date <- as.Date(data$date, "%m/%d/%Y")

ggplot(data, aes(x = date, y = value)) + geom_line()

data dput is here:

structure(list(date = structure(c(13514, 13545, 13573, 13604, 
13634, 13665, 13695, 13726, 13757, 13787, 13818, 13848, 13879, 
13910, 13939, 13970, 14000, 14031, 14061, 14092, 14123, 14153, 
14184, 14214, 14245, 14276, 14304, 14335, 14365, 14396, 14426, 
14457, 14488, 14518, 14549, 14579, 14610, 14641, 14669, 14700, 
14730, 14761, 14791, 14822, 14853, 14883, 14914, 14944, 14975, 
15006, 15034, 15065, 15095, 15126, 15156, 15187, 15218, 15248, 
15279, 15309, 15340, 15371, 15400, 15431, 15461, 15492, 15522, 
15553, 15584, 15614, 15645, 15675, 15706, 15737, 15765, 15796, 
15826, 15857, 15887, 15918, 15949, 15979, 16010, 16040, 16071, 
16102, 16130, 16161, 16191, 16222, 16252, 16283, 16314, 16344, 
16375, 16405, 16436, 16467, 16495, 16526, 16556, 16587, 16617, 
16648, 16679, 16709, 16740, 16770, 16801, 16832, 16861, 16892, 
16922, 16953, 16983, 17014, 17045, 17075, 17106, 17136, 17167, 
17198, 17226, 17257, 17287, 17318, 17348, 17379, 17410, 17440, 
17471, 17501, 17532, 17563, 17591, 13514, 13545, 13573, 13604, 
13634, 13665, 13695, 13726, 13757, 13787, 13818, 13848, 13879, 
13910, 13939, 13970, 14000, 14031, 14061, 14092, 14123, 14153, 
14184, 14214, 14245, 14276, 14304, 14335, 14365, 14396, 14426, 
14457, 14488, 14518, 14549, 14579, 14610, 14641, 14669, 14700, 
14730, 14761, 14791, 14822, 14853, 14883, 14914, 14944, 14975, 
15006, 15034, 15065, 15095, 15126, 15156, 15187, 15218, 15248, 
15279, 15309, 15340, 15371, 15400, 15431, 15461, 15492, 15522, 
15553, 15584, 15614, 15645, 15675, 15706, 15737, 15765, 15796, 
15826, 15857, 15887, 15918, 15949, 15979, 16010, 16040, 16071, 
16102, 16130, 16161, 16191, 16222, 16252, 16283, 16314, 16344, 
16375, 16405, 16436, 16467, 16495, 16526, 16556, 16587, 16617, 
16648, 16679, 16709, 16740, 16770, 16801, 16832, 16861, 16892, 
16922, 16953, 16983, 17014, 17045, 17075, 17106, 17136, 17167, 
17198, 17226, 17257, 17287, 17318, 17348, 17379, 17410, 17440, 
17471, 17501, 17532, 17563, 17591, 13514, 13545, 13573, 13604, 
13634, 13665, 13695, 13726, 13757, 13787, 13818, 13848, 13879, 
13910, 13939, 13970, 14000, 14031, 14061, 14092, 14123, 14153, 
14184, 14214, 14245, 14276, 14304, 14335, 14365, 14396, 14426, 
14457, 14488, 14518, 14549, 14579, 14610, 14641, 14669, 14700, 
14730, 14761, 14791, 14822, 14853, 14883, 14914, 14944, 14975, 
15006, 15034, 15065, 15095, 15126, 15156, 15187, 15218, 15248, 
15279, 15309, 15340, 15371, 15400, 15431, 15461, 15492, 15522, 
15553, 15584, 15614, 15645, 15675, 15706, 15737, 15765, 15796, 
15826, 15857, 15887, 15918, 15949, 15979, 16010, 16040, 16071, 
16102, 16130, 16161, 16191, 16222, 16252, 16283, 16314, 16344, 
16375, 16405, 16436, 16467, 16495, 16526, 16556, 16587, 16617, 
16648, 16679, 16709, 16740, 16770, 16801, 16832, 16861, 16892, 
16922, 16953, 16983, 17014, 17045, 17075, 17106, 17136, 17167, 
17198, 17226, 17257, 17287, 17318, 17348, 17379, 17410, 17440, 
17471, 17501, 17532, 17563, 17591), class = "Date"), value = c(4.76, 
4.72, 4.56, 4.69, 4.75, 5.1, 5, 4.67, 4.52, 4.53, 4.15, 4.1, 
3.74, 3.74, 3.51, 3.68, 3.88, 4.1, 4.01, 3.89, 3.69, 3.81, 3.53, 
2.42, 2.52, 2.87, 2.82, 2.93, 3.29, 3.72, 3.56, 3.59, 3.4, 3.39, 
3.4, 3.59, 3.73, 3.69, 3.73, 3.85, 3.42, 3.2, 3.01, 2.7, 2.65, 
2.54, 2.76, 3.29, 3.39, 3.58, 3.41, 3.46, 3.17, 3, 3, 2.3, 1.98, 
2.15, 2.01, 1.98, 1.97, 1.97, 2.17, 2.05, 1.8, 1.62, 1.53, 1.68, 
1.72, 1.75, 1.65, 1.72, 1.91, 1.98, 1.96, 1.76, 1.93, 2.3, 2.58, 
2.74, 2.81, 2.62, 2.72, 2.9, 2.86, 2.71, 2.72, 2.71, 2.56, 2.6, 
2.54, 2.42, 2.53, 2.3, 2.33, 2.21, 1.88, 1.98, 2.04, 1.94, 2.2, 
2.36, 2.32, 2.17, 2.17, 2.07, 2.26, 2.24, 2.09, 1.78, 1.89, 1.81, 
1.81, 1.64, 1.5, 1.56, 1.63, 1.76, 2.14, 2.49, 2.43, 2.42, 2.48, 
2.3, 2.3, 2.19, 2.32, 2.21, 2.2, 2.36, 2.35, 2.4, 2.58, 2.86, 
2.84, 5.32, 5.31, 5.3, 5.31, 5.31, 5.33, 5.32, 5.49, 5.46, 5.08, 
4.97, 5.02, 3.84, 3.06, 2.79, 2.85, 2.66, 2.76, 2.79, 2.79, 3.59, 
4.32, 2.36, 1.77, 1.02, 1.16, 1.07, 0.89, 0.57, 0.39, 0.35, 0.3, 
0.25, 0.24, 0.21, 0.22, 0.2, 0.19, 0.23, 0.3, 0.45, 0.52, 0.41, 
0.32, 0.28, 0.27, 0.27, 0.3, 0.29, 0.28, 0.28, 0.23, 0.21, 0.22, 
0.24, 0.29, 0.33, 0.37, 0.41, 0.49, 0.4, 0.3, 0.29, 0.29, 0.29, 
0.32, 0.3, 0.26, 0.24, 0.23, 0.23, 0.24, 0.23, 0.22, 0.21, 0.2, 
0.2, 0.19, 0.14, 0.12, 0.11, 0.12, 0.12, 0.14, 0.12, 0.13, 0.12, 
0.12, 0.11, 0.11, 0.13, 0.13, 0.12, 0.12, 0.13, 0.15, 0.16, 0.15, 
0.14, 0.13, 0.15, 0.18, 0.19, 0.26, 0.27, 0.25, 0.3, 0.54, 0.57, 
0.54, 0.55, 0.55, 0.57, 0.55, 0.62, 0.73, 0.75, 0.72, 0.71, 0.87, 
0.9, 0.87, 0.98, 1.03, 1.05, 1.16, 1.22, 1.25, 1.25, 1.26, 1.32, 
1.54, 1.63, 1.78, 2.08, 5.25, 5.26, 5.26, 5.25, 5.25, 5.25, 5.26, 
5.02, 4.94, 4.76, 4.49, 4.24, 3.94, 2.98, 2.61, 2.28, 1.98, 2, 
2.01, 2, 1.81, 0.97, 0.39, 0.16, 0.15, 0.22, 0.18, 0.15, 0.18, 
0.21, 0.16, 0.16, 0.15, 0.12, 0.12, 0.12, 0.11, 0.13, 0.16, 0.2, 
0.2, 0.18, 0.18, 0.19, 0.19, 0.19, 0.19, 0.18, 0.17, 0.16, 0.14, 
0.1, 0.09, 0.09, 0.07, 0.1, 0.08, 0.07, 0.08, 0.07, 0.08, 0.1, 
0.13, 0.14, 0.16, 0.16, 0.16, 0.13, 0.14, 0.16, 0.16, 0.16, 0.14, 
0.15, 0.14, 0.15, 0.11, 0.09, 0.09, 0.08, 0.08, 0.09, 0.08, 0.09, 
0.07, 0.07, 0.08, 0.09, 0.09, 0.1, 0.09, 0.09, 0.09, 0.09, 0.09, 
0.12, 0.11, 0.11, 0.11, 0.12, 0.12, 0.13, 0.13, 0.14, 0.14, 0.12, 
0.12, 0.24, 0.34, 0.38, 0.36, 0.37, 0.37, 0.38, 0.39, 0.4, 0.4, 
0.4, 0.41, 0.54, 0.65, 0.66, 0.79, 0.9, 0.91, 1.04, 1.15, 1.16, 
1.15, 1.15, 1.16, 1.3, 1.41, 1.42, 1.51)), row.names = c(NA, 
-405L), class = "data.frame")
2
Check your data. What happens when you plot with geom_point instead? Why are you expecting it to ignore lots of data?heds1
@heds1 jesus. I am an idiot. Why has this data three times double entries? I have never looked at the whole dataset, when I downloaded it. Only check the head.purpleblau

2 Answers

2
votes

The lines are not erratic. Your data just contains multiple observations for every single day. The desired plot is aggregated by some measure and it seems to be the maximum value for each day.

data <- aggregate(value ~ date, data = data, FUN = "max")
ggplot(data, aes(x = date, y = value)) + geom_line(color = "blue", size = 1) 

enter image description here

1
votes

purpleblau when I tried I got the same output as yours! Turned out that in $Subject there are three different interest rates. Applying filter(Subject == 'Long-term interest rates, Per cent per annum') solved the problem for me. Hope it helps.enter image description here