0
votes

Hi I am new in R and was tring to convert dataframe into a time series object but after applying groupby on a certain index the datatype gets changed to "tbl_df" "tbl" "data.frame" format. Also i am trying to make a another dataframe subset out of an existing dataframe which is returning null. Also after converting the dataframe into a time series object its converting to ts matrix. Can you please let me know why all this issues are happening?

I have tried all the basic operations but somehow missing the background interpretation of all the codes used in the code. Kindly help

data <- read.csv("Time_Series_Data_Peak2.csv")
head(data)
class(data)
#Groupby
library(dplyr)
Dates_class = data %>% 
  group_by(Date) %>% 
  summarise(Dates_class= sum(Calls_Handled))
View(Dates_class)
head(Dates_class)
plot(Dates_class$Date,Dates_class$Dates_class)
lines(Dates_class$Date,Dates_class$Dates_class)
class(Dates_class)
Dates_class1 <- ts(Dates_class,start=c(2019,3),end=c(2019,5),frequency=1)

I want the data to be ready for checking stationarity.

Edit:

Sample data from the commment:

structure(list(Date = c("20/01/0003", "20/01/0003", "20/01/0003", "20/01/0003", "20/01/0003", "20/01/0003"), Date2 = structure(c(17956, 17956, 17956, 17956, 17956, 17956), class = "Date"), Calls_Handled = c(30L, 43L, 36L, 28L, 32L, 23L)), row.names = c(NA, 6L), class = "data.frame") 
1
Update the result of dput(head(data)) to the question in order to make your example reproducibleRobert
structure(list(Date = c("20/01/0003", "20/01/0003", "20/01/0003", "20/01/0003", "20/01/0003", "20/01/0003"), Date2 = structure(c(17956, 17956, 17956, 17956, 17956, 17956), class = "Date"), Calls_Handled = c(30L, 43L, 36L, 28L, 32L, 23L)), row.names = c(NA, 6L), class = "data.frame")YASH NAGRAJ
The Date and Date2 are different, is that what you expected? If not, there is a problem reading the file or how the file was created.Robert
No they are same basically the data column was creating problems so i created a new date column(Date2) and it worked. So now i was trying to put it through adf test but it always returns with the error(Error in res.sum$coefficients[2, 1] : subscript out of bounds) . Does it have anything to do with the dimension of data ?YASH NAGRAJ

1 Answers

0
votes

As there is an apparent problem with Date column, I use Date2 instead. After summarise you got a data.frame, but in order to make the test you need time series. In the code I also changed the name of summary variable to CallsH, making it clear.

Dates_class = data %>% 
  group_by(Date2) %>% 
  summarise(CallsH= sum(Calls_Handled))
#View(Dates_class)
head(Dates_class)
plot(Dates_class$Date2,Dates_class$CallsH,type="l",col=3)
class(Dates_class)
dfts=timeSeries::as.timeSeries(Dates_class$CallsH,Dates_class$Date2)
tseries::adf.test(dfts, k = 10)

Edit: (the complete simulation)

data=structure(list(Date = c("20/01/0003", "20/01/0003", "20/01/0003", "20/01/0003", "20/01/0003", "20/01/0003"), Date2 = structure(c(17956, 17956, 17956, 17956, 17956, 17956), class = "Date"), Calls_Handled = c(30L, 43L, 36L, 28L, 32L, 23L)), row.names = c(NA, 6L), class = "data.frame") 
dts=seq.Date(from=as.Date("2019/03/02"),to=as.Date("2019/05/31"),by = "day")
ls=length(dts)
ch=runif(6*ls,15,34)
data=rbind(data,data.frame(Date=dts,Date2=dts,Calls_Handled=ch))
#data <- read.csv("Time_Series_Data_Peak2.csv")
head(data)
class(data)
#Groupby
library(dplyr)
Dates_class = data %>% 
  group_by(Date2) %>% 
  summarise(CallsH= sum(Calls_Handled))
#View(Dates_class)
str(Dates_class)
summary(Dates_class)
plot(Dates_class$Date2,Dates_class$CallsH,type="l",col=3)
#lines(Dates_class$Date2,Dates_class$Dates_class)
class(Dates_class)
dfts=timeSeries::as.timeSeries(Dates_class$CallsH,Dates_class$Date2)
tseries::adf.test(dfts, k = 10)