0
votes

I have a data frame with a column of dates as integers e.x. 192606 192607 etc.. My date values are just years and months I would like to change this integer form to a dateform so I could plot them in a time series plot (ggplot)

I tried using lubridate but I get an error massage.

sss[,1]<-ymd(sss[,1])

EDIT:

Data can be found here: https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

I am working with 25 Portfolios sorted on size and book-to-market

EDIT 2:

Here ist the output of my data frame. I appreciate your fast help! Thanks

str(sss) num [1:1122, 1:5] 192607 192608 192609 192610 192611 ... - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:5] "Time" "Intercept" "Mkt" "smb" ...

2
Do those numbers come from an Excel worksheet? MATLAB? What is their origin?Rui Barradas
The numbers come from an csv file yes, and are imported to RStdio and converted to a data frameTheRipper7000
What is equivalent date of 192606 ?Ronak Shah
It is the year of 1926 and month of JulyTheRipper7000

2 Answers

0
votes

See if this helps.
The problem seems to be that those numbers should not be seen as numbers but as character strings coding dates in the format "YYYYMM". So to coerce to R class "Date", first paste a day 01 then coerce with as.Date.

sss <- matrix(
  c(192607, 192608, 192609, 192610, 192611, 192612, 192701, 192702, 
    192703, 192704, 192705, 192706, 192707, 192708, 192709, 192710, 
    192711, 192712, 192801, 192802, 192803, 192804, 192805, 192806, 
    192807), 
  ncol = 1)

d <- as.Date(paste0(sss[, 1], "01"), format = "%Y%m%d")
head(d)
#[1] "1926-07-01" "1926-08-01" "1926-09-01" "1926-10-01" "1926-11-01"
#[6] "1926-12-01"
0
votes

The ymd requires that it is a character vector with the order of dates following year, month, day in any format as long as the date is in that order. Your column is of type integer and hence breaks these rules. So use either as.Date() from base, or as_date() from lubridate

as.Date(192606)
"2497-05-03"
lubridate::as_date(192606)
"2497-05-03"

Therefore, for your data:

sss[,1] <- as_date(sss[,1])

If data is not numeric:

sss[,1] <- as_date(as.numeric(sss[,1]))

Since the format is YearMonth we can use:

library(zoo)
sss[,1] <-  as.yearmon(as.character(sss[,1, drop = T]), "%Y%m")

head(sss[,1])
# A tibble: 6 x 1
  Date     
  <yearmon>
1 Jul 1926 
2 Aug 1926 
3 Sep 1926 
4 Oct 1926 
5 Nov 1926 
6 Dec 1926 

If sss is a matrix, wrap as.character() around the as.yearmon call. This is because, with a matrix, all the data must be of the same time. When types are mixed, all data is coerced into factors. This is why you get the values you do. So convert the column to character before the data is coerced to a factor.

sss[,1] <- as.character(as.yearmon(as.character(test[,1]), "%Y%m"))

Although this works, you should probably stick with data frames when dealing with data like this.