1
votes

I am currently working with futures dataset across asset classes - which involves xts objects with numeric and character inputs. I am applying merge() to align the dataset to consistent dates, however, merge() on a xts object with character inputs, as is the case with the below example, gives NAs. Is there a work around?

Below is the dput output of a sample xts object (underlying future contracts):

uContracts <- structure(c("SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", 
"SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", 
"SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", 
"SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", 
"SPM00-USA"), class = c("xts", "zoo"), .indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC", index = structure(c(946598400, 
946857600, 946944000, 947030400, 947116800, 947203200, 947462400, 
947548800, 947635200, 947721600, 947808000, 948153600, 948240000, 
948326400, 948412800, 948672000, 948758400, 948844800, 948931200, 
949017600), tzone = "UTC", tclass = "Date"), .Dim = c(20L, 3L
), .Dimnames = list(NULL, c("SP00-USA", "SP.1-USA", "SP.2-USA"
)))

Dput out of sample dates:

tW <- structure(c(10956, 10959, 10960, 10961, 10962, 10963, 10966, 
10967, 10968, 10969, 10970, 10973, 10974, 10975, 10976, 10977, 
10980, 10981, 10982, 10983), class = "Date")

I would like to format uContracts as per dates in tW, wherein any dates in tW that are not in uContracts, fill underlying contract name from the last available date. I am currently doing that as below:

adjContracts <- merge(uContracts, tW, fill = na.locf)

The above command works for numeric data (such as prices), but falters for character data. My current output is:

NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_), .Dim = c(21L, 3L), .Dimnames = list(NULL, 
    c("SP00.USA", "SP.1.USA", "SP.2.USA")), index = structure(c(946598400, 
946857600, 946944000, 947030400, 947116800, 947203200, 947462400, 
947548800, 947635200, 947721600, 947808000, 948067200, 948153600, 
948240000, 948326400, 948412800, 948672000, 948758400, 948844800, 
948931200, 949017600), tzone = "UTC", tclass = "Date"), class = c("xts", 
"zoo"), .indexCLASS = "Date", .indexTZ = "UTC", tclass = "Date", tzone = "UTC")

From what I gather, xts is converting the underlying dataset to numeric (from character). I believe, Darren Cook (PS - I hope you are okay with the shoutout) in merge.xts not merging all data using R touched upon this issue, but I am not sure how to apply it here.

Any help would be appreciated.

Cheers, S

update: below is the expected output (note the difference in the index of the xts object):

structure(c("SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", "SPH00-USA", 
"SPH00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", 
"SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", 
"SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", 
"SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", "SPM00-USA", 
"SPM00-USA"), .Dim = c(20L, 3L), .Dimnames = list(NULL, c("SP00-USA", 
"SP.1-USA", "SP.2-USA")), index = structure(c(946598400, 946857600, 
946944000, 947030400, 947116800, 947203200, 947462400, 947548800, 
947635200, 947721600, 947808000, 948067200, 948153600, 948240000, 
948326400, 948412800, 948672000, 948758400, 948844800, 948931200
), tzone = "UTC", tclass = "Date"), class = c("xts", "zoo"), .indexCLASS = "Date", tclass = "Date", .indexTZ = "UTC", tzone = "UTC")
2
Can you show what will be your expected output ?Ronak Shah
I have now included that in my question, thanks!Shriv

2 Answers

0
votes

Maybe there is a way to solve this without converting the data but it is easier to do manipulation if you convert it into a dataframe. Using functions from tidyverse we can do the following

library(tidyverse)
library(xts)

as.data.frame(uContracts) %>%
   rownames_to_column('Date') %>%
   mutate(Date = as.Date(Date)) %>%
   right_join(tibble(Date = tW)) %>%
   fill(everything()) %>%
   column_to_rownames('Date') %>%
   as.xts()

#             SP00-USA    SP.1-USA    SP.2-USA   
#1999-12-31 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-03 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-04 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-05 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-06 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-07 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-10 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-11 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-12 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-13 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-14 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-17 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-18 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-19 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-20 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-21 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-24 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-25 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-26 "SPH00-USA" "SPH00-USA" "SPM00-USA"
#2000-01-27 "SPH00-USA" "SPH00-USA" "SPM00-USA"
0
votes

There are several problems with the code in the question:

  • tW was not converted to xts and if that is not done it assumes that the object represents data whereas in fact it represents the index
  • from the expected output it seems you want only the dates in tW to be in the output so the all= argument needs to be specified appropriately.

To do this note that zero width xts objects are supported so convert tW to an xts object as shown below and then merge uContracts with it. From the expected output shown in the question it appears that only the dates in tW should be in the result. In that case, use the all= argument shown below. (If instead all dates from both objects were wanted in the output then omit the all= argument since the default is all = TRUE which retains the dates of both objects.)

merge(uContracts, xts(, tW), all = c(FALSE, TRUE), fill = na.locf)