0
votes

I have a vector within an R dataframe wich literally contains an abbreviation for the months in a year in the form (JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC) and I want to replace them for their traditional equivalent [1:12]

came up with the following ideas, all of which give a vector filled with not available (NA) values.

replace(df$month, df$month == 'JAN', '01')

df$month <- if(df$month == "JAN") '01'

df$month <- match(df$month,month.abb) 

the first two only make NA values were JAN was, the third one makes all months NA values

Any ideas why this isn't working, and how to get it to work?

1
You could just convert to a factor: df$month <- factor(df$month, levels = ("JAN", "FEB", "MAR", ...), labels = (1, 2, 3, ...))Peter Diakumis
I think what you have should work if df$month is a string and not a factor already.jraab
If x is your input vector then try match(x, toupper(month.abb))G. Grothendieck
@PeterDee your solution works and I get the sense of itca_san
At the R console try: ?toupperG. Grothendieck

1 Answers

1
votes

I'd be inclined to do this with merge.

MonthRef <- data.frame(month_number = 1:12,
                       month_abb = toupper(month.abb))
#* Make a data frame of random months
Months <- data.frame(month = sample(MonthRef$month_abb, 20, replace=TRUE))

merge(Months, MonthRef, by.x="month", by.y="month_abb")

It's a bit more typing, but it has the advantage that it will be very clear to me what I did when I come back to it in six months.