I'm struggling with something very basic: sorting a data frame based on a time format (month-year, or, “%B-%y” in this case). My goal is to calculate various monthly statistics, starting with sum.
The part of relevant part of the data frame looks like this * (This goes well and in accordance of my goal. I'm including it here to show where the problem could originate from)*:
> tmp09
Instrument AccountValue monthYear ExitTime
1 JPM 6997 april-07 2007-04-10
2 JPM 7261 mei-07 2007-05-29
3 JPM 7545 juli-07 2007-07-18
4 JPM 7614 juli-07 2007-07-19
5 JPM 7897 augustus-07 2007-08-22
10 JPM 7423 november-07 2007-11-02
11 KFT 6992 mei-07 2007-05-14
12 KFT 6944 mei-07 2007-05-21
13 KFT 7069 juli-07 2007-07-09
14 KFT 6919 juli-07 2007-07-16
# Order on the exit time, which corresponds with 'monthYear'
> tmp09.sorted <- tmp09[order(tmp09$ExitTime),]
> tmp09.sorted
Instrument AccountValue monthYear ExitTime
1 JPM 6997 april-07 2007-04-10
11 KFT 6992 mei-07 2007-05-14
12 KFT 6944 mei-07 2007-05-21
2 JPM 7261 mei-07 2007-05-29
13 KFT 7069 juli-07 2007-07-09
14 KFT 6919 juli-07 2007-07-16
3 JPM 7545 juli-07 2007-07-18
4 JPM 7614 juli-07 2007-07-19
5 JPM 7897 augustus-07 2007-08-22
10 JPM 7423 november-07 2007-11-02
So far, so good, and sorting based on ExitTime works. The trouble starts when I try to calculate the totals per month, followed by an attempt to sort this output:
# Calculate the total results per month
> Tmp09Totals <- tapply(tmp09.sorted$AccountValue, tmp09.sorted$monthYear, sum)
> Tmp09Totals <- data.frame(Tmp09Totals)
> Tmp09Totals
Tmp09Totals
april-07 6997
augustus-07 7897
juli-07 29147
mei-07 21197
november-07 7423
How can I sort this output in a chronological Way?
I've already tried (besides various attempts to convert the monthYear to another date format): order, sort, sort.list, sort_df, reshape, and calculating the sum based on tapply, lapply, sapply, aggregate. And even rewriting the rownames (by giving them a number from 1 to length (tmp09.sorted2$AccountValue
) didn't work. I also tried to give each month-year a different ID based on what I've learned in another question, but R also experienced difficulties in discriminating between the various month-year values.
The correct order of this output would be april-07,mei-07,juli-07,augustus07, november-07
:
apr-07 6997
mei-07 21197
jul-07 29147
aug-07 7897
nov-07 7423