2
votes

Let's say in my data (e.g. iris), I want to group only one variable-Sepal.Length by Species and add two rows- one at the top (name of the group) "setosa", followed by observations then after the observations of setosa have ended, a row saying "END", followed by two blank rows, after which new group name "versicolor" starts with its observations with an "END" row etc etc. My real data has over 200 groups and observations are characters.

So far, I have achieved this with dplyr

iris %>%
  group_by(Species) %>%
  select(Sepal.Length) %>%
  add_row(.before=0,.after=0)

Needless to say, my add_row is not working, I have also tried using bind_rows and mutate. Any suggestions would be greatly appreciated, I want my output to look like, which I will export as txt file.

 setosa
    4.1
    5.1
    .
    .
    END
    <empty row1>
    <empty row2>
    versicolor
    5.1
    6.1
    .
    .
    END
    <empty row1>
    <empty row2>
1

1 Answers

0
votes

You can do this using split to get a list of dataframes, then imap_dfr, a really cool function added recently to purrr. imap_dfr maps over a list of dataframes, taking the dataframe and the name of the list entry as its arguments, and returns one dataframe rbinded together.

Try this:

iris %>%
    select(Species, Sepal.Length) %>%
    split(.$Species) %>%
    imap_dfr(function(df, heading) {
        bind_rows(
            tibble(newcol = heading),
            df %>% mutate(newcol = as.character(Sepal.Length)),
            tibble(newcol = "END"),
            tibble(newcol = c("", ""))
        )
    })

Inside the mapping function, I made some dummy tibbles to hold the heading, the "END" line, and the two blank lines. I'm putting all the stuff you want to keep in a new column with the uncreative name newcol, to hold the (only?) column in your desired output.