I'd like to apply a couple functions to a column but I want to apply some logic as to when I do this, in this case when another column has some NA's. To illustrate I'll add some NA to the iris
dataset and turn it into a data.table:
library(data.table)
irisdt <- iris
## Prep some example data
irisdt[irisdt$Sepal.Length < 5,]$Sepal.Length <- NA
irisdt[irisdt$Sepal.Width < 3,]$Sepal.Width <- NA
## Turn this into a data.table
irisdt <- as.data.table(iris)
If I wanted to apply max
to multiple columns I'd go like this:
## Apply a function to individual columns
irisdt[, lapply(.SD, max), .SDcols = c("Petal.Length", "Petal.Width")]
#> Petal.Length Petal.Width
#> 1: 6.9 2.5
In this case however I'd like to take out any row that isn't an NA in Sepal.Length
and then return max and min along with the name of the column I subset for NA's. Below is an ugly way of implementing this but hopefully illustrates what I am after:
## Here is what the table would look like
desired_table <- rbind(
irisdt[!is.na(Sepal.Length), .(max = max(Petal.Length), min = min(Petal.Length), var = "Sepal.Length")],
irisdt[!is.na(Sepal.Width), .(max = max(Petal.Length), min = min(Petal.Length), var = "Sepal.Width")]
)
desired_table
#> max min var
#> 1: 6.9 1.2 Sepal.Length
#> 2: 6.7 1.0 Sepal.Width
Created on 2020-01-14 by the reprex package (v0.3.0)
Any thoughts on how I might accomplish this?