1
votes

I have a mlr3 task, where I have dataset like this:

Dataset "all"

all <- data.frame(v1 = c("a", "b"),
              v2 = c(1, 2),
              data = c("test", "train"))

library(mlr3)
task <- TaskClassif$new("loan", all, target = "v1")

How could I filter task by column "data" value "train"?

I tried task$filter(data == "train") and a lot of other combinations, but don't work.

dput(task)
<environment>

str(task)
Classes 'TaskClassif', 'TaskSupervised', 'Task', 'R6' <TaskClassif:loan>
1
Can you show the dput of the exampleakrun
@akrun, do you mean this (see in question)?Jānis
I guess it is in some environment. Can you show the code to create this data so that it can be reproduced (along with the packages used)akrun
@akrun, how about now?Jānis
Please check the solution belowakrun

1 Answers

3
votes

There is a as.data.table method

methods(class = 'Task')
#[1] as_task       as_tasks      as.data.table

So, we can convert it to a data.table and use the methods of data.table for subsetting

library(data.table)
as.data.table(task)[data == 'train']
#   v1  data v2
#1:  b train  2

Or can extract the data

task$data()[data == 'test']
#    v1 data v2
#1:  a test  1

Or create a new instance

tasktrain <-  TaskClassif$new("loantrain",
           task$data()[data == 'train'], target = "v1")
tasktrain$data()
#   v1  data v2
#1:  b train  2