1
votes

let's say I have a 10x3 matrix m where I want to check for all zeros and two consecutive zeros in the first column. I want to remove all rows containing a zero in the first column and also all other rows after two consecutive zeros in the first column starting from a certain point in the matrix and either removing values before or after two zeros in a row.

      col1 col2 col3
[1,]    2    2    2
[2,]    2    2    2
[3,]    2    2    2
[4,]    2    2    2
[5,]    2    0    2
[6,]    2    2    2
[7,]    2    0    2
[8,]    2    0    2
[9,]    2    2    2
[10,]   2    2    2

dput= structure(c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 2, 0, 
0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), .Dim = c(10L, 3L), .Dimnames = list(
NULL, c("col1", "col2", "col3")))


expected result=     col1 col2 col3
                [1,]    2    2    2
                [2,]    2    2    2

Removing rows 1,2,3,4,5,6,7, and 8.

1
Please provide code with data (in a machine readable format)HubertL
@HubertL done so.rapuu
Please use dput to export your dataHubertL
And please also provide the expected resulting datasetHubertL
Done, hope it's all done correctly.rapuu

1 Answers

0
votes

I have written you code to solve the following rules:

Rule A: Remove rows with a zero in any column

Rule B: Remove all rows before consecutive zeros in any column

1 2 3 4 5 6 7 8 9 10 # Row Number
2 2 2 2 0 2 0 0 2 2  # Column 2
* * * * * * * * 2 2  # * = Remove
B B B B C B A A - -  # Rule Why Removed

Where C is both A+B happening. If there are following rows are after row 10 with single (non-consecutive) zeros, they will be removed.

Here we removed 1:8. Here is my approach:

dat <- structure(c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 2, 0, 
                  0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), .Dim = c(10L, 3L), .Dimnames = list(
                    NULL, c("col1", "col2", "col3")))
dat

ToRemove <- apply(dat, 2, function(colmn) {
  row.zeros <- which(colmn == 0) # rows with zeros
  if(length(row.zeros) > 0) { # if we found any
     # which of them is the last double
    last.doubles <- max(which(diff(row.zeros) == 1))
    leftof.last.doubles <- "if"(length(last.doubles) > 0, # if double exists
                                1:(row.zeros[last.doubles]-1), # all rows before
                                NULL) # else nothing
    # remove rows with single zeros and all rows before double consecutive 
    unique(c(row.zeros, leftof.last.doubles)) }
})

ToRemove
#$col1
#NULL
#
#$col2
#[1] 5 7 8 1 2 3 4 6
#
#$col3
#NULL

dat[-unlist(ToRemove),]
#     col1 col2 col3
#[1,]    2    2    2
#[2,]    2    2    2