2
votes

This should be simple, but I cannot figure it out:

I have a square matrix with integer values in each cell (result of an all vs all distance calculation). I would like to subset the matrix based on the cell values, e.g. cell == 8, or cell <= 6 , etc.

foo[1:5, 1:5]
                CASSLLAGAPEQFF CASSQVGLATGTQYF CASSSGTQYTQYF CASRITSGGKTQYF CATSDSRGKTQYF
CASSLLAGAPEQFF               0             999           999              8           999
CASSQVGLATGTQYF            999               0           999            999           999
CASSSGTQYTQYF              999             999             0            999             6
CASRITSGGKTQYF               8             999           999              0           999
CATSDSRGKTQYF              999             999             6            999             0

dput:

structure(c(0, 999, 999, 8, 999, 999, 0, 999, 999, 999, 999, 999, 0, 999, 6, 8, 999, 999, 0, 999, 999, 999, 6, 999, 0), .Dim = c(5L, 5L), .Dimnames = list(c("CASSLLAGAPEQFF", "CASSQVGLATGTQYF", "CASSSGTQYTQYF", "CASRITSGGKTQYF", "CATSDSRGKTQYF"), c("CASSLLAGAPEQFF", "CASSQVGLATGTQYF", "CASSSGTQYTQYF", "CASRITSGGKTQYF", "CATSDSRGKTQYF" )))

Expected result of cell == 8 would be a 2x2 matrix of

               CASSLLAGAPEQFF  CASRITSGGKTQYF
CASSLLAGAPEQFF 0                8
CASRITSGGKTQYF 8                0

The row and column names don't matter to the subsetting (but I want to keep the names). What is the most straightforward way to do that?

Thanks for your help!

1
Please use dput to show a small example of what you have. When you say "all vs all distance calculation" do you mean you've used dist and now have an object of class dist?Roland
I've used stringdistmatrix() to create this object.Scott Presnell
Please add dput(foo[1:5, 1:5]) to the question and also show the expected result.Roland

1 Answers

0
votes

You can replace the ==8 with any other filtering criterion.

foo[rowSums(foo==8)>0,colSums(foo==8)>0]
#               CASSLLAGAPEQFF CASRITSGGKTQYF
#CASSLLAGAPEQFF              0              8
#CASRITSGGKTQYF              8              0

rowSums(foo==8)>0 finds any row where at least one element of foo==8 is TRUE. colSums(foo==8)>0 does the same for each column.