max.col works for the maximum position while the order method below works for any number of maximal positions: we order the index matrix to descreasing order and pick the wanted count of largest positions. Similar to max.col, we could use which.max function
> which.max(c(100,1:10))
[1] 1
> max.col(c(100,1:10))
[1] 1 1 1 1 1 1 1 1 1 1 1
and repeatedly finding the maximal position until not needed. We will next concentrate on the order method. Alternative A shows a case where the value matrix and the index matrix are equal. Alternative B shows a case where the value matrix and the index matrix are not equal.
Alternative A showing a case with index matrix equal to the value matrix.
> set.seed(123); r<-10; c<-10; m2 <- matrix(rbinom(r*c,70,0.5),r,c)
> m2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 35 37 41 40 37 34 36 31 35 33
[2,] 31 36 38 31 33 30 42 38 38 35
[3,] 39 34 31 43 28 32 37 28 35 32
[4,] 34 33 37 29 38 37 44 33 42 39
[5,] 39 41 40 45 34 39 34 33 38 30
[6,] 32 34 35 30 38 29 33 33 32 31
[7,] 40 32 34 39 34 28 39 37 35 41
[8,] 39 37 33 36 27 38 37 33 32 38
[9,] 35 40 36 39 40 25 31 32 40 38
[10,] 29 41 39 35 34 38 34 34 29 38
> t(apply(m2, 1, function(x) order(x,decreasing=T)[1:3]))
[,1] [,2] [,3]
[1,] 3 4 2
[2,] 7 3 8
[3,] 4 1 7
[4,] 7 9 10
[5,] 4 2 3
[6,] 5 3 2
[7,] 10 1 4
[8,] 1 6 10
[9,] 2 5 9
[10,] 2 3 6
that is the index matrix and its value matrix, in this case, is the same.
Alternative B showing where the index matrix not equal to the value matrix.
max.col returns the maximum position of each row of a matrix, breaking ties at random, but it does not have the partial directive
require(quanteda)
mytext <- c("Let the big dogs hunt honor", "No holds barred, it is a honor child child child.", "My child is an honor student")
myMatrix <-dfm(mytext, ignoredFeatures = stopwords("english"), stem = TRUE)
> myMatrix %*% t(myMatrix) #Tells how sentences are related to each other
# 3 x 3 sparse Matrix of class "dgCMatrix"
# text1 text2 text3
# text1 5 1 1
# text2 1 12 4
# text3 1 4 3
> t(myMatrix) %*% myMatrix #Tells how words are related to each other
# 9 x 9 sparse Matrix of class "dgCMatrix"
# let big dog hunt honor hold bar child student
# let 1 1 1 1 1 . . . .
# big 1 1 1 1 1 . . . .
# dog 1 1 1 1 1 . . . .
# hunt 1 1 1 1 1 . . . .
# honor 1 1 1 1 3 1 1 4 1
# hold . . . . 1 1 1 3 .
# bar . . . . 1 1 1 3 .
# child . . . . 4 3 3 10 1
# student . . . . 1 . . 1 1
#
> max.col(t(myMatrix) %*% myMatrix)
[1] 5 2 2 3 8 8 8 8 9
>
> max.col(myMatrix %*% t(myMatrix))
[1] 1 2 2
and
> n<-length(myMatrix);max.col(t(myMatrix) %*% myMatrix,partial=n-1)
Error in max.col(t(myMatrix) %*% myMatrix, partial = n - 1) :
unused argument (partial = n - 1)
where unfortunately partial directive is not defined in the max.col command. So we test the method A:
> mytext <- c("Let the big dogs hunt honor", "No holds barred, it is a honor child child child.", "My child is an honor student")
> myMatrix <-dfm(mytext, ignoredFeatures = stopwords("english"), stem = TRUE)
> aa<- myMatrix %*% t(myMatrix)
> aa
3 x 3 sparse Matrix of class "dgCMatrix"
text1 text2 text3
text1 6 1 1
text2 1 18 5
text3 1 5 6
> t(apply(aa, 1, function(x) order(x,decreasing=T)[1:3]))
[,1] [,2] [,3]
text1 1 2 3
text2 2 3 1
text3 3 2 1
that is correct and other way around we get
> aaa<- t(myMatrix) %*% myMatrix
> aaa
18 x 18 sparse Matrix of class "dgCMatrix"
[[ suppressing 18 column names ‘let’, ‘the’, ‘big’ ... ]]
let 1 1 1 1 1 1 . . . . . . . . . . . .
the 1 1 1 1 1 1 . . . . . . . . . . . .
big 1 1 1 1 1 1 . . . . . . . . . . . .
dog 1 1 1 1 1 1 . . . . . . . . . . . .
hunt 1 1 1 1 1 1 . . . . . . . . . . . .
honor 1 1 1 1 1 3 1 1 1 1 1 2 1 4 1 1 1 1
no . . . . . 1 1 1 1 1 1 1 1 3 1 . . .
hold . . . . . 1 1 1 1 1 1 1 1 3 1 . . .
bar . . . . . 1 1 1 1 1 1 1 1 3 1 . . .
, . . . . . 1 1 1 1 1 1 1 1 3 1 . . .
it . . . . . 1 1 1 1 1 1 1 1 3 1 . . .
is . . . . . 2 1 1 1 1 1 2 1 4 1 1 1 1
a . . . . . 1 1 1 1 1 1 1 1 3 1 . . .
child . . . . . 4 3 3 3 3 3 4 3 10 3 1 1 1
. . . . . . 1 1 1 1 1 1 1 1 3 1 . . .
my . . . . . 1 . . . . . 1 . 1 . 1 1 1
an . . . . . 1 . . . . . 1 . 1 . 1 1 1
student . . . . . 1 . . . . . 1 . 1 . 1 1 1
> mmm <- t(apply(aaa, 1, function(x) order(x,decreasing=T)[1:3]))
[,1] [,2] [,3]
let 1 2 3
the 1 2 3
big 1 2 3
dog 1 2 3
hunt 1 2 3
honor 14 6 12
no 14 6 7
hold 14 6 7
bar 14 6 7
, 14 6 7
it 14 6 7
is 14 6 12
a 14 6 7
child 14 6 12
. 14 6 7
my 6 12 14
an 6 12 14
student 6 12 14
and the corresponding value matrix
> t(apply(mmm, 1, function(x) colnames(aaa)[x]))
[,1] [,2] [,3]
let "let" "the" "big"
the "let" "the" "big"
big "let" "the" "big"
dog "let" "the" "big"
hunt "let" "the" "big"
honor "child" "honor" "is"
no "child" "honor" "no"
hold "child" "honor" "no"
bar "child" "honor" "no"
, "child" "honor" "no"
it "child" "honor" "no"
is "child" "honor" "is"
a "child" "honor" "no"
child "child" "honor" "is"
. "child" "honor" "no"
my "honor" "is" "child"
an "honor" "is" "child"
student "honor" "is" "child"
where the value matrix is not equal to the earlier index matrix.