Compute the Euclidean distance using word counts

Question

Consider the following two sentences.

Sentence 1: The quick brown fox jumps over the lazy dog.

Sentence 2: A quick brown dog outpaces a quick fox.

Compute the Euclidean distance using word counts.

Dinesh.hmn Dinesh.hmn · Accepted Answer · 2017-02-18T10:59:27

You can use the package tm to find word counts and then compute the euclidean distance

> library(tm)
> s1 <- " The quick brown fox jumps over the lazy dog"
> s2 <- "A quick brown dog outpaces a quick fox"
> 
> VS <- VectorSource(c(s1,s2))
> corp <- Corpus(VS)
> dtm <- DocumentTermMatrix(corp)
> d <- dist(t(dtm), method = 'euclidean')
> d



        brown      dog      fox    jumps     lazy outpaces     over    quick
dog      0.000000                                                               
fox      0.000000 0.000000                                                      
jumps    1.000000 1.000000 1.000000                                             
lazy     1.000000 1.000000 1.000000 0.000000                                    
outpaces 1.000000 1.000000 1.000000 1.414214 1.414214                           
over     1.000000 1.000000 1.000000 0.000000 0.000000 1.414214                  
quick    1.000000 1.000000 1.000000 2.000000 2.000000 1.414214 2.000000         
the      1.414214 1.414214 1.414214 1.000000 1.000000 2.236068 1.000000 2.236068

Compute the Euclidean distance using word counts

1 Answers