I'm a newbie to R, but really like it and want to improve constantly. Now, after searching for a while, I need to ask you for help.
This is the given case:
1) I have sentences (sentence.1 and sentence.2 - all words are already lower-case) and create the sorted frequency lists of their words:
sentence.1 <- "bob buys this car, although his old car is still fine." # saves the sentence into sentence.1
sentence.2 <- "a car can cost you very much per month."
sentence.1.list <- strsplit(sentence.1, "\\W+", perl=T) #(I have these following commands thanks to Stefan Gries) we split the sentence at non-word characters
sentence.2.list <- strsplit(sentence.2, "\\W+", perl=T)
sentence.1.vector <- unlist(sentence.1.list) # then we create a vector of the list
sentence.2.vector <- unlist(sentence.2.list) # vectorizes the list
sentence.1.freq <- table(sentence.1.vector) # and finally create the frequency lists for
sentence.2.freq <- table(sentence.2.vector)
These are the results:
sentence.1.freq:
although bob buys car fine his is old still this
1 1 1 2 1 1 1 1 1 1
sentence.2.freq:
a can car cost month much per very you
1 1 1 1 1 1 1 1 1
Now, please, how could I combine these two frequency lists that I will have the following:
a although bob buys can car cost fine his is month much old per still this very you
NA 1 1 1 NA 2 NA 1 1 1 NA NA 1 NA 1 1 NA NA
1 NA NA NA 1 1 1 NA NA NA 1 1 NA 1 NA NA 1 1
Thus, this "table" should be "flexible" so that in case of entering a new sentence with the word, e.g. "and", the table would add the column with the label "and" between "a" and "although".
I thought of just adding new sentences into a new row and putting all not word that are not yet in the list column-wise (here, "and" would be to the right of "you") and sort the list again. However, I haven't managed this as already the sorting of the new sentence's words' frequencies according to the existing labels haven't been working (when there is e.g., "car" again, the new sentence's frequency of car should be written into the new sentence's row and the column of "car", but when there is e.g. "you" for the 1st time, its frequency should be written into the new sentence's row and a new column labeled "you").