I would like to do the following:
combine into a data frame, two vectors that
- have different length
- contain sequences found also in the other vector
- contain sequences not found in the other vector
- sequences that are not found in other vector are never longer than 3 elements
- always have same first element
The data frame should show the equal sequences in the two vectors aligned, with NA in the column if a vector lacks a sequence present in the other vector.
For example:
vector 1 vector 2 vector 1 vector 2
1 1 a a
2 2 g g
3 3 b b
4 1 or h a
1 2 a g
2 3 g b
5 4 c h
5 c
should be combined into data frame
1 1 a a
2 2 g g
3 3 b b
4 NA h NA
1 1 or a a
2 2 g g
NA 3 NA b
NA 4 NA h
5 5 c c
What I did, is to search for merge, combine, cbind, plyr examples but was not able to find solutions. I am afraid I will need to start write a function with nested for loops to solve this problem.
c("apple", "banana")
andc("apple", "orange")
: we know from sequence #1 thatbanana
will come afterapple
; similarly, we know from sequence #2 thatorange
will come afterapple
. But what will tell us ifbanana
ororange
should come first? We need a way to sort elements across multiple sequences. Can you clarify that aspect of the problem? (In your example, you used integers for which there is an implicit solution. Maybe it's just a matter of confirming that your two vectors areinteger
.) – flodel1
in your example)? – flodel