I have several data frame that have a list of gene names without a header. Each files roughly looks like this:
SCA-6_Chr1v1_00001
SCA-6_Chr1v1_00002
SCA-6_Chr1v1_00003
SCA-6_Chr1v1_00004
SCA-6_Chr1v1_00005
SCA-6_Chr1v1_00006
SCA-6_Chr1v1_00009
SCA-6_Chr1v1_00010
SCA-6_Chr1v1_00014
SCA-6_Chr1v1_00015
SCA-6_Chr1v1_00017
Each of these data frames is written to a separate .txt
file and I have uploaded them all into one list like so:
temp = list.files(pattern = "*.txt")
myfiles = lapply(temp, FUN=read.table, header=FALSE)
With the myfiles
list I want to determine all of the values unique to each file and return them in a list (I assume I can do this with a lapply
function). I have tried running the following code but it is not dropping the shared values:
unique.genes = lapply(1:length(myfiles), function(n) setdiff(myfiles[[n]], unlist(myfiles[-n])))
Any help would be greatly appreciated.
myfiles = lapply(temp, FUN = scan, what = character())
thelapply/setdiff
loop will work and it's much faster. - Rui Barradas