1
votes

I have a big Data_Set with SNPs in R like that

OAR19_64675012.1 OAR19_64803054.1 OAR1_88143.1 s09912.1 s36301.1
              1                1            2        2        0
              1                1            1        0        1
              1                1            2        1        2
              0                2            2        1        0

...

> dim(data2)
[1]   501 42844

And I want to use the SNPassoc library in order to do quality control. So i have to do :

mydt<- setupSNP(data2)

as the http://davinci.crg.es/estivill_lab/tools/SNPassoc/SupplementaryMaterial.pdf says.

The output of the previous command is this :

Error in as.data.frame.default(x[[i]], optional = TRUE,  stringsAsFactors = stringsAsFactors) : 
cannot coerce class ""try-error"" to a data.frame
 In addition: Warning message:
   In mclapply(data[, colSNPs, drop = FALSE], snp, sep = sep, ...) :
    all scheduled cores encountered errors in user code

I've just made a search about this but i can't fix it.... If anyone has any idea about this i would apreciate if she/he post something

Thank you all in advance...

1
Your data frame is not in the proper format. If you take a look at head(data(SNPs)), you'll see what setupSNP is expecting. The supp material you link to makes this clear as well.emilliman5
First of all thnx for your time.... but I am still not understand what you say... I just look the head(data(SNPs)) yopu said and the output is "SNPs". What i have to understand from this thing ?? Can you be more specific ???Giorgos K
Sorry, that was my fault for the improper shorthand. If you load SNPassoc example data and loot at it you will see what form your data has to be in. data(SNPs), head(SNPs). Basically you need the dinucleotide sequence not an integer representation of the snps for setupSNP to work on.emilliman5
Ok so you suggest me to use strings instead of integers for the presentation of SNPs. Like AA , AB , BB in case of 0 ,1 ,2 which I have. But I have a second data_set with replaced strings like A_A , A_B and B_B. That's my genotypes, they are defferent from the data in SNPassoc. I think that the problem is somewhere else.... any way..... thank you. This is a very significant project for me and i have to find a solution to my problem, if you have some time we can just talk about it.Giorgos K
Doing my own testing I found that you cannot have more than 3 genotypes for any one snp column. With out seeing your data in full, I don't think I will be able to help any more than that.emilliman5

1 Answers

0
votes

We had the same issue - we had -- listed in some of the SNP columns which counted toward the 3 genotype limit. Setting these to NA did the trick.