1
votes

although other posts had the same question, i couldnt use the solution. I am trying to generate a matrix for estimate of correlation for only the significant values. It is supposed to be simple but i am getting an error "Error in mat[i, j] <- result["estimate"] : incorrect number of subscripts on matrix" Here is my GENE input:

Name    Sample1 Sample2 Sample3 Sample4 Sample5 Sample6 Sample7 Sample8 Sample9 Sample10
Lrriq3  8.185794    5.691456    5.693373333 6.973468667 8.868912    5.915211333 6.718336667 6.212762667 6.424637333 13.01974667
Dnase2b 0   0.1749128   0   0.1685122   0.1784736   0.122940127 0.007396118 0   0   0.09347276
Lphn2   1.080010133 10.01754067 14.10849333 11.77894    1.2552028   1.702124667 11.52506    15.21622    0.093035673 0.019666988
Rpf1    7.439926667 8.863518    10.28811467 11.86218    13.45304667 13.44146667 20.04024    16.94706667 23.76358    17.00742667
Uox 7.458356667 10.01754067 14.10849333 11.77894    19.75814    12.14829333 14.58846667 11.52506    15.21622    14.57954
Ctbs    0.400568    0.134638993 3.450422667 0.164317553 0   0   0.3395462   0.079734033 0.2700658   0
Spata1  2.066878    2.079750667 1.7238  2.240882667 1.461403333 2.093744    1.67564 1.2552028   1.702124667 1.427768
Ptprh   1.080010133 0.09089988  0.621011133 0.3004404   0.228991467 0.063827739 0.188904267 0.093035673 0.256751333 0.424108067

My LNC input:

Name    Sample1 Sample2 Sample3 Sample4 Sample5 Sample6 Sample7 Sample8 Sample9 Sample10
XX1 3.956263333 2.443864667 1.413482    1.486519333 2.20473 3.015326    1.1033612   0.977534    0.789298267 1.469496
XX2 2.759029333 2.371987333 3.434   4.004905333 5.198814667 2.889342    3.463316    4.039935333 5.038084667 5.113266667
XX3 4.214811333 3.470377333 8.075684667 5.115368    7.084812667 4.767865333 6.272181333 6.202424667 5.480058667 4.613682
XX4 3.256852667 2.944397333 2.047966    1.696964667 2.099414667 1.780854667 0.3989612   0.23245 0.257986867 1.676498
XX5 661.7403333 647.749 834.8288    670.8856    728.8326667 710.5224667 357.7705333 387.9334    404.3672667 694.4849333
XX6 7.458356667 10.01754067 14.10849333 11.77894    11.77894    19.75814    11.77894    1.2552028   1.702124667 11.52506
XX7 7.458356667 10.01754067 14.10849333 11.77894    19.75814    14.58846667 11.52506    13.45304667 13.44146667 0.23245

Script is designed to do correlation between each row from each file (note that the samples 1 to 10 are arranged in same order) and output an excel sheet with the p value, estimate, and test, as well as a matrix of the estimates for only those with p <0.05. All the script works except for one step.

Script is:

genes <- read.delim(file="SampleGene.txt", header=TRUE, row.names=1)
lnc <- read.delim(file="Samplelncs.txt", header=TRUE, row.names=1) 
x = rownames(genes[1:nrow(genes),])
y = rownames(lnc[1:nrow(lnc),])

d<-NULL #creates an empty dataframe
mat<-matrix(0,nrow(genes),nrow(lnc)) #creates a matrix with all values as 0
rownames(mat) <- rownames(genes) #assigns rownames to the matrix based on row names of the gene file
colnames(mat) <- rownames(lnc) #assigns colnames to the matrix based on the colnames of the lnc file

for (i in x){
    for (j in y) {
        result=cor.test(as.numeric(genes[i,]), as.numeric(lnc[j,]), method='pearson')#calculates the correlation and assigns it to result
        d<-rbind(d, data.frame(i, j, result[c("estimate","p.value","statistic","method")], stringsAsFactors=FALSE)) #rbind allows writing output of loop to an empty dataframe. Works perfectly.
        if (result["p.value"]<0.05){ #attempts to add the estimate to the matrix only of p.value <0.05
             mat[i,j] <- result["estimate"] #This is causing the error
             #print(result["estimate"]) #if I just print without adding to matrix, i dont get errors
}
}
}

write.table(file="Pearson.xls", as.data.frame(d), sep="\t")

As I indicated, if i remove the if statement from the loop OR if i just print out result["estimate"], I dont get errors. Otherwise, I get errors al the time.

I am a beginner in R and programming.Hence, if there are other suggestions to optimize the script above, please let me know.

1
And you were very close to something working so I upvoted to balanceHubertL
Perhaps for lack of clarity? What "other posts"? You haven't really said what a correct answer would be. How many rows are you expecting? 45 or 10? You seek two-way correlations of the numeric columns only between rows with matching row numbers? Took me three reads to get that might be the goal. Or the fact that d<-NULL does NOT create an empty dataframe.IRTFM
@42 in response to your questions: posts as in literally posts on forums. I clearly said that i am expecting a matrix with estimates as values. read the part starting with "Script is designed to do correlation .... only those with p <0.05." I am not sure what does the # of rows refer to. it is simply as the crow of (inc) as indicated in the script.BioProgram
@HubertL thank you -BioProgram
You asked for a guess at the reason for the downvote (and then you deleted that comment) . Don't complain when people respond to your request.IRTFM

1 Answers

1
votes

When you write result["estimate"] you get a list while if you write result[["estimate"]]you get a numeric. just use :

mat[i,j] <- result[["estimate"]]

and you wont get the error