I have a matrix of gene names with expression values in different tissues. However, the analyses were performed independently and not all genes are present in all tissues. The gene lists for each tissue were simply pasted below each other. Right now it looks like this:
GeneName Tissue A Tissue B
Gene A 1------------
Gene B 1------------
Gene C 2-----------
Gene A ---------3
Gene D ----------2
I would like to collapse the gene name multiples so that i get a matrix like the following:
GeneName Tissue A Tissue B
Gene A 1---------3
Gene B 1---------
Gene C 2----------
Gene D ---------2
Edit: Thanks for the answer. However, I missed adding that the gene names are a column of their own, while the row names are simply numbers 1-n. I tried to set the name column as row name row.names(mydataframe)<-mydataframe$GeneName
, but got the following error message Error in
row.names<-.data.frame(
tmp, value = c(578L, 510L, 1707L, :
duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names':
As I understand it I can't use a column with non-unique values as row name, which seems to put me in a catch-22 if I need to name the rows after the gene name column to be able to collapse the matrix?
Gene D
become 3 in the output? – Ruthger Righart