I have a table that features 3 levels of risk alleles at different genomic loci. Ultimately, I need to set up this table a key to identify the prevalence of the different alleles factored by risk status in a large number of samples. I currently have an example of the risk table below:
genomic.stuff <- data.frame(c("A A", "A G", "G A", "G G"), c("T T", "C T", "T C", "C C"),
row.names= c("Risk Level 1", "Risk Level 2", "Risk Level 3", "Risk Level 4"),
stringsAsFactors = TRUE)
colnames(genomic.stuff) <- c("Gene A", "Gene B")
genomic.stuff
Gene A Gene B
Risk Level 1 A A T T
Risk Level 2 A G C T
Risk Level 3 G A T C
Risk Level 4 G G C C
str(genomic.stuff)
'data.frame': 4 obs. of 2 variables:
$ Gene A: Factor w/ 4 levels "A A","A G","G A",..: 1 2 3 4
$ Gene B: Factor w/ 4 levels "C C","C T","T C",..: 4 2 3 1
So I have 2 things I would like to do with this data frame. Bear in mind I have a large mapping file with many genes, so if this can be done across the entire table in dplyr or tidyverse that would (I think?) be best.
1) I want to re-level the factors so that they ranked according to risk status and not automatically leveled according to alphabetical order (The data frame already exists so I don't think I can do it on the level of the data frame construction)
2) I want to reassign factor level such that Risk Level 1 = 1, Risk Level 2 | 3 = 2, Risk Level 4=3.
Thank you all very much for your help!
Risk Level
arow.name
? Why not its own column? - NelsonGon