I need to create some box plots showing the abundance of some bacterial taxa in different samples. My data looks like:
my.data <- "Taxon 06.TO.VG 21.TO.V 02.TO.VG 41.TO.VG 30.TO.V 04.BA.V 34.TO.VG 01.BA.V 28.TO.VG 18.TO.O 44.TO.V 08.BA.O 07.BA.O 06.BA.V 11.TO.V 06.BA.VG 07.BA.VG 05.BA.VG 07.BA.V 05.BA.V 06.BA.O 02.BA.O 04.BA.O 01.BA.O 05.BA.O 03.BA.O 02.BA.VG 03.BA.V 02.BA.V 04.BA.VG 03.BA.VG 01.BA.VG 15.TO.O 31.TO.O 09.TO.O 27.TO.V 42.TO.VG 08.TO.VG 16.TO.O 07.TO.V 13.TO.O 32.TO.V 29.TO.VG 10.TO.V 25.TO.V 05.TO.VG 20.TO.O 19.TO.V 17.TO.O 35.TO.V 43.TO.O 24.TO.V 26.TO.VG 01.TO.VG 37.TO.O 04.TO.VG 33.TO.O 39.TO.VG 14.TO.O 12.TO.O 38.TO.VG 22.TO.O
Bacteroides 0.072745558 0.011789182 0.028956894 0.059031877 0.097387173 0.086673889 0.432662192 0.060246679 0.269535674 0.152713335 0.014511873 0.063421323 0.091253905 0.139856373 0.013677012 0.200847907 0.180712032 0.21332737 0.031756181 0.272166702 0.019861211 0.133804422 0.168692685 0.100862392 0.152431791 0.104702194 0.119352089 0.410334347 0.024104844 0.0493905 0.068065382 0.047854785 0.011860175 0.168986083 0.015748031 0.407974482 0.264409881 0.250364431 0.330547112 0.536443695 0.578045113 0.400459167 0.204446209 0.357879234 0.242751388 0.488863722 0.521495803 0.001852281 0.045638126 0.503566932 0.069072806 0.171181339 0.183629007 0.371751412 0.385231317 0.023690205 0.255697356 0.104054054 0.242741552 0.043973941 0.221033868 0.004587156
Prevotella 0.073080791 0.302011096 0.586048042 0.487603306 0.290973872 0.014897075 0 0.333254269 0.029445074 0 0.153034301 0.002399726 0.025658188 0.090664273 0.440294582 0.100688924 0 0 0 0 0 0.000227946 0.093623374 0 0.000197707 0.115987461 0.076442171 0 0.047507606 0.000210172 0.000243962 0.042079208 0.52184769 0 0.394750656 0 0 0.235787172 0 0.000936856 0.000300752 0 0.051607781 0 0 0 0.002289494 0.735586941 0.023828756 0 0.011200996 0 0.046374105 0 0.00044484 0.085421412 0.000455789 0.306756757 0 0.11970684 0.008912656 0.371559633"
I'm wandering bout using ggplot2 to do to do the box plot, but I'm not sure about how the data have to be formatted.... I tried this:
df <- read.csv("my.data", header=T) ggplot(data = df, aes(x=variable, y=value)) + geom_boxplot(aes(fill=Taxon))
but it gave me an error saying that the variable was not found... Anyone can help me?
Many thanks Francesca
variable
? If they are not thenR
will tell you it cannot find them... Also your data lookswide
it needs to belong
. Posting the result ofdput(my.data)
is much more productive than the format you have given your data in. – user1317221_G