1
votes

I have the following code using the CSV below

library(ggpubr)
library(ggsci)
df = read.csv2("file.csv", row.names=1)
# Copy df
df2 = df
# Convert the cyl variable to a factor
df2$perc <- as.factor(df2$perc)
# Add the name colums
df2$name <- rownames(df)
ggbarplot(df2, x = "name", y = "perc",
          fill = "role",               # change fill color by cyl
          color = "white",            # Set bar border colors to white
          palette = "npg",            # jco journal color palett. see ?ggpar
          sort.val = "asc",          # Sort the value in dscending order
          sort.by.groups = FALSE,     # Don't sort inside each group
          x.text.angle = 0,           # Rotate vertically x axis texts
          rotate = TRUE,
          label = TRUE, label.pos = "out",
          #label = TRUE, lab.pos = "in", lab.col = "white",
          width = 0.5
)

the CSV is :

genes;perc;role
GATA-3;7,9;confirmed in this cancer
CCDC74A;6,8;prognostic in this cancer
LINC00621;6,1;none
POLRMTP1;4,1;none
IGF2BP3;3,2;confirmed in this cancer

which produced this plot

enter image description here

There are two things I don't get here:

1) Why the x-axis tick of each bar correspond to the actual value plotted ? I mean why the x-axis isn't from 0 to 8, and should be in my opinion. I hope I explain correctly.

2) The label value seems unaligned with the y-thick. Am I missing an option here ?

1
For (1): is your x-variable a factor perhaps? Continuous x-scales work best with numerical data - teunbrand
True, the df2$perc <- as.numeric(df2$perc) fixed the problem (1) - Benoit B.

1 Answers

2
votes

To be honest, I would probably not use ggpubr here. Staying in the ggplot syntax is often safer. And also arguably less code... (Also, don't use factors in this case, as user teunbrand commented)

Two good options for horizontal bars:

library(tidyverse)
library(ggstance)
library(ggsci)

Option 1 - use coord_flip

ggplot(df2, aes(fct_reorder(genes, perc), perc, fill = role)) +
  geom_col() +
  geom_text(aes(label = perc), hjust = 0) +
  scale_fill_npg() +
  coord_flip(ylim = c(0,100)) +
  theme_classic() +
  theme(legend.position = 'top') +
  labs(x = 'gene', y = 'percent')

option 2 - use the ggstance package I prefer option 2, because using ggstance allows for more flexible combination with other plots

ggplot(df2, aes(perc, fct_reorder(genes, perc), fill = role)) +
  geom_colh() +
  geom_text(aes(label = perc), hjust = 0)+
  scale_fill_npg() +
  coord_cartesian(xlim = c(0,100)) +
  theme_classic() +
  theme(legend.position = 'top')+
  labs(x = 'gene', y = 'percent')

Created on 2020-03-27 by the reprex package (v0.3.0)

data

df2 <- read_delim("genes;perc;role
GATA-3;7,9;confirmed in this cancer
CCDC74A;6,8;prognostic in this cancer
LINC00621;6,1;none
POLRMTP1;4,1;none
IGF2BP3;3,2;confirmed in this cancer", ";") %>% rownames_to_column("name")