Normally when you want to plot a variable vs another, you just supply the variable names and that's all cool. If the variable you want is the result of a computation, you can add that as a column to your data.frame or data.table and then use it. However this creates a lot of junk/redundant data if you have big data frames and need to plot these new columns just once. So I am essentially trying to find a way to use functions on variables instead.
I'll try to illustrate that with an example:
data(iris)
ggboxplot(iris, x="Species", y="Sepal.Width", add = "jitter")
Will plot the sepal width for different species of iris flowers. However if you want to a custom function on a variable, e.g:
ggboxplot(iris, x=round("Sepal.Length"), y="Sepal.Width", add = "jitter")
Error in round("Sepal.Length") :
non-numeric argument to mathematical function
This makes sense, since the function doesn't know that the text in quote refers to a variable.
Note that I have been using the ggpubr
package for prettier plots, but I think the problem is essentially further down in ggplot2
ggplot(data = iris, aes(x=floor(Sepal.Length), y=Sepal.Width)) + geom_boxplot()
Warning message:
Continuous x aesthetic -- did you forget aes(group=...)?
One way to bypass this is to override the aes mapping, however this results in a slightly weird x-axis
ggplot(data = iris, aes(y=Sepal.Width, x=Sepal.Length)) + geom_boxplot(mapping = aes(group=floor(Sepal.Length)))
I am thinking there has to be a simpler way to get this done, any advice? I would ideally like to keep using ggboxplot()
from ggpubr
package, but if it can't be done there I can consider using the ggplot2
alone.
ggplot2
at least, the problem (as the error suggests) is that you are trying to plot a continuous x-variable in a boxplot, which takes categorical x. Another way to bypass this is by converting x to a factor:x = factor(floor(Sepal.Length))
. That is essentially what the group argument is doing, but you don't get the "weird" axis because each integer is a category. Of course, if you don't have sequential integers (e.g.c(4, 5, 20, 40)
) the axis will not be in scale (numerically-wise, let's say). – Gabriel Silva