0
votes

I am using the most basic function of Boxplot, boxplot(x, ..., range = 1.5, but if I don't set the rang, and let R use its default value. Something like boxplot(x, ...,) what exactly quantile of the whiskers ? Because I have outliners that is larger or smaller than the upper/lower whiskers. How can I know the exact percentage of the outliners above or below the uper/lower whiskers? In other words, without setting the range, may I know what the percentage of the data is for the uper/lower whiskers?

2

2 Answers

1
votes

For example, you could calculate the percentage of utliers as follows:

# Some data with outliers:
d <- rnorm(100)
d[sample(1:100, 10)] <- rnorm(10,mean = 0, sd = 10)
bp <- boxplot(d)

# Get the values of the outliers:
out <- bp$out

# The proportion of outliers:
length(out)/length(d)*100
9
0
votes

Not entirely sure what your question is, but: ?boxplot says the default value of range is 1.5, and then it says

range: this determines how far the plot whiskers extend out from the box. If ‘range’ is positive, the whiskers extend to the most extreme data point which is no more than ‘range’ times the interquartile range from the box. A value of zero causes the whiskers to extend to the data extremes.

In other words, the whiskers are not defined as a proportion of the data, but as a multiple of the interquartile range.

If you want to know the proportions, you can use boxplot.stats:

set.seed(101)
x <- runif(100)
bb <- boxplot.stats(x)
 c(mean(x<min(bb$stats)),mean(x>max(bb$stats)))
## [1] 0 0

mean(<logical value>) is a shortcut for computing a proportion. Because I have chosen the data from a uniform distribution, there are actually no points beyond the whiskers (confirmed by looking at boxplot(x)). If I were to do re-do this with rcauchy() there would be lots ...