How would you handle breaks for a dotPlot when you have serious outliers:
I cannot transform the data to log or anything like that.
library(mosaic)
n=300
r =c(seq(1,15,1))
binwidth = 1
outliers= c(100,400,800,700)
#outliers= c(15,14,3,5)
dat = c(sample(r ,n= 1,size = n, replace = TRUE),outliers)
quantile(dat)[4]+1.5* IQR(dat)
n=n+4
brks = c(seq(0,sd(dat)*2,binwidth),tail(seq(0,sd(dat)*2,binwidth),1)+binwidth,tail(seq(0, max(dat),binwidth),1)+binwidth)
d = data.frame( x = dat, color = c(rep("red",n/2), rep("green",n/2)))
dotPlot(d$x, breaks = seq(min(d$x)-binwidth,max(d$x)+binwidth,binwidth), cex = .5)
If you run that code you will see 4 outliers that make the plot unreadable. How would you deal with that?
Right now the breaks go from the min to the max of the d$x by the binwidth but I think that some of those empty bins should be removed. What logic would you use to remove those bins? Bins that are over 2 standard deviations and are empty then remove them? Can you give example code?
Any idea how to create your own dot plot without using dotPlot() or dotplot().
i have the data in the "dat" dataframe below
##### HERE CAN I CREATE MY OWN DOT PLOT?
library(qdapRegex)
binwidth = 1
t = table(cut(dat, seq(0,max(dat)+1,binwidth) ))
r_names =rownames(t)[t>0]
r_names = as.numeric(rm_between(r_names, ',', ']', extract=TRUE))
dat =data.frame(bin = r_names, data = t[t>0])
dat #can you turn this into a dot plot where the x-axis ONLY consists of the bin column. i.e. no space between 15 and 100?
Thank you.