0
votes

I am having a dataset that contains 25 variables and 112095 observations.

I am trying to plot a graph with 5 variables.

Sample dataset

In the image, you could see that I am having date in 1 column, process types in column 2, column 4 with upper-limit and column 3 with lower limit, column 5 with measured value.

I would like to plot the measured value with respect to each process and draw the upper limit and lower limit in the line. in the sample data set i have shown only 3 process, but in reality i have 14 process and i want to display them on a single panel. below is the sample image,

Sample image Could anyone, help me how i could start with this ? I am new with R and ggplot.

EDIT: for 1 process, here is the sample graph, it should look like

Expected output

For eg from the graph, it is for 1 process, that the points inside with different colors are measure values, with the green line on the top and below as Lower and upper bound,and the different colored points indicate the different days(thur,fri, sat).

1
Hi, for your scatter plot, could you clarifying what you want as the x and y axis?whalea
@whalea I have added a Edit for one processJenny

1 Answers

1
votes

Using some random data

library(lubridate)
library(dplyr)
library(ggplot2)
df <- data.frame(date = as.Date(c("2018-05-04", "2018-05-06", "2018-09-04", "2018-09-07")),
                 process = c("P1", "P1", "P2", "P2"),
                 lower_bound = c(0.5, 0.5, 2.5, 2.5),
                 upper_bound = c(2.5, 2.5, 3.7, 3.7),
                 mv = c(1, 2, 3, 3.2)) %>%
  mutate(wd = wday(date))

ggplot(df) +
  geom_jitter(aes(x = wd, y = mv, col = as.factor(wd)), width = 0.1) +
  geom_line(aes(x = wd, y = lower_bound), colour = 'green') +
  geom_line(aes(x = wd, y = upper_bound), colour = 'green') +
  facet_wrap(~process, ncol = 3)

sample plot