1
votes

I've just started learning R and have run into a problem regarding graph construction.

I have a df where str(df) gives

Date : chr  
Hour : int  
Street 1: int  
Street 2: int  
..  
Street 15: int  

where the date is every day of the month, hour is every hour of the day and data for the streets shows the amount of traffic for the street for the hour of the day.

I want to make a bar chart on ggplot which shows the total amount of traffic for each street over the month to show the street with the heaviest traffic but when I try making the graph with ggplot, includes the hour data as well which ruins the graph.

I looked through the various questions that have already been asked on stack overflow and tried melting the data but either I did that incorrectly or it isn't suitable for my data as it still didn't work.
I was able to reach a very simplistic solution by doing:

df2 <- colSums(df[3:15], na.rm = TRUE)  
barplot(df2, las=2, xlab="Street", ylab="Amount of People", main="Pedestrian Traffic For January", cex.lab=0.75, ylim=c(0,1500000))  

But this graph is very basic and I can't modify the x-axis labels.

I also want to make a line graph showing the total amount of traffic per hour for a street but I think because there are multiple hour data values (such as data for hour 1 of 1/1 then hour 1 of 2/1, etc) the line graph does not show one line.

Edit:
head(df): There are more streets but for the sake of formatting, I only posted the data for the first 3 streets.

    Date       Hour     Street 1                Street 2            Street 3
1 01/01/2014    0          1544                   893                   404
2 01/01/2014    1          1401                   224                   179
3 01/01/2014    2           608                   127                    97
4 01/01/2014    3           360                   108                    74
5 01/01/2014    4           156                    75                    33
6 01/01/2014    5            69                    20                     8
1
Please post head(df)pogibas
Editted in head(df)TLo

1 Answers

0
votes

as I do not have the actual data you used I just generated a random data set.

require(tidyverse)

# Random Data
df <- data.frame(date = seq(31), hour = rep(seq(24), 31), Street1 = 
                                        rpois(24*31, 5), 
                                        Street2 = rpois(24*31, 10),
                                        Street3 = rpois(24*31, 15))

# Transform to long format
df %>%
  gather(key, value,-date, -hour) -> df

# Create bar chart
g <- ggplot(df, aes(x = as.factor(key), y = value))
g <- g + geom_bar(stat="identity")
g <- g + xlab("Street") + ylab("Pedestrian Traffic for January")
g

This code will output the following Plot: enter image description here

You can easily change the x-labels by changing the factor labels of the column "key".

The line chart can be created by the following code:

# Summarize the hourly data
df %>%
  group_by(date,key) %>%
  summarise(value = sum(value)) -> df

g <- ggplot(df, aes(x = date, y = value, color = key))
g <- g + geom_line()
g

This will output the following chart: enter image description here