0
votes

I have a dataframe consisting of car models in countries with an associated value which looks like this

Car      Country      Value
Audi A6  US           23
Audi A6  UK           12
Audi A6  DE           19
BMW X5   UK           8
BMW X5   DE           5
etc

Now, I want to make a histogram of the Values column and I also want the colour of the bars indicating whether there are a large amount of Audi A6 models in this bar for example.

I know how to make a histogram using ggplot:

qplot(beta_0jk[data$Value], 
  geom="histogram", fill=I("lightblue"))

But does someone know how I can let the colour depend on the Car or Country columns in this dataframe? Or does someone know a different way than a histogram for visualizing this?

2
Maybe a bar plot?Rui Barradas
that is indeed a good alternative, but do you maybe know how to get the y-axis of the barplot to equal counts just like in a histogram?Activation

2 Answers

1
votes

Firt of all I would seriously recommend looking up cheat sheets for R which are very conveniently placed here

I'm personaly more used to write full version of ggplot function because it's more clear when you're getting more familiar with this libary.

Problem
First you need to understand the idea behind HISTOGRAMS, histograms works when you don't have value and want to calculate quantity or density of some characteristics. In your case you just need simple dots to represent values you already have in your data frame. It's easy to do with some understanding of ggplot.

Aesthetics
When you use ggplot() function it takes some basic arguments.

ggplot(data = NULL, mapping = aes(), ..., environment = parent.frame())  

Data you provide is just whole beta_0jk dataframe. The mapping corresponds to the elements you define by your columns and so you would need to specify them:

x - something to group by your values, I would say you would want "Car" here to specify model
y - that should be clear - "Value" is variable you measure so you chose it to represent y axis value
col - it's again GROUP, but it works differently than x - it makes different colours for every group you specify. To use it you have to make sure your column is factor

Implementation

ggplot2::ggplot(beta_0jk,ggplot2::aes(
  x = Car,
  y = Value,
  col = Country)
) + geom_jitter()

Start from this and use ggplot2 cheat sheet to make your desirible result because to be honest I don't know what do you excatly want to show. I also recommend looking up dplyr and tidyr libraries

1
votes

Is this what you are looking for? To have all bars of the same width I had to fill the data with an extra row, since there is no Country == 'US' when Car == 'BMW X5'. The data preparation pipe %>% was completely inspired in this answer.

library(tidyverse)
library(ggplot2)


data %>% 
  spread(key = Car, value = Value, fill = NA) %>% 
  gather(key = Car, value = Value, -Country) %>% 
  ggplot(aes(x = Car, y = Value, fill = Country)) +
  geom_col(position = position_dodge())

Data.

data <- read.table(text = "
Car      Country      Value
'Audi A6'  US           23
'Audi A6'  UK           12
'Audi A6'  DE           19
'BMW X5'   UK           8
'BMW X5'   DE           5
", header = TRUE)