1
votes

It may be an easy question but I am still a beginner in r.

I need to calculate the correlation coefficient between each two numerical variables of the three columns in my dataframe and plot them.

I want to have between columns 2 & 3, columns 2 & 4 and finally between columns 3 & 4.

enter image description here

Thanks a lot in advance.

1

1 Answers

2
votes

You could use the following code: I recreated the first 3 rows of your data set and put them in a data frame called "mydata"

cname <- c("Albania", "Argentina", "Australia")
economic_growth_rate <- c(75.67, 6.87, 24.22)
ave_HDI_rate <- c(8.69, 7.03, 3.61 )
ave_raw_EPI_growth_percentage <- c(16.61, -12.39, -1.77)
mydata <- data.frame(cname, economic_growth_rate, ave_HDI_rate,  ave_raw_EPI_growth_percentage)

cor(mydata[ , 2:4])

This results in a correlation matrix.

The last line in the above code selects column 2 up to 4 from the dataset mydata and passes it to the function cor.

you could render a barplot like this:

cordf <- cor(mydata[ , 2:4])
barplot(cordf[,1])

type in the console for more info:

?cor
?barplot

or you could look at the packages corrgram and corrplot