1
votes

This is a sample of the dataset I'm working on where I'm running a pearson correlation test between variables step and z:

> head(datacorr)
  Date & Time [Local]  Latitude Longitude     step   x   y         z
1 2018-06-18 15:32:00 -2.436589  34.81398 4410.099  14  10  18.24621
2 2018-06-18 15:36:00 -2.438691  34.81222 4620.307  11  15  18.60108
3 2018-06-18 15:40:00 -2.438472  34.81164 4682.904 112 164 198.84468
4 2018-06-18 15:44:00 -2.437794  34.81141 4702.586  90 278 293.42787
5 2018-06-18 15:48:00 -2.437766  34.81177 4662.585  11   7  13.05272
6 2018-06-18 15:52:00 -2.437416  34.81284 4541.207  16   2  16.17849

I have no issues running the test and creating a basic plot() but I would like to have a more detailed visualization using ggscatter() from package ggpubr. Here's my script with its outputs:

> corre<-cor.test(datacorr$step, datacorr$z, method=c("pearson"))
> print(corre)

    Pearson's product-moment correlation

data:  datacorr$step and datacorr$z
t = -6.2382, df = 15021, p-value = 4.546e-10
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.06676964 -0.03487023
sample estimates:
       cor 
-0.0508329 

> plot(datacorr$step,datacorr$z)
> step<-datacorr$step
> activityz<-datacorr$z
> library("ggpubr")
> ggscatter(datacorr, x = step, y = activityz, 
+           add = "reg.line", conf.int = TRUE, 
+           cor.coef = TRUE, cor.method = "pearson",
+           xlab = "Step Length", ylab = "Activity Z")
Error in .check_data(data, x, y, combine = combine | merge != "none") : 
  Can't find the y elements in the data.

I used the ggscatter() code based on another post. Does anyone know why I keep having errors? I'm new to R but it looks to me like I'm correctly defining all of the arguments. Iy you have any alternatives on how to visualize pearson correlation tests in R (featuring line, r coefficient, p-value etc.), I'm open for suggestions.

Any help is appreciated!

1
ggscatter(datacorr, x = step, y = z, [...] ) ?heck1
I'm getting a "no found" Error using y = z. Have tried it before already with no luck, reason why I created object activityz : > plot(datacorr$step,datacorr$z) > library("ggpubr") > ggscatter(datacorr, x = step, y = z, + add = "reg.line", conf.int = TRUE, + cor.coef = TRUE, cor.method = "pearson", + xlab = "Step Length", ylab = "Activity Z") Error in ggscatter(datacorr, x = step, y = z, add = "reg.line", conf.int = TRUE, : object 'z' not foundjuansalix
you should pass x and y as strings like this ggscatter(datacorr, x = 'step', y = 'z', ....... DS_UNI
take a look at the examples int the documentation ggscatterDS_UNI
@DS_UNI Thank you that worked! By the way I added color="red" to change the color of the ñine. However, both line and plotted points turned red. Any idea on how to format the color of the line only?juansalix

1 Answers

3
votes

The examples in the documentation of the function ggscatter shows that you have to pass the x and y arguments as strings. The documentation also states (as an answer to your comment above) that you can use add.params to stylise the regression line.

Try this:

ggscatter(datacorr, x = 'step', y = 'z', 
          color = 'red',   # for the points
          add = "reg.line", 
          add.params = list(color = "blue", fill = "lightgray"),  # for the line
          conf.int = TRUE, 
          cor.coef = TRUE, cor.method = "pearson",
          xlab = "Step Length", ylab = "Activity Z")

Used data:

datacorr <- read.table(text = "Date Time  Latitude Longitude     step   x   y         z
1 2018-06-18 15:32:00 -2.436589  34.81398 4410.099  14  10  18.24621
2 2018-06-18 15:36:00 -2.438691  34.81222 4620.307  11  15  18.60108
3 2018-06-18 15:40:00 -2.438472  34.81164 4682.904 112 164 198.84468
4 2018-06-18 15:44:00 -2.437794  34.81141 4702.586  90 278 293.42787
5 2018-06-18 15:48:00 -2.437766  34.81177 4662.585  11   7  13.05272
6 2018-06-18 15:52:00 -2.437416  34.81284 4541.207  16   2  16.17849
", header = TRUE)