2
votes

For the iris data, we get the scatter plot using the pairs() function as below:

pairs(iris[1:4], 
      main = "Edgar Anderson's Iris Data", 
      lower.panel=panel.pearson, 
      pch = 21, 
      bg = c("red", "green3", "blue")[unclass(iris$Species)])

With the function panel.pearson defined as follows:

panel.pearson <- function(x, y, ...) { horizontal <- (par("usr")[1] + par("usr")[2]) / 2; vertical <- (par("usr")[3] + par("usr")[4]) / 2; text(horizontal, vertical, format(abs(cor(x,y)), digits=2)) }

I needed to convert the lower panel to correlation matrix and remove the labels from the diagonal and put them along the right and bottom axes. I tried the following:

pairs(iris[1:4], 
      main = "Edgar Anderson's Iris Data", 
      labels=NULL, 
      lower.panel=panel.pearson, 
      xaxt='n', 
      yaxt='n', 
      pch = 21, 
      bg = c("red", "green3", "blue")[unclass(iris$Species)])

This gives me what I need. Except that I do not understand how to get the labels on the bottom and right axes (the variable labels, I mean, Sepal.Length, Sepal.Width etc..). Any help is tremendously appreciated. Thanks!

1
Could you please provide reproducible code? My R doesn't know panel.pearson.Stephan Kolassa
Thanks Stephan. I missed the function and have added it along. Best!Nerdstat

1 Answers

1
votes

Is this what you had in mind?

# Horizontal axis
text(seq(.2, 2, length.out=4), 0,
     c("Sepal Length","Sepal Width","Petal Length","Petal Width"),
     xpd=TRUE, adj=c(0,.5), cex=.9)

# Vertical axis
text(0, seq(0.35, 2.05, length.out=4),
     rev(c("Sepal Length","Sepal Width","Petal Length","Petal Width")),
     xpd=TRUE, adj=c(0.5, 0), 
     srt=90,  # rotates text to be parallel to axis
     cex=.9)

I positioned the labels by trial and error. There's probably a better way, but at least this gets the labels in (nearly) the right place.

Update: A new SO question gave me an idea for a slightly better way to position the axis labels. As the linked answer points out, you can get the current coordinates of the plot area with par('usr'). So here's an update to the code, based on that:

x.coords = par('usr')[1:2]
y.coords = par('usr')[3:4]

# Offset is estimated distance between edge of plot area and beginning of actual plot
x.offset = 0.03 * (x.coords[2] - x.coords[1])  
xrng =  (x.coords[2] - x.coords[1]) - 2*x.offset
x.width = xrng/4  

y.offset = 0.028 * (y.coords[2] - y.coords[1])
yrng =  (y.coords[2] - y.coords[1]) - 2*y.offset
y.width = yrng/4  

# seq function below calculates the location of the midpoint of each panel

# x-axis labels
text(seq(x.coords[1] + x.offset + 0.5*x.width, x.coords[2] - x.offset - 0.5*x.width,
         length.out=4), 0,
     c("Sepal Length","Sepal Width","Petal Length","Petal Width"),
     xpd=TRUE,adj=c(.5,.5), cex=.9)

# y-axis labels
text(0, seq(y.coords[1] + y.offset + 0.5*y.width, y.coords[2] - 3*y.offset - 0.5*y.width, 
     length.out=4),
     rev(c("Sepal Length","Sepal Width","Petal Length","Petal Width")),
     xpd=TRUE, adj=c(0.5, 0.5), 
     srt=90,  # rotates text to be parallel to axis
     cex=.9)

It's still not ideal, because the size of the offset is determined by trial and error. If someone knows how R determines the size of the offset between the boundary of the plot area and where the actual plot begins, then the offset can be determined programmatically also.