2
votes

I have x-y data, which both have +/- errors (that are equal on each side). The type of data it is has normal distribution, on in both the x-y direction. At the moment we plot it either as typical x-y crosses, or using geom_rect(); but both of which have issues in demonstrating what the data represents. I am looking for a solution that would allow the each of the x-y data points to be represented as some sort of normal/Gaussian distribution (instead of just as +) as per my rough sketch below.

x-y plot with normal distributions for both errors

Below is an example data frame.

structure(list(Age = c(2003L, 1999L, 1995L, 1993L, 1993L, 1990L, 1988L, 1987L, 1985L, 1984L, 1983L, 1975L, 1974L, 1972L, 1963L, 1960L, 1959L, 1957L, 1953L, 1951L, 1951L, 1946L, 1940L, 1936L, 1930L, 1927L, 1919L, 1914L, 1906L, 1885L, 1864L, 1842L, 1830L, 1810L, 1803L, 1783L, 1762L, 1741L, 1720L, 1699L, 1678L, 1657L ), Age_error = c(1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 4L, 2L, 2L, 2L, 3L, 5L, 3L, 3L, 4L, 6L, 4L, 8L, 5L, 7L, 5L, 10L, 14L, 17L, 23L, 21L, 20L, 53L, 67L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L), Value = c(0, 0.07, 0, 0.09, 0.02, 0.06, -0.02, 0.154, 0.05, 0.02, -0.03, -0.024, -0.01, -0.06, -0.15, -0.04, 0.065, -0.1, -0.09, -0.02, -0.024, -0.11, -0.081, -0.13, -0.12, -0.07, -0.16, -0.122, -0.057, -0.18, -0.095, -0.105, -0.23, -0.19, -0.178, -0.267, -0.26, -0.158, -0.079, -0.218, -0.148, -0.193), Value_error = c(0.17, 0.143, 0.18, 0.18, 0.17, 0.19, 0.18, 0.163, 0.19, 0.18, 0.18, 0.142, 0.17, 0.18, 0.17, 0.17, 0.152, 0.17, 0.17, 0.17, 0.151, 0.17, 0.154, 0.17, 0.18, 0.26, 0.17, 0.144, 0.145, 0.18, 0.153, 0.153, 0.17, 0.18, 0.144, 0.155, 0.138, 0.141, 0.157, 0.14, 0.147, 0.137)), .Names = c("Age", "Age_error", "Value", "Value_error"), class = "data.frame", row.names = c(NA, -42L))

This is the sort of code I am using to just get a typical x-y error plot for this data frame.

ggplot() + geom_linerange(data=mydata, aes(y=Value, x=Age, xmin=Age-Age_error, xmax=Age+Age_error, ymin=Value-Value_error, ymax=Value+Value_error)) + geom_errorbarh(data=mydata, aes(y=Value, x=Age, xmin=Age-Age_error, xmax=Age+Age_error, ymin=Value-Value_error, ymax=Value+Value_error)) 

I haven't found a function yet to do x-y normal distribution type plots and there might not be one, but thought someone might have some ideas! Many thanks in advance.

2
why is it a cross and not a disc / ellipse (still with radial gradient)?baptiste
It could be an ellipse; but I guess I had sketched at a cross (by overlaying two ellipses) as the corners between the two ellipses would need to be kind of convex when I have sketched two bell curves over the top of each other. If I could plot the x error as one ellipse, and the y as another, it would be probably create this affect if I made it alpha=0.5 or similar. Is there a way to plot shaded ellipse for the symbols?ahsat

2 Answers

0
votes

Do you want a contour plot of Age versus Value as a 2d kernel density?

require(MASS)
dens <- with(dat, MASS::kde2d(Age, Value))
str(dens)
#-------------
List of 3
 $ x: num [1:25] 1657 1671 1686 1700 1715 ...
 $ y: num [1:25] -0.267 -0.249 -0.232 -0.214 -0.197 ...
 $ z: num [1:25, 1:25] 0.00152 0.00187 0.00226 0.00267 0.00312 ...
#--------------
# kde2d is designed for contour display: x-vector, y-vector, z-Matrix
 contour(dens)

Added the data points so the connection between the contour plot and the data was more visible:

 points(dat$Age, dat$Value, cex=0.3, col="red")

enter image description here

0
votes

If you need each Age,Value pair to have a +ve and -ve error, then I think you may be looking for smoothScatter function. This function plots densities of each point using a color scheme that fades as you get farther from the point.

smoothScatter(mydata$Age, mydata$Value)

Results in

enter image description here