
I am attempting to create a surface plot of some randomized data I have, and I'm running into an issue where plot_ly is plotting the id column of the matrix.

Below is the code and a subsection of the random data.


random_data <- read_excel("Regression_Builder.xlsx", sheet = "Yield")
lm.O1 = lm(O1 ~ X1 + X2 + X3 + I(X1^2) + I(X3^2), data = random_data)

three_dims = data.frame(random_data$O1, random_data$X1, random_data$X2)
three_dims_mat = data.matrix(three_dims, rownames.force = NA)
#I saw a post that mentioned that using data.matrix can lead to issues and to instead use cbind.
#I attempted that and got the same results.

O1_surface = plot_ly(z = three_dims_mat[,1:3], type = "surface")
#I also tried with z = ~three . . . and also without the [,1:3]. Neither of these helped.


As you can see, there are three columns of data with one ID column. Additionally, the only column that gets anywhere near 5000 is the id column.

When I create the surface plot, I get this graph: Surface Plot

The x and y axes are definitely off, and it appears the y axis is simply the id column?

I'm very new to R, so I was really just following another page's instructions, which can be seen here: https://plotly.com/r/3d-surface-plots/

They don't seem to be doing anything differently from what I'm doing. The data they are using comes directly from plotly, and "volcano" is structured like a matrix similar to mine.

I appreciate any help you can give!

edit: someone asked for a sample of the data. Here's the first 20 data points.

    X1               X2             X3
1   -568.4093212    -306.6656476    35.08753966
2   -758.2562177    -310.9201146    32.64751489
3   -467.4339846    -364.0556644    34.09746155
4   -529.7232277    -310.837259     36.28913812
5   -535.9391621    -323.411462     39.75818106
6   -494.4654867    -386.835529     30.5269416
7   -490.3442684    -363.7089394    33.8776127
8   -392.6493419    -327.10129      31.22857484
9   -720.6745211    -339.3230459    35.09282461
10  -425.0705298    -324.8479801    32.0451123
11  -529.9568075    -317.8269927    35.48054421
12  -445.4251925    -422.9827843    34.80734687
13  -730.3447224    -307.6357161    33.58775347
14  -309.4192505    -434.2465323    29.17980084
15  -609.6549563    -382.4879761    31.16542379
16  -731.8211673    -345.8748154    32.76108565
17  -745.736109     -299.1330659    36.46136652
18  -589.5006466    -368.9677558    31.87794536
19  -655.5712467    -344.9485136    32.50361267
20  -536.5405239    -401.9952118    30.72522988

I hope that helps. Thanks!

Please inlcude a sample of your data.vestland

1 Answers


OK, so I figured out the root cause here.

As I feared, it was mostly a misunderstanding of how the plot_ly function works. I had assumed it would work similar to a 3D scatter plot, which in retrospect doesn't make much sense.

This function requires a matrix of height values only. The rows and columns of this matrix represent the X and Y "inputs" for the function you are trying to plot. I discovered this by playing around with a simple example:

L = cbind(c(1, 1, 1), c(1, 2, 1), c(1, 1, 1))
plot_l = plot_ly(z = L, type = "surface")

Which results in this:

Simple Surface Example

For anyone who runs across this problem and wants to see how I got the surface plot I was wanting, below is the code and plot. The plot is relatively anti-climatic, and the x, y axes are not technically correct, but it's a start!

X1_V = -476:-233
X2_C = 33.96
X3_V = 185:423

pred_func = function(x1, x3) predict(lm.O1, newdata = data.frame(X1 = x1, X2 = X2_C, X3 = x3))

O1_surface = plot_ly(z = pred_mat, type = "surface")

Accurate Surface Plot