1
votes

I am attempting to create a surface plot of some randomized data I have, and I'm running into an issue where plot_ly is plotting the id column of the matrix.

Below is the code and a subsection of the random data.

library(plotly)

random_data <- read_excel("Regression_Builder.xlsx", sheet = "Yield")
lm.O1 = lm(O1 ~ X1 + X2 + X3 + I(X1^2) + I(X3^2), data = random_data)

three_dims = data.frame(random_data$O1, random_data$X1, random_data$X2)
three_dims_mat = data.matrix(three_dims, rownames.force = NA)
#I saw a post that mentioned that using data.matrix can lead to issues and to instead use cbind.
#I attempted that and got the same results.

O1_surface = plot_ly(z = three_dims_mat[,1:3], type = "surface")
#I also tried with z = ~three . . . and also without the [,1:3]. Neither of these helped.
O1_surface

Data

As you can see, there are three columns of data with one ID column. Additionally, the only column that gets anywhere near 5000 is the id column.

When I create the surface plot, I get this graph: Surface Plot

The x and y axes are definitely off, and it appears the y axis is simply the id column?

I'm very new to R, so I was really just following another page's instructions, which can be seen here: https://plotly.com/r/3d-surface-plots/

They don't seem to be doing anything differently from what I'm doing. The data they are using comes directly from plotly, and "volcano" is structured like a matrix similar to mine.

I appreciate any help you can give!

edit: someone asked for a sample of the data. Here's the first 20 data points.

    X1               X2             X3
1   -568.4093212    -306.6656476    35.08753966
2   -758.2562177    -310.9201146    32.64751489
3   -467.4339846    -364.0556644    34.09746155
4   -529.7232277    -310.837259     36.28913812
5   -535.9391621    -323.411462     39.75818106
6   -494.4654867    -386.835529     30.5269416
7   -490.3442684    -363.7089394    33.8776127
8   -392.6493419    -327.10129      31.22857484
9   -720.6745211    -339.3230459    35.09282461
10  -425.0705298    -324.8479801    32.0451123
11  -529.9568075    -317.8269927    35.48054421
12  -445.4251925    -422.9827843    34.80734687
13  -730.3447224    -307.6357161    33.58775347
14  -309.4192505    -434.2465323    29.17980084
15  -609.6549563    -382.4879761    31.16542379
16  -731.8211673    -345.8748154    32.76108565
17  -745.736109     -299.1330659    36.46136652
18  -589.5006466    -368.9677558    31.87794536
19  -655.5712467    -344.9485136    32.50361267
20  -536.5405239    -401.9952118    30.72522988

I hope that helps. Thanks!

1
Please inlcude a sample of your data.vestland

1 Answers

0
votes

OK, so I figured out the root cause here.

As I feared, it was mostly a misunderstanding of how the plot_ly function works. I had assumed it would work similar to a 3D scatter plot, which in retrospect doesn't make much sense.

This function requires a matrix of height values only. The rows and columns of this matrix represent the X and Y "inputs" for the function you are trying to plot. I discovered this by playing around with a simple example:

L = cbind(c(1, 1, 1), c(1, 2, 1), c(1, 1, 1))
plot_l = plot_ly(z = L, type = "surface")
plot_l

Which results in this:

Simple Surface Example

For anyone who runs across this problem and wants to see how I got the surface plot I was wanting, below is the code and plot. The plot is relatively anti-climatic, and the x, y axes are not technically correct, but it's a start!

X1_V = -476:-233
X2_C = 33.96
X3_V = 185:423

pred_func = function(x1, x3) predict(lm.O1, newdata = data.frame(X1 = x1, X2 = X2_C, X3 = x3))

O1_surface = plot_ly(z = pred_mat, type = "surface")
O1_surface

Accurate Surface Plot