2
votes

I'm creating a non-linear regression model to fit the data below

Plot

enter image description here

I keep on getting the error:

singular gradient matrix at initial parameter estimates

Can anyone explain what's wrong with my model? The equation I'm trying to write is y = (p1*logbasep2 (x))/(x^p3), where p1, p2, and p3 are constants.

p1 <- 1
p2 <- 1 
p3 <- 2


model1 = nls(ydata ~ ((p1*log(xdata, p3))/(xdata)^p2), start = list(p1 = p1, p2 = p2, p3 = p3))

Below is the sample output for xdata and ydata

  > dput(ydata)
c(2.675967443, 5.262229596, 2.756358345, 2.582628563, 2.578517983, 
2.572149035, 2.572149035, 2.419269246, 4.342393324, 4.57849526, 
2.414960542, 2.414960542, 2.414960542, 2.414960542, 2.655729394, 
5.205391565, 3.137641723, 2.503911184, 2.499359843, 2.499198257, 
2.498768034, 2.693594595, 5.231803091, 2.998312831, 2.520387095, 
2.518634129, 2.518634129, 2.518634129, 2.711184536, 5.229303652, 
3.003137243, 2.551123783, 2.516552812, 2.504450358, 2.484247615, 
2.581875759, 5.157438135, 3.310365728, 2.620786825, 2.458420168, 
2.436577883, 2.434535502, 2.606225185, 5.265676214, 2.71775484, 
2.61596361, 2.598126717, 2.598126717, 2.598126717, 2.803018082, 
4.934368949, 3.595430381, 3.031594421, 2.227695807, 2.207278748, 
2.200613613, 2.594364366, 5.215228585, 3.07169941, 2.694566482, 
2.511361391, 2.456389883, 2.456389883, 2.862120485, 5.202934582, 
3.056182323, 2.469690653, 2.469690653, 2.469690653, 2.469690653, 
2.437314286, 4.587186915, 4.302037827, 2.711703229, 2.346318322, 
2.308501078, 2.306938344, 2.30614524, 4.657971143, 4.158221237, 
2.943632973, 2.350070603, 2.296930829, 2.287027975, 2.531924554, 
5.071156271, 3.541488012, 2.65287316, 2.420471714, 2.391688, 
2.39039829, 2.477102765, 5.030773262, 3.642446383, 2.620965051, 
2.424021444, 2.402895805, 2.40179529, 2.584714789, 5.03335416, 
3.619673092, 2.583602564, 2.533903128, 2.326437301, 2.318314966, 
2.49144927, 4.897950266, 3.585821617, 3.227165648, 2.53767512, 
2.221395797, 2.038542282, 2.354867369, 4.95865857, 3.766909175, 
2.715186396, 2.382613432, 2.372757351, 2.449007707, 2.20573524, 
4.55514547, 3.91611881, 3.606189025, 2.303604277, 2.20810652, 
2.205100659, 3.300879888, 5.151795375, 2.75624017, 2.449071065, 
2.447337834, 2.447337834, 2.447337834, 2.528936269, 4.955034368, 
3.754254308, 2.751181588, 2.399415789, 2.308263059, 2.302914619, 
2.350317116, 4.873892721, 3.39391574, 2.606991064, 2.443820718, 
2.33106264, 2.33106264, 2.621925026, 5.267786769, 2.622588101, 
2.621925026, 2.621925026, 2.621925026, 2.621925026, 2.425160063, 
5.022138529, 3.663550495, 2.612718078, 2.425326541, 2.42594623, 
2.425160063, 2.625820509, 5.265415337, 2.713068882, 2.638650782, 
2.585780084, 2.586285083, 2.584979323, 2.508232606, 4.902729122, 
3.746937795, 3.015086226, 2.332707845, 2.267248424, 2.227057981, 
2.947719346, 5.098315798, 3.368997979, 2.39886785, 2.402312015, 
2.392233622, 2.39155339, 2.548810552, 4.525931048, 4.3760105, 
2.589251919, 2.429896804, 2.281201495, 2.248897682)
dput(xdata)
c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 
4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 
6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 
5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 
4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 
6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L)
1
You need better starting values. Also from looking at your equation, when xdata is one, then ydata will be zero, irrespective of parameter value., which may make things difficult. Are you able to shate your data : if so can you edit your question with the results of dput(ydata) and dput(xdata)user20650
ps what is logbasep2user20650
by logbasep2 i mean that i want the base for the logarithm to be a parameter as wellR. Joe
we appreciate the effort, but please please please post the results of dput() as text within your question, not as a screenshot (that format is more or less useless for anyone trying to replicate your problem ...)Ben Bolker
your dput output is incomplete ... (please include more than just the first three lines ...)Ben Bolker

1 Answers

1
votes

Your problem is actually more mathematical than statistical, as it turns out. The problem is that log(x,b1) and log(x,b2) differ only by a constant factor (log(x,b1) = log(x,b2)*log(b2,b1)). This means that your p1 and p3 parameters are redundant. So everything works if you leave out p3 (this is just one choice of the many reasonable ways to change the formula ...)

Using nls with the data argument is preferred, as it makes it easier to do things like predict new values:

dd <- data.frame(x=xdata,y=ydata)
model1 = nls(y ~ ((p1*log(x))/(x)^p2),
             start = list(p1 = 1, p2 = 1),
             data=dd)

Now predict:

xvec <- seq(1,7,length=101)
par(las=1,bty="l")  ## cosmetic
plot(y~x,data=dd)
lines(xvec,predict(model1,newdata=data.frame(x=xvec)))

enter image description here

Note that there are some issues with this fit (the line systematically misses the data at both the lower and the upper end) ...