1
votes

I have to perform a nonlinear multiple regression with data that looks like the following:

ID    Customer   Country   Industry      Machine-type    Service hours**
1     A          China     mass          A1              120
2     B          Europe    customized    A2              400
3     C          US        mass          A1               60
4     D          Rus       mass          A3              250
5     A          China     mass          A2              480
6     B          Europe    customized    A1              300
7     C          US        mass          A4              250
8     D          Rus       customized    A2              260
9     A          China     Customized    A2              310
10    B          Europe    mass          A1              110
11    C          US        Customized    A4               40
12    D          Rus       customized    A2              80

Dependent variable: Service hours Independent variables: Customer, Country, Industry, Machine type

I did a linear regression, but because the assumption of linearity does not hold I have to perform a nonlinear regression.

I know nonlinear regression can be done with the nls function. How do I add the categorical variables to the nonlinear regression so that I get the statistical summary in R?

Column names after adding dummies: table with dummies

ID  Customer.a  Customer.b  Customer.c  Customer.d  Country.China   Country.Europe  Country.Rus Country.US  Industry.customized industry.Customized Industry.mass   Machine type.A1 Machine type.A2 Machine type.A3 Service hours
1 1 0 0 0 1 0 0 0 0 0 1 1 0 0 120 
2 0 1 0 0 0 1 0 0 1 0 0 0 1 0 400 
3 0 0 1 0 0 0 0 1 0 0 1 0 0 1 60 
4 0 0 0 1 0 0 1 0 0 0 1 1 0 0 250 
5 1 0 0 0 1 0 0 0 1 0 0 0 0 1 480 
6 0 1 0 0 0 1 0 0 0 1 0 1 0 0 300 
7 0 0 1 0 0 0 0 1 0 0 1 0 0 1 250 
8 0 0 0 1 0 0 1 0 1 0 0 0 1 0 260 
9 1 0 0 0 1 0 0 0 0 0 1 0 1 0 210 
10 0 1 0 0 0 1 0 0 1 0 0 0 1 0 110 
11 0 0 1 0 0 0 0 1 0 0 1 0 0 1 40 
12 0 0 0 1 0 0 1 0 0 0 1 1 0 0 80
1
Not sure if the function you are using can take factor variables or you may need to create dummy variables. Have a look at the dummies package - Sam
Hi, Thank you for your answer so quick! Yes, I used the dummies package. So now I have multiple dummies, but how can I put those dummies into a nonlinear function in which the statistical summary results? - Yannick
> datadum <- dummy.data.frame(Map1, sep = ".") > names(datadum) [1] "ID" "Customer.a" "Customer.b" [4] "Customer.c" "Customer.d" "Country.China" [7] "Country.Europe" "Country.Rus" "Country.US" [10] "Industry.customized" "Industry.Customized" "Industry.mass" [13] "Machine type.A1" "Machine type.A2" "Machine type.A3" [16] "Service hours" - Yannick
these are the dummies I get. I would like to add them to a nonlinear regression. Thank you in advance for helping me! - Yannick
Can you update the data frame in the question please - Sam

1 Answers

0
votes

The way to handle categorical predictors is dependent on the number of levels the predictor can hold.

For predictors such as gender which can only take 2 forms (male or female), you can simply represent them as a binary (1,0) variable.

For predictors with greater than 2 levels, we use 1-of-k dummy encoding where k is the number of levels the particular variable takes. See the dummies package for useful functions!

After this, you can fit the model using formula:

nls(Service.hours ~ predictor1 + predictor2 + predictorN, data = df)