0
votes

TL;DR: Is my function not vectorised? How can I vectorise a function calling other functions with many if & else statements? Many thanks!

EDIT: The function works when manually doing one tree at a time by hand in the console, or dplyr-style when using dplyr::rowwise

I'm writing a function containing a lot of if's and 'else's.

The function decides, depending on which arguments are supplied (or missing), on which function to use to calculate the volume of a tree.

I want to use it like this, dplyr-style, over a huge data sheet with many cases.

tree_data %>% mutate(volume = tree.volume(dbh.cm = dbh.cm, height.m= height, species.latin=species.latin, latitude= latitude)

It is common that crown.base.height.m and double.bark.mm are missing (there are volume functions without these arguments).

tree.volume <- function(dbh.cm,height.m,crown.base.height.m,double.bark.mm,species.latin,latitude){
  if(grepl("^Pinus", ignore.case=TRUE, species.latin)){
    pinus_spp_vol(species.latin = species.latin,
                  dbh.cm = dbh.cm,
                  height.m = height.m,
                  latitude = latitude,
                  double.bark.mm = double.bark.mm,
                  crown.base.height.m = crown.base.height.m)
  } else if(grepl("^Picea|^Abies",ignore.case = TRUE, species.latin)){
    picea_spp_vol(dbh.cm = dbh.cm,
                  height.m = height.m,
                  latitude = latitude,
                  crown.base.height.m = crown.base.height.m)
  } else if(grepl("^Larix", ignore.case = TRUE, species.latin)){
    larix_sibirica_vol_carbonnier_1954(dbh.cm = dbh.cm,
                                       height.m = height.m,
                                       double.bark.mm = double.bark.mm,
                                       crown.base.height.m = crown.base.height.m)
  }

}

I get the warning messages:

1: In if (grepl("^Pinus", ignore.case = TRUE, species.latin)) { :
the condition has length > 1 and only the first element will be used

2: In if (species.latin == "Pinus contorta") { : the condition has length > 1 and only the first element will be used

3: In if (dbh.cm >= 4.5) { : the condition has length > 1 and only the first element will be used

4: In if (latitude >= 60) { : the condition has length > 1 and only the first element will be used

I read this as the function not handling one row in my data.frame at a time... and thus believes all my rows are under the same "category" as the first row?

Attached: The first function from which the if statements causing the errors are read:

pinus_spp_vol <- function(species.latin, dbh.cm, height.m, crown.base.height.m, double.bark.mm, latitude){
  if(!missing(crown.base.height.m)){
    if(!missing(double.bark.mm)){
      if(species.latin!="Pinus contorta"){
        if(dbh.cm >= 4.5){ #Brandel, 1990 function 100-4. for Pines above 4.5 cm dbh.
          if(latitude >= 60){
            (10^-1.12715)*(dbh.cm^2.13211)*((dbh.cm+20)^-0.13543)*(height.m^1.58121)*((height.m-1.3)^-0.73435)*(crown.base.height.m^0.06595)*(double.bark.mm^-0.10998)
          } else {
            (10^-1.20042)*(dbh.cm^2.10263)*((dbh.cm+20)^-0.07366)*(height.m^1.99751)*((height.m-1.3)^-1.11357)*(crown.base.height.m^0.06420)*(double.bark.mm^-0.14963)
          }
        } else { #Andersson,1954 for small trees.
          if(latitude >= 60){
            0.22 + 0.08786 * (dbh.cm^2) + 0.03045 * (dbh.cm^2) * height.m + 0.002809 * dbh.cm * (height.m^2)
          } else {
            0.22 + 0.1066 * (dbh.cm^2) + 0.02085 * (dbh.cm^2) * height.m + 0.008427 * dbh.cm * (height.m^2)
          }
        }
      }
    }
  } else {
    if(species.latin== "Pinus contorta"){ #Eriksson 1973
      0.1121*(dbh.cm^2) + 0.02870*(dbh.cm^2)*height.m - 0.000061*(dbh.cm^2)*(height.m^2) - 0.09176*dbh.cm*height.m + 0.01249*dbh.cm*(height.m^2)
    } else {
      if(dbh.cm >= 4.5){#Brandel 1990, 100-01
        if(latitude >= 60){
          (10^-1.20914)*(dbh.cm^1.94740)*((dbh.cm+20)^-0.05947)*(height.m^1.40958)*((height.m-1.3)^-0.45810)
        } else {
          (10^-1.38903)*(dbh.cm^1.84493)*((dbh.cm+20)^0.06563)*(height.m^2.02122)*((height.m-1.3)^-1.01095)
        }
      } else { #Andersson,1954 for small trees.
        if(latitude >= 60){
          0.22 + 0.08786 * (dbh.cm^2) + 0.03045 * (dbh.cm^2) * height.m + 0.002809 * dbh.cm * (height.m^2)
        } else {
          0.22 + 0.1066 * (dbh.cm^2) + 0.02085 * (dbh.cm^2) * height.m + 0.008427 * dbh.cm * (height.m^2)
        }
      }
    }
  }
}
1

1 Answers

1
votes

You are confusing non-vectorized if () ... else ... with vectorized ifelse.

if () ... else ... takes a single (!!) condition and performs an action on whether the condition is TRUE or FALSE. If you instead pass a (logical) vector for the condition then only the first element of the vector is chosen to decide which action to perform. As a notfication R will raise a warning. However, besides of that you will in general get a wrong or non-expected result.

ifelse on the other hand is vectorized, i.e. both conditions are checked and actions are taken elementwise. For all TRUE elements the TRUE action is chosen and for all FALSE elements the FALSE action chosen.

This can best be seen by a simple example:

power2 <- function(x) x^2
power3 <- function(x) x^3

myfun_warning <- function(cyl) {
  if (cyl == 4) {
    power2(cyl)
  } else {
    power3(cyl)
  }
}

myfun_no_warning <- function(cyl) {
  ifelse(cyl == 4, power2(cyl), power3(cyl))
}

# Warning and incorrect result
myfun_warning(mtcars$cyl)
#> Warning in if (cyl == 4) {: the condition has length > 1 and only the first
#> element will be used
#>  [1] 216 216  64 216 512 216 512  64  64 216 216 512 512 512 512 512 512  64  64
#> [20]  64  64 512 512 512 512  64  64  64 512 216 512  64

# No warning and correct result
myfun_no_warning(mtcars$cyl)
#>  [1] 216 216  16 216 512 216 512  16  16 216 216 512 512 512 512 512 512  16  16
#> [20]  16  16 512 512 512 512  16  16  16 512 216 512  16