0
votes

I am trying to learn R and I am having problems with the way it works. I tried to make an entropy function of variables p and 1-p from scratch and I am having problems when I try to add some ifs to avoid the NaN when dividing by 0.

When I try the custom entropy with the plot, it just works but it shows the NaN when I print the results. But when I try to add the ifs, then it says:

Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ

entropy <- function(p){

    cat("p = " , p)

    if (p==0 || p==1) {

       result = 0

    }else{

        result = - p*log2(p)-(1-p)*log2((1-p))

    }

    cat("\nresult=",result)

    return(result) 


}

p <- seq(0,1,0.01)

plot(p, entropy(p), type='l', main='Funcion entropia con dos valores posibles')

I don't understand it since I am using a plot of an array as x and a function with that array as parameter as y, so it should be the same lengths with and without ifs.

Console without the ifs:

p = 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.2 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.3 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.4 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7 0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.8 0.81 0.82 0.83 0.84 0.85 0.86 0.87 0.88 0.89 0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1

result= NaN 0.08079314 0.1414405 0.1943919 0.2422922 0.286397 0.3274449 0.3659237 0.4021792 0.4364698 0.4689956 0.499916 0.5293609 0.5574382 0.5842388 0.6098403 0.6343096 0.6577048 0.680077 0.7014715 0.7219281 0.7414827 0.7601675 0.7780113 0.7950403 0.8112781 0.8267464 0.8414646 0.8554508 0.8687212 0.8812909 0.8931735 0.9043815 0.9149264 0.9248187 0.9340681 0.9426832 0.9506721 0.958042 0.9647995 0.9709506 0.9765005 0.9814539 0.985815 0.9895875 0.9927745 0.9953784 0.9974016 0.9988455 0.9997114 1 0.9997114 0.9988455 0.9974016 0.9953784 0.9927745 0.9895875 0.985815 0.9814539 0.9765005 0.9709506 0.9647995 0.958042 0.9506721 0.9426832 0.9340681 0.9248187 0.9149264 0.9043815 0.8931735 0.8812909 0.8687212 0.8554508 0.8414646 0.8267464 0.8112781 0.7950403 0.7780113 0.7601675 0.7414827 0.7219281 0.7014715 0.680077 0.6577048 0.6343096 0.6098403 0.5842388 0.5574382 0.5293609 0.499916 0.4689956 0.4364698 0.4021792 0.3659237 0.3274449 0.286397 0.2422922 0.1943919 0.1414405 0.08079314 NaN

Console with the ifs:

p = 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.2 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.3 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.4 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7 0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.8 0.81 0.82 0.83 0.84 0.85 0.86 0.87 0.88 0.89 0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1 result= 0Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ

3

3 Answers

1
votes

You did not create a vector but a scalar since you did not used a vectorized functionality in you if else clause. The result of your function has been just one number. This should work:

entropy <- function(p){

  # initialize a vector of the desired length with zeros
  result <- numeric(length(p))

  # subset the vector for which you want to apply your formula on
  x <- p[!(p %in% c(0,1))]

  # overwrite only those positions for which you want to calculate values based
  # on your formula
  result[!(p %in% c(0,1))] <- - x*log2(x)-(1-x)*log2((1-x))


  #cat("\nresult=",result)

  return(result) 


}

p <- seq(0,1,0.01)

plot(p, entropy(p), type='l', main='Funcion entropia con dos valores posibles')
0
votes

EDIT:

Even tho I was suggested to do it vectorizing it, I wanted to do it somewhat similar to other languages I know for the moment, since I am starting. I was able to fix it, althought I ended up using a for and printing 2 arrays instead of the function itself.

entropy <- function(p){

    if (p==0 || p==1) {

       result = 0

    }else{

        result = - p*log2(p)-(1-p)*log2((1-p))

    }

    return(result) 


}

x <- seq(0,1,0.01)
y <- numeric(length(p))
i = 1

for (p in x) {
    y[i] = entropy(p)
    cat(x[i],"=",y[i],"\n")
    i=i+1
}

plot(x, y, type='l', main='Funcion entropia con dos valores posibles')
0
votes

I just applied your entropy function to the p vector prior to trying to plot it using the sapply function.

entropy <- function(p){
  
  cat("p = " , p)
  
  if (p==0 || p==1) {
    
    result = 0
    
  }else{
    
    result = - p*log2(p)-(1-p)*log2((1-p))
    
  }
  
  cat("\nresult=",result)
  
  return(result) 
  
}

p <- seq(0,1,0.01)

# Apply the function over all the values of 'p'
entropy_p <- sapply(p,FUN = entropy)

plot(p, entropy_p, type='l', main='Funcion entropia con dos valores posibles')