0
votes

The contents of csv file is given below: Data set with details of an automobile

Here the column horsepower is character by default. When I applied range function in horsepower as :

    sapply(Auto[,4],range)

The following error message appers:

    Error in Summary.factor(17L, na.rm = FALSE) : 

‘range’ not meaningful for factors

So I tried to covert the character to numeric as:

   as.numeric(as.character(Auto$horsepower))

This results in the warning message:

   NAs introduced by coercion 

After the above step also I am not able to apply the range function. How to use range function in horsepower column ? Please note that data set contains a character '?' in horsepower column line number 127.

2
As an fyi - it’s best to avoid using images of code/data and here’s why. Note that you can quickly get your data out of your R session and onto SO by calling dput(my_df) and copy/pasting the result. If your data are large, do dput(head(my_df)). - DanY

2 Answers

1
votes

The underlying issue here is that horsepower was converted to a factor when the CSV file was read into R. This is due to the presence of the ? character.

You can avoid this using e.g.

Auto <- read.csv("myfile.csv", 
                 stringsAsFactors = FALSE, 
                 na.strings = "?") 
1
votes

You need this:

range(as.numeric(as.character(Auto$horsepower)), na.rm=TRUE)

If you want to convert a numeric-looking factor to an actual numeric, it is correct to use as.numeric(as.character()). For you, this introduces NAs because you have values like "?" in the column for horsepower and R doesn't know how to turn a "?" into a numeric, so it turns it into an NA.

Now, you can calculate the range, but you need to tell range to "skip" the NAs with the argument na.rm=TRUE.