1
votes

Can someone explain why R does this? And the reason behind it?

"-1" < 0
#[1] TRUE
# expected [1] FALSE # OR better NA

"-abc" < 0
#[1] TRUE
# expected [1] FALSE # OR better NA

From ?Comparison:

If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw

This does not help either FWIW:

toString(-1) < 0
as.character(-1) < 0
toString("-abc") < 0
as.character("-abc") < 0

Am I wrong to expect a different result? I ask this because this seems to me something that could give unexpected results inside a function if not known.

1

1 Answers

3
votes

To quote the precedence rules you already quoted:

the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw

So in the expression:

"-abc" < 0

what is happening is that the 0 on the RHS is being coerced to character. This leaves us with:

"-abc" < "0"

This is lexicographically true (you may check for yourself). So the expression evaluates to true. Note that if the coercion had gone the other way, namely if R had tried to coerce "-abc" to a numeric type, then it would have resulted in NA, and the entire expression would have evaluated to NA, not true:

"-abc" < 0
NA < 0
NA

So, this is how we know that R is coercing the RHS to character.

A good rule of thumb in R (or SQL, Java, JavaScript, really any language) is to not mess around with types. If you know your data is numeric, then work with a numeric type and treat it as such, as vice-versa for character data.