1
votes

I am using the dismo package in R and performing a species distribution model. There are a few points that are in the wrong location that I would like to remove. I used the identify tool which tells me the number of the point in the occurrence data, but I'm not sure what code to use to delete these points. One of the points I want to remove is 302339 for example.

sparrow= gbif("ammodramus", "maritimus*", geo=FALSE)
sparrow
names(sparrow)
dim(sparrow)
sparrow <- subset(sparrow, !is.na(lon) & !is.na(lat))
library(maptools)
data(wrld_simpl)
plot(wrld_simpl, xlim=c(-40,-10), ylim=c(20,100), axes=TRUE, col="light      yellow")
points(sparrow$lon, sparrow$lat, col="orange", pch=20, cex=0.75) 
points(sparrow$lon, sparrow$lat, col="red", cex=0.75)
identify(sparrow$lon, sparrow$lat)
1
You probably only need to make one of lat or lon to be a NA. Try sparrow[ 302339, 'lat'] <- NA. Generally graphics functions just ignore items with coordinates that are NA. I don't suppose you could find a location with fewer sparrows? That example takes a longggggg time to load. - IRTFM
Well it's not a coordinate, it's the number associated with an individual. So say it's bird 302339 that I want to remove from the dataset. And unfortunately I have to use this area, I know it sucks to load. - user5899223
I'll see your "well" and raise you one more. I'm unsure what you mean by "it's not a coordinate". I was showing you how to address a value specified by the i-value given to "[" as the rowname and the j-value given as a column name. When you type sparrow[ 30439, 'lat'], you get [1] 57.2833, which is clearly the outlier in the plot that (eventually) appeared. When I followed my own advice that point (off the Newfoundland coast) disappears when plotting is repeated. - IRTFM
I see now. Sorry I'm still fairly new to R and I misunderstood your previous comment and how to use the code. Thanks for your help and advice! - user5899223

1 Answers

0
votes

To address that value you give the i-value given to "[" as the rowname and the j-value is given as a column name: sparrow[ 30439, 'lat'], and you get [1] 57.2833, which is clearly the outlier in the plot that (eventually) appeared. When I followed my own advice that point (off the Newfoundland coast) disappears when plotting is repeated.

So you then execute:

sparrow[ 302339, 'lat'] <- NA

... and replot, that point disappears. You could also have executed:

sparrow <- sparrow[ -'302339', ]  # which removes that line 

Note that the use of minus signs with character values succeeds in the i-position for [.data.frame<- but will not succeed in the j-position for removal of columns. And also note that rownames are actually character values which are matched as character despite being displayed as if they were numeric. After removing that anomaly you see that the last 6 rows have these identifiers:

 rownames(tail(sparrow))
[1] "32855" "32904" "32951" "32953" "32972" "32993"