3
votes

I have a very large dBase file (1.64Gb). It takes a very long time to load the whole file in R using the standard foreign::read.dbf() function. I would like to load only a few variables in the dataset. Does anyone have a solution ?

1

1 Answers

7
votes

I think the read.dbf(...) function in package foreign was intended for reading the *.dbf part of a shapefile, in which case reading in part of the file really doesn't make sense. You seem to want to do something different.

Using RODBC might work, depending on how your system is configured. If you're running Windows, and if you have the dBASE ODBC drivers installed, this will probably work for you (note: When you install MSOffice, it sets up a user dsn called "dBase Files", which should be accessible from RODBC. So if you have MSOffice installed, this should work...).

Important Note: This will only work if you are running a 32-bit version of R. This is because there are no 64-bit dBASE ODBC drivers. Generally, when you download 64-bit R you get both 32- and 64-bit versions, so it's just a matter of switching between them.

library(RODBC)
# setwd("< directory with your files >")
conn <- odbcConnect(dsn="dBASE Files")
df   <- sqlFetch(conn,"myTable",max=10)   # grab first ten rows
head(df)
#       LENGTH COASTLN010
# 1 0.02482170          1
# 2 0.01832134          2
# 3 0.03117752          3
# 4 0.04269755          4
# 5 0.02696307          5
# 6 0.05047828          6

sqlQuery(conn,"select * from myTable where LENGTH<0.008")
#       LENGTH COASTLN010
# 1 0.00625200        186
# 2 0.00634897        379
# 3 0.00733319       1583
# 4 0.00369786       1617
# 5 0.00722233       1618
# 6 0.00524176       1636

The example above is just meant to give you an idea of how to use RODBC. In this example, I have a file, myTable.dbf in the "directory with all your files", and this dbf has two columns, LENGTH and COASTLN010 (this file actually is part of a coastline shapefile, but that is irrelevant...).

If this doesn't work try:

 conn <- odbcConnectDbase("myTable.dbf")