15
votes

I continue to read the DBI/ODBC is faster than RODBC, so I tried as follows:

require(DBI);require(odbc)
con <- DBI::dbConnect(odbc::odbc(), dsn = 'SQLSERVER1', database = 'AcumaticaDB')

I can make a successful connection to the DSN, but the following query:

rs <- dbGetQuery(con, "SELECT * FROM inventoryitem")
dbFetch(rs)

gives me the following error:

Error in result_fetch(res@ptr, n, ...) : nanodbc/nanodbc.cpp:3110: 07009: [Microsoft][ODBC Driver 13 for SQL Server]Invalid Descriptor Index

What am I doing wrong ? Please, no RODBC solutions. Thanks!

7
Leaving out dbFetch(), does rs come through as a dataframe?Parfait
From the source and man page, dbGetQuery() ... calls 'dbSendQuery()', then 'dbFetch()', ensuring that the result is always free-d by 'dbClearResult()'.r2evans
rs <- dbSendQuery(con, "SELECT * FROM InventoryItem") : this works and creates rs as class <odbcresult> (it's not a dataframe) dbFetch(rs) : gives me the error: "Error in result_fetch(res@ptr, n, ...) : nanodbc/nanodbc.cpp:3110: 07009: [Microsoft][ODBC Driver 13 for SQL Server]Invalid Descriptor Index"user2948714
I have the same problem as you have and came to the conclusion that it has to be a bug. I wound up returning to RODBC instead which while a bit slower atleast works.ErrantBard
This is a known issue, see here.NGaffney

7 Answers

12
votes

I have also been struggling with this issue for several months. However, I have come across a solution that may help you as well.

In a nutshell, the issue occurs when certain text columns do not appear after integer/numeric columns. When the columns are not aligned properly in the query, an error of invalid index is thrown and your connection may freeze. The issue then is, how do I know what to put at the end of my query?

To determine this, one could typically examine a column using class() or typeof(). To examine such information from the database, you can use a query such as:

dbColumnInfo(dbSendQuery(con, "SELECT * from schema.table")) # You may not require the schema part...

This will return a table with a type field for every column in the data-set of interest. You can then use this table as an index to sort the select() statement. My particular difficulty is that the type field in the table was all numbers! However, I noticed that every column with a negative number, when placed at the end of the select statement, fixed my query and I could pull the whole table just fine. For example, my full solution:

# Create my index of column types (ref to the current order)
index <- dbColumnInfo(dbSendQuery(con, "SELECT * from schema.table"))
index$type <- as.integer(index$type) # B/c they are + and - numbers!

# Create the ref to the table
mySQLTbl <- tbl(con, in_schema("schema", "tablename"))

# Use the select statement to put all the + numbered columns first!
mySQLTbl %>%
  select(c(which(index$type>=0),
                 which(index$type<0)))

As for reason for why this occurs, I am not sure and I do not have the data access privileges to dig much deeper in my use-case

7
votes

I appreciate that this question was asked a long time ago, but I've managed to find a workaround. The answers above got me most of the way there. The problem I had was with columns of type nvarchar that had a CHARACTER_MAXIMUM_LENGTH in the schema table of -1, which I understand means they are the maximum length possible.

My solution was to lookup the relevant table in the INFORMATION_SCHEMA.COLUMNS table and then rearrange my fields appropriately:

require(DBI);require(odbc)
library(tidyverse)
con <- DBI::dbConnect(odbc::odbc(), dsn = 'SQLSERVER1', database = 'AcumaticaDB')

column.types <- dbGetQuery(con, "SELECT COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME='inventoryitem'")

ct <- column.types %>%
  mutate(cml = case_when(
    is.na(CHARACTER_MAXIMUM_LENGTH) ~ 10,
    CHARACTER_MAXIMUM_LENGTH == -1 ~ 100000,
    TRUE ~ as.double(CHARACTER_MAXIMUM_LENGTH)
    )
  ) %>%
  arrange(cml) %>%
  pull(COLUMN_NAME)

fields <- paste(ct, collapse=", ")
query <- paste("SELECT", fields, "FROM inventoryitems")

tbl(con, sql(query)) %>% head(5)
6
votes
rs <- dbGetQuery(con, "SELECT * FROM inventoryitem")
dbFetch(rs)

If inventoryitem table contains mix of long data/variable-length columns (eg. VARBINARY, VARCHAR) and columns of simple types (eg. INT), you can not query them in arbitrary order via ODBC.

Applications should make sure to place long data columns at the end of the select list.

Long data is retrieved from database using ODBC API call SQLGetData and it has to be retrieved after the other data in the row has been fetched.

These are known and documented ODBC restrictions

To retrieve long data from a column, an application first calls SQLFetchScroll or SQLFetch to move to a row and fetch the data for bound columns. The application then calls SQLGetData.

See https://docs.microsoft.com/en-us/sql/odbc/reference/develop-app/getting-long-data

5
votes

There is a workaround:

Reorder your SELECT statements such that longer datatypes (typically strings) are last.

If you have a complex query that is generated by dbplyr itself, then get the SQL query directly via show_query(). Copy-paste and modify the first SELECT statement such that long datatypes are last in the list. It should then work.

EDIT: in many cases it is possible to reorder the fields by adding this to the query:

%>% select(var1, var2, textvar1, textvar2)
1
votes

I certainly encountered this problem recently. Here is my solution. Basically you have to reorder columns based on the column information fetched from the database first. Columns could mix with positive and negative types. So sorting them with positive first, then negative will do the trick.

It works perfectly with my data when having "Invalid Descriptor Index" issue. Please let me know whether it works for you too.

sqlFetchData <- function(connection, database, schema, table, nobs = 'All') {

  #'wrap function to fetch data from SQL Server
  #
  #@ connection: an established odbc connection
  #@ database: database name
  #@ schema: a schema under the main database
  #@ table: the name of the data table to be fetched. 
  #@ nobs: number of observation to be fetched. Either 'All' or an integer number. 
  #        The default value is 'All'. It also supports the input of 'all', 'ALL' and
  #        etc. . 

  if (is.character(nobs)) {
    if (toupper(nobs) == 'ALL') {
      obs_text <- 'select'
    } else {
      stop("nobs could either be 'ALL' or a scalar integer number")
    }
  } else {
    if (is.integer(nobs) && length(nobs) == 1) {
      obs_text <- paste('select top ', nobs, sep = '')
    } else {
      stop("nobs could either be 'ALL' or a scalar integer number")
    }
  }

  initial_sql <- paste("select * from ", database, '.', schema, ".", table, 
                       sep = '')
  dbquery <- dbSendQuery(connection, initial_sql)
  cols <- dbColumnInfo(dbquery) 
  dbClearResult(dbquery)

  #' sort the rows by query type due to error message:
  #' Invalid Descriptor Index 

  colInfo <- cols
  colInfo$type <- as.integer(colInfo$type)
  cols_neg <- colInfo[which(colInfo$type < 0), ]
  cols_neg <- cols_neg[order(cols_neg[, 2]), ]
  cols_pos <- colInfo[which(colInfo$type >= 0), ]
  cols_pos <- cols_pos[order(cols_pos[, 2]), ]
  cols <- rbind(cols_pos, cols_neg)

  add_comma <- "c(cols$name[1], paste(',', cols$name[-1L], sep = ''))"

  sql1 <- paste(c(obs_text, eval(parse(text = add_comma))),
                collapse = ' ', sep = '')
  data_sql <- paste(sql1, ' from ', database, '.', schema, '.', table, 
                    sep = '')

  dataFetch <- dbGetQuery(connection, data_sql)[, colInfo$name]
  return(dataFetch)
}
1
votes

ODBC/DBI convert character variable data type in the database into 'ntext' while making the connection. So, you need to convert your character variables (say x) in the SQL string in R as CONVERT(varchar(100),x). Then dbGetQuery function should work.

0
votes

I got this error as a result of trying to load in a timestamp variable. Try removing any timestamp variables from your query.

Try the below or similar. Let me know what works and I'll update my post.

require(DBI);require(odbc)
con <- DBI::dbConnect(odbc::odbc(), dsn = 'SQLSERVER1', database = 'AcumaticaDB')

column.types = DBI::dbGetQuery( 
    con, 
    'SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = "inventoryitem"' 
))

sql = paste(c(
        'select ', 
        paste(column.types$COLUMN_NAME[column.types$DATA_TYPE != 'timestamp'], collapse = ', '), 
        ' from inventoryitem'
     ),
    collapse = ''
)

dbFetch(dbGetQuery(con, sql))