I ran across a database query written in R that runs against a mapR data store using Apache Drill driver. Due to a performance ceiling with my program of about 700,000 rows, I'm looking into using a different DB situation than SQL.
This question is about using R to query SQL and store it in the working environment. I generalized it to just say SELECT * FROM ... for the sake of this question.
Say you're running a three node MapR cluster, and execute a SQL query against the database using R, will the query return results faster because it's MapR or would a single RDBMS perform the same?
library(RODBC)
# initialize the connection
ch <- odbcConnect("drill64")
# run the query
df = sqlQuery(SELECT * FROM state)
#Code to write output to file
# close the connection so we don't get a warning at the end
odbcClose(ch)
Performance wise, is this the same as using odbcConnect("RMySQL") or some similar MySQL library?