How to connect hbase with Spark

Question

I want to load data from hbase and then proceed them using Spark ! I use Spark 2.0.2 on google cloud and hbase 1.2.5

On the internet, I have found some examples that use JavaHBaseContext but i don't know where to find this class because i don't have any jar file hbase called hbase-spark ?

And I have found this code too, that use HBaseConfiguration and ConnectionFactory to make connection with hbase database:

    Configuration conf = HBaseConfiguration.create();
    conf.addResource(new Path("/etc/hbase/conf/core-site.xml"));
    conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));
    conf.set(TableInputFormat.INPUT_TABLE, tableName);

    Connection connection = ConnectionFactory.createConnection(conf);

    Admin admin = connection.getAdmin(); 
    Table tab = connection.getTable(TableName.valueOf(tableName));
    byte [] row = Bytes.toBytes("TestSpark");
    byte [] family1 = Bytes.toBytes("MetaData");
    byte [] height = Bytes.toBytes("height");
    byte [] width = Bytes.toBytes("width");

    Put put = new Put(row);
    put.addColumn(family1, height, Bytes.toBytes("256"));
    put.addColumn(family1, width, Bytes.toBytes("384"));

    tab.put(put);

But I get an error about the Connection connection = ConnectionFactory.createConnection(conf); that is :

error: unreported exception IOException; must be caught or declared to be thrown Connection connection = ConnectionFactory.createConnection(conf);

Can any of you tell me how to do load data from hbase to be proceed using Spark ?

PS : I program Java

hbase-spark.jar is the (emerging) standard HBase plugin for Spark, that was contributed by Cloudera and is available (a) in the CDH distro, (b) as an additional JAR for other distros using HBase 1.x, or (c) natively in HBase 2.x -- see blog.cloudera.com/blog/2014/12/… and blog.cloudera.com/blog/2015/08/… — Samson Scharfrichter
Thre's also shc promoted by HortonWorks, as a Spark package docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/… and repo.hortonworks.com/content/repositories/releases/com/… — Samson Scharfrichter

JFPicard JFPicard · Accepted Answer · 2017-07-20T19:31:02

The error you've mentionned is related to the fact that Connection connection = ConnectionFactory.createConnection(conf); can thow an error. Like the message says, you must surrond it with try ..catch:

try {    
    Connection connection = ConnectionFactory.createConnection(conf);
}
catch (Exception e) //Replace Exception with the exception thown by ConnectionFactory 
{
... Do something.
}

How to connect hbase with Spark

1 Answers