1
votes

If I ssh onto the master node of my presto emr cluster, I can run queries. However, I would like to be able to run queries from java source code on my local machine that connects to the emr cluster. I set up my presto emr cluster with default configurations.

I have tried port forwarding, but it still does not seem to work. When I create the connection, I print it out and it is "com.facebook.presto.jdbc.PrestoConnection@XXXXXXX" but I still have doubts if it is actually connected since I can't execute any queries and it always times out.

import java.net.URI;
import java.net.URISyntaxException;
import java.sql.*;

public class PrestoJDBC {
    // JDBC driver name and database URL
    static final String JDBC_DRIVER = "com.facebook.presto.jdbc.PrestoDriver";
    //static final String JDBC_DRIVER = "com.teradata.presto.jdbc42.Driver";

    static final String DB_URL = "jdbc:presto://ec2-XX-XX-XXX-XX.us-east-2.compute.amazonaws.com:8889/hive/default";

    //  Database credentials
    static final String USER = "hadoop";
    static final String PASS = "";

    public static void main(String[] args) throws URISyntaxException {
        Connection conn = null;
        Statement stmt = null;
        try{
            //STEP 2: Register JDBC driver
            Class.forName(JDBC_DRIVER);

            //STEP 3: Open a connection
            //conn = DriverManager.getConnection(DB_URL,USER,PASS);
            conn = DriverManager.getConnection(DB_URL,USER,PASS);
            //STEP 4: Execute a query

            stmt = conn.createStatement();
            System.out.println(conn);
            String sql;
            sql = "select * from onedaytest where readOnly=true;";
            ResultSet rs = stmt.executeQuery(sql);

            //STEP 5: Extract data from result set
            while(rs.next()){

                //Display values
                System.out.println(rs.getString(3));
            }
            //STEP 6: Clean-up environment
            rs.close();
            stmt.close();

            conn.close();
        }catch(SQLException se){
            se.printStackTrace();
        }catch(Exception e){
            e.printStackTrace();
        }finally{
            try{
                if(stmt!=null)
                    stmt.close();
            }catch(SQLException se2){
            }
            try{
                if(conn!=null)
                    conn.close();
            }catch(SQLException se){
                se.printStackTrace();
            }
        }
        System.out.println("Done");
    }

}

I get a java.sql.SQLEXception:Error executing query, and a java.net.SocketTimeoutException. I was wondering if there are any other configurations or things I need to setup to be able to query.

1
Make sure your Security Groups are not preventing the connection from being established. - Piotr Findeisen
I have checked my inbound network connection rules for the emr cluster, but I'm not exactly sure what rule to add or what port number it should be. - moontartan
By default Presto runs on port 8080. Since protocol is HTTP-based, you can verify it works with curl, wget or a web browser - Piotr Findeisen
Resolved. Firewall was blocking connection. Just need to set up port forwarding - moontartan

1 Answers

0
votes

This could be because of firewall, which might not be allowing you to connect to the server. You would have to create an inbound rule in the Master group.

  1. For that, first select your cluster.
  2. Under "Security and access" heading, click on the link of "Security groups for Master".
  3. Now select the security group with name "ElasticMapReduce-master" and then click on "Edit inbound rules".
  4. At the bottom end, click on "Add Rule" and then in the type select "All TCP" and in the source field select "my ip" or add the ip address from where you want to access.
  5. Click on save rules.

Try Again. It will work!!