1
votes

I did a small test comparing Phoenix batch insert vs multiple puts using hbase. And Phoenix is much much slower than straight puts (6.157 sec vs 0.083 sec for 10,000 records).

Here is my code:

    SingleConnectionDataSource dataSource = new SingleConnectionDataSource();
    final JdbcTemplate template = new JdbcTemplate();
    template.setDataSource(dataSource);
    dataSource.setUrl(PhoenixZK);
    dataSource.setDriverClassName("org.apache.phoenix.jdbc.PhoenixDriver");

    final PlatformTransactionManager txnManager = new DataSourceTransactionManager(dataSource);
    TransactionTemplate txnTemplate = new TransactionTemplate(txnManager);
    txnTemplate.execute(new TransactionCallbackWithoutResult() {

        @Override
        public void doInTransactionWithoutResult(TransactionStatus status) {
            template.execute("DROP TABLE IF EXISTS stats.prod_metrics");
            template.execute("CREATE TABLE stats.prod_metrics ( host char(50) not null, created_date TIMESTAMP not null,\n" +
                    "    txn_count bigint CONSTRAINT pk PRIMARY KEY (host, created_date) ) SALT_BUCKETS=36, COMPRESSION='SNAPPY', REPLICATION_SCOPE=1");
        }
    });

    long startTime = System.currentTimeMillis();
    final Random random = new Random();
    txnTemplate.execute(new TransactionCallbackWithoutResult() {
        @Override
        protected void doInTransactionWithoutResult(TransactionStatus status) {
            template.batchUpdate("UPSERT INTO stats.prod_metrics VALUES (?,?,?)", new BatchPreparedStatementSetter(){

                @Override
                public void setValues(PreparedStatement ps, int i) throws SQLException {

                    ps.setString(1, "localhost-" + random.nextInt());
                    ps.setTimestamp(2, new Timestamp(System.currentTimeMillis()));
                    ps.setInt(3, random.nextInt());
                }

                @Override
                public int getBatchSize() {
                    return numRec;
                }
            });
        }
    });

    System.out.println("elapse = " + (double)(System.currentTimeMillis() - startTime)/1000.0);

Here is the logging from pheonix:

16:03:42.544  DEBUG  MutationState                   Sending 10000 mutations for STATS.PROD_METRICS with 20000 key values of total size 1950000 bytes
16:03:47.784  DEBUG  MutationState                   Total time for batch call of  10000 mutations into STATS.PROD_METRICS: 5235 ms

Why phoenix took much longer than straight insert? I am do anything wrong?

Thanks, Sean

2
I have found that AutoCommit slows down the inserts immensely. I can't tell above it you are using it or not...markthegrea

2 Answers

0
votes

0.083 sec for 10,000 records

How have you got these numbers? This is 100K per sec. You can get this performance only on cluster with 10-20 nodes running multiple (multithreaded) clients.

~ 2K inserts per sec (transactional?) looks pretty normal to me. This is just only one client sending data to one RS (Region Server).

0
votes

Anything will be slower compared to pure HBase. Phoenix is a kind of layer in top of HBase "High performance relational database layer over HBase for low latency applications".