2
votes

I'm using curl to access Hbase with REST. I'm having a problem in inserting data into Hbase. I followed the Stargate documentation but when I follow the same syntax it gives me 400/405 errors of Bad requests and Method not Allowed errors. I have pasted the command below. Please tell me where am I going wrong.

Stargate documentation says

POST /<table>/<row>/<column> (:qualifier)?/<timestamp>
curl -H "Content-Type: text/xml" --data '[...]' http://localhost:8000/test/testrow/test:testcolumn

My curl command is as follows:

curl -H "Content-Type: text/xml" --data '[<CellSet><Row key="111"><Cell column="f1">xyz</Cell></Row></CellSet>]' http://localhost:8080/mytable/row/fam

What is the right way to do this? because this gives me Bad request error.

Also, Im trying the same in a Python client.It gives me ColumnFamilyNotFoundException.I am reading the Xml data that is to be passed to the stargate server from a file.The code is as follows.

url = 'http://localhost:8080/mytable/row/fam' f = open('example.xml', 'r') xmlData = f.read() r = requests.post(url, data=xmlData, headers=headers)

example.xml has the following:

<CellSet>
     <Row key="111">
   <Cell column="fam:column1">
             xyz
         </Cell>
     </Row>
 </CellSet>
2

2 Answers

4
votes

It was a very simple mistake. Hbase expects every value in base64 encoding. Tha key as well as columnfamily:column has to be base64 encoded before entering in the xml.

0
votes

Inserting is easy using starbase.

$ pip install starbase

Create a table named table1 with columns col1 and col2

from starbase import Connection
connection = Connection()
table = connection.table('table1')
table.create('col1', 'col2')

Insert a row into table1. Row key would be row1.

table.insert(
    'row1', 
    {
        'col1': {'key1': 'val1', 'key2': 'val2'}, 
        'col2': {'key3': 'val3', 'key4': 'val4'}
    }
)

You may also insert in batch.

Not to duplicate the code, assume that our data is stored in data variable (dict).

data = {
    'col1': {'key1': 'val1', 'key2': 'val2'}, 
    'col2': {'key3': 'val3', 'key4': 'val4'}
}

batch = table.batch()
for i in range(100, 5000):
    batch.insert('row_%s' % i, data)
batch.commit(finalize=True)

Updating is done with update method and works the same way as insert does.

To fetch a row use fetch method.

Fetch entire row:

table.fetch('row1')

Fetch only col1 data:

table.fetch('row1', 'col1')

Fetch only col1 and col2 data:

table.fetch('row1', ['col1', 'col2'])

Fetch only col1:key1 and col2:key4 data:

table.fetch('row1', {'col1': ['key1'], 'col2': ['key4']})

Alter table schema:

Add columns col3 and col4

table.add_columns('col3', 'col4')

Drop columns

table.drop_columns('col1', 'col4')

Show table columns

table.columns()

Show all tables

connection.tables()