4
votes

I'm using composite datatype in rowkey, column family is as below

create column family CompositeTest
with comparator = 'UTF8Type'
and key_validation_class = 'CompositeType(UTF8Type,UTF8Type)'
and default_validation_class = 'UTF8Type';

The sample data of this column family as below,

RowKey: s2:2222222
=> (column=param1, value=value1
=> (column=param2, value=value2
=> (column=param3, value=value3
-------------------
RowKey: s2:3333333
=> (column=param1, value=value1
=> (column=param2, value=value2
=> (column=param3, value=value3
-------------------
RowKey: s2:1111111
=> (column=param1, value=value1
=> (column=param2, value=value2
=> (column=param3, value=value3
-------------------
RowKey: s1:3333333
=> (column=param1, value=value1
=> (column=param2, value=value2
=> (column=param3, value=value3
-------------------
RowKey: s1:2222222
=> (column=param1, value=value1
=> (column=param2, value=value2
=> (column=param3, value=value3
-------------------
RowKey: s1:1111111
=> (column=param1, value=value1
=> (column=param2, value=value2
=> (column=param3, value=value3

I want to get all the rows which first component of row key is "s1". Is it possible using Hector client? if not then by which cassandra client its possible?

I've tried by using following code, but its not working,

Composite start = new Composite();
        start.addComponent(0, "s1", ComponentEquality.EQUAL);

        Composite end = new Composite();
        end.addComponent(0, "s1", ComponentEquality.GREATER_THAN_EQUAL);

        RangeSlicesQuery<Composite, String, String> rangeSlicesQuery = HFactory.createRangeSlicesQuery(keyspace, new CompositeSerializer(), StringSerializer.get(),  StringSerializer.get()); 
        rangeSlicesQuery.setKeys(start, end);
        rangeSlicesQuery.setRange("param1", "param3", false, 100);
        rangeSlicesQuery.setColumnFamily("CompositeTest");
        rangeSlicesQuery.setRowCount(11);
        QueryResult<OrderedRows<Composite, String, String>>  queryResult = rangeSlicesQuery.execute();

        Rows<Composite, String, String> rows = queryResult.get();
        Iterator<Row<Composite, String, String>> rowsIterator = rows.iterator();

Thanks in advance...

2
It is possible to do that both by Hector client and by Astyanax client.Nikola Yovchev
If possible then can you please share me how to do it using Hector?Jignesh Dhua

2 Answers

1
votes

The problem is you are trying to perform a slice on the row keys. It is not possible at all if you are using in Cassandra a random partitioner (e.g RandomPartitioner or Murmur3Partitioner). It could be possible (but I've never tried) if you are using a order preserving partitioner. In you case should be a CompositeKeyPartitioner that unlucky doesn't exist and thus you should have to write it by yourself. Then you should also configure the cluster by calculating the right tokens in according with your data. As you can see, it is not the easiest way.

BUT, you can do the same, if you just put the composite value in the Column name instead of the key. You can define you CF in such way:

create column family CompositeTest
   with comparator = 'CompositeType(UTF8Type,UTF8Type)'
   and key_validation_class = 'UTF8Type'
   and default_validation_class = 'UTF8Type';

And store the data like:

RowKey: s2
=> (column=2222222:param1, value=value1
=> (column=2222222:param2, value=value2
=> (column=2222222:param3, value=value3
=> (column=3333333:param1, value=value1
=> (column=3333333:param2, value=value2
=> (column=3333333:param3, value=value3
=> (column=1111111:param1, value=value1
=> (column=1111111:param2, value=value2
=> (column=1111111:param3, value=value3
-------------------
RowKey: s1:
=> (column=3333333:param1, value=value1
=> (column=3333333:param2, value=value2
=> (column=3333333:param3, value=value3
=> (column=2222222:param1, value=value1
=> (column=2222222:param2, value=value2
=> (column=2222222:param3, value=value3
=> (column=1111111:param1, value=value1
=> (column=1111111:param2, value=value2
=> (column=1111111:param3, value=value3

With this structure the query you thought it's quite easy, and then you can always slice on the column name to select only those columns inside the interval you want.

2
votes

This is not possible in Cassandra using any client. While the row key appears as a composite object to you, the application developer, in Cassandra itself the row key is a singular byte-array that is stored in Cassandra's SSTable as a single, atomic value.

Meaning, you can only query a row with the entire key, not just part of a key. Otherwise, you'd have to scan the entire column family until you found a match - which would be tremendously expensive.

That being said, if you do need to be able to query rows in a column family using only part of their row key, then I would strongly recommend creating separate index column families for those key parts. This will allow you to use standard key / column lookups to find all of the rows in your raw data column family which match your criteria.