0
votes

We are using ByteOrderedPartitioner to store time series for a new project, cql3 was okey for us just for a moment then we choose Hector to move on but now our range query doesn't work.

C* version: 2.0.7

Hector version: 1.0-5

Schema:

        ColumnFamilyDefinition cfd = HFactory.createColumnFamilyDefinition(
                keyspaceName, columnFamilyName,
                ComparatorType.UTF8TYPE);
        cfd.setComparatorTypeAlias("(IntegerType,IntegerType,IntegerType)");
        cfd.setKeyValidationClass("CompositeType(IntegerType,IntegerType,IntegerType)");
        cfd.setDefaultValidationClass(ComparatorType.UTF8TYPE.getClassName());

RowKey: 100:20:11

=> (name=column1, value=AAL, timestamp=1401745673543000)

=> (name=column2, value=NYC, timestamp=1401745673543002)

RowKey: 100:20:12

=> (name=column1, value=AAL, timestamp=1401745673543000)

=> (name=column2, value=TXA, timestamp=1401745673543002)

And so on..

Query to iterate over all rows of cassandra Column Family

    Composite startComposite = new Composite();
    startComposite.addComponent(0,100,EQUAL);
    startComposite.addComponent(1,20,EQUAL);
    startComposite.addComponent(2,11,EQUAL);

    Composite endComposite = new Composite();
    endComposite.addComponent(0,100,EQUAL);
    endComposite.addComponent(1,20, EQUAL);
    endComposite.addComponent(2,18,GREATER_THAN_EQUAL);

    int rowCount = 100;
    RangeSlicesQuery<Composite, String, String> rangeSlicesQuery = HFactory
            .createRangeSlicesQuery(ksp, CompositeSerializer.get(), StringSerializer.get(), StringSerializer.get())
            .setColumnFamily(columnFamilyName)
            .setRange("", "", false, rowCount);

    rangeSlicesQuery.setKeys(startComposite, endComposite);
    QueryResult<OrderedRows<Composite, String, String>> result = rangeSlicesQuery.execute();

    System.out.println(result.get());

Get empty rows:

Rows({})

1

1 Answers

0
votes

This is a Cassandra anti-pattern. There are very few good reasons to use ByteOrderedPartitioner, and this pattern is not one of them. You'll end up with all writes and queries essentially hitting one node (or a small number of nodes, depending on your cluster size).

There are many good examples of time-series data models in Cassandra. Here is one from Datastax.