2
votes

I store my data in Cassandra NoSQL database with the following schema:

CREATE TABLE bidding3(
    item_id bigint,
    user_id bigint,
    bid_price bigint,
    current_time text,
    PRIMARY KEY (item_id,current_time)
) WITH CLUSTERING ORDER BY (current_time,DESC);


CREATE TABLE bidding_user(
    item_id bigint,
    user_id bigint,
    bid_price bigint,
    current_time text,
    PRIMARY KEY (user_id,current_time)
) WITH CLUSTERING ORDER BY (current_time,DESC);

And then I use SELECT * FROM bidding_user LIMIT 5;. I would expect the data is arranged in time-series and it did for the first day, but things changed today.

Wed Jul 06 20:09:04 UTC 2016
Wed Jul 06 19:10:04 UTC 2016
Thu Jul 07 19:09:04 UTC 2016.

I think the database ignores the date but only care about the time.

Any idea how to fix this?

2

2 Answers

4
votes
SELECT * FROM bidding_user LIMIT 5;

The problem here, is that you have not specified a WHERE clause. Cassandra only maintains sort order within a partition key. Your partition key is user_id, so my guess (since you are only showing us the current_time column, and not the user_id) is that your first two rows share a user_id, and the user_id for the third row is different.

Consider the following table:

CREATE TABLE stackoverflow.timestamptest (
    userid text,
    activetime timestamp,
    value text,
    PRIMARY KEY (userid, activetime)
) WITH CLUSTERING ORDER BY (activetime ASC)

If I query without a WHERE clause, I get:

aploetz@cqlsh:stackoverflow> SELECT userid,activetime FROM timestamptest ;

 userid | activetime
--------+--------------------------
      a | 2015-09-25 11:33:33+0000
      a | 2015-10-22 14:26:00+0000
      c | 2015-12-28 19:12:00+0000
      d | 2016-01-22 14:57:54+0000
      d | 2016-01-22 14:57:54+0000
      d | 2016-07-07 19:44:04+0000
      b | 2015-10-22 14:26:00+0000
      b | 2016-07-07 19:44:10+0000

(8 rows)

As you can see, the order of my clustering key (activetime) is really only meaningful within each partition key (userid).

In other words, if you want your result set to be sorted, you will need to specify your user_id partition key in a WHERE clause with your query.

Not to self-promote or anything, but I wrote an article on this subject early last year, which may help you understand this better:

PlanetCassandra: We Shall Have Order!

0
votes

The current_time is of type text so sorting is done based on characters of the date string.

Try changing the type of current_time to timestamp.