3
votes

I have a Cassandra and I want to use the cql "IN" query. Therefore I have to change the order of the elements in my composed primary key (only the last piece is available for "IN" queries). The table is quite big but does not span multiple nodes now.

So what I have tried now (which is not working) is the following:

  1. create a new column family with identical columns but different order of primary key elements
  2. stop write processes and nodetool flush
  3. copy all /data/keyspace/columnfamily/ files
  4. rename the files to match the new column family name
  5. use the sstable loader to load the files into the new column family

But afterwards the primary key is just messed up:

Failed to decode value '53ccb45d4ab0d3560e8c36fd' (for column 'cent') as int: unpack requires a string argument of length 4

I can also not use COPY ... TO ... because this is just timing out ...

Any ideas?

1
Don't use an IN query :-) It is not effective ( see, for example, lostechies.com/ryansvihla/2014/09/22/…)jny
@jny well scylla should solve this. but anyhow if the alternative is just hammering the DB with lots of similar quereis (which I am doing currently) the perfromace is also very poor. I really want to see then IN instead of just beliving :-)KIC
@jny also: from your llink: "The “in” keyword has it’s place such as when querying INSIDE of a partition, but by and large it’s something I wish wasn’t doable across partitions, I fixed a good dozen performance problems with it so far, and I’ve yet to see it be faster than separate queries plus async.", this is exactly what I want to do.KIC

1 Answers

0
votes

There are a couple of good bulkloaders available on GIT that works better and wont timeout like the CQLSH COPY TO/FROM tool.

You can find it here. or here

Otherwise I'd recommend using something like SPARK to move the data for you.

You could also use SCALA once you have your second table already created:

val mydata = sc.cassandraTable("mykeyspace","mytable")
.select("key","column1","column2","column3")

mydata.saveToCassandra("whateverkeyspace","whatevertable", SomeColumns("key","column1","column2","colum3"))