5
votes

The problem

I'm trying to save a serialized object (using cPickle) into a Cassandra 1.2 column, I'm using the python cql library. I've already tried defining the column as text (utf8 string) and blob, in both cases I'm receiving the same error:

The object is a Python dict:

obj = {'id':'sometextid',
       'time_created':05/12/2013, #<---- datetime
       'some other string property': 'some other value'
}

The error is this:

raise cql.ProgrammingError("Bad Request: %s" % ire.why)
cql.apivalues.ProgrammingError: Bad Request: line 31:36 no viable alternative at character '\'

And looking at the executed CQL statement I can see some '\' characters after pickling the object, for instance:

Part of the pickled object

cdatetime
datetime
p4
(S'\x07\xdd\x03\x1c\x000\x13\x05\xd0<'
tRp5

My questions

What is the usual way of serializing a python dict (including datetimes) to save it into cassandra 1.2 using the cql library? Is there a better or more straightforward way of doing this?

Thanks in advance!

2
I'm not going to make this an actual Answer because I'm unfamiliar with Cassandra. But do you suppose that it's failing because it stops reading at the single-quote? (Because it's interpreting it as a quoted string or something? Or perhaps it's trying to interpret the backslash-x as a control character?) If so, perhaps JSON encoding or pickle -> Base64 would work better (because they're all text with well-defined quoting rules).Jim Pivarski

2 Answers

2
votes

The complete solution to this problem is to define the column as a blob and include an encode to hex (as defined in the cassandra docs for the blob type) in this way:

obj_to_store = cPickle.dumps(input_obj).encode("hex")

In this way you can serialize a regular python dict. With regular I mean it can contain anything a python dict can, including datetimes or whatever you want and it will be properly serialized and stored in cassandra.

Maybe there is a better solution out there but so far this is the only one i've found that actually works with an arbitrary python dict.

Hope it helps somebody!

1
votes

Sounds like a problem with your CQL library parsing strings properly. Until that's fixed, one approach would be to convert the pickle to a packed string using struct.

Alternately, you could change the encoding for the offending values using something like urllib