1
votes

I have created table using below CQL: I want to run query to find all video by actor name (case insensitive).

CREATE TABLE video_by_actor(
    actor text, added_date timestamp, video_id timeuuid, 
    character_name text, description text,
    encoding frozen<video_encoding>,
    tags set<text>, title text, user_id uuid,
primary key ((actor), added_date)) with clustering order by (added_date desc);


select * from video_by_actor where actor='Tom Hanks'

I want to select all rows from table irrespective of actor's name case eg. "tom hanks," "Tom hanks," "tom Hanks" etc.

Is it possible?

1
From Cassandra developer forum: This is not possible with regular CQL queries, but you have a few options: 1. Create a special column in your table to store the normalized string that you want to search. 2. Use a SASI index. 3. Use DSE Search.Amit H

1 Answers

2
votes

I want to search all case

First of all, if you want to "search," you need a different tool, like ElasticSearch. Cassandra is for key-based querying, which is very different from searching.

No, what you're looking to do really isn't possible with Cassandra, as it cares about case. I created the table definition described above, and inserted four rows, each with a different case application to Tom Hanks' name. Then I queried the results with the token function:

aploetz@cqlsh:stackoverflow> SELECT actor,token(actor),title FROM video_by_actor ;

 actor     | system.token(actor)  | title
-----------+----------------------+---------------------
 Tom Hanks | -4258050846863339499 |        Forrest Gump
 Tom hanks | -3872727890651172910 | Saving Private Ryan
 tom Hanks | -3300209463718095087 | Joe vs. the Volcano
 tom hanks |  1022609553103151654 |           Apollo 13

(4 rows)

Notice how each different case of "Tom Hanks" generated a different token. As this table is partitioning on actor, this means that these rows will likely be stored on different nodes.

Again, you'll probably want to use an actual search engine for something like this. They will have tools like analyzers that can have features like "fuzzy matching" enabled.