I have been trying out Cassandra and need some help in understanding a few issues. I am new to cassandra and I am not sure of translating a MySQL DB to Cassandra would lead me to pitfalls which due to say inexperience or limited knowledge of cassandra. So I hope I can get the useful information from experienced cassandra users/developers.
Below are sample keyspaces I have created. I would like to know any sort of drawback in the design if someone from their experience can point out.
create keyspace Students with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = {replication_factor:1};
use Students;
create column family StudentID with column_type = 'Super' and comparator = 'UTF8Type' and subcomparator = 'UTF8Type' and default_validation_class = 'UTF8Type' and column_metadata =
[{column_name : 'First Name', validation_class : UTF8Type},
{column_name : 'Last Name', validation_class : UTF8Type},
{column_name : 'Subjects', validation_class : UTF8Type},
{column_name : 'Class', validation_class : UTF8Type}];
set StudentID[utf8('1968')]['00001']['First Name'] = 'Mark';
set StudentID[utf8('1968')]['00001']['Last Name'] = 'Myers';
set StudentID[utf8('1968')]['00001']['Subjects'] = 'Maths, Chemistry';
set StudentID[utf8('1968')]['00001']['Class'] = '10th grade';
create keyspace Teachers with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = {replication_factor:1};
use Teachers;
create column family TeacherID with column_type = 'Super' and comparator = 'UTF8Type' and subcomparator = 'UTF8Type' and default_validation_class = 'UTF8Type' and column_metadata =
[{column_name : 'First Name', validation_class : UTF8Type},
{column_name : 'Last Name', validation_class : UTF8Type},
{column_name : 'Subjects', validation_class : UTF8Type},
{column_name : 'Class', validation_class : UTF8Type}];
set TeacherID[utf8('777')]['234-333']['First Name'] = 'Mark';
set TeacherID[utf8('777')]['234-333']['Last Name'] = 'Myers';
set TeacherID[utf8('777')]['234-333']['Subjects'] = 'Maths, Chemistry,physics';
set TeacherID[utf8('777')]['234-333']['Class'] = '10th grade, 11th grade, 9th grade';
create keyspace Subjects with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = {replication_factor:1};
use Subjects;
create column family SubjectNames with default_validation_class = 'UTF8Type' and comparator = 'UTF8Type' and column_metadata =
[{column_name : 'Names1', validation_class : UTF8Type}];
set SubjectNames[utf8('Current')]['Name1']= 'maths';
set SubjectNames[utf8('Current')]['Name2']= 'physics';
set SubjectNames[utf8('Current')]['Name3']= 'chemistry';
set SubjectNames[utf8('Current')]['Name4']= 'CS';
Three keyspaces - Students, Teachers and Subjects. I would definitely need some relationship amongst these keyspaces and would also require querying data. e.g.
- I would query for students with a certain subject and/or class
- A teacher with certain class
- List all subjects taken up by a certain student from a certain class.
From what I know, I would definitely need to create secondary indices to make the queries work. That is, retrieving data on certain clauses.
What I know if I am correct
- We donot have a 'like' clause in cassandra
- For each value for a column (the very last key-value pair), the value must be broken up. That is to individual words. Say, I want to get a list of subjects so each subject must reside in a distinct column associated to it. I cannot query column values that are like "subjectA,subjectB" instead I would break it up to SubjectA and SubjectB and put them in different columns.
Below are the keyspaces.