6
votes

How to implement auditing for cassandra data? I am looking for a open source option.

Are there any features of cassandra that help with auditing? Can I use triggers to log the records into a table? I followed Triggers example and was able to get a record inserted into triggers_log table when the updates occur on another table. But not sure how do I capture the user/session details that triggered the update. I have From CQLSH terminal, create users and trigger_log table

create table AUDIT_LOG ( 
       transaction_id int,
       entries map<text, text>,  --> to capture the modifications done to the tables
       user varchar,  //authenticated user
       time timestamp, 
       primary key(transaction_id));
CREATE TABLE users (
  user_id int PRIMARY KEY,
  fname text,
  lname text
);

Define the trigger on users table using CREATE TRIGGER syntax from cqlsh

Below code so far.

public class AuditTrigger implements ITrigger {

    @Override
    public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update) {

        List<RowMutation> mutations = new ArrayList<RowMutation>();
        for (Column column : update) {
            if (column.value().remaining() > 0) {
                RowMutation mutation = new RowMutation("mykeyspace", key);
           //What do I need here to capture the updates to users 
           //table and log the updates into various columns of audit_log
                mutations.add(mutation);
            }
        }
        return mutations;
    }
}

If triggers is not the correct approach (any spring AOP approach?), please suggest alternatives. I also tried Cassandra vs logging activity solution but it does not print the sql executed, authenticated user information.

3
datastax enterprise is commercial product. so it requires license and am looking for open source implementation.suman j

3 Answers

7
votes

Unfortunately at this time, Triggers cannot be used as what you need is the ClientState which contains the user information and is not passed to Triggers.

There are 2 approaches I can think of.(You will need to look at the Cassandra code base for better understanding these approaches)

One approach is AOP i.e to add an agent which would AOP and start Cassandra with the Agent. The class that will need to be pointcut is the QueryProcessor#processStatement method. The call to this method will have the prepared statement and the QueryState as parameters. From the PreparedStatement you can identify the intention of the user. QueryState.getClientState will return the ClientState which is where the user information resides.

The other approach involves custom authenticators and authorizers. Configuring this in Cassandra is described here.

http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secure_about_native_authenticate_c.html

You can have a custom authorizer extending the AllowAllAuthorizer(this will disable permission caching). Whenever you get an authorize request on the Authorizer you can log it. The downside of this approach is that you do not know what the user intends to do with the table, only that he is request some authorization on it. Permission is the one which contains what he wants to do with the table, but it is not passed on to the authorizer.

If you decide on either of these approaches, you are free to post followups if you need more detail.

1
votes

Here is an example of doing Cassandra Audit log: https://github.com/xiaodong-xie/cassandra-audit

The solution is based on a system property named "cassandra.custom_query_handler_class". And it contains user authentication part, assuming AWS System Manager Parameter Store and a LDAP server are used.

By the way, it seems audit log will be supported in Cassandra v4.x (https://issues.apache.org/jira/browse/CASSANDRA-12151)

0
votes

ecAudit is an Apache Cassandra plug-in which provide full support for audit logs in Cassandra 3.0.x and 3.11.x.

https://github.com/Ericsson/ecaudit

It will create audit records for authentication attempts, and CQL queries. It allow you define filters to limit audit logs based on roles/keyspaces/tables/queries.