0
votes

I have a table with a wide column schema. I save data to the row over time, and the schema will effectively change. At time=t1, I might have columns a,b,c, at time=t2 I might have a,b and d.

Whenever I get from the table, I only want the cells which have the latest timestamp for the entire row. So at t2, I would exclude the cell with column c.

If I set maxversions to 1, I can get the latest value for each cell in the row, but I don't want that. Instead I want only the cells from the last transaction.

To speed things up when there is a big difference in columns from transaction to transaction, I would like to do this filtering server side. I had planned on using DependentColumnFilter and a column called ts. I would like to filter the cells such that their timestamps match the timestamp of the column ts - is that possible without filtering client side?

1
"I had planned on using DependentColumnFilter and a column called ts." how you achieve your goal by using that filter? you planned to insert each row with 'ts' column ?sel-fish
Yeah basically add a column from which I can derive the timestamp of the transaction. If every time I write, I add it, then this should be fine. However that filter builds a hashset of timestamps from the dependency column, so can return several. The idea I arrived at (I don't want to use custom filters because I need to get another department to install the jar for me) was exactly the same as your answer.user1310957

1 Answers

0
votes

A custom filter can do this job :

public class GetLatestColumnsFilter extends TimestampsFilter {
    private static final Log log = LogFactory.getLog(GetLatestColumnsFilter.class);
    private long max;

    public GetLatestColumnsFilter() {
        super(new ArrayList<>());
        max = -1;
    }

    @Override
    public ReturnCode filterKeyValue(Cell v) {
        if (-1 == max) {
            max = Long.valueOf(v.getTimestamp());
        } else if (max != Long.valueOf(v.getTimestamp())) {
            return ReturnCode.SKIP;
        }
        return ReturnCode.INCLUDE;
    }

    public static GetLatestColumnsFilter parseFrom(byte[] pbBytes) throws DeserializationException {
        return new GetLatestColumnsFilter();
    }

}

There is an example if you want to check it out.