0
votes

I'm trying to work out what advantages there are to using filter expressions for the DynamoDB Query operation.

I understand that it's used to refine the results of a query, as explained here: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.html#Query.FilterExpression. But this also says that the same limits apply before the filter, so don't gain anything in terms of reduced read capacity consumption or overcoming the 1MB-per-query limit.

Why would I use this over in-built language features, like Scala's filter?

1

1 Answers

2
votes

A filter expression is applied after a Query finishes, but before the results are returned.

So, ultimately it saves you bandwidth. Scala's filter works the same way - you create the collection first and then filter will iterate over it and filter out the results that don't match your predicate.

Spark filter operation on the other hand, is a transformation operation and is hence lazily evaluated which allows Spark to perform some optimisations.