2
votes

I would appreciate help from anyone familiar with how DynamoDB work. I need to perform scan on a large DynamoDB table. I know that DynamoDBClient scan operation is limited to 1 MB size of returned data. Does the same restriction apply to Table.scan operation? The thing is that Table.scan operation returns output of type "ItemCollection<ScanOutcome>", while DynamoDBClient scan returns ScanResult output and it is not clear to me whether these operations work in a similar way or not.

I have checked this example: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ScanJavaDocumentAPI.html, but it doesn't contain any hints about using last returned key.

My questions are: Do I still need to make scan calls in a cycle until lastreturnedkey is null if I use Table.scan? If yes, how do I get last key? If not, how can I enforce pagination? Any links to code examples would be appreciated. I have spent some time googling for examples, but most of them are either using DynamoDBClient or DynamoDBMapper, while I need to use Table and Index objects instead.

Thanks!

1
You said yo have a very large table, but you are looking for something in particular (or a set), so you can start filtering your result (which is obvious I guess). If the same is not big enough: yes, you have to keep searching in the next batch(es).x80486
I am not sure I understood your comment. I do have a filterexpression that filters out my scan results, but that doesn't guarantee that my results will never exceed 1MbTofig Hasanov
So, you need to scan the next batch; you can do it in parallel by "playing" with Segments and/or TotalSegments; in that case the value of LastEvaluatedKey returned from the request must be used as the ExclusiveStartKey with the same segment ID in a subsequent scan operation. It's pretty much like SQL, but faster!x80486
There is no "LastEvaluatedKey" parameter in Table.scan output typeTofig Hasanov

1 Answers

1
votes

If you iterate over the output of Table.scan(), the SDK will do pagination for you.