1
votes

I'm having a problem with dynamodb. I'm attempting to verify the data contained within, but scan seems to be only returning a subset of the data, here is the code I'm using with the python boto bindings

#!/usr/bin/python
#Check the scanned length of a table against the Table Description
import boto.dynamodb
#Connect
TABLENAME = "MyTableName"
sdbconn = boto.dynamodb.connect_to_region(
    "eu-west-1",
    aws_access_key_id='-snipped-',
    aws_secret_access_key='-snipped-')

#Initial Scan
results = sdbconn.layer1.scan(TABLENAME,count=True)
previouskey = results['LastEvaluatedKey']

#Create Counting Variable
count = results['Count']

#DynamoDB scan results are limited to 1MB but return a Key value to carry on for the next MB
#so loop untill it does not return a continuation point
while previouskey != False:
    results = sdbconn.layer1.scan(TABLENAME,exclusive_start_key=previouskey,count=True)
    print(count)
    count = count + results['Count']
    try:
        #get next key
        previouskey = results['LastEvaluatedKey']
    except:
        #no key returned so thats all folks!
        print(previouskey)
        print("Reached End")
        previouskey = False

#these presumably should match, they dont on the MyTableName Table, not even close
print(sdbconn.describe_table(TABLENAME)['Table']['ItemCount'])
print(count)

print(sdbconn.describe_table) gives me 1748175 and print(count) gives me 583021. I was the under the impression that these should always match? (I'm aware of the 6 hour update) only 300 rows have been added in the last 24 hours though does anyone know if this is an issue with dynamodb? or does my code have a wrong assumption?

1
was there more than one print(count) print? Perhaps the code you wrote doesn't handle the LastEvaluatedKey as expected or perhaps you're hitting provisioning throughput. - Chen Harel
provisioning was my first thought, so i tried with a ten times increase in capacity and got the same result, LastEvaluatedKey works as expectced untill the final scan from dynamo which gives no lastevaluatedkey at all, printing the entire result verifies this, this is a very old dynamov1 table so im wondering if theres been an issue in the past witht dynamo - user2903946

1 Answers

2
votes

figured it out finally, its to do with Local Secondary Indexes, they show up in the table description as unique items, the table has two LSI's causing it to show 3x the number of items actually present