1
votes

I am currently pulling data into MongoDB, and will later need to pull this data into a separate application. This application has a requirement for the _id field to be a 32bit integer.

Be sure to explicitly set the _id attribute in the result document to unique 32 bit integers. source

I am making use of pymongo to insert documents into a collection.

def parse_tweet(in_t):
    t = {}
    t["text"] = in_t["text"]
    t["shape"] = in_t["coordinates"]["coordinates"][0], in_t["coordinates"]["coordinates"][1]
    return t

This gives me the expected documents:

{
  "_id" : ObjectId("50a0de04f26afb14f4bba03d"),
  "text" : "hello world",
  "shape" : [144.9557834, -37.8208589],
}

How can I explicitly set the _id value to be a 32bit integer?
I don't intend on storing more than 6 million documents.

1

1 Answers

2
votes

Just generate an id and pass it along. Id can be anything (except for array).

def parse_tweet(in_t):
    t = {}
    t["_id"] = get_me_an_int32_id
    t["text"] = in_t["text"]
    t["shape"] = in_t["coordinates"]["coordinates"][0], in_t["coordinates"]["coordinates"][1]
    return t

You will have to take care of its uniqueness yourself. MongoDB will only ensure that you don't store duplicate values. But where you get unique values - that's your problem.

Here are some ideas: How to make an Autoincrementing field.