3
votes

I am following this blog post from Microsoft testing out DocumentDB.

I have created a collection and inserted 2 documents via different POCO classes on my application. It created the documents but I cannot filter them back into their respective POCO classes. I realized that I am querying all the collection so it is obviously retrieving all the documents stored inside that collection.

What is the best way to differentiate documents while querying so that I can query them separately by type?

I can add a type field to the document and can get by WHERE type="user" but I'm not sure that I cannot do SELECT * FROM users with users being a document type (if there is such a thing in DocumentDB), not a collection.

Here is how I am creating documents:

    var user1= new User()
    {
        UserTypeId = 0,
        UserName = "[email protected]",
        Password = "12345",
        PasswordSalt = "saltyPassword",
        UserStatusId = 1,
        ProfilePhotoKey = "KJSY"
    };
    await DocumentDBRepository<User>.CreateItemAsync(user1);

    var client = new Client()
    {
        ClientName = "client1",
        Secret = "rxPBsIVYya2Jg2ZHPNG8gL0P36TnutiBehvEFgk938M=",
        Title = "Administration Front End Application",
        ApplicationTypeId = 0,
        Active = false,
        RefreshTokenLifeTime = 60,
        AllowedOrigin = "http://localhost:8080",
        AllowedRoles = "admin"
    };
    await DocumentDBRepository<Client>.CreateItemAsync(client);

Document Db repository class

public static class DocumentDBRepository<T>
{
    //Use the Database if it exists, if not create a new Database
    private static Database ReadOrCreateDatabase()
    {
        var db = Client.CreateDatabaseQuery()
                        .Where(d => d.Id == DatabaseId)
                        .AsEnumerable()
                        .FirstOrDefault();

        if (db == null)
        {
            db = Client.CreateDatabaseAsync(new Database { Id = DatabaseId }).Result;
        }

        return db;
    }

    //Use the DocumentCollection if it exists, if not create a new Collection
    private static DocumentCollection ReadOrCreateCollection(string databaseLink)
    {
        var col = Client.CreateDocumentCollectionQuery(databaseLink)
                          .Where(c => c.Id == CollectionId)
                          .AsEnumerable()
                          .FirstOrDefault();

        if (col == null)
        {
            var collectionSpec = new DocumentCollection { Id = CollectionId };
            var requestOptions = new RequestOptions { OfferType = "S1" };

            col = Client.CreateDocumentCollectionAsync(databaseLink, collectionSpec, requestOptions).Result;
        }

        return col;
    }

    //Expose the "database" value from configuration as a property for internal use
    private static string databaseId;
    private static String DatabaseId
    {
        get
        {
            if (string.IsNullOrEmpty(databaseId))
            {
                databaseId = ConfigurationManager.AppSettings["database"];
            }

            return databaseId;
        }
    }

    //Expose the "collection" value from configuration as a property for internal use
    private static string collectionId;
    private static String CollectionId
    {
        get
        {
            if (string.IsNullOrEmpty(collectionId))
            {
                collectionId = ConfigurationManager.AppSettings["collection"];
            }

            return collectionId;
        }
    }

    //Use the ReadOrCreateDatabase function to get a reference to the database.
    private static Database database;
    private static Database Database
    {
        get
        {
            if (database == null)
            {
                database = ReadOrCreateDatabase();
            }

            return database;
        }
    }

    //Use the ReadOrCreateCollection function to get a reference to the collection.
    private static DocumentCollection collection;
    private static DocumentCollection Collection
    {
        get
        {
            if (collection == null)
            {
                collection = ReadOrCreateCollection(Database.SelfLink);
            }

            return collection;
        }
    }

    //This property establishes a new connection to DocumentDB the first time it is used, 
    //and then reuses this instance for the duration of the application avoiding the
    //overhead of instantiating a new instance of DocumentClient with each request
    private static DocumentClient client;
    private static DocumentClient Client
    {
        get
        {
            // change policy to ConnectionMode: Direct and ConnectionProtocol: TCP on publishing to AZURE
            if (client == null)
            {
                string endpoint = ConfigurationManager.AppSettings["endpoint"];
                string authKey = ConfigurationManager.AppSettings["authKey"];
                Uri endpointUri = new Uri(endpoint);
                client = new DocumentClient(endpointUri, authKey);
            }

            return client;
        }
    }


    /* QUERY HELPERS */
    public static IEnumerable<T> GetAllItems()
    {
        return Client.CreateDocumentQuery<T>(Collection.DocumentsLink)
            .AsEnumerable();
    }
    public static IEnumerable<T> GetItems(Expression<Func<T, bool>> predicate)
    {
        return Client.CreateDocumentQuery<T>(Collection.DocumentsLink)
            .Where(predicate)
            .AsEnumerable();
    }
    public static async Task<Document> CreateItemAsync(T item)
    {
        return await Client.CreateDocumentAsync(Collection.SelfLink, item);
    }
    public static T GetItem(Expression<Func<T, bool>> predicate)
    {
        return Client.CreateDocumentQuery<T>(Collection.DocumentsLink)
                    .Where(predicate)
                    .AsEnumerable()
                    .FirstOrDefault();
    }

    public static async Task<Document> UpdateItemAsync(string id, T item)
    {
        Document doc = GetDocument(id);
        return await Client.ReplaceDocumentAsync(doc.SelfLink, item);
    }

    private static Document GetDocument(string id)
    {
        return Client.CreateDocumentQuery(Collection.DocumentsLink)
            .Where(d => d.Id == id)
            .AsEnumerable()
            .FirstOrDefault();
    }
}

I am trying to get:

    var q = DocumentDBRepository<User>.GetAllItems().ToList();
    var t = DocumentDBRepository<Client>.GetAllItems().ToList();

q should contain only user documents those were created by

await DocumentDBRepository<User>.CreateItemAsync(user1);

and t should contain only client documents those were created by

await DocumentDBRepository<Client>.CreateItemAsync(client1);
2

2 Answers

2
votes

Since DocumentDB doesn't have any built-in type metadata for each document, you'd need to add one (such as the type property you suggested, or any other distinguishing property) when storing heterogeneous documents in the same collection, and use it in your WHERE clause. What you name this property, and what values you store in it, have nothing to do with your collection name.

Regarding your specific example of SELECT * from users WHERE type='user' would work, but SELECT * from users would return all documents, regardless of type.

By default, all properties are indexed, including your newly-formed type property, which lets you perform your WHERE-clause filtering efficently without requiring a collection scan.

1
votes

Regaring how to differentiate document types within a collection...

I started out using a Type attribute which simply took the iternal type name (a getter in the base class "Entity")

My expectation had been that we would use the Type attribute when querying.

However we quickly moved on to using the type value instead as a suffix on the partition key for each entity type ("pkey" again inherited from Entity - since we're savving $ by storing everything in one collection we had to use one partition key attribute name across all document types)

If type name is "Thing" and there is one only then "id" is "Thing" and pkey is "-identifier-|Thing"

If -identifier- identifies a group then multiple records will have unique values for "id" and range queries are easy simply querying on pkey and iterating.

The type name should be a pkey suffix to ensure you don't reduce read and write distribution

id and pkey also work nicely as a unique index - a welcome feature when you find yourself missing relational SQL :-)

As regards POCO - I'm seriously considering abandoning direct POCO operations since we've had so much trouble with the camelcase json serialization and sql queries. We've ended up unable to trust global camelcase settings - instead exhaustively setting json name on all fields.

What I'm thinking of moving to uses an internal POCO which persists to- and from- Document. POCO getters and setters refer to the Document instance via getAttributeValue() an setAttributeValue(). We can then swap our persistence layer to something else via DI

The Document class has lots of interesting methods which we've hardly looked into yet.

Decoupling our POCO from persistance is also desireable.

Just some thoughts for you.