1
votes

I'm building an address-book app that uses a back-end Cloudant database. The database stores 3 types of documents:

-> User Profile document
-> Group document
-> User-to-Group Link document

As the names of the document go, there are users in my database, there are groups for users(like whatsapp), and there are link documents for each user to a group (the link document also stores settings/privileges of that user in that group).

My client-side app on login, queries cloudant for the user document, and each group document using view collation over the link documents of that user.

Then using the groups that I have identified above, I find all the other users of that group.

Now, the challenge is that I need to monitor any changes on the group and user documents. I am using pouchdb on the app side, and can invoke the 'changes' API against the ids of all the group and user documents. But the scale of this can be maybe 500 users in each group, and a logged in user being part of 10-50 groups. That multiplied to 1000s of users will become a nightmare for the back-end to support.

Is my scalability concern warranted? Or is this normal for cloudant?

3

3 Answers

0
votes

If I understand your schema correctly, you documents of this form:

{
   _id: "user:glynn",
   type: "user",
   name: "Glynn Bird"
}
{  
   _id: "group:Developers",
   type: "group",
   name: "Software Developers"
} 
{    
   _id: "user:glynn:developers"
 }

In the above example, the primary key's sorting allows a user and all of its memberships to be retrieved by using startkey and endkey parameters do the database's _all_docs endpoint.

This is "scalable" in the sense that if is efficient for Cloudant retrieve data from a primary or secondary index because the index is held in a b-tree so data with adjacent keys is store next to each other. A limit parameter can be used to paginate through larger data sets.

0
votes

yes the documents are more or less how you've specified. Link documents are as follows:

{
  "_id": <AutoGeneratedID>,
  "type": "link",
  "user": user_id,
  "group": group_id
}

I've written the following view map function:

if(type == "link") {
    emit(doc.user, {"_id": doc.user});
    emit([doc.user, doc.group], {"_id": doc.group});
    emit([doc.group, doc.user], {"_id": doc.user});
}

using the above 3 indexes and include-docs=true, 1st lets me get my logged-in user document, 2nd lets me get all group documents for my logged-in user (using start and end key), and 3rd lets me get all other user documents for a group (using start and end key again).

Fetching the documents is done, but now I need to monitor changes on users of each group, for this, don't I need to query the changes API with array of user ids ? Is there any other way ?

Cloudant retrieve data from a primary or secondary index because the index is held in a b-tree so data with adjacent keys is store next to each other

Sorry, I did not understand this statement ?

Thanks.

0
votes

Part 1. I recommend to get rid of the "link" type here - it's good for SQL world, but not for CouchDb.

Instead of this, it is better to utilize a benefit of Document Storage, i.e. store user groups in property "Groups" for "User"; and property "Users" for "Group".

With this approach you can set up filtered replication to process only changes of specific groups and these changes will already contain all the users of the group.

I want to notice, that I made an assumption, that number of groups for a user and number of groups is reasonable (hundreds at maximum) and doesn't change frequently.

Part 2. You can just store ids in these properties and then use Views to "join" other data. Or I was also thinking about other approach (for my use case, but yours is similar):

1) Group contains only ids of users - no views needed.

2) You create a view of each user contacts, i.e. for each user get all users with whom he has mutual groups.

3) Replicate this view to client app.

When user opens a group, values (such as names and pics of contacts are taken from this local "dictionary"). This approach can save some traffic.

Please, let me know what do you think. Because right now I'm working on designing architecture of my solution. Thank you!)