How to handle data scheme change on non-SQL databases?

Question

As I experienced about non-SQL databases, one of biggest problem was schema change. Adding or removing columns on SQL database is easy operation and server guarantees data stability during scheme change. So it can handle data schema changes during service advancing. But how to non-SQL databases (especially objective style systems) handle those schema change? Is there reliable method?

It depends on the database in question, they're all different. — skaffman

dsegleau dsegleau · Accepted Answer · 2011-02-14T21:25:36

I agree with Skaffman, non-SQL databases covers a broad spectrum of products. Each one tends to provide different levels of schema management.

For example, key/value pair databases, like Oracle Berkeley DB are schema-free. What is placed in the key/value pair is an opaque structure, which is known to the applications that access it. In this case, I've often seen applications implement a field within the key/value pair data structure to indicate the schema version. The application, when reading or writing the record will take the appropriate action based on the schema version it finds. This can be advantageous for some applications, since schema changes can be applied as required on a given read/write operation rather than in bulk.

Another example, XML databases, like Oracle Berkeley DB XML store data in XML format which is self describing. Although it's common for most XML documents within a collection to have the same schema, it's certainly possible and even desirable for the schema to have additional or fewer attributes for a given document(s). These non-SQL databases employ query languages like XQuery, that allow you to query the structure (attributes) of the data as well as the content.

In yet another example, object-based data stores, like the Data Persistence Layer API provided with Berkeley DB can support application-directed schema evolution as part of the underlying API, as described here.

However, even with SQL databases it's only easy to change the schema on the surface. The application usually has to be made aware of any schema changes in order to operate properly. Adding a column in a SQL database can adversely affect applications that tend to do "SELECT *", while renaming or removing a column can adversely affect applications that assumed the existence of that column. SQL databases make schema change "easy" in the sense that there's a SQL command that allows you to add, drop and rename columns. The schema management requirements up the stack still need to be thought through and implemented properly.

Bottom line, typically schema evolution is either managed by the database engine, the application or the intervening API layer. As for how "easy" it is, depends a lot on the application layers above it and how they are affected by the schema change.

If you can be more specific about the problem that you're trying to solve we might be able to provide more specific suggestions. In particular, which database are you using and how do you see your schema evolving?

Regards,

Dave

How to handle data scheme change on non-SQL databases?

1 Answers