I'm researching distributed file system architectures and designs. Quite a few DFS(s) I've come across usually have the following architecture:
- A namenode or metadata server used to manage the location of data blocks / chunks as well as the hierarchy of the filesystem.
- A data node or data server used to store chunks or blocks of data belonging to one or more logical files
- A client that talks to a namenode to find appropriate data nodes to read/write from/to.
Many of these systems have two primary variants, a block size and a replication factor.
My question is:
Are Replication Factor and Forward Error Correction like Reed Solomon Erasure Encoding compatible here? Does it makes sense to use both techniques to ensure high availability of data? Or is it enough to use one or hte other (what are the trade offs?)