1
votes

In checkpointing, Definitive Guides say

1. The secondary asks the primary to roll its edits file, so new edits goes to a new file
2. The secondary retrieves fsimage and edits from primary (using HTTP GET)

and at the end of checkpointing secondary namenode sends updated fsimage to namenode.

Now secondary namenode has latest fsimage, in next checkpointing will secondary namenode again copy fsimage from namenode?? If so why?? can't it simply compare two using checksum

1

1 Answers

2
votes

Yes, when the edit file size in namenode grows to specific size (default: fs.checkpoint.size= 4194304), secondary name would copy the fsimage and the edit file from namenode server.

This code from SecondaryNameNode.java explains that -

        long size = namenode.getEditLogSize();
        if (size >= checkpointSize || 
            now >= lastCheckpointTime + 1000 * checkpointPeriod) {
          doCheckpoint();
          lastCheckpointTime = now;
        }

Please, check when doCheckpoint(); is called.

The answer to why, is in the design Hadoop follows (I don't know why the follow this design though) - see the code below what's being done (I'm keeping only the statements relevant to this question). You can probably see how the functions downloadCheckpointFiles(sig) and doMerge(sig) are called.

/**
   * Create a new checkpoint
   */
  void doCheckpoint() throws IOException {

    //---other code skipped---

    // Tell the namenode to start logging transactions in a new edit file
    // Retuns a token that would be used to upload the merged image.
    CheckpointSignature sig = (CheckpointSignature)namenode.rollEditLog();


    downloadCheckpointFiles(sig);   // Fetch fsimage and edits
    doMerge(sig);                   // Do the merge

    //
    // Upload the new image into the NameNode. Then tell the Namenode
    // to make this new uploaded image as the most current image.
    //
    putFSImage(sig);


    namenode.rollFsImage();
    checkpointImage.endCheckpoint();

     //----other code skipped----
  }

Then how the downloadCheckpointFiles(sig); called from within doCheckpoint() above. See code below -

/**
       * Download <code>fsimage</code> and <code>edits</code>
       * files from the name-node.
       * @throws IOException
       */
      private void downloadCheckpointFiles(final CheckpointSignature sig
                                          ) throws IOException {
        try {
          UserGroupInformation.getCurrentUser().doAs(new PrivilegedExceptionAction<Void>() {

            @Override
            public Void run() throws Exception {


              // get fsimage
              String fileid = "getimage=1";
              File[] srcNames = checkpointImage.getImageFiles();
              assert srcNames.length > 0 : "No checkpoint targets.";
              TransferFsImage.getFileClient(fsName, fileid, srcNames);
              LOG.info("Downloaded file " + srcNames[0].getName() + " size " +
                       srcNames[0].length() + " bytes.");

              // get edits file
              fileid = "getedit=1";
              srcNames = checkpointImage.getEditsFiles();
              assert srcNames.length > 0 : "No checkpoint targets.";
              TransferFsImage.getFileClient(fsName, fileid, srcNames);
              LOG.info("Downloaded file " + srcNames[0].getName() + " size " +
                  srcNames[0].length() + " bytes.");

              checkpointImage.checkpointUploadDone();
              return null;
            }
          });
        } catch (InterruptedException e) {
          throw new RuntimeException(e);
        }

      }

And, for your third last question - "can't it simply compare two using checksum" -

One possible reason is they don't want to take any risk as checksum for two different files can sometime be same. Say in Namenode you have a fsImage which is different to what's in secondarynamenode but their checksum somehow becomes same. This might happen you might never know. Copying seems to be the best option they got to ensure the copies are same.

Hope this helps.