HDFS Block related questions

Question

I have 2 questions that will help me understand how HDFS works in the context of blocks.

1. You use the hadoop fs -put command to write a 300 MB file using and HDFS block size of 64 MB. Just after this command has finished writing 200 MB of this file, what would another user see when trying to access this file?

A. They would see Hadoop throw an ConcurrentFileAccessException when they try to access this file.

B. They would see the current state of the file, up to the last bit written by the command.

C. They would see the current of the file through the last completed block.

D. They would see no content until the whole file written and closed.

As I see it, because the file is splitted into blocks, when each block is put in the HDFS it becomes available, so my answer is C, but I do need a verification for it...

2. You need to move a file titled “weblogs” into HDFS. When you try to copy the file, you can’t. You know you have ample space on your DataNodes. Which action should you take to relieve this situation and store more files in HDFS?
A. Increase the block size on all current files in HDFS.

B. Increase the block size on your remaining files.

C. Decrease the block size on your remaining files.

D. Increase the amount of memory for the NameNode.

E. Increase the number of disks (or size) for the NameNode.

F. Decrease the block size on all current files in HDFS.

My approach for this one is that the file is probably small enough to fit, but a much larger block will be allocated for it, and so if you decrease the block size it will "defragment" some of the gaps - I can't figure out though, if it is a good approach to do this for the remaining files or all the files... or even if my approach is correct

Thank you!!

akhil gupta akhil gupta · Accepted Answer · 2015-01-14T19:42:46

If the writer has not used Hflush, then the reader will see an error as block has not been finalized yet. So I will got with D.

Here are two links for this https://issues.apache.org/jira/browse/HDFS-1907 Hadoop HDFS: Read sequence files that are being written

One of the error in this situation will that the Name node is not aware of the spaces in HDFS. So I wll go with E in this case.

Links: error while copying the files from local file system to HDFS in Hadoop

HDFS Block related questions

2 Answers