The HDFS filesystem shows that around 600K blocks on a cluster are under-replicated due to rack failure. Is there a way to know which files will be affected if these blocks are lost, before HDFS recovers? I can't do a 'fsck /' as the cluster is very large.
2
votes
2 Answers
2
votes
Namenode UI lists the Missing blocks and JMX logs lists the corrupted/missing blocks. UI and JMX just shows the number of under-replicated blocks.
There are two way to view the under-replicated blocks/files: using fsck or WebHDFS API.
using WebHDFS REST API:
curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=LISTSTATUS"
This will return the response with a FileStatuses JSON object. Parse the JSON object and filter for the files having the replication less than the configured value.
Please find below the sample Response returned from NN:
curl -i "http://<NN_HOST>:<HTTP_PORT>/webhdfs/v1/<PATH_OF_DIRECTORY>?op=LISTSTATUS"
HTTP/1.1 200 OK
Cache-Control: no-cache
Content-Type: application/json
Transfer-Encoding: chunked
Server: Jetty(6.1.26.hwx)
{"FileStatuses":{"FileStatus":[
{"accessTime":1489059994224,"blockSize":134217728,"childrenNum":0,"fileId":209158298,"group":"hdfs","length":0,"modificationTime":1489059994227,"owner":"XXX","pathSuffix":"_SUCCESS","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"},
{"accessTime":1489059969939,"blockSize":134217728,"childrenNum":0,"fileId":209158053,"group":"hdfs","length":0,"modificationTime":1489059986846,"owner":"XXX","pathSuffix":"part-m-00000","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"},
{"accessTime":1489059982614,"blockSize":134217728,"childrenNum":0,"fileId":209158225,"group":"hdfs","length":0,"modificationTime":1489059993497,"owner":"XXX","pathSuffix":"part-m-00001","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"},
{"accessTime":1489059977524,"blockSize":134217728,"childrenNum":0,"fileId":209158188,"group":"hdfs","length":0,"modificationTime":1489059983034,"owner":"XXX","pathSuffix":"part-m-00002","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"}]}}
If the number of files are more, you can also list the files iteratively using ?op=LISTSTATUS_BATCH&startAfter=<CHILD>