Hadoop之check the logs or run fsck in order to identify the missing blocks
- 2020 年 3 月 26 日
- 筆記
hadoop版本是2.8.3
今天发现有奇怪的问题,如下List-1所示,提示有俩个文件块丢失
List-1
There are 2 missing blocks. The following files may be corrupted: blk_1073857294 /tmp/xxx/b9a11fe8-306a-42cc-b49f-2a7f098ecb5a/hive-exec-2.1.1.jar blk_1073857295 /tmp/xxx/b9a11fe8-306a-42cc-b49f-2a7f098ecb5a/hive- hcatalog-core-3.0.0.jar Please check the logs or run fsck in order to identify the missing blocks. See the Hadoop FAQ for common causes and potential solutions.
由于是/tmp目录下,不是正常的业务数据,我们直接删除,如下List-2,之后再去看hdfs的页面,无此问题了。
List-2
[xx@xxx hadoop]# hadoop fsck -delete DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Connecting to namenode via http://xxxx:50070/fsck?ugi=root&delete=1&path=%2F FSCK started by root (auth:SIMPLE) from /10.42.5.26 for path / at Wed Mar 25 12:35:39 CST 2020 .............................................................................. /tmp/xxx/b9a11fe8-306a-42cc-b49f-2a7f098ecb5a/hive-exec-2.1.1.jar: CORRUPT blockpool BP-604784226-10.42.1.102-1577681916881 block blk_1073857294 /tmp/xxx/b9a11fe8-306a-42cc-b49f-2a7f098ecb5a/hive-exec-2.1.1.jar: MISSING 1 blocks of total size 32441258 B.. /tmp/xxx/b9a11fe8-306a-42cc-b49f-2a7f098ecb5a/hive-hcatalog-core-3.0.0.jar: CORRUPT blockpool BP-604784226-10.42.1.102-1577681916881 block blk_1073857295 /tmp/xxx/b9a11fe8-306a-42cc-b49f-2a7f098ecb5a/hive-hcatalog-core-3.0.0.jar: MISSING 1 blocks of total size 269009 B...................... ...
原因分析:
数据是按blk_1073857294、blk_1073857295方式存储在hdfs上的,删除了blk_1073857294、blk_1073857295后,元数据还在,但是数据块不在了,才报的这个错,但是这部分数据其实我不需要了,所以就直接把出异常的文件块的元数据信息也删除就可以了。
Reference
1.https://blog.csdn.net/lsr40/article/details/79426333