个人笔记
SongPinru 的小仓库
节点退役方案:
1.退役前检查HDFS块信息是否有错误,如果有错误的块,先修复
(PS:使用hdfs用户执行)
检查命令
hdfs fsck -list-corruptfileblocks -openforwrite -files -blocks -locations
修复命令
hdfs debug recoverLease -path /path
2.退役节点
CDH中搜索dfs_hosts_exclude.txt
增加要下线的节点ip
在黑名单中配置要下线的节点,然后保存并刷新hdfs(不会重启集群)
或者登录active namenode ,找到配置文件目录,修改dfs_hosts_exclude.txt
然后执行:
hadoop dfsadmin -refreshNodes
此时下线节点状态变为decommissioning(可在web界面查看)
3.等待节点下线完毕
可以在webUi查看,或使用以下命令
此命令可用于查看副本是否复制完毕:
hdfs dfsadmin -report -decommissioning
root@bjuc49:/home/ops# hdfs dfsadmin -report -decommissioning
Configured Capacity: 67358447177728 (61.26 TB)
Present Capacity: 66731792277188 (60.69 TB)
DFS Remaining: 19512827293507 (17.75 TB)
DFS Used: 47218964983681 (42.95 TB)
DFS Used%: 70.76%
Replicated Blocks:
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 6
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
-------------------------------------------------
Decommissioning datanodes (0):
- Under replicated blocks: 副本数不完整的块(用来查看下线进度)
- Blocks with corrupt replicas: 副本错误的块
- Missing blocks: 丢失的块