메뉴 건너뛰기

Cloudera, BigData, Semantic IoT, Hadoop, NoSQL

Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.


1. hdfs-site.xml과 yarn-site.xml의 설정을 다시 확인한다.

 가. hdfs-site.xml

   <property>
     <name>dfs.hosts.exclude</name>
     <value>$HOME/hadoop/etc/hadoop/nodes.exclude</value>
   </property>
   <property>
     <name>dfs.host</name>
     <value>$HOME/hadoop/etc/hadoop/nodes.include</value>
   </property>


 나. yarn-site.xml

  <property>
   <name>yarn.resourcemanager.nodes.include-path</name>
   <value$HOME/hadoop/etc/hadoop/nodes.include</value>
  </property>
  <property>
   <name>yarn.resourcemanager.nodes.exclude-path</name>
   <value$HOME/hadoop/etc/hadoop/nodes.exclude</value>
  </property>


2. hdfs fsck -storagepolicies 혹은 hdfs fsck -blocks / 를 실행하여 Block의 상태를 확인한다.

 결과는 하단 참조


3. 2의 결과가 Status: CORRUPT이면 적절한 조치를 취한다.

 hdfs fsck -delete 혹은 hdfs fsck -move


4. 2을 다시 실행하여 Status: HEALTHY인지 확인한다.

  결과는 하단 참조


5. 필요시 Decommission과정을 다시 수행한다.

hdfs dfsadmin -refreshNodes

yarn rmadmin -refreshNodes


* Decommission이 수일 혹은 수주 동안 진행될수도 있는데 속도를 증가시키는 방법으로 hdfs-site.xml에 다음을 추가/반영시켜준다.

   (참고 : https://community.hortonworks.com/questions/102621/node-decommissioning-progressing-too-slowly.html)

   <property>
     <name>dfs.namenode.replication.max-streams</name>
     <value>50</value>
   </property>
  
   <property>
     <name>dfs.namenode.replication.max-streams-hard-limit</name>
     <value>100</value>
   </property>
  
   <property>
     <name>dfs.namenode.replication.work.multiplier.per.iteration</name>
     <value>200</value>
   </property>



-----------hdfs fsck -storagepolicies실행 결과(Status: CORRUPT)----------

.....

(생략)

....

/user/hadoop/spark/local-1510393605261: MISSING 1 blocks of total size 134217728 B......
/user/hadoop/spark/local-1511150952538:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1078958243_5217532. Target Replicas is 3 but found 1 replica(s).

/user/hadoop/spark/local-1511150952538:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1078982811_5242100. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1511756383245: MISSING 1 blocks of total size 8357126 B.....
/user/hadoop/spark/local-1511848071791:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079042189_5301478. Target Replicas is 3 but found 2 replica(s).
..
/user/hadoop/spark/local-1511858124646: MISSING 1 blocks of total size 40291 B..
/user/hadoop/spark/local-1511858518707:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079043514_5302803. Target Replicas is 3 but found 1 replica(s).
....
/user/hadoop/spark/local-1511861829455:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079044160_5303449. Target Replicas is 3 but found 1 replica(s).
..
/user/hadoop/spark/local-1511921506635:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079057453_5316742. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1511931435456: MISSING 1 blocks of total size 702011 B..
/user/hadoop/spark/local-1511932067927:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079058984_5318273. Target Replicas is 3 but found 2 replica(s).
.......
/user/hadoop/spark/local-1511939175974:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079060057_5319346. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1511942070784:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079060488_5319777. Target Replicas is 3 but found 2 replica(s).
.
.
/user/hadoop/spark/local-1511945803722:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079061302_5320591. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1511946633083:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079061444_5320733. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1512003403329:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079074161_5333450. Target Replicas is 3 but found 1 replica(s).
.
/user/hadoop/spark/local-1512008787877:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079074799_5334088. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1512018010728:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079076017_5335306. Target Replicas is 3 but found 2 replica(s).
............
/user/hadoop/spark/local-1512121416466:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079096405_5355695. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512361519396:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079147739_5407029. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512373036884:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079149109_5408399. Target Replicas is 3 but found 2 replica(s).
..
/user/hadoop/spark/local-1512373950155:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079149191_5408481. Target Replicas is 3 but found 1 replica(s).
..........................
/user/hadoop/spark/local-1512641606927:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079196301_5455600. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1512694548543:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079208186_5467485. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512712721899:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079272068_5531367. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512978676213:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079331781_5591080. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1513040318768:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079345530_5604829. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1513154800735:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079372104_5631403. Target Replicas is 3 but found 1 replica(s).
..
/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079391263_5650563. Target Replicas is 3 but found 2 replica(s).

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079416745_5676128. Target Replicas is 3 but found 1 replica(s).

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079454497_5713964. Target Replicas is 3 but found 1 replica(s).
....
/user/hadoop/spark/local-1513933165296:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079584889_5844405. Target Replicas is 3 but found 1 replica(s).
...
/user/pineone/gooper-test/icbms_2017-07-21_13-28-17.nq.gz:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1076598010_2857261. Target Replicas is 3 but found 2 replica(s).
.....
/user/pineone/in/tomcat-juli.jar:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073741825_1001. Target Replicas is 3 but found 2 replica(s).
.......
/user/pineone/out3/part-r-00000:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073744812_3988. Target Replicas is 3 but found 2 replica(s).
.......
.....Status: CORRUPT
 Total size:    1136215827409 B (Total open files size: 939529123 B)
 Total dirs:    1415
 Total files:   1864205
 Total symlinks:                0 (Files currently being written: 12)
 Total blocks (validated):      1864295 (avg. block size 609461 B) (Total open file blocks (not validated): 18)
  ********************************
  UNDER MIN REPL'D BLOCKS:      11 (5.900354E-4 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:        11
  MISSING BLOCKS:       11
  MISSING SIZE:         321480184 B
  ********************************
 Minimally replicated blocks:   1864284 (99.99941 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       545406 (29.255348 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.661861
 Corrupt blocks:                0
 Missing replicas:              630358 (11.270713 %)
 Number of data-nodes:          8
 Number of racks:               1
FSCK ended at Tue Jan 02 15:42:32 KST 2018 in 196868 milliseconds
FSCK ended at Tue Jan 02 15:42:32 KST 2018 in 196868 milliseconds
fsck encountered internal errors!


Fsck on path '/' FAILED



-----------hdfs fsck -storagepolicies실행 결과(Status: HEALTHY)----------

.....

(생략)

....

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079416745_5676128. Target Replicas is 3 but found 1 replica(s).

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079454497_5713964. Target Replicas is 3 but found 1 replica(s).
....
/user/hadoop/spark/local-1513933165296:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079584889_5844405. Target Replicas is 3 but found 1 replica(s).
...
/user/hadoop/spark/local-1514451281961:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079748353_6021470. Target Replicas is 3 but found 1 replica(s).
.
/user/pineone/gooper-test/icbms_2017-07-21_13-28-17.nq.gz:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1076598010_2857261. Target Replicas is 3 but found 2 replica(s).
.....
/user/pineone/in/tomcat-juli.jar:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073741825_1001. Target Replicas is 3 but found 2 replica(s).
.......
/user/pineone/out3/part-r-00000:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073744812_3988. Target Replicas is 3 but found 2 replica(s).
........
....Status: HEALTHY
 Total size:    1136067627332 B (Total open files size: 1312 B)
 Total dirs:    1446
 Total files:   1864304
 Total symlinks:                0 (Files currently being written: 4)
 Total blocks (validated):      1864376 (avg. block size 609355 B) (Total open file blocks (not validated): 3)
 Minimally replicated blocks:   1864376 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       526644 (28.247736 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.682025
 Corrupt blocks:                0
 Missing replicas:              592825 (10.599168 %)
 Number of data-nodes:          8
 Number of racks:               1

Blocks satisfying the specified storage policy:
Storage Policy                  # of blocks       % of blocks
DISK:5(HOT)                   796652              42.7302%
DISK:3(HOT)                   595925              31.9638%
DISK:4(HOT)                   471514              25.2907%
DISK:6(HOT)                      274               0.0147%
DISK:1(HOT)                       10               0.0005%
DISK:2(HOT)                        1               0.0001%

All blocks satisfy specified storage policy.
FSCK ended at Tue Jan 02 17:08:00 KST 2018 in 184737 milliseconds


The filesystem under path '/' is HEALTHY

번호 제목 날짜 조회 수
67 메이븐 (maven) 설치 및 이클립스 연동하기 file 2013.03.06 4943
66 Hadoop 설치 및 시작하기 file 2013.03.06 4776
65 Hadoop wordcount 소스 작성 file 2013.03.06 4596
64 이클립스에서 생성한 jar 파일 hadoop 으로 실행하기 file 2013.03.06 5539
63 ExWordCount jar파일 file 2013.03.06 4477
62 Hadoop Cluster 설치 (Hadoop+Zookeeper+Hbase) file 2013.03.07 6099
61 hadoop 설치(3대) file 2013.03.07 4671
60 hadoop설치시 참고사항 2013.03.08 4849
59 org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop-root/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible. 2013.03.11 17000
58 Cacti로 Hadoop 모니터링 하기 file 2013.03.12 4902
57 hadoop설치시 오류 2013.12.18 4967
56 hadoop및 ecosystem에서 사용되는 명령문 정리 2014.05.28 6435
55 banana pi에(lubuntu)에 hadoop설치하고 테스트하기 - 성공 file 2014.07.05 5344
54 org.apache.hadoop.security.AccessControlException: Permission denied: user=hadoop, access=WRITE, inode="":root:supergroup:rwxr-xr-x 오류 처리방법 2014.07.05 5385
53 access=WRITE, inode="staging":ubuntu:supergroup:rwxr-xr-x 오류 2014.07.05 4636
52 hadoop의 data디렉토리를 변경하는 방법 2014.08.24 4181
51 bananapi 5대(ubuntu계열 리눅스)에 yarn(hadoop 2.6.0)설치하기-ResourceManager HA/HDFS HA포함, JobHistory포함 2015.04.24 22230
50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable원인 2015.04.27 4461
49 Hadoop - 클러스터 세팅및 기동 2015.04.28 4003
48 hadoop 2.6.0 기동(에코시스템 포함)및 wordcount 어플리케이션을 이용한 테스트 2015.05.05 6542
위로