메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


1. hdfs-site.xml과 yarn-site.xml의 설정을 다시 확인한다.

 가. hdfs-site.xml

   <property>
     <name>dfs.hosts.exclude</name>
     <value>$HOME/hadoop/etc/hadoop/nodes.exclude</value>
   </property>
   <property>
     <name>dfs.host</name>
     <value>$HOME/hadoop/etc/hadoop/nodes.include</value>
   </property>


 나. yarn-site.xml

  <property>
   <name>yarn.resourcemanager.nodes.include-path</name>
   <value$HOME/hadoop/etc/hadoop/nodes.include</value>
  </property>
  <property>
   <name>yarn.resourcemanager.nodes.exclude-path</name>
   <value$HOME/hadoop/etc/hadoop/nodes.exclude</value>
  </property>


2. hdfs fsck -storagepolicies 혹은 hdfs fsck -blocks / 를 실행하여 Block의 상태를 확인한다.

 결과는 하단 참조


3. 2의 결과가 Status: CORRUPT이면 적절한 조치를 취한다.

 hdfs fsck -delete 혹은 hdfs fsck -move


4. 2을 다시 실행하여 Status: HEALTHY인지 확인한다.

  결과는 하단 참조


5. 필요시 Decommission과정을 다시 수행한다.

hdfs dfsadmin -refreshNodes

yarn rmadmin -refreshNodes


* Decommission이 수일 혹은 수주 동안 진행될수도 있는데 속도를 증가시키는 방법으로 hdfs-site.xml에 다음을 추가/반영시켜준다.

   (참고 : https://community.hortonworks.com/questions/102621/node-decommissioning-progressing-too-slowly.html)

   <property>
     <name>dfs.namenode.replication.max-streams</name>
     <value>50</value>
   </property>
  
   <property>
     <name>dfs.namenode.replication.max-streams-hard-limit</name>
     <value>100</value>
   </property>
  
   <property>
     <name>dfs.namenode.replication.work.multiplier.per.iteration</name>
     <value>200</value>
   </property>



-----------hdfs fsck -storagepolicies실행 결과(Status: CORRUPT)----------

.....

(생략)

....

/user/hadoop/spark/local-1510393605261: MISSING 1 blocks of total size 134217728 B......
/user/hadoop/spark/local-1511150952538:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1078958243_5217532. Target Replicas is 3 but found 1 replica(s).

/user/hadoop/spark/local-1511150952538:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1078982811_5242100. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1511756383245: MISSING 1 blocks of total size 8357126 B.....
/user/hadoop/spark/local-1511848071791:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079042189_5301478. Target Replicas is 3 but found 2 replica(s).
..
/user/hadoop/spark/local-1511858124646: MISSING 1 blocks of total size 40291 B..
/user/hadoop/spark/local-1511858518707:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079043514_5302803. Target Replicas is 3 but found 1 replica(s).
....
/user/hadoop/spark/local-1511861829455:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079044160_5303449. Target Replicas is 3 but found 1 replica(s).
..
/user/hadoop/spark/local-1511921506635:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079057453_5316742. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1511931435456: MISSING 1 blocks of total size 702011 B..
/user/hadoop/spark/local-1511932067927:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079058984_5318273. Target Replicas is 3 but found 2 replica(s).
.......
/user/hadoop/spark/local-1511939175974:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079060057_5319346. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1511942070784:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079060488_5319777. Target Replicas is 3 but found 2 replica(s).
.
.
/user/hadoop/spark/local-1511945803722:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079061302_5320591. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1511946633083:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079061444_5320733. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1512003403329:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079074161_5333450. Target Replicas is 3 but found 1 replica(s).
.
/user/hadoop/spark/local-1512008787877:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079074799_5334088. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1512018010728:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079076017_5335306. Target Replicas is 3 but found 2 replica(s).
............
/user/hadoop/spark/local-1512121416466:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079096405_5355695. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512361519396:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079147739_5407029. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512373036884:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079149109_5408399. Target Replicas is 3 but found 2 replica(s).
..
/user/hadoop/spark/local-1512373950155:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079149191_5408481. Target Replicas is 3 but found 1 replica(s).
..........................
/user/hadoop/spark/local-1512641606927:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079196301_5455600. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1512694548543:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079208186_5467485. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512712721899:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079272068_5531367. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512978676213:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079331781_5591080. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1513040318768:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079345530_5604829. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1513154800735:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079372104_5631403. Target Replicas is 3 but found 1 replica(s).
..
/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079391263_5650563. Target Replicas is 3 but found 2 replica(s).

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079416745_5676128. Target Replicas is 3 but found 1 replica(s).

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079454497_5713964. Target Replicas is 3 but found 1 replica(s).
....
/user/hadoop/spark/local-1513933165296:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079584889_5844405. Target Replicas is 3 but found 1 replica(s).
...
/user/pineone/gooper-test/icbms_2017-07-21_13-28-17.nq.gz:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1076598010_2857261. Target Replicas is 3 but found 2 replica(s).
.....
/user/pineone/in/tomcat-juli.jar:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073741825_1001. Target Replicas is 3 but found 2 replica(s).
.......
/user/pineone/out3/part-r-00000:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073744812_3988. Target Replicas is 3 but found 2 replica(s).
.......
.....Status: CORRUPT
 Total size:    1136215827409 B (Total open files size: 939529123 B)
 Total dirs:    1415
 Total files:   1864205
 Total symlinks:                0 (Files currently being written: 12)
 Total blocks (validated):      1864295 (avg. block size 609461 B) (Total open file blocks (not validated): 18)
  ********************************
  UNDER MIN REPL'D BLOCKS:      11 (5.900354E-4 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:        11
  MISSING BLOCKS:       11
  MISSING SIZE:         321480184 B
  ********************************
 Minimally replicated blocks:   1864284 (99.99941 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       545406 (29.255348 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.661861
 Corrupt blocks:                0
 Missing replicas:              630358 (11.270713 %)
 Number of data-nodes:          8
 Number of racks:               1
FSCK ended at Tue Jan 02 15:42:32 KST 2018 in 196868 milliseconds
FSCK ended at Tue Jan 02 15:42:32 KST 2018 in 196868 milliseconds
fsck encountered internal errors!


Fsck on path '/' FAILED



-----------hdfs fsck -storagepolicies실행 결과(Status: HEALTHY)----------

.....

(생략)

....

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079416745_5676128. Target Replicas is 3 but found 1 replica(s).

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079454497_5713964. Target Replicas is 3 but found 1 replica(s).
....
/user/hadoop/spark/local-1513933165296:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079584889_5844405. Target Replicas is 3 but found 1 replica(s).
...
/user/hadoop/spark/local-1514451281961:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079748353_6021470. Target Replicas is 3 but found 1 replica(s).
.
/user/pineone/gooper-test/icbms_2017-07-21_13-28-17.nq.gz:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1076598010_2857261. Target Replicas is 3 but found 2 replica(s).
.....
/user/pineone/in/tomcat-juli.jar:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073741825_1001. Target Replicas is 3 but found 2 replica(s).
.......
/user/pineone/out3/part-r-00000:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073744812_3988. Target Replicas is 3 but found 2 replica(s).
........
....Status: HEALTHY
 Total size:    1136067627332 B (Total open files size: 1312 B)
 Total dirs:    1446
 Total files:   1864304
 Total symlinks:                0 (Files currently being written: 4)
 Total blocks (validated):      1864376 (avg. block size 609355 B) (Total open file blocks (not validated): 3)
 Minimally replicated blocks:   1864376 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       526644 (28.247736 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.682025
 Corrupt blocks:                0
 Missing replicas:              592825 (10.599168 %)
 Number of data-nodes:          8
 Number of racks:               1

Blocks satisfying the specified storage policy:
Storage Policy                  # of blocks       % of blocks
DISK:5(HOT)                   796652              42.7302%
DISK:3(HOT)                   595925              31.9638%
DISK:4(HOT)                   471514              25.2907%
DISK:6(HOT)                      274               0.0147%
DISK:1(HOT)                       10               0.0005%
DISK:2(HOT)                        1               0.0001%

All blocks satisfy specified storage policy.
FSCK ended at Tue Jan 02 17:08:00 KST 2018 in 184737 milliseconds


The filesystem under path '/' is HEALTHY

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로