메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


1. hdfs-site.xml과 yarn-site.xml의 설정을 다시 확인한다.

 가. hdfs-site.xml

   <property>
     <name>dfs.hosts.exclude</name>
     <value>$HOME/hadoop/etc/hadoop/nodes.exclude</value>
   </property>
   <property>
     <name>dfs.host</name>
     <value>$HOME/hadoop/etc/hadoop/nodes.include</value>
   </property>


 나. yarn-site.xml

  <property>
   <name>yarn.resourcemanager.nodes.include-path</name>
   <value$HOME/hadoop/etc/hadoop/nodes.include</value>
  </property>
  <property>
   <name>yarn.resourcemanager.nodes.exclude-path</name>
   <value$HOME/hadoop/etc/hadoop/nodes.exclude</value>
  </property>


2. hdfs fsck -storagepolicies 혹은 hdfs fsck -blocks / 를 실행하여 Block의 상태를 확인한다.

 결과는 하단 참조


3. 2의 결과가 Status: CORRUPT이면 적절한 조치를 취한다.

 hdfs fsck -delete 혹은 hdfs fsck -move


4. 2을 다시 실행하여 Status: HEALTHY인지 확인한다.

  결과는 하단 참조


5. 필요시 Decommission과정을 다시 수행한다.

hdfs dfsadmin -refreshNodes

yarn rmadmin -refreshNodes


* Decommission이 수일 혹은 수주 동안 진행될수도 있는데 속도를 증가시키는 방법으로 hdfs-site.xml에 다음을 추가/반영시켜준다.

   (참고 : https://community.hortonworks.com/questions/102621/node-decommissioning-progressing-too-slowly.html)

   <property>
     <name>dfs.namenode.replication.max-streams</name>
     <value>50</value>
   </property>
  
   <property>
     <name>dfs.namenode.replication.max-streams-hard-limit</name>
     <value>100</value>
   </property>
  
   <property>
     <name>dfs.namenode.replication.work.multiplier.per.iteration</name>
     <value>200</value>
   </property>



-----------hdfs fsck -storagepolicies실행 결과(Status: CORRUPT)----------

.....

(생략)

....

/user/hadoop/spark/local-1510393605261: MISSING 1 blocks of total size 134217728 B......
/user/hadoop/spark/local-1511150952538:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1078958243_5217532. Target Replicas is 3 but found 1 replica(s).

/user/hadoop/spark/local-1511150952538:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1078982811_5242100. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1511756383245: MISSING 1 blocks of total size 8357126 B.....
/user/hadoop/spark/local-1511848071791:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079042189_5301478. Target Replicas is 3 but found 2 replica(s).
..
/user/hadoop/spark/local-1511858124646: MISSING 1 blocks of total size 40291 B..
/user/hadoop/spark/local-1511858518707:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079043514_5302803. Target Replicas is 3 but found 1 replica(s).
....
/user/hadoop/spark/local-1511861829455:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079044160_5303449. Target Replicas is 3 but found 1 replica(s).
..
/user/hadoop/spark/local-1511921506635:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079057453_5316742. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1511931435456: MISSING 1 blocks of total size 702011 B..
/user/hadoop/spark/local-1511932067927:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079058984_5318273. Target Replicas is 3 but found 2 replica(s).
.......
/user/hadoop/spark/local-1511939175974:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079060057_5319346. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1511942070784:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079060488_5319777. Target Replicas is 3 but found 2 replica(s).
.
.
/user/hadoop/spark/local-1511945803722:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079061302_5320591. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1511946633083:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079061444_5320733. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1512003403329:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079074161_5333450. Target Replicas is 3 but found 1 replica(s).
.
/user/hadoop/spark/local-1512008787877:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079074799_5334088. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1512018010728:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079076017_5335306. Target Replicas is 3 but found 2 replica(s).
............
/user/hadoop/spark/local-1512121416466:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079096405_5355695. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512361519396:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079147739_5407029. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512373036884:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079149109_5408399. Target Replicas is 3 but found 2 replica(s).
..
/user/hadoop/spark/local-1512373950155:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079149191_5408481. Target Replicas is 3 but found 1 replica(s).
..........................
/user/hadoop/spark/local-1512641606927:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079196301_5455600. Target Replicas is 3 but found 2 replica(s).
.
/user/hadoop/spark/local-1512694548543:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079208186_5467485. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512712721899:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079272068_5531367. Target Replicas is 3 but found 2 replica(s).
.....
/user/hadoop/spark/local-1512978676213:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079331781_5591080. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1513040318768:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079345530_5604829. Target Replicas is 3 but found 2 replica(s).
...
/user/hadoop/spark/local-1513154800735:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079372104_5631403. Target Replicas is 3 but found 1 replica(s).
..
/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079391263_5650563. Target Replicas is 3 but found 2 replica(s).

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079416745_5676128. Target Replicas is 3 but found 1 replica(s).

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079454497_5713964. Target Replicas is 3 but found 1 replica(s).
....
/user/hadoop/spark/local-1513933165296:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079584889_5844405. Target Replicas is 3 but found 1 replica(s).
...
/user/pineone/gooper-test/icbms_2017-07-21_13-28-17.nq.gz:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1076598010_2857261. Target Replicas is 3 but found 2 replica(s).
.....
/user/pineone/in/tomcat-juli.jar:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073741825_1001. Target Replicas is 3 but found 2 replica(s).
.......
/user/pineone/out3/part-r-00000:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073744812_3988. Target Replicas is 3 but found 2 replica(s).
.......
.....Status: CORRUPT
 Total size:    1136215827409 B (Total open files size: 939529123 B)
 Total dirs:    1415
 Total files:   1864205
 Total symlinks:                0 (Files currently being written: 12)
 Total blocks (validated):      1864295 (avg. block size 609461 B) (Total open file blocks (not validated): 18)
  ********************************
  UNDER MIN REPL'D BLOCKS:      11 (5.900354E-4 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:        11
  MISSING BLOCKS:       11
  MISSING SIZE:         321480184 B
  ********************************
 Minimally replicated blocks:   1864284 (99.99941 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       545406 (29.255348 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.661861
 Corrupt blocks:                0
 Missing replicas:              630358 (11.270713 %)
 Number of data-nodes:          8
 Number of racks:               1
FSCK ended at Tue Jan 02 15:42:32 KST 2018 in 196868 milliseconds
FSCK ended at Tue Jan 02 15:42:32 KST 2018 in 196868 milliseconds
fsck encountered internal errors!


Fsck on path '/' FAILED



-----------hdfs fsck -storagepolicies실행 결과(Status: HEALTHY)----------

.....

(생략)

....

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079416745_5676128. Target Replicas is 3 but found 1 replica(s).

/user/hadoop/spark/local-1513239737761:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079454497_5713964. Target Replicas is 3 but found 1 replica(s).
....
/user/hadoop/spark/local-1513933165296:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079584889_5844405. Target Replicas is 3 but found 1 replica(s).
...
/user/hadoop/spark/local-1514451281961:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1079748353_6021470. Target Replicas is 3 but found 1 replica(s).
.
/user/pineone/gooper-test/icbms_2017-07-21_13-28-17.nq.gz:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1076598010_2857261. Target Replicas is 3 but found 2 replica(s).
.....
/user/pineone/in/tomcat-juli.jar:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073741825_1001. Target Replicas is 3 but found 2 replica(s).
.......
/user/pineone/out3/part-r-00000:  Under replicated BP-605282214-166.104.112.43-1498555165989:blk_1073744812_3988. Target Replicas is 3 but found 2 replica(s).
........
....Status: HEALTHY
 Total size:    1136067627332 B (Total open files size: 1312 B)
 Total dirs:    1446
 Total files:   1864304
 Total symlinks:                0 (Files currently being written: 4)
 Total blocks (validated):      1864376 (avg. block size 609355 B) (Total open file blocks (not validated): 3)
 Minimally replicated blocks:   1864376 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       526644 (28.247736 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.682025
 Corrupt blocks:                0
 Missing replicas:              592825 (10.599168 %)
 Number of data-nodes:          8
 Number of racks:               1

Blocks satisfying the specified storage policy:
Storage Policy                  # of blocks       % of blocks
DISK:5(HOT)                   796652              42.7302%
DISK:3(HOT)                   595925              31.9638%
DISK:4(HOT)                   471514              25.2907%
DISK:6(HOT)                      274               0.0147%
DISK:1(HOT)                       10               0.0005%
DISK:2(HOT)                        1               0.0001%

All blocks satisfy specified storage policy.
FSCK ended at Tue Jan 02 17:08:00 KST 2018 in 184737 milliseconds


The filesystem under path '/' is HEALTHY

번호 제목 글쓴이 날짜 조회 수
481 CDP에서 AD와 Kerberos를 활용하여 인증 환경을 구축하는 3가지 방법 gooper 2022.06.10 423
480 Tracking URL = N/A 가발생하는 경우 - 환경설정값을 잘못설정하는 경우에 발생함 총관리자 2015.06.17 423
479 conda를 이용한 jupyterhub(v0.9)및 jupyter설치 (v4.4.0) 총관리자 2018.07.30 421
478 컬럼및 라인의 구분자를 지정하여 sqoop으로 데이타를 가져오고 hive테이블을 생성하는 명령문 총관리자 2018.08.03 419
477 kafka 0.9.0.1 for scala 2.1.1 설치및 테스트 총관리자 2016.05.02 412
476 Permission denied: user=hadoop, access=EXECUTE, inode="/tmp":root:supergroup:drwxrwx--- 오류해결방법 총관리자 2015.05.17 412
475 S2RDF를 실행부분만 추출하여 1건의 triple data를 HDFS에 등록, sparql을 sql로 변환, sql실행하는 방법및 S2RDF소스 컴파일 방법 총관리자 2016.06.15 410
474 원보드 컴퓨터 비교표 file 총관리자 2014.08.04 408
473 2개 data를 join하고 마지막으로 code정보를 join하여 결과를 얻는 mr 프로그램 총관리자 2014.06.30 408
472 Job이 끝난 log을 볼수 있도록 설정하기 총관리자 2016.05.30 405
471 Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.http.HttpConfig.getSchemePrefix()Ljava/lang/String; 해결->실패 총관리자 2015.06.14 403
470 source, sink를 직접 구현하여 사용하는 예시 총관리자 2019.05.30 398
469 Eclipse실행시 Java was started but returned exit code=1이라는 오류가 발생할때 조치방법 총관리자 2016.11.07 398
468 Cassandra 3.4(3.10) 설치/설정 (5대로 clustering) 총관리자 2016.04.11 397
467 Error: E0501 : E0501: Could not perform authorization operation, User: hadoop is not allowed to impersonate hadoop 해결하는 방법 총관리자 2015.06.07 385
466 hive metadata(hive, impala, kudu 정보가 있음) 테이블에서 db, table, owner, location를 조회하는 쿼리 총관리자 2020.02.07 380
465 sparql 문법구조 설명 file 총관리자 2015.12.09 378
464 namenode오류 복구시 사용하는 명령 총관리자 2016.04.01 377
463 특정문자열이나 URI를 임의로 select 절에 지정하여 사용할때 사용하는 sparql 문장 총관리자 2016.08.25 376
462 scan의 startrow, stoprow지정하는 방법 총관리자 2015.04.08 375

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로