메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


Hadoop의 각 데몬을 기동하여 정상 작동중이다가 갑자기 DataNode가 아래와 같은 오류를 내면서 죽는 경우가 있다.

원인은 Heap메모리가 부족하여 발생하는 문제이다. 이때는 아래 내용을 참조하여 HEAP사이즈를 변경하여 각서버에 반영하고 Hadoop전체를 다시

재 기동시켜서 반영해준다.

(분제가 발생하는 노드는 전체 클러스트와 메모리는 같은데 HardDisk용량이 2.5배 정도 되는데 다른 노드에 비해서 데이타 유입량이 더 많아서

동한 다른 노드와 같은 설정을 하면 이용하면서 HEAP메모리 부족현상이 발생되는것으로 보임)


1. hadoop-env.sh에서
export HADOOP_HEAPSIZE을
export HADOOP_HEAPSIZE=3000 으로 설정한다.

export HADOOP_NAMENODE_INIT_HEAPSIZE=""을
export HADOOP_NAMENODE_INIT_HEAPSIZE="2000" 으로 설정한다.


2. mapred-env.sh에서
export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=1000을
export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=2000 으로 설정한다.


3. yarn-env.sh에서

JAVA_HEAP_MAX=-Xmx1000m 를
JAVA_HEAP_MAX=-Xmx2000m 으로 설정한다.

# YARN_HEAPSIZE=1000을
YARN_HEAPSIZE=2000 으로 설정한다.




-----------------------------------오류내용--------------------------------
2017-07-18 20:20:38,668 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1265ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1764ms
2017-07-18 20:20:32,678 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-605282214-XXX.XXX.XXX.XXX-1498555165989:blk_1076520983_2780234 received exception java.io.IOException: Premature EOF from inputStream
2017-07-18 20:20:30,934 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-605282214-XXX.XXX.XXX.XXX-1498555165989:blk_1076520963_2780213, type=LAST_IN_PIPELINE, downstreams=0:[] terminating
2017-07-18 20:20:47,191 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-605282214-XXX.XXX.XXX.XXX-1498555165989:blk_1076520963_2780213 received exception java.io.IOException: Premature EOF from inputStream
2017-07-18 20:20:47,191 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/XXX.XXX.XXX.XXX:9000 beginning handsh
ake with NN
2017-07-18 20:20:48,073 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: sda2:50010:DataXceiver error processing WRITE_BLOCK operation  src: /XXX.XXX.XXX.XXX:43840 dst: /XXX.XXX.XXX.XXX:50010
java.io.IOException: Premature EOF from inputStream
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:895)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:801)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
        at java.lang.Thread.run(Thread.java:745)
2017-07-18 20:20:48,073 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: sda2:50010:DataXceiver error processing WRITE_BLOCK operation  src: /166.104.112.69:45343 dst: /XXX.XXX.XXX.XXX:50010
java.io.IOException: Premature EOF from inputStream
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:895)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:801)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
        at java.lang.Thread.run(Thread.java:745)
2017-07-18 20:20:57,083 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/XXX.XXX.XXX.XXX:9000 succe
ssfully registered with NN
2017-07-18 20:21:00,044 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1178ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1677ms
2017-07-18 20:21:05,452 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 2133ms
GC pool 'PS MarkSweep' had collection(s): count=3 time=2632ms
2017-07-18 20:21:12,342 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1229ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1729ms
2017-07-18 20:21:14,056 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1214ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1713ms
2017-07-18 20:21:28,386 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected exception in block pool Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/1
66.104.112.43:9000
java.lang.OutOfMemoryError: Java heap space
2017-07-18 20:21:28,386 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/166.1
04.112.43:9000
2017-07-18 20:21:29,958 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1205ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1704ms
2017-07-18 20:21:40,231 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1431ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1931ms
2017-07-18 20:21:45,597 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3310ms
GC pool 'PS MarkSweep' had collection(s): count=4 time=3808ms
2017-07-18 20:21:55,800 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392)
2017-07-18 20:21:58,707 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 5945ms
GC pool 'PS MarkSweep' had collection(s): count=12 time=13105ms
2017-07-18 20:22:00,356 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989
2017-07-18 20:22:03,000 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2017-07-18 20:22:03,001 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2017-07-18 20:22:03,002 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1085ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=1233ms
2017-07-18 20:22:03,003 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at sda2/XXX.XXX.XXX.XXX
************************************************************/

번호 제목 글쓴이 날짜 조회 수
27 hortonworks에서 제공하는 메모리 설정값 계산기 사용법 file 총관리자 2015.06.14 719
26 CentOS의 서버 5대에 yarn(hadoop 2.7.2)설치하기-ResourceManager HA/HDFS HA포함, JobHistory포함 총관리자 2016.03.29 1138
25 resouce manager에 dr.who가 아닌 다른 사용자로 로그인 하기 총관리자 2018.06.28 1207
24 ExWordCount jar파일 file 구퍼 2013.03.06 1336
23 Journal Storage Directory /data/hadoop/journal/data/mycluster not formatted 오류시 조치사항 총관리자 2016.07.29 1518
22 physical memory used되면서 mapper가 kill되는 경우 오류 발생시 조치 총관리자 2018.09.20 1522
» 갑자기 DataNode가 java.io.IOException: Premature EOF from inputStream를 반복적으로 발생시키다가 java.lang.OutOfMemoryError: Java heap space를 내면서 죽는 경우 조치방법 총관리자 2017.07.19 1676
20 access=WRITE, inode="staging":ubuntu:supergroup:rwxr-xr-x 오류 총관리자 2014.07.05 1719
19 Hadoop wordcount 소스 작성 file 구퍼 2013.03.06 1888
18 Hadoop 설치 및 시작하기 file 구퍼 2013.03.06 1951
17 W/F수행후 Logs not available for 1. Aggregation may not to complete. 표시되며 로그내용이 보이지 않은 경우 총관리자 2020.05.08 2110
16 hadoop설치시 참고사항 구퍼 2013.03.08 2131
15 메이븐 (maven) 설치 및 이클립스 연동하기 file 구퍼 2013.03.06 2280
14 hadoop설치시 오류 총관리자 2013.12.18 2313
13 Cacti로 Hadoop 모니터링 하기 file 구퍼 2013.03.12 2367
12 hadoop 설치(3대) file 구퍼 2013.03.07 2613
11 banana pi에(lubuntu)에 hadoop설치하고 테스트하기 - 성공 file 총관리자 2014.07.05 2760
10 org.apache.hadoop.security.AccessControlException: Permission denied: user=hadoop, access=WRITE, inode="":root:supergroup:rwxr-xr-x 오류 처리방법 총관리자 2014.07.05 2834
9 이클립스에서 생성한 jar 파일 hadoop 으로 실행하기 file 구퍼 2013.03.06 2836
8 "java.net.NoRouteToHostException: 호스트로 갈 루트가 없음" 오류시 확인및 조치할 사항 총관리자 2016.04.01 3002

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로