메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


Hadoop의 각 데몬을 기동하여 정상 작동중이다가 갑자기 DataNode가 아래와 같은 오류를 내면서 죽는 경우가 있다.

원인은 Heap메모리가 부족하여 발생하는 문제이다. 이때는 아래 내용을 참조하여 HEAP사이즈를 변경하여 각서버에 반영하고 Hadoop전체를 다시

재 기동시켜서 반영해준다.

(분제가 발생하는 노드는 전체 클러스트와 메모리는 같은데 HardDisk용량이 2.5배 정도 되는데 다른 노드에 비해서 데이타 유입량이 더 많아서

동한 다른 노드와 같은 설정을 하면 이용하면서 HEAP메모리 부족현상이 발생되는것으로 보임)


1. hadoop-env.sh에서
export HADOOP_HEAPSIZE을
export HADOOP_HEAPSIZE=3000 으로 설정한다.

export HADOOP_NAMENODE_INIT_HEAPSIZE=""을
export HADOOP_NAMENODE_INIT_HEAPSIZE="2000" 으로 설정한다.


2. mapred-env.sh에서
export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=1000을
export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=2000 으로 설정한다.


3. yarn-env.sh에서

JAVA_HEAP_MAX=-Xmx1000m 를
JAVA_HEAP_MAX=-Xmx2000m 으로 설정한다.

# YARN_HEAPSIZE=1000을
YARN_HEAPSIZE=2000 으로 설정한다.




-----------------------------------오류내용--------------------------------
2017-07-18 20:20:38,668 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1265ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1764ms
2017-07-18 20:20:32,678 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-605282214-XXX.XXX.XXX.XXX-1498555165989:blk_1076520983_2780234 received exception java.io.IOException: Premature EOF from inputStream
2017-07-18 20:20:30,934 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-605282214-XXX.XXX.XXX.XXX-1498555165989:blk_1076520963_2780213, type=LAST_IN_PIPELINE, downstreams=0:[] terminating
2017-07-18 20:20:47,191 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-605282214-XXX.XXX.XXX.XXX-1498555165989:blk_1076520963_2780213 received exception java.io.IOException: Premature EOF from inputStream
2017-07-18 20:20:47,191 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/XXX.XXX.XXX.XXX:9000 beginning handsh
ake with NN
2017-07-18 20:20:48,073 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: sda2:50010:DataXceiver error processing WRITE_BLOCK operation  src: /XXX.XXX.XXX.XXX:43840 dst: /XXX.XXX.XXX.XXX:50010
java.io.IOException: Premature EOF from inputStream
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:895)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:801)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
        at java.lang.Thread.run(Thread.java:745)
2017-07-18 20:20:48,073 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: sda2:50010:DataXceiver error processing WRITE_BLOCK operation  src: /166.104.112.69:45343 dst: /XXX.XXX.XXX.XXX:50010
java.io.IOException: Premature EOF from inputStream
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:895)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:801)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
        at java.lang.Thread.run(Thread.java:745)
2017-07-18 20:20:57,083 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/XXX.XXX.XXX.XXX:9000 succe
ssfully registered with NN
2017-07-18 20:21:00,044 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1178ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1677ms
2017-07-18 20:21:05,452 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 2133ms
GC pool 'PS MarkSweep' had collection(s): count=3 time=2632ms
2017-07-18 20:21:12,342 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1229ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1729ms
2017-07-18 20:21:14,056 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1214ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1713ms
2017-07-18 20:21:28,386 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected exception in block pool Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/1
66.104.112.43:9000
java.lang.OutOfMemoryError: Java heap space
2017-07-18 20:21:28,386 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/166.1
04.112.43:9000
2017-07-18 20:21:29,958 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1205ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1704ms
2017-07-18 20:21:40,231 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1431ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1931ms
2017-07-18 20:21:45,597 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3310ms
GC pool 'PS MarkSweep' had collection(s): count=4 time=3808ms
2017-07-18 20:21:55,800 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392)
2017-07-18 20:21:58,707 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 5945ms
GC pool 'PS MarkSweep' had collection(s): count=12 time=13105ms
2017-07-18 20:22:00,356 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989
2017-07-18 20:22:03,000 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2017-07-18 20:22:03,001 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2017-07-18 20:22:03,002 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1085ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=1233ms
2017-07-18 20:22:03,003 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at sda2/XXX.XXX.XXX.XXX
************************************************************/

번호 제목 글쓴이 날짜 조회 수
444 [oneM2M]Ontologies used for oneM2M 총관리자 2017.08.02 45
443 Windows7 64bit 환경에서 Apache Spark 2.2.0 설치하기 총관리자 2017.07.26 227
442 Windows7 64bit 환경에서 Apache Hadoop 2.7.1설치하기 총관리자 2017.07.26 200
441 jena/fuseki 3.4.0 설치 총관리자 2017.07.25 142
440 LUBM 데이타 생성구문 총관리자 2017.07.24 136
439 Core with name 'xx_shard4_replica1' already exists. 발생시 조치사항 총관리자 2017.07.22 43
438 9대가 hbase cluster로 구성된 서버에서 테스트 data를 halyard에 적재하고 테스트 하는 방법및 절차 총관리자 2017.07.21 53
» 갑자기 DataNode가 java.io.IOException: Premature EOF from inputStream를 반복적으로 발생시키다가 java.lang.OutOfMemoryError: Java heap space를 내면서 죽는 경우 조치방법 총관리자 2017.07.19 1010
436 Current heap configuration for MemStore and BlockCache exceeds the threshold required for successful cluster operation 총관리자 2017.07.18 878
435 HBase 설정 최적화하기(VCNC) file 총관리자 2017.07.18 98
434 HBase write 성능 튜닝 file 총관리자 2017.07.18 68
433 schema.xml vs managed-schema 지정 사용하기 - 두개를 동시에 사용할 수는 없음 총관리자 2017.07.09 137
432 halyard의 console스크립트에서 생성한 repository는 RDF4J Web Applications에서 공유가 되지 않는다. 총관리자 2017.07.05 42
431 halyard 1.3의 rdf4j-server.war와 rdf4j-workbench.war를 tomcat deploy후 조회시 java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/Cell발생시 조치사항 총관리자 2017.07.05 57
430 halyard 1.3을 다른 서버로 이전하는 방법 총관리자 2017.07.05 63
429 python test.py실행시 "ImportError: No module named pyspark" 혹은 "ImportError: No module named py4j.protocol"등의 오류 발생시 조치사항 총관리자 2017.07.04 678
428 solr명령 실행시 "Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect" 오류발생 총관리자 2017.06.30 168
427 mysql에서 외부 디비를 커넥션할 경우 접속 속도가 느려질때 총관리자 2017.06.30 812
426 solr 6.2에 한글 형태소 분석기(arirang 6.x) 적용 및 테스트 file 총관리자 2017.06.27 650
425 elasticsearch 기동시 permission denied on key 'vm.max_map_count' 오류발생시 조치사항 총관리자 2017.06.23 374

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로