메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


Hadoop의 각 데몬을 기동하여 정상 작동중이다가 갑자기 DataNode가 아래와 같은 오류를 내면서 죽는 경우가 있다.

원인은 Heap메모리가 부족하여 발생하는 문제이다. 이때는 아래 내용을 참조하여 HEAP사이즈를 변경하여 각서버에 반영하고 Hadoop전체를 다시

재 기동시켜서 반영해준다.

(분제가 발생하는 노드는 전체 클러스트와 메모리는 같은데 HardDisk용량이 2.5배 정도 되는데 다른 노드에 비해서 데이타 유입량이 더 많아서

동한 다른 노드와 같은 설정을 하면 이용하면서 HEAP메모리 부족현상이 발생되는것으로 보임)


1. hadoop-env.sh에서
export HADOOP_HEAPSIZE을
export HADOOP_HEAPSIZE=3000 으로 설정한다.

export HADOOP_NAMENODE_INIT_HEAPSIZE=""을
export HADOOP_NAMENODE_INIT_HEAPSIZE="2000" 으로 설정한다.


2. mapred-env.sh에서
export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=1000을
export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=2000 으로 설정한다.


3. yarn-env.sh에서

JAVA_HEAP_MAX=-Xmx1000m 를
JAVA_HEAP_MAX=-Xmx2000m 으로 설정한다.

# YARN_HEAPSIZE=1000을
YARN_HEAPSIZE=2000 으로 설정한다.




-----------------------------------오류내용--------------------------------
2017-07-18 20:20:38,668 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1265ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1764ms
2017-07-18 20:20:32,678 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-605282214-XXX.XXX.XXX.XXX-1498555165989:blk_1076520983_2780234 received exception java.io.IOException: Premature EOF from inputStream
2017-07-18 20:20:30,934 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-605282214-XXX.XXX.XXX.XXX-1498555165989:blk_1076520963_2780213, type=LAST_IN_PIPELINE, downstreams=0:[] terminating
2017-07-18 20:20:47,191 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-605282214-XXX.XXX.XXX.XXX-1498555165989:blk_1076520963_2780213 received exception java.io.IOException: Premature EOF from inputStream
2017-07-18 20:20:47,191 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/XXX.XXX.XXX.XXX:9000 beginning handsh
ake with NN
2017-07-18 20:20:48,073 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: sda2:50010:DataXceiver error processing WRITE_BLOCK operation  src: /XXX.XXX.XXX.XXX:43840 dst: /XXX.XXX.XXX.XXX:50010
java.io.IOException: Premature EOF from inputStream
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:895)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:801)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
        at java.lang.Thread.run(Thread.java:745)
2017-07-18 20:20:48,073 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: sda2:50010:DataXceiver error processing WRITE_BLOCK operation  src: /166.104.112.69:45343 dst: /XXX.XXX.XXX.XXX:50010
java.io.IOException: Premature EOF from inputStream
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501)
        at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:895)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:801)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
        at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
        at java.lang.Thread.run(Thread.java:745)
2017-07-18 20:20:57,083 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/XXX.XXX.XXX.XXX:9000 succe
ssfully registered with NN
2017-07-18 20:21:00,044 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1178ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1677ms
2017-07-18 20:21:05,452 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 2133ms
GC pool 'PS MarkSweep' had collection(s): count=3 time=2632ms
2017-07-18 20:21:12,342 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1229ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1729ms
2017-07-18 20:21:14,056 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1214ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1713ms
2017-07-18 20:21:28,386 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Unexpected exception in block pool Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/1
66.104.112.43:9000
java.lang.OutOfMemoryError: Java heap space
2017-07-18 20:21:28,386 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392) service to sda1/166.1
04.112.43:9000
2017-07-18 20:21:29,958 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1205ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1704ms
2017-07-18 20:21:40,231 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1431ms
GC pool 'PS MarkSweep' had collection(s): count=2 time=1931ms
2017-07-18 20:21:45,597 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3310ms
GC pool 'PS MarkSweep' had collection(s): count=4 time=3808ms
2017-07-18 20:21:55,800 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989 (Datanode Uuid d4f1b1f7-0636-483d-91e8-4780b73fb392)
2017-07-18 20:21:58,707 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 5945ms
GC pool 'PS MarkSweep' had collection(s): count=12 time=13105ms
2017-07-18 20:22:00,356 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing block pool BP-605282214-XXX.XXX.XXX.XXX-1498555165989
2017-07-18 20:22:03,000 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2017-07-18 20:22:03,001 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2017-07-18 20:22:03,002 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1085ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=1233ms
2017-07-18 20:22:03,003 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at sda2/XXX.XXX.XXX.XXX
************************************************************/

번호 제목 글쓴이 날짜 조회 수
439 Core with name 'xx_shard4_replica1' already exists. 발생시 조치사항 총관리자 2017.07.22 62
438 9대가 hbase cluster로 구성된 서버에서 테스트 data를 halyard에 적재하고 테스트 하는 방법및 절차 총관리자 2017.07.21 56
» 갑자기 DataNode가 java.io.IOException: Premature EOF from inputStream를 반복적으로 발생시키다가 java.lang.OutOfMemoryError: Java heap space를 내면서 죽는 경우 조치방법 총관리자 2017.07.19 1676
436 Current heap configuration for MemStore and BlockCache exceeds the threshold required for successful cluster operation 총관리자 2017.07.18 892
435 HBase 설정 최적화하기(VCNC) file 총관리자 2017.07.18 120
434 HBase write 성능 튜닝 file 총관리자 2017.07.18 87
433 schema.xml vs managed-schema 지정 사용하기 - 두개를 동시에 사용할 수는 없음 총관리자 2017.07.09 153
432 halyard의 console스크립트에서 생성한 repository는 RDF4J Web Applications에서 공유가 되지 않는다. 총관리자 2017.07.05 45
431 halyard 1.3의 rdf4j-server.war와 rdf4j-workbench.war를 tomcat deploy후 조회시 java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/Cell발생시 조치사항 총관리자 2017.07.05 65
430 halyard 1.3을 다른 서버로 이전하는 방법 총관리자 2017.07.05 66
429 python test.py실행시 "ImportError: No module named pyspark" 혹은 "ImportError: No module named py4j.protocol"등의 오류 발생시 조치사항 총관리자 2017.07.04 765
428 solr명령 실행시 "Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect" 오류발생 총관리자 2017.06.30 202
427 mysql에서 외부 디비를 커넥션할 경우 접속 속도가 느려질때 총관리자 2017.06.30 1079
426 solr 6.2에 한글 형태소 분석기(arirang 6.x) 적용 및 테스트 file 총관리자 2017.06.27 879
425 elasticsearch 기동시 permission denied on key 'vm.max_map_count' 오류발생시 조치사항 총관리자 2017.06.23 431
424 http://blog.naver.com... 총관리자 2017.06.23 88
423 Not enough replica available for query at consistency QUORUM가 발생하는 경우 총관리자 2017.06.21 256
422 cassandra cluster 문제가 있는 node제거 하기(DN상태의 노드가 있으면 cassandra cluster 전체에 문제가 발생하므로 반드시 제거할것) 총관리자 2017.06.21 309
421 VPS에서는 root로 실행해도 swap파일을 만들지 못하게 만들어 두었지만 swap파일을 생성하는 방법 총관리자 2017.06.20 120
420 Ubuntu에서 sbt및 scala설치하기 총관리자 2017.06.20 124

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로