메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


1. application실행하면 "YarnApplicationState: ACCEPTED: waiting for AM container to be allocated, launched and register with RM"가 UI에 표시되면서 job이 hang이 걸리면서 진행이 되지 않는 경우가 있는데 그때는 yarn-site.xml의 "yarn.nodemanager.resource.memory-mb"의 값을 매우 크게 지정하여 appliction을 실행하면 hang은 걸리지 않으나 아래의 2번 항목의 문제가 발생한다. 
(yarn.nodemanager.vmem-check-enabled, yarn.nodemanager.vmem-pmem-ratio의 설정값이 없는 경우)

2. hadoop application을 실행할때 "Container [pid=19278,containerID=container_1493858350369_0001_01_000008] is running beyond virtual memory limits. Current usage: 636.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used. Killing container." 같은 오류가 길생되면서 job이 실패하는 경우가 있는데 이것은 application이 가용한 가상메모리 보다 더 많은 메모리를 사용할때 발생하는 문제이다. 그래서 hadoop이 가상메모리 제한을 체크하지 않도록 하고 많은 메모리를 사용하는 appliction을 위해서 가상메모리와 물리메모리 비율을 높게 설정해 준다.(yarn-site.xml에 설정해줌)
  <property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
  </property>
  <property>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>4</value>
  </property>

------------------------------------------오류내용-----------------------
root@gsda1:~/hadoop/etc/hadoop# yarn jar $HOME/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar wordcount in out-6
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/svc/apps/gsda/bin/hadoop/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/05/04 09:41:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/05/04 09:41:10 WARN ipc.Client: Failed to connect to server: gsda1/104.251.212.146:8032: retries get failed due to exceeded maximum allowed retries number: 0
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:681)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:777)
        at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:409)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1542)
        at org.apache.hadoop.ipc.Client.call(Client.java:1373)
        at org.apache.hadoop.ipc.Client.call(Client.java:1337)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy13.getNewApplication(Unknown Source)
        at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:258)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:335)
        at com.sun.proxy.$Proxy14.getNewApplication(Unknown Source)
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:242)
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:250)
        at org.apache.hadoop.mapred.ResourceMgrDelegate.getNewJobID(ResourceMgrDelegate.java:193)
        at org.apache.hadoop.mapred.YARNRunner.getNewJobID(YARNRunner.java:241)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:155)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1359)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
17/05/04 09:41:10 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
17/05/04 09:41:12 INFO input.FileInputFormat: Total input files to process : 3
17/05/04 09:41:12 INFO mapreduce.JobSubmitter: number of splits:3
17/05/04 09:41:12 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1493858350369_0001
17/05/04 09:41:13 INFO impl.YarnClientImpl: Submitted application application_1493858350369_0001
17/05/04 09:41:13 INFO mapreduce.Job: The url to track the job: http://gsda2:8088/proxy/application_1493858350369_0001/
17/05/04 09:41:13 INFO mapreduce.Job: Running job: job_1493858350369_0001
17/05/04 09:41:21 INFO mapreduce.Job: Job job_1493858350369_0001 running in uber mode : false
17/05/04 09:41:21 INFO mapreduce.Job:  map 0% reduce 0%
17/05/04 09:41:27 INFO mapreduce.Job:  map 33% reduce 0%
17/05/04 09:41:29 INFO mapreduce.Job:  map 67% reduce 0%
17/05/04 09:41:30 INFO mapreduce.Job: Task Id : attempt_1493858350369_0001_m_000000_0, Status : FAILED
Container [pid=2161,containerID=container_1493858350369_0001_01_000002] is running beyond virtual memory limits. Current usage: 738.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1493858350369_0001_01_000002 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 2165 2161 2161 2161 (java) 929 59 2580721664 188600 /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_0 2 
        |- 2161 2159 2161 2161 (bash) 0 0 12861440 351 /bin/bash -c /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_0 2 1>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000002/stdout 2>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000002/stderr  

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

17/05/04 09:41:36 INFO mapreduce.Job: Task Id : attempt_1493858350369_0001_m_000000_1, Status : FAILED
Container [pid=19183,containerID=container_1493858350369_0001_01_000007] is running beyond virtual memory limits. Current usage: 651.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1493858350369_0001_01_000007 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 19187 19183 19183 19183 (java) 646 48 2581504000 166331 /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000007/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000007 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_1 7 
        |- 19183 19181 19183 19183 (bash) 0 0 12861440 351 /bin/bash -c /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000007/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000007 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_1 7 1>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000007/stdout 2>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000007/stderr  

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

17/05/04 09:41:42 INFO mapreduce.Job: Task Id : attempt_1493858350369_0001_m_000000_2, Status : FAILED
Container [pid=19278,containerID=container_1493858350369_0001_01_000008] is running beyond virtual memory limits. Current usage: 636.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1493858350369_0001_01_000008 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 19278 19276 19278 19278 (bash) 0 0 12861440 351 /bin/bash -c /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000008/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000008 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_2 8 1>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000008/stdout 2>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000008/stderr  
        |- 19282 19278 19278 19278 (java) 693 47 2579701760 162501 /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000008/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000008 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_2 8 

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

17/05/04 09:41:44 INFO mapreduce.Job:  map 67% reduce 22%
17/05/04 09:41:49 INFO mapreduce.Job:  map 100% reduce 100%
17/05/04 09:41:49 INFO mapreduce.Job: Job job_1493858350369_0001 failed with state FAILED due to: Task failed task_1493858350369_0001_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

17/05/04 09:41:50 INFO mapreduce.Job: Counters: 41
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=7666113
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=4036044
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=6
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Job Counters 
                Failed map tasks=4
                Killed map tasks=1
                Killed reduce tasks=1
                Launched map tasks=6
                Launched reduce tasks=1
                Other local map tasks=3
                Data-local map tasks=3
                Total time spent by all maps in occupied slots (ms)=28729
                Total time spent by all reduces in occupied slots (ms)=40216
                Total time spent by all map tasks (ms)=28729
                Total time spent by all reduce tasks (ms)=20108
                Total vcore-milliseconds taken by all map tasks=28729
                Total vcore-milliseconds taken by all reduce tasks=20108
                Total megabyte-milliseconds taken by all map tasks=29418496
                Total megabyte-milliseconds taken by all reduce tasks=41181184
        Map-Reduce Framework
                Map input records=29535
                Map output records=70611
                Map output bytes=7250146
                Map output materialized bytes=7382759
                Input split bytes=248
                Combine input records=70611
                Combine output records=67818
                Spilled Records=67818
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=111
                CPU time spent (ms)=4610
                Physical memory (bytes) snapshot=1259335680
                Virtual memory (bytes) snapshot=5179203584
                Total committed heap usage (bytes)=1168113664
        File Input Format Counters 
                Bytes Read=4035796
번호 제목 글쓴이 날짜 조회 수
437 갑자기 DataNode가 java.io.IOException: Premature EOF from inputStream를 반복적으로 발생시키다가 java.lang.OutOfMemoryError: Java heap space를 내면서 죽는 경우 조치방법 총관리자 2017.07.19 1662
436 Current heap configuration for MemStore and BlockCache exceeds the threshold required for successful cluster operation 총관리자 2017.07.18 892
435 HBase 설정 최적화하기(VCNC) file 총관리자 2017.07.18 120
434 HBase write 성능 튜닝 file 총관리자 2017.07.18 87
433 schema.xml vs managed-schema 지정 사용하기 - 두개를 동시에 사용할 수는 없음 총관리자 2017.07.09 153
432 halyard의 console스크립트에서 생성한 repository는 RDF4J Web Applications에서 공유가 되지 않는다. 총관리자 2017.07.05 45
431 halyard 1.3의 rdf4j-server.war와 rdf4j-workbench.war를 tomcat deploy후 조회시 java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/Cell발생시 조치사항 총관리자 2017.07.05 65
430 halyard 1.3을 다른 서버로 이전하는 방법 총관리자 2017.07.05 66
429 python test.py실행시 "ImportError: No module named pyspark" 혹은 "ImportError: No module named py4j.protocol"등의 오류 발생시 조치사항 총관리자 2017.07.04 761
428 solr명령 실행시 "Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect" 오류발생 총관리자 2017.06.30 202
427 mysql에서 외부 디비를 커넥션할 경우 접속 속도가 느려질때 총관리자 2017.06.30 1058
426 solr 6.2에 한글 형태소 분석기(arirang 6.x) 적용 및 테스트 file 총관리자 2017.06.27 873
425 elasticsearch 기동시 permission denied on key 'vm.max_map_count' 오류발생시 조치사항 총관리자 2017.06.23 431
424 http://blog.naver.com... 총관리자 2017.06.23 85
423 Not enough replica available for query at consistency QUORUM가 발생하는 경우 총관리자 2017.06.21 256
422 cassandra cluster 문제가 있는 node제거 하기(DN상태의 노드가 있으면 cassandra cluster 전체에 문제가 발생하므로 반드시 제거할것) 총관리자 2017.06.21 308
421 VPS에서는 root로 실행해도 swap파일을 만들지 못하게 만들어 두었지만 swap파일을 생성하는 방법 총관리자 2017.06.20 120
420 Ubuntu에서 sbt및 scala설치하기 총관리자 2017.06.20 118
419 lagom을 이용한 샘플 경매 프로그램 실행방법 총관리자 2017.06.20 181
418 원격 리포지토리에서 최초 clone시 Permission denied (publickey). 오류발생시 조치사항 총관리자 2017.06.20 871

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로