메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


1. application실행하면 "YarnApplicationState: ACCEPTED: waiting for AM container to be allocated, launched and register with RM"가 UI에 표시되면서 job이 hang이 걸리면서 진행이 되지 않는 경우가 있는데 그때는 yarn-site.xml의 "yarn.nodemanager.resource.memory-mb"의 값을 매우 크게 지정하여 appliction을 실행하면 hang은 걸리지 않으나 아래의 2번 항목의 문제가 발생한다. 
(yarn.nodemanager.vmem-check-enabled, yarn.nodemanager.vmem-pmem-ratio의 설정값이 없는 경우)

2. hadoop application을 실행할때 "Container [pid=19278,containerID=container_1493858350369_0001_01_000008] is running beyond virtual memory limits. Current usage: 636.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used. Killing container." 같은 오류가 길생되면서 job이 실패하는 경우가 있는데 이것은 application이 가용한 가상메모리 보다 더 많은 메모리를 사용할때 발생하는 문제이다. 그래서 hadoop이 가상메모리 제한을 체크하지 않도록 하고 많은 메모리를 사용하는 appliction을 위해서 가상메모리와 물리메모리 비율을 높게 설정해 준다.(yarn-site.xml에 설정해줌)
  <property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
  </property>
  <property>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>4</value>
  </property>

------------------------------------------오류내용-----------------------
root@gsda1:~/hadoop/etc/hadoop# yarn jar $HOME/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar wordcount in out-6
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/svc/apps/gsda/bin/hadoop/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/05/04 09:41:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/05/04 09:41:10 WARN ipc.Client: Failed to connect to server: gsda1/104.251.212.146:8032: retries get failed due to exceeded maximum allowed retries number: 0
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:681)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:777)
        at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:409)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1542)
        at org.apache.hadoop.ipc.Client.call(Client.java:1373)
        at org.apache.hadoop.ipc.Client.call(Client.java:1337)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy13.getNewApplication(Unknown Source)
        at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:258)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:335)
        at com.sun.proxy.$Proxy14.getNewApplication(Unknown Source)
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:242)
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:250)
        at org.apache.hadoop.mapred.ResourceMgrDelegate.getNewJobID(ResourceMgrDelegate.java:193)
        at org.apache.hadoop.mapred.YARNRunner.getNewJobID(YARNRunner.java:241)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:155)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1359)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
17/05/04 09:41:10 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
17/05/04 09:41:12 INFO input.FileInputFormat: Total input files to process : 3
17/05/04 09:41:12 INFO mapreduce.JobSubmitter: number of splits:3
17/05/04 09:41:12 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1493858350369_0001
17/05/04 09:41:13 INFO impl.YarnClientImpl: Submitted application application_1493858350369_0001
17/05/04 09:41:13 INFO mapreduce.Job: The url to track the job: http://gsda2:8088/proxy/application_1493858350369_0001/
17/05/04 09:41:13 INFO mapreduce.Job: Running job: job_1493858350369_0001
17/05/04 09:41:21 INFO mapreduce.Job: Job job_1493858350369_0001 running in uber mode : false
17/05/04 09:41:21 INFO mapreduce.Job:  map 0% reduce 0%
17/05/04 09:41:27 INFO mapreduce.Job:  map 33% reduce 0%
17/05/04 09:41:29 INFO mapreduce.Job:  map 67% reduce 0%
17/05/04 09:41:30 INFO mapreduce.Job: Task Id : attempt_1493858350369_0001_m_000000_0, Status : FAILED
Container [pid=2161,containerID=container_1493858350369_0001_01_000002] is running beyond virtual memory limits. Current usage: 738.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1493858350369_0001_01_000002 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 2165 2161 2161 2161 (java) 929 59 2580721664 188600 /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_0 2 
        |- 2161 2159 2161 2161 (bash) 0 0 12861440 351 /bin/bash -c /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000002/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000002 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_0 2 1>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000002/stdout 2>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000002/stderr  

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

17/05/04 09:41:36 INFO mapreduce.Job: Task Id : attempt_1493858350369_0001_m_000000_1, Status : FAILED
Container [pid=19183,containerID=container_1493858350369_0001_01_000007] is running beyond virtual memory limits. Current usage: 651.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1493858350369_0001_01_000007 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 19187 19183 19183 19183 (java) 646 48 2581504000 166331 /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000007/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000007 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_1 7 
        |- 19183 19181 19183 19183 (bash) 0 0 12861440 351 /bin/bash -c /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000007/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000007 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_1 7 1>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000007/stdout 2>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000007/stderr  

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

17/05/04 09:41:42 INFO mapreduce.Job: Task Id : attempt_1493858350369_0001_m_000000_2, Status : FAILED
Container [pid=19278,containerID=container_1493858350369_0001_01_000008] is running beyond virtual memory limits. Current usage: 636.1 MB of 1 GB physical memory used; 2.4 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1493858350369_0001_01_000008 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 19278 19276 19278 19278 (bash) 0 0 12861440 351 /bin/bash -c /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000008/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000008 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_2 8 1>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000008/stdout 2>/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000008/stderr  
        |- 19282 19278 19278 19278 (java) 693 47 2579701760 162501 /usr/lib/jvm/java-8-oracle/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx819m -Djava.io.tmpdir=/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1493858350369_0001/container_1493858350369_0001_01_000008/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/svc/apps/gsda/bin/hadoop/hadoop-2.8.0/logs/userlogs/application_1493858350369_0001/container_1493858350369_0001_01_000008 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 104.251.212.191 43986 attempt_1493858350369_0001_m_000000_2 8 

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

17/05/04 09:41:44 INFO mapreduce.Job:  map 67% reduce 22%
17/05/04 09:41:49 INFO mapreduce.Job:  map 100% reduce 100%
17/05/04 09:41:49 INFO mapreduce.Job: Job job_1493858350369_0001 failed with state FAILED due to: Task failed task_1493858350369_0001_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

17/05/04 09:41:50 INFO mapreduce.Job: Counters: 41
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=7666113
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=4036044
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=6
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Job Counters 
                Failed map tasks=4
                Killed map tasks=1
                Killed reduce tasks=1
                Launched map tasks=6
                Launched reduce tasks=1
                Other local map tasks=3
                Data-local map tasks=3
                Total time spent by all maps in occupied slots (ms)=28729
                Total time spent by all reduces in occupied slots (ms)=40216
                Total time spent by all map tasks (ms)=28729
                Total time spent by all reduce tasks (ms)=20108
                Total vcore-milliseconds taken by all map tasks=28729
                Total vcore-milliseconds taken by all reduce tasks=20108
                Total megabyte-milliseconds taken by all map tasks=29418496
                Total megabyte-milliseconds taken by all reduce tasks=41181184
        Map-Reduce Framework
                Map input records=29535
                Map output records=70611
                Map output bytes=7250146
                Map output materialized bytes=7382759
                Input split bytes=248
                Combine input records=70611
                Combine output records=67818
                Spilled Records=67818
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=111
                CPU time spent (ms)=4610
                Physical memory (bytes) snapshot=1259335680
                Virtual memory (bytes) snapshot=5179203584
                Total committed heap usage (bytes)=1168113664
        File Input Format Counters 
                Bytes Read=4035796
번호 제목 글쓴이 날짜 조회 수
65 missing block및 관련 파일명 찾는 명령어 총관리자 2021.02.20 14
64 W/F수행후 Logs not available for 1. Aggregation may not to complete. 표시되며 로그내용이 보이지 않은 경우 총관리자 2020.05.08 1723
63 A Cluster의 HDFS 디렉토리및 파일을 사용자및 권한 유지 하여 다운 받아서 B Cluster에 넣기 총관리자 2020.05.06 29
62 기준일자 이전의 hdfs 데이타를 지우는 shellscript 샘플 총관리자 2019.06.14 124
61 Error: java.lang.RuntimeException: java.lang.OutOfMemoryError 오류가 발생하는 경우 총관리자 2018.09.20 111
60 physical memory used되면서 mapper가 kill되는 경우 오류 발생시 조치 총관리자 2018.09.20 1241
59 resouce manager에 dr.who가 아닌 다른 사용자로 로그인 하기 총관리자 2018.06.28 118
58 hadoop 클러스터 실행 스크립트 정리 총관리자 2018.03.20 501
57 HA(Namenode, ResourceManager, Kerberos) 및 보안(Zookeeper, Hadoop) 총관리자 2018.03.16 67
56 [Decommission]시 시간이 많이 걸리면서(수일) Decommission이 완료되지 않는 경우 조치 총관리자 2018.01.03 182
55 [2.7.2] distribute-exclude.sh사용할때 ssh 포트변경에 따른 오류발생시 조치사항 총관리자 2018.01.02 32
54 hadoop cluster에 포함된 노드중에서 문제있는 decommission하는 방법및 절차 file 총관리자 2017.12.28 168
53 Hadoop 2.7.x에서 사용할 수 있는 파일/디렉토리 관련 util성 클래스 파일 총관리자 2017.09.28 35
52 editLog의 문제로 발생하는 journalnode 기동 오류 발생시 조치사항 총관리자 2017.09.14 112
51 hadoop cluster구성된 노드를 확인시 Capacity를 보면 색이 붉은색으로 표시되어 있는 경우나 Unhealthy인 경우 처리방법 총관리자 2017.08.30 32
50 Windows7 64bit 환경에서 Apache Hadoop 2.7.1설치하기 총관리자 2017.07.26 148
49 갑자기 DataNode가 java.io.IOException: Premature EOF from inputStream를 반복적으로 발생시키다가 java.lang.OutOfMemoryError: Java heap space를 내면서 죽는 경우 조치방법 총관리자 2017.07.19 557
» mapreduce appliction을 실행시 "is running beyond virtual memory limits" 오류 발생시 조치사항 총관리자 2017.05.04 2468
47 hadoop에서 yarn jar ..를 이용하여 appliction을 실행하여 정상적(?)으로 수행되었으나 yarn UI의 어플리케이션 목록에 나타나지 않는 문제 총관리자 2017.05.02 50
46 hadoop에서 yarn jar ..를 이용하여 appliction을 실행하여 정상적으로 수행되었으나 yarn UI의 어플리케이션 목록에 나타나지 않는 문제 총관리자 2017.05.02 90

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로