메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


StringBuffer의 값을 toString()을 이용하여 문자열로 변환할때 "java.lang.OutOfMemoryError: Java heap space"가 발생하는데 이것은 StringBuffer.toString()하는 과정에서 값을 복사하는데 이때 heap메모리가 부족해서 발생하는 오류이다.

이때는 spark-submit에서 --driver-memory 5g처럼 지정하는 메모리를 크게 증가시켜서 -Xmx값을 증가시켜준다.


------------------오류내용------------------------

[2018-02-01 10:12:40,253] [internal.Logging$class] [logError(#70)] [ERROR] Task 0 in stage 20.0 failed 1 times; aborting job
[2018-02-01 10:12:40,267] [internal.Logging$class] [logError(#91)] [ERROR] Error running job streaming job 1517447030000 ms.0
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 20.0 failed 1 times, most recent failure: Lost task 0.0 in stage 20.0 (TID 20, localhost, executor driver): java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOfRange(Arrays.java:3664)
        at java.lang.StringBuffer.toString(StringBuffer.java:671)
        at com.pineone.icbms.sda.sf.TripleService.sendTripleFileToHalyard(TripleService.java:500)
        at com.pineone.icbms.sda.kafka.onem2m.AvroOneM2MDataSparkSubscribe.sendTriples(AvroOneM2MDataSparkSubscribe.java:296)
        at com.pineone.icbms.sda.kafka.onem2m.AvroOneM2MDataSparkSubscribe.access$100(AvroOneM2MDataSparkSubscribe.java:34)
        at com.pineone.icbms.sda.kafka.onem2m.AvroOneM2MDataSparkSubscribe$ConsumerT.go(AvroOneM2MDataSparkSubscribe.java:202)
        at com.pineone.icbms.sda.kafka.onem2m.AvroOneM2MDataSparkSubscribe$1.call(AvroOneM2MDataSparkSubscribe.java:101)
        at com.pineone.icbms.sda.kafka.onem2m.AvroOneM2MDataSparkSubscribe$1.call(AvroOneM2MDataSparkSubscribe.java:93)
        at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$10.next(Iterator.scala:393)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
        at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
        at scala.collection.AbstractIterator.to(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
        at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
        at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
        at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1354)
        at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1354)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1951)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1951)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
        at scala.Option.foreach(Option.scala:257)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:802)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1650)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1925)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1938)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1951)
        at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1354)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
        at org.apache.spark.rdd.RDD.take(RDD.scala:1327)
        at org.apache.spark.streaming.dstream.DStream$$anonfun$print$2$$anonfun$foreachFunc$3$1.apply(DStream.scala:734)
        at org.apache.spark.streaming.dstream.DStream$$anonfun$print$2$$anonfun$foreachFunc$3$1.apply(DStream.scala:733)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
        at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:415)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
        at scala.util.Try$.apply(Try.scala:192)
        at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
        at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:256)
        at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:256)
        at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:256)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
        at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:255)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOfRange(Arrays.java:3664)
        at java.lang.StringBuffer.toString(StringBuffer.java:671)
        at com.pineone.icbms.sda.sf.TripleService.sendTripleFileToHalyard(TripleService.java:500)
        at com.pineone.icbms.sda.kafka.onem2m.AvroOneM2MDataSparkSubscribe.sendTriples(AvroOneM2MDataSparkSubscribe.java:296)
        at com.pineone.icbms.sda.kafka.onem2m.AvroOneM2MDataSparkSubscribe.access$100(AvroOneM2MDataSparkSubscribe.java:34)
        at com.pineone.icbms.sda.kafka.onem2m.AvroOneM2MDataSparkSubscribe$ConsumerT.go(AvroOneM2MDataSparkSubscribe.java:202)
        at com.pineone.icbms.sda.kafka.onem2m.AvroOneM2MDataSparkSubscribe$1.call(AvroOneM2MDataSparkSubscribe.java:101)
        at com.pineone.icbms.sda.kafka.onem2m.AvroOneM2MDataSparkSubscribe$1.call(AvroOneM2MDataSparkSubscribe.java:93)
        at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$$anon$10.next(Iterator.scala:393)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
        at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
        at scala.collection.AbstractIterator.to(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
        at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
        at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
        at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1354)
        at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$29.apply(RDD.scala:1354)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1951)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1951)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
        ... 3 more

번호 제목 글쓴이 날짜 조회 수
620 hbase CustomFilter만들기 (0.98.X이상) 총관리자 2015.05.08 162
619 hbase가 기동시키는 zookeeper에서 받아드리는 ip가 IPv6로 사용되는 경우가 있는데 이를 IPv4로 강제적용하는 방법 총관리자 2015.05.08 267
618 secureCRT에서 backspace키가 작동하지 않는 경우 해결방법 총관리자 2015.05.11 716
617 Nodes of the cluster (unhealthy)중 1/1 log-dirs are bad: 오류 해결방법 총관리자 2015.05.17 599
616 Permission denied: user=hadoop, access=EXECUTE, inode="/tmp":root:supergroup:drwxrwx--- 오류해결방법 총관리자 2015.05.17 412
615 java.lang.ClassNotFoundException: org.apache.hadoop.util.ShutdownHookManager 오류조치사항 총관리자 2015.05.20 573
614 flume 1.5.2 설치및 테스트(source : file, sink : hdfs) in HA 총관리자 2015.05.21 1415
613 hbase shell 필드 검색 방법 총관리자 2015.05.24 1900
612 HAX is not working and emulator runs in emulation mode 메세지가 나오는 경우 file 총관리자 2015.05.25 159
611 apk 파일 위치 file 총관리자 2015.05.25 2227
610 센서테스트 file 총관리자 2015.05.25 167
609 Error: Could not find or load main class nodemnager 가 발생할때 해결하는 방법 총관리자 2015.06.05 426
608 Error: E0501 : E0501: Could not perform authorization operation, User: hadoop is not allowed to impersonate hadoop 해결하는 방법 총관리자 2015.06.07 385
607 "File /user/hadoop/share/lib does not exist" 오류 해결방법 총관리자 2015.06.07 655
606 hadoop 2.6.0에 sqoop2 (1.99.5) server및 client설치 == fail 총관리자 2015.06.11 1770
605 Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.http.HttpConfig.getSchemePrefix()Ljava/lang/String; 해결->실패 총관리자 2015.06.14 402
604 hortonworks에서 제공하는 메모리 설정값 계산기 사용법 file 총관리자 2015.06.14 719
603 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable to deserialize reduce input key from...오류해결방법 총관리자 2015.06.16 1699
602 Tracking URL = N/A 가발생하는 경우 - 환경설정값을 잘못설정하는 경우에 발생함 총관리자 2015.06.17 423
601 바나나 파이의 /tmp폴더를 외장하드로 변경하기 총관리자 2015.07.24 136

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로