Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.

spark Scala에서 countByWindow를 이용하기(예제)

총관리자 2018.03.08 14:26 조회 수 : 4838

import org.apache.spark.SparkContext

import org.apache.spark.streaming.StreamingContext

import org.apache.spark.streaming.Seconds

object StreamingLogsMB {

def main(args: Array[String]) {

if (args.length < 2) {

System.err.println("Usage: stubs.StreamingLogsMB <hostname> <port>")

System.exit(1)

}

// get hostname and port of data source from application arguments

val hostname = args(0)

val port = args(1).toInt

// Create a Spark Context

val sc = new SparkContext()

// Set log level to ERROR to avoid distracting extra output

sc.setLogLevel("ERROR")

// Configure the Streaming Context with a 1 second batch duration

val ssc = new StreamingContext(sc,Seconds(1))

// Create a DStream of log data from the server and port specified

val logs = ssc.socketTextStream(hostname,port)

ssc.checkpoint("logcheckpt")

logs.countByWindow(Seconds(5), Seconds(2)).print

ssc.start()

ssc.awaitTermination()

}

이 게시물을

이 글의 추천인 목록 목록

번호	제목	날짜	조회 수
85	dual table만들기	2014.05.16	4246
84	spark 온라인 책자링크 (제목 : mastering-apache-spark)	2016.05.25	4248
83	[impala]쿼리 수행중 발생하는 오류(due to memory pressure: the memory usage of this transaction, Failed to write to server)	2022.10.05	4259
82	CentOS 7.x에 Jupyter설치	2018.04.18	4272
81	external partition table생성및 data확인	2014.04.03	4280
80	[TLS/SSL]Kudu Master 설정하기	2022.05.13	4294
79	Scala를 이용한 Streaming예제	2018.03.08	4302
78	AIX 7.1에서 hive실행시 "hive: line 86: readlink: command not found" 오류가 발생시 임시 조치사항	2016.09.25	4303
77	AnalysisException: Incomplatible return type 'DECIMAL(38,0)' and 'DECIMAL(38,5)' of exprs가 발생시 조치	2021.07.26	4318
76	Soft memory limit exceeded (at 100.05% of capacity) 오류 조치	2022.01.17	4323
75	conda를 이용한 jupyterhub(v0.9)및 jupyter설치 (v4.4.0)	2018.07.30	4325
74	Apache Spark와 Drools를 이용한 CEP구현 테스트	2016.07.15	4361
73	scala application 샘플소스(SparkSession이용)	2018.03.07	4366
72	spark-env.sh에서 사용할 수있는 항목.	2016.05.24	4378
71	kudu hms check 사용법(예시)	2021.10.22	4401
70	[Kudu]Schema별 혹은 테이블별 사용량(Replica포함) 구하는 방법	2022.07.14	4404
69	Windows7 64bit 환경에서 Apache Spark 2.2.0 설치하기	2017.07.26	4409
68	spark-submit으로 spark application실행하는 다양한 방법	2016.05.25	4410
67	[CDP7.1.7]Impala Query의 Memory Spilled 양은 ScratchFileUsedBytes값을 누적해서 구할 수 있다.	2022.07.29	4423
66	hive query에서 mapreduce돌리지 않고 select하는 방법	2014.05.23	4430

쓰기 태그

첫 페이지 1 2 3 4 5 6 7 8 끝 페이지

Cloudera, BigData, Semantic IoT, Hadoop, NoSQL

Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.

spark Scala에서 countByWindow를 이용하기(예제)

댓글 0

Cloudera, BigData, Semantic IoT, Hadoop, NoSQL

Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.

spark Scala에서 countByWindow를 이용하기(예제)

댓글 0

LOGIN