Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.

flume flume 1.5.2 설치및 테스트(source : file, sink : hdfs) in HA

총관리자 2015.05.21 17:14 조회 수 : 4414

1. flume설치파일 다운로드

apache-flume-1.5.2-bin.tar.gz

2. 압축풀기

tar xvfz apache-flume-1.5.2-bin.tar.gz

3. 링크생성

ln -s apache-flume-1.5.2-bin flume

4. 환경변수 수정(vi /home/hadoop/.bashrc)

export FLUME_HOME=/hadoop/flume

export PATH=$PATH:$FLUME_HOME/bin

* 변경사항 반영 : source /home/hadoop/.bashrc

5. Flume Conf

cd $FLUME_HOME/conf

cp flume-conf.properties.template flume.conf

vi flume.conf

agent.sources = seqGenSrc

agent.channels = memoryChannel

agent.sinks = hdfsSink

# For each one of the sources, the type is defined

agent.sources.seqGenSrc.type = exec

agent.sources.seqGenSrc.command = tail -F /home/bigdata/hadoop-1.2.1/logs/hadoop-hadoop-namenode-localhost.localdomain.log

#가상분산환경에서 테스트용으로 잡은것.

# The channel can be defined as follows.

agent.sources.seqGenSrc.channels = memoryChannel

# Each sink's type must be defined

agent.sinks.hdfsSink.type = hdfs

agent.sinks.hdfsSink.hdfs.path = hdfs://mycluster/flume/data #테스트용

agent.sinks.hdfsSink.rollInterval = 30

agent.sinks.hdfsSink.sink.batchSize = 100

#Specify the channel the sink should use

agent.sinks.hdfsSink.channel = memoryChannel

# Each channel's type is defined.

agent.channels.memoryChannel.type = memory

# Other config values specific to each type of channel(sink or source)

# can be defined as well

# In this case, it specifies the capacity of the memory channel

agent.channels.memoryChannel.capacity = 100000

agent.channels.memoryChannel.transactionCapacity = 10000

6. agent기동

[hadoop@master]$ flume-ng agent -conf-file ./flume.conf --name agent

Info: Including Hadoop libraries found via (/usr/local/flume/bin/hadoop) for HDFS access

Info: Excluding /usr/local/flume/share/usr/local/common/lib/slf4j-api-1.7.5.jar from classpath

Info: Excluding /usr/local/flume/share/usr/local/common/lib/slf4j-log4j12-1.7.5.jar from classpath

Info: Including HBASE libraries found via (/usr/local/hbase/bin/hbase) for HBASE access

Info: Excluding /usr/local/hbase/lib/slf4j-api-1.6.4.jar from classpath

Info: Excluding /usr/local/hbase/lib/slf4j-log4j12-1.6.4.jar from classpath

Info: Excluding /usr/local/flume/share/usr/local/common/lib/slf4j-api-1.7.5.jar from classpath

Info: Excluding /usr/local/flume/share/usr/local/common/lib/slf4j-log4j12-1.7.5.jar from classpath

.....

15/05/21 17:38:57 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting

15/05/21 17:38:57 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:./flume.conf

15/05/21 17:38:57 INFO conf.FlumeConfiguration: Processing:hdfsSink

15/05/21 17:38:57 INFO conf.FlumeConfiguration: Added sinks: hdfsSink Agent: agent

15/05/21 17:38:57 INFO conf.FlumeConfiguration: Processing:hdfsSink

15/05/21 17:38:57 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [agent]

15/05/21 17:38:57 INFO node.AbstractConfigurationProvider: Creating channels

15/05/21 17:38:57 INFO channel.DefaultChannelFactory: Creating instance of channel memoryChannel type memory

15/05/21 17:38:57 INFO node.AbstractConfigurationProvider: Created channel memoryChannel

15/05/21 17:38:57 INFO source.DefaultSourceFactory: Creating instance of source seqGenSrc, type exec

15/05/21 17:38:57 INFO sink.DefaultSinkFactory: Creating instance of sink: hdfsSink, type: hdfs

15/05/21 17:38:58 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

15/05/21 17:38:58 INFO hdfs.HDFSEventSink: Hadoop Security enabled: false

15/05/21 17:38:58 INFO node.AbstractConfigurationProvider: Channel memoryChannel connected to [seqGenSrc, hdfsSink]

15/05/21 17:38:58 INFO node.Application: Starting new configuration:{ sourceRunners:{seqGenSrc=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource{name:seqGenSrc,state:IDLE} }} sinkRunners:{hdfsSink=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@3b201837 counterGroup:{ name:null counters:{} } }} channels:{memoryChannel=org.apache.flume.channel.MemoryChannel{name: memoryChannel}} }

15/05/21 17:38:58 INFO node.Application: Starting Channel memoryChannel

15/05/21 17:38:58 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: memoryChannel: Successfully registered new MBean.

15/05/21 17:38:58 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: memoryChannel started

15/05/21 17:38:58 INFO node.Application: Starting Sink hdfsSink

15/05/21 17:38:58 INFO node.Application: Starting Source seqGenSrc

7. hdfs확인

cat aaa >> /home/hadoop/test.log로 데이타를 넣고 아래 명령으로 확인해본다.

[hadoop@master]$ hadoop fs -lsr /flume

drwxr-xr-x - hadoop supergroup 0 2015-05-21 17:39 /flume/data

-rw-r--r-- 3 hadoop supergroup 208 2015-05-21 17:39 /flume/data/FlumeData.1432197542415

이 게시물을

이 글의 추천인 목록 목록

번호	제목	날짜	조회 수
210	python실행시 ValueError: zero length field name in format오류 해결방법	2016.05.27	3929
209	S2RDF 테스트(벤치마크 테스트를 기준으로 python, scala소스가 만들어져서 기능은 파악되지 못함) [1]	2016.05.27	2865
208	CentOS6에 python3.5.1 소스코드로 빌드하여 설치하기	2016.05.27	4113
207	RDF storage조합에대한 test결과(4store, Jena+HBase, Hive+HBase, CumulusRDF, Couchbase) 페이지 링크	2016.05.26	3018
206	spark-submit으로 spark application실행하는 다양한 방법	2016.05.25	4414
205	spark 온라인 책자링크 (제목 : mastering-apache-spark)	2016.05.25	4248
204	"Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources"오류 발생시 조치사항	2016.05.25	4629
203	spark-env.sh에서 사용할 수있는 항목.	2016.05.24	4380
202	Spark 1.6.1 설치후 HA구성	2016.05.24	4533
201	spark-shell실행시 "A read-only user or a user in a read-only database is not permitted to disable read-only mode on a connection."오류가 발생하는 경우 해결방법	2016.05.20	2953
200	Master rejected startup because clock is out of sync 오류 해결방법	2016.05.03	5074
199	kafka broker기동시 brokerId가 달라서 기동에 실패하는 경우 조치방법	2016.05.02	5190
198	kafka 0.9.0.1 for scala 2.1.1 설치및 테스트	2016.05.02	4052
197	solr 인스턴스 기동후 shard에 서버가 정상적으로 할당되지 않는 경우 해결책	2016.04.29	4036
196	collection생성시 -shards와 -replicationFactor값을 잘못지정하면 write.lock for client xxx.xxx.xxx.xxx already exists오류가 발생한다.	2016.04.28	3436
195	solrdf초기 기동시 "Caused by: java.lang.IllegalAccessError: tried to access field org.apache.solr.handler.RequestHandlerBase.log from class org.gazzax.labs.solrdf.handler.update.RdfUpdateRequestHandler" 오류가 발생시 조치사항	2016.04.22	4139
194	solrcloud에 solrdf1.1설치하고 테스트 하기	2016.04.22	3287
193	Spark 2.1.1 clustering(5대) 설치(YARN기반)	2016.04.22	4961
192	Hadoop 완벽 가이드 정리된 링크	2016.04.19	4354
191	cumulusRDF 1.0.1설치및 "KeyspaceCumulus" keyspace확인하기	2016.04.15	8731

쓰기 태그

첫 페이지 23 24 25 26 27 28 29 30 31 32 끝 페이지

Cloudera, BigData, Semantic IoT, Hadoop, NoSQL

Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.

flume flume 1.5.2 설치및 테스트(source : file, sink : hdfs) in HA

댓글 0

Cloudera, BigData, Semantic IoT, Hadoop, NoSQL

Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.

flume flume 1.5.2 설치및 테스트(source : file, sink : hdfs) in HA

댓글 0

LOGIN