메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


기타 ubuntu에 hadoop 2.0.5설치하기

총관리자 2013.12.16 22:09 조회 수 : 1882

출처 : http://www.spikyjohn.com/cribsheets/20130609_hadoopinstall.html

 

Just the command lines to get hadoop 2 installed on Ubuntu. These are all cribbed from the following source notes, and I am preserving them here for my own benefit so I can quickly repeat what I did. Note many of these instructions are also in the main hadoop docs from apache.

Source material

Use Michael-noll's guide for version 1 & ssh
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
http://hadoop.apache.org/docs/r1.1.2/single_node_setup.html

Or this one for Hadoop 2
http://jugnu-life.blogspot.com/2012/05/hadoop-20-install-tutorial-023x.html
http://hadoop.apache.org/docs/r2.0.5-alpha/

Create the hadoop user and ssh

sudo apt-get install openssh-server openssh-client

sudo addgroup hadoop
sudo adduser --ingroup hadoop hduser
su - hduser

If you cannot ssh to localhost without a passphrase, execute the following commands:
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

Testing your SSH
ssh localhost
Say yes
#exit

Get hadoop all set up

As the hduser, after downloading the tar

tar -xvf hadoop-2.0.5-alpha.tar.gz
ln -s hadoop-2.0.5-alpha hadoop
#edit .bashrc
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_21/
export HADOOP_PREFIX="/home/hduser/hadoop"
export PATH=$PATH:$HADOOP_PREFIX/bin
export PATH=$PATH:$HADOOP_PREFIX/sbin

export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export YARN_HOME=${HADOOP_PREFIX}

Stolen entirely from JJ, but with path changed for my Ubuntu

Stolen from http://jugnu-life.blogspot.com/2012/05/hadoop-20-install-tutorial-023x.html Please click on his blog.

Login again so bash has paths above. In Hadoop 2.x version /etc/hadoop is the default conf directory. We need to modify / create following property files in the /etc/hadoop directory

cd ~
mkdir -p /home/hduser/workspace/hadoop_space/hadoop23/dfs/name;mkdir -p /home/hduser/workspace/hadoop_space/hadoop23/dfs/data;mkdir -p /home/hduser/workspace/hadoop_space/hadoop23/mapred/system;mkdir -p /home/hduser/workspace/hadoop_space/hadoop23/mapred/local

Edit core-site.xml with following contents

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
<description>The name of the default file system. Either the literal string "local" or a host:port for NDFS.</description>
<final>true</final>
</property>
</configuration>

Edit hdfs-site.xml with following contents

<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/workspace/hadoop_space/hadoop23/dfs/name</value>
<description>Determines where on the local filesystem the DFS name node
should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the
directories, for redundancy. </description>
<final>true</final>
</property>

<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/workspace/hadoop_space/hadoop23/dfs/data</value>
<description>Determines where on the local filesystem an DFS data node
should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named
directories, typically on different devices. Directories that do not exist are ignored.
</description>
<final>true</final>
</property>

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.permissions</name>
<value>false</value>
</property>

</configuration>

The path
file:/home/hduser/workspace/hadoop_space/hadoop23/dfs/name AND
file:/home/hduser/workspace/hadoop_space/hadoop23/dfs/data
are some folders in your computer which would give space to store data and name edit files

Path should be specified as URI
Create a file mapred-site.xml inside /etc/hadoop with following contents

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

<property>
<name>mapred.system.dir</name>
<value>file:/home/hduser/workspace/hadoop_space/hadoop23/mapred/system</value>
<final>true</final>
</property>

<property>
<name>mapred.local.dir</name>
<value>file:/home/hduser/workspace/hadoop_space/hadoop23/mapred/local</value>
<final>true</final>
</property>

</configuration>

The path

file:/home/hduser/workspace/hadoop_space/hadoop23/mapred/system AND
file:/home/hduser/workspace/hadoop_space/hadoop23/mapred/local
are some folders in your computer which would give space to store data

Path should be specified as URI

Edit yarn-site.xml with following contents

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

Format the namenode

# hdfs namenode –format

Say Yes and let it complete the format

Time to start the daemons

# hadoop-daemon.sh start namenode
# hadoop-daemon.sh start datanode

You can also start both of them together by

# start-dfs.sh

Start Yarn Daemons

# yarn-daemon.sh start resourcemanager
# yarn-daemon.sh start nodemanager

You can also start all yarn daemons together by

# start-yarn.sh

Time to check if Daemons have started

Enter the command

# jps
2539 NameNode
2744 NodeManager
3075 Jps
3030 DataNode
2691 ResourceManager

Time to launch UI

Open the localhost:8088 to see the Resource Manager page

Done :)

Happy Hadooping :)

번호 제목 글쓴이 날짜 조회 수
681 [Cloudera 6.3.4, Kudu]]Service Monitor에서 사용하는 metric중에 일부를 blacklist로 설정하여 모니터링 정보 수집 제외하는 방법 gooper 2022.07.08 31
680 파일은 남겨두고 파일 내용만 지우고자 할 때. 총관리자 2017.08.30 32
679 Cloudera Hadoop and Spark Developer Certification 준비(참고) 총관리자 2018.05.16 32
678 Failed to write to server: (no server available): 총관리자 2022.01.17 32
677 [Kerberos]병렬 kinit 호출시 cache파일이 손상되어 Bad format in credentials cache 혹은 No credentials cache found 혹은 Internal credentials cache error 오류 발생시 gooper 2023.01.20 32
676 restaurant-controller,에서 등록 예시 총관리자 2022.04.30 33
675 fuseki에서 제공하는 script중 s-post를 사용하는 예문 총관리자 2017.09.15 34
674 AnalysisException: Incomplatible return type 'DECIMAL(38,0)' and 'DECIMAL(38,5)' of exprs가 발생시 조치 총관리자 2021.07.26 34
673 ServerInfo객체파일 총관리자 2016.07.21 35
672 spark에서 hive table을 읽어 출력하는 예제 소스 총관리자 2017.03.09 35
671 core 'gc_shard3_replica2' is already locked라는 오류가 발생할때 조치사항 총관리자 2017.09.14 35
670 tar를 이용한 리눅스 백업 총관리자 2018.05.13 35
669 Oracle NLOB type의 데이터를 import하는 경우 No Java type for SQL type 2011 for column rst와 같은 오류 발생시 조치사항 총관리자 2022.01.14 35
668 [TLS/SSL]Kudu Tablet Server설정 총관리자 2022.05.13 35
667 kerberos연동된 CDH 6.3.4에서 default realm값이 잘못된 상태에서 서비스 기동시 오류 gooper 2022.10.14 35
666 S2RDF모듈의 실행부분만 추출하여 별도록 실행하는 방법(draft) 총관리자 2016.06.14 36
665 5건의 triple data를 이용하여 특정 작업 폴더에서 작업하는 방법/절차 총관리자 2016.06.16 36
664 Github를 이용하는 전체 흐름 이해하기 총관리자 2016.11.18 36
663 [vi] test.nq파일에서 특정문자열(예, <>)을 찾아서 포함되는 라인을 삭제한 동일한 이름의 파일을 만드는 방법 총관리자 2017.01.25 36
662 CM의 Impala->Query tab에서 FINISHED query가 보이지 않는 현상 총관리자 2021.08.31 36

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로