메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


1. 다운로드

http://archive.apache.org/dist/sqoop/1.99.5/ 에서 hadoop2는 hadoop200이 붙은 tar.gz파일을 다운로드한다.

(sqoop-1.99.5-bin-hadoop200.tar.gz)


2. /usr/local로 이동후 압축풀기

mv sqoop-1.99.5-bin-hadoop200.tar.gz /usr/local

tar xvfz sqoop-1.99.5-bin-hadoop200.tar.gz


3. link 생성

ln -s sqoop-1.99.5-bin-hadoop200 sqoop


4. catalina.properties 설정

hadoop1 혹은 hadoop2이 어디에 설치되어있던 상관없이 hadoop library파일과 configuration파일이 사용가능한 곳에 있어야 한다.

vi /usr/local/sqoop/server/conf/catalina.properties

내용중 common.loader내용을 현재 상황에 맞게 jar파일위치를 지정해준다.

common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${catalina.home}/../lib/*.jar,/usr/local/hadoop/share/usr/local/common/*.jar,/usr/local/hadoop/share/usr/local/hdfs/*.jar,/usr/local/hadoop/share/usr/local/mapreduce/*.jar,/usr/local/hadoop/share/usr/local/yarn/*.jar,/usr/local/hive/lib/*.jar,/usr/local/hadoop/share/usr/local/common/lib/*.jar


5. jdbc driver설정

mkdir /usr/local/sqoop/lib

cp postgresql-9.3-1103.jdbc4.jar /usr/local/sqoop/lib


6. server설정 확인

sqoop2-tool verify하면 "Caused by: java.sql.SQLNonTransientConnectionException: No current connection."메세지가 나올수 있는데

무시하고 진행하면 된다.(Verification was successful.라는 메세지가 보이므로..)


--->정상메세지

Verification was successful.

Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly


-- 오류메세지

sqoop2-tool verify

Sqoop home directory: /usr/lib/sqoop

Setting SQOOP_HTTP_PORT:     12000

Setting SQOOP_ADMIN_PORT:     12001

Using   CATALINA_OPTS:       

Adding to CATALINA_OPTS:    -Dsqoop.http.port=12000 -Dsqoop.admin.port=12001

Mar 14, 2015 11:02:53 PM org.apache.catalina.startup.ClassLoaderFactory validateFile

WARNING: Problem with directory [/usr/lib/sqoop/lib], exists: [false], isDirectory: [false], canRead: [false]

Sqoop tool executor:

    Version: 1.99.5

    Revision: 9665c01f674d69d41a6fcfffb2c0b94590f70f59

    Compiled on Wed Feb 18 09:42:27 PST 2015 by vbasavaraj

Running tool: class org.apache.sqoop.tools.tool.VerifyTool

2015-03-14 23:02:54,477 INFO  [main] core.SqoopServer (SqoopServer.java:initialize(51)) - Booting up Sqoop server

2015-03-14 23:02:54,493 INFO  [main] core.PropertiesConfigurationProvider (PropertiesConfigurationProvider.java:initialize(96)) - Starting config file poller thread

log4j: Parsing for [root] with value=[WARN, file].

log4j: Level token is [WARN].

log4j: Category root set to WARN

log4j: Parsing appender named "file".

log4j: Parsing layout options for "file".

log4j: Setting property [conversionPattern] to [%d{ISO8601} %-5p %c{2} [%l] %m%n].

log4j: End of parsing for "file".

log4j: Setting property [file] to [@LOGDIR@/sqoop.log].

log4j: Setting property [maxBackupIndex] to [5].

log4j: Setting property [maxFileSize] to [25MB].

log4j: setFile called: @LOGDIR@/sqoop.log, true

log4j: setFile ended

log4j: Parsed "file" options.

log4j: Parsing for [org.apache.sqoop] with value=[DEBUG].

log4j: Level token is [DEBUG].

log4j: Category org.apache.sqoop set to DEBUG

log4j: Handling log4j.additivity.org.apache.sqoop=[null]

log4j: Parsing for [org.apache.derby] with value=[INFO].

log4j: Level token is [INFO].

log4j: Category org.apache.derby set to INFO

log4j: Handling log4j.additivity.org.apache.derby=[null]

log4j: Finished configuring.

log4j: Could not find root logger information. Is this OK?

log4j: Parsing for [default] with value=[INFO,defaultAppender].

log4j: Level token is [INFO].

log4j: Category default set to INFO

log4j: Parsing appender named "defaultAppender".

log4j: Parsing layout options for "defaultAppender".

log4j: Setting property [conversionPattern] to [%d %-5p %c: %m%n].

log4j: End of parsing for "defaultAppender".

log4j: Setting property [file] to [@LOGDIR@/default.audit].

log4j: setFile called: @LOGDIR@/default.audit, true

log4j: setFile ended

log4j: Parsed "defaultAppender" options.

log4j: Handling log4j.additivity.default=[null]

log4j: Finished configuring.

Exception in thread "PurgeThread" org.apache.sqoop.common.SqoopException: JDBCREPO_0009:Failed to finalize transaction

    at org.apache.sqoop.repository.JdbcRepositoryTransaction.close(JdbcRepositoryTransaction.java:115)

    at org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:109)

    at org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:61)

    at org.apache.sqoop.repository.JdbcRepository.purgeSubmissions(JdbcRepository.java:589)

    at org.apache.sqoop.driver.JobManager$PurgeThread.run(JobManager.java:648)

Caused by: java.sql.SQLNonTransientConnectionException: No current connection.

    at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)

    at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)

    at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)

    at org.apache.derby.impl.jdbc.Util.noCurrentConnection(Unknown Source)

    at org.apache.derby.impl.jdbc.EmbedConnection.checkIfClosed(Unknown Source)

    at org.apache.derby.impl.jdbc.EmbedConnection.setupContextStack(Unknown Source)

    at org.apache.derby.impl.jdbc.EmbedConnection.commit(Unknown Source)

    at org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:334)

    at org.apache.commons.dbcp.DelegatingConnection.commit(DelegatingConnection.java:334)

    at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.commit(PoolingDataSource.java:211)

    at org.apache.sqoop.repository.JdbcRepositoryTransaction.close(JdbcRepositoryTransaction.java:112)

    ... 4 more

Caused by: java.sql.SQLException: No current connection.

    at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)

    at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source)

    ... 15 more

Verification was successful.

Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly.

hduser@slave:/usr/lib/sqoop/bin$ hadoop version

Hadoop 2.4.1

Subversion http://svn.apache.org/repos/asf/hadoop/common -r 1604318

Compiled by jenkins on 2014-06-21T05:43Z

Compiled with protoc 2.5.0

From source with checksum bb7ac0a3c73dc131f4844b873c74b630

This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.4.1.jar


7. server기등(실패)

bin>sqoop.sh server start

Sqoop home directory: /usr/local/sqoop

Setting SQOOP_HTTP_PORT:     12000

Setting SQOOP_ADMIN_PORT:     12001

Using   CATALINA_OPTS:

Adding to CATALINA_OPTS:    -Dsqoop.http.port=12000 -Dsqoop.admin.port=12001

Using CATALINA_BASE:   /usr/local/sqoop/server

Using CATALINA_HOME:   /usr/local/sqoop/server

Using CATALINA_TMPDIR: /usr/local/sqoop/server/temp

Using JRE_HOME:        /usr/java/latest

Using CLASSPATH:       /usr/local/sqoop/server/bin/bootstrap.jar


(서버stop : bin>sqoop.sh -server stop)

* default로 12000, 12001을 사용하는데 필요시 server/bin/setenv.sh에서 

SQOOP_HTTP_PORT, SQOOP_ADMIN_PORT에서 변경가능함


=============아래와 같은 오류(로그폴더 bin>@LOGDIR@)가 발생하면서 sqoop2 server가 기동되지 않는다.=====>

*org.apache.sqoop.common.SqoopException: CONN_0007:Connector registration
failed*
* at
org.apache.sqoop.connector.ConnectorManager.registerConnectors(ConnectorManager.java:236)*
* at
org.apache.sqoop.connector.ConnectorManager.initialize(ConnectorManager.java:197)*
* at
org.apache.sqoop.connector.ConnectorManager.initialize(ConnectorManager.java:145)*


*Caused by: org.apache.sqoop.common.SqoopException: JDBCREPO_0013:Connector
metadata changed - upgrade may be required - Connector:
generic-jdbc-connector given:
connector-generic-jdbc-connector:-1:org.apache.sqoop.connector.jdbc.GenericJdbcConnector,
Connection: Forms:
form-connection:-1:[input-connection.jdbcDriver:-1:STRING:128,
input-connection.connectionString:-1:STRING:128,
input-connection.username:-1:STRING:40,
input-connection.password:-1:STRING:40,
input-connection.jdbcProperties:-1:MAP]Job type: EXPORTForms:
form-table:-1:[input-table.schemaName:-1:STRING:50,
input-table.tableName:-1:STRING:2000, input-table.sql:-1:STRING:50,
input-table.columns:-1:STRING:50]Job type: IMPORTForms:
form-table:-1:[input-table.schemaName:-1:STRING:50,
input-table.tableName:-1:STRING:50, input-table.sql:-1:STRING:2000,
input-table.columns:-1:STRING:50, input-table.partitionColumn:-1:STRING:50,
input-table.partitionColumnNull:-1:BOOLEAN,
input-table.boundaryQuery:-1:STRING:50] found:
connector-generic-jdbc-connector:1:org.apache.sqoop.connector.jdbc.GenericJdbcConnector,
Connection: Forms:
form-connection:1:[input-connection.jdbcDriver:1:STRING:128,
input-connection.connectionString:2:STRING:128,
input-connection.username:3:STRING:40,
input-connection.password:4:STRING:40,
input-connection.jdbcProperties:5:MAP]Job type: EXPORTForms:
form-table:2:[input-table.schemaName:6:STRING:50,
input-table.tableName:7:STRING:2000, input-table.sql:8:STRING:50,
input-table.columns:9:STRING:50, input-table.stageTableName:10:STRING:2000,
input-table.clearStageTable:11:BOOLEAN]Job type: IMPORTForms:
form-table:3:[input-table.schemaName:12:STRING:50,
input-table.tableName:13:STRING:50, input-table.sql:14:STRING:2000,
input-table.columns:15:STRING:50, input-table.partitionColumn:16:STRING:50,
input-table.partitionColumnNull:17:BOOLEAN,
input-table.boundaryQuery:18:STRING:50]*
* at
org.apache.sqoop.repository.JdbcRepository$3.doIt(JdbcRepository.java:195)*


그래서 sqoop.properties파일에 

org.apache.sqoop.connector.autoupgrade=true
org.apache.sqoop.framework.autoupgrade=true

를 설정하여도 마찬가지로 기동되지 않는다.....===> failed


8. client기동

bin>sqoop.sh client를 실행하면 된다.

sqoop:000>show connector하면 오류메세지가 나온다.


============> 그지 같은 SQOOP2 <===================



---------sqoop.properties---------------------

#

# Licensed to the Apache Software Foundation (ASF) under one or more

# contributor license agreements.  See the NOTICE file distributed with

# this work for additional information regarding copyright ownership.

# The ASF licenses this file to You under the Apache License, Version 2.0

# (the "License"); you may not use this file except in compliance with

# the License.  You may obtain a copy of the License at

#

#     http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

#


#

# Sqoop configuration file used by the built in configuration

# provider: org.apache.sqoop.core.PropertiesConfigurationProvider.

# This file must reside in the system configuration directory

# which is specified by the system property "sqoop.config.dir"

# and must be called sqoop.properties.

#

# NOTE: Tokens specified in this file that are marked by a

# leading and trailing '@' characters should be replaced by

# their appropriate values. For example, the token @LOGDIR@

# should be replaced  appropriately.

#

# The following tokens are used in this configuration file:

#

# LOGDIR

#   The absolute path to the directory where system genearated

#   log files will be kept.

#

# BASEDIR

#   The absolute path to the directory where Sqoop 2 is installed

#


#

# Logging Configuration

# Any property that starts with the prefix

# org.apache.sqoop.log4j is parsed out by the configuration

# system and passed to the log4j subsystem. This allows you

# to specify log4j configuration properties from within the

# Sqoop configuration.

#

org.apache.sqoop.log4j.appender.file=org.apache.log4j.RollingFileAppender

org.apache.sqoop.log4j.appender.file.File=@LOGDIR@/sqoop.log

org.apache.sqoop.log4j.appender.file.MaxFileSize=25MB

org.apache.sqoop.log4j.appender.file.MaxBackupIndex=5

org.apache.sqoop.log4j.appender.file.layout=org.apache.log4j.PatternLayout

org.apache.sqoop.log4j.appender.file.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} [%l] %m%n

org.apache.sqoop.log4j.debug=true

org.apache.sqoop.log4j.rootCategory=WARN, file

org.apache.sqoop.log4j.category.org.apache.sqoop=DEBUG

org.apache.sqoop.log4j.category.org.apache.derby=INFO


#

# Audit Loggers Configuration

# Multiple audit loggers could be given here. To specify an

# audit logger, you should at least add org.apache.sqoop.

# auditlogger.[LoggerName].class. You could also provide

# more configuration options by using org.apache.sqoop.

# auditlogger.[LoggerName] prefix, then all these options

# are parsed to the logger class.

#

org.apache.sqoop.auditlogger.default.class=org.apache.sqoop.audit.FileAuditLogger

org.apache.sqoop.auditlogger.default.file=@LOGDIR@/default.audit


#

# Repository configuration

# The Repository subsystem provides the special prefix which

# is "org.apache.sqoop.repository.sysprop". Any property that

# is specified with this prefix is parsed out and set as a

# system property. For example, if the built in Derby repository

# is being used, the sysprop prefixed properties can be used

# to affect Derby configuration at startup time by setting

# the appropriate system properties.

#


# Repository provider

org.apache.sqoop.repository.provider=org.apache.sqoop.repository.JdbcRepositoryProvider


# Repository upgrade

# If set to true, it will not upgrade the sqoop respository schema, by default it will iniate the upgrade on server start-up

#org.apache.sqoop.repository.schema.immutable=false

org.apache.sqoop.repository.schema.immutable=true


# JDBC repository provider configuration

#org.apache.sqoop.repository.jdbc.handler=org.apache.sqoop.repository.derby.DerbyRepositoryHandler

#org.apache.sqoop.repository.jdbc.transaction.isolation=READ_COMMITTED

#org.apache.sqoop.repository.jdbc.maximum.connections=10

#org.apache.sqoop.repository.jdbc.url=jdbc:derby:@BASEDIR@/repository/db;create=true

#org.apache.sqoop.repository.jdbc.driver=org.apache.derby.jdbc.EmbeddedDriver

#org.apache.sqoop.repository.jdbc.user=sa

#org.apache.sqoop.repository.jdbc.password=


# JDBC repository provider configuration

org.apache.sqoop.repository.jdbc.handler=org.apache.sqoop.repository.postgresql.PostgresqlRepositoryHandler

org.apache.sqoop.repository.jdbc.transaction.isolation=READ_COMMITTED

org.apache.sqoop.repository.jdbc.maximum.connections=10

org.apache.sqoop.repository.jdbc.url=jdbc:postgresql://postgres.dbserver.com:5432/sqoopdb

org.apache.sqoop.repository.jdbc.create.schema=true

org.apache.sqoop.repository.jdbc.driver=org.postgresql.Driver

#org.apache.sqoop.repository.jdbc.driver=org.apache.postgresql.jdbc.EmbeddedDriver

org.apache.sqoop.repository.jdbc.user=sqoop

org.apache.sqoop.repository.jdbc.password=sqoop


# System properties for embedded Derby configuration

org.apache.sqoop.repository.sysprop.derby.stream.error.file=@LOGDIR@/derbyrepo.log


#

# Sqoop Connector configuration

# If set to true will initiate Connectors config upgrade during server startup

#

org.apache.sqoop.connector.autoupgrade=true

#

# Sqoop Driver configuration

# If set to true will initiate the Driver config upgrade during server startup

#

org.apache.sqoop.driver.autoupgrade=true


# Sleeping period for reloading configuration file (once a minute)

org.apache.sqoop.core.configuration.provider.properties.sleep=60000


#

# Submission engine configuration

#


# Submission engine class

org.apache.sqoop.submission.engine=org.apache.sqoop.submission.mapreduce.MapreduceSubmissionEngine


# Number of milliseconds, submissions created before this limit will be removed, default is one day

#org.apache.sqoop.submission.purge.threshold=


# Number of milliseconds for purge thread to sleep, by default one day

#org.apache.sqoop.submission.purge.sleep=


# Number of milliseconds for update thread to sleep, by default 5 minutes

#org.apache.sqoop.submission.update.sleep=


#

# Configuration for Mapreduce submission engine (applicable if it's configured)

#


# Hadoop configuration directory

org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/usr/local/hadoop/etc/hadoop/


#

# Execution engine configuration

#

org.apache.sqoop.execution.engine=org.apache.sqoop.execution.mapreduce.MapreduceExecutionEngine


#

# Authentication configuration

#

#org.apache.sqoop.security.authentication.type=SIMPLE

#org.apache.sqoop.security.authentication.handler=org.apache.sqoop.security.authentication.SimpleAuthenticationHandler

#org.apache.sqoop.security.authentication.anonymous=true

#org.apache.sqoop.security.authentication.type=KERBEROS

#org.apache.sqoop.security.authentication.handler=org.apache.sqoop.security.authentication.KerberosAuthenticationHandler

#org.apache.sqoop.security.authentication.kerberos.principal=sqoop/_HOST@NOVALOCAL

#org.apache.sqoop.security.authentication.kerberos.keytab=/home/kerberos/sqoop.keytab

#org.apache.sqoop.security.authentication.kerberos.http.principal=HTTP/_HOST@NOVALOCAL

#org.apache.sqoop.security.authentication.kerberos.http.keytab=/home/kerberos/sqoop.keytab

#org.apache.sqoop.security.authentication.enable.doAs=true

#org.apache.sqoop.security.authentication.proxyuser.#USER#.users=*

#org.apache.sqoop.security.authentication.proxyuser.#USER#.groups=*

#org.apache.sqoop.security.authentication.proxyuser.#USER#.hosts=*


#

# Authorization configuration

#

#org.apache.sqoop.security.authorization.handler=org.apache.sqoop.security.authorization.DefaultAuthorizationHandler

#org.apache.sqoop.security.authorization.access_controller=org.apache.sqoop.security.authorization.DefaultAuthorizationAccessController

#org.apache.sqoop.security.authorization.validator=org.apache.sqoop.security.authorization.DefaultAuthorizationValidator

#org.apache.sqoop.security.authorization.authentication_provider=org.apache.sqoop.security.authorization.DefaultAuthenticationProvider

#org.apache.sqoop.security.authorization.server_name=SqoopServer1


# External connectors load path

# "/path/to/external/connectors/": Add all the connector JARs in the specified folder

#

org.apache.sqoop.connector.external.loadpath=


번호 제목 글쓴이 날짜 조회 수
18 import 혹은 export할때 hive파일의 default 구분자는 --input-fields-terminated-by "x01"와 같이 지정해야함 총관리자 2014.05.20 4244
17 sqoop작업시 hdfs의 개수보다 더많은 값이 중복되어 oracle에 입력되는 경우가 있음 총관리자 2014.09.02 4093
16 sqoop 1.4.4 설치및 테스트 총관리자 2014.04.21 3134
» hadoop 2.6.0에 sqoop2 (1.99.5) server및 client설치 == fail 총관리자 2015.06.11 1770
14 sqoop에서 oracle관련 작업할때 테이블명, 사용자명, DB명은 모두 대문자로 사용할것 총관리자 2014.05.15 1527
13 oozie 에서 sqoop action실행 에러 - 컬럼개수 차이 총관리자 2014.07.17 1002
12 sqoop export/import등을 할때 driver를 못찾는 오류가 발생하면... 총관리자 2014.05.15 863
11 sqoop으로 mariadb에 접근해서 hive 테이블로 자동으로 생성하기 총관리자 2018.08.03 670
10 java.util.NoSuchElementException발생시 조치 총관리자 2014.08.27 476
9 컬럼및 라인의 구분자를 지정하여 sqoop으로 데이타를 가져오고 hive테이블을 생성하는 명령문 총관리자 2018.08.03 418
8 Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.http.HttpConfig.getSchemePrefix()Ljava/lang/String; 해결->실패 총관리자 2015.06.14 402
7 [sqoop] mapper를 2이상으로 설정하기 위한 split-by컬럼을 찾을때 유용하게 활용할 수 있는 쿼리 총관리자 2020.05.13 329
6 [sap] Error: java.io.IOException: SQLException in nextKeyValue 오류 발생 총관리자 2020.06.08 266
5 Oracle 12c DB의 LOB타입 컬럼이 있는 테이블을 import할 때 주의 할 사항 gooper 2022.09.14 157
4 oracle 12에 sqoop해서 데이터 import하기 (console에서 sqoop import하는 방법) 총관리자 2021.12.31 48
3 Oracle NLOB type의 데이터를 import하는 경우 No Java type for SQL type 2011 for column rst와 같은 오류 발생시 조치사항 총관리자 2022.01.14 35
2 oracle 접속 방식에 따른 --connect 지정 방법 총관리자 2022.02.11 24
1 [CDP7.1.7] oozie sqoop action으로 import혹은 export수행시 발생한 오류에 대한 자세한 로그 확인 하는 방법 new gooper 2024.04.19 0

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로