메뉴 건너뛰기

Cloudera, BigData, Semantic IoT, Hadoop, NoSQL

Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.


1. Cassandra 1.2.11 다운로드/설치(cumulusRDF 1.0.1와 호환되며 테스트 된 버젼임)

 http://archive.apache.org/dist/cassandra/


 *참고1 : https://www.gooper.com/ss/index.php?mid=bigdata&category=2803&document_srl=3110 (버젼이 다르지만 언급된 항목에 대한 설정은 같으므로 참고하여 설정해준다)

 *참고2 : CumulusRDF 1.1.0이 Cassandra 1.2.X만 지원하므로 1.2.X를 다운받아 설치해야한다.


2. CumulusRDF 1.0.1 Web Application다운로드및 설치(직접 maven으로 빌드하면.. *.jar, war파일이 만들어지나 잘안됨(?))

  https://github.com/cumulusrdf/cumulusrdf/wiki/Downloads

  에서 March 11th 2014: CumulusRDF v1.0.1 war파일을 다운로드 받아서 WAS에 deploy한다.

  (예, tomcat의 경우 webapps폴더 밑에 두면 파일명을 context명으로 자동설치된다)


* 참고1 : https://github.com/cumulusrdf/cumulusrdf/wiki
* 참고2 : http://xxx.xxx.xxx.43:8080/cumulusrdf-1.0.1/info에 접근하면 web페이지에서 query및 bulkupload를 할 수있다. 

3. CumulusRDF 1.0.1 CLI 툴 다운로드및 설치

  https://github.com/cumulusrdf/cumulusrdf/wiki/Downloads

  에서 March 11th 2014: CumulusRDF v1.0.1 CLI jar를 다운로드 받아서 적절한 위치에 복사한다.

 * 참고1 : https://github.com/cumulusrdf/cumulusrdf/wiki/CLI)

 * 참고2 : dump, load, query, remove를 실행할 수 있는 jar파일임


 

4. 첨부된 1.0.0버젼의 CLI jar파일은 load할때 아래와 같이 사용한다.(이것은 thread개수를 지정할 수 있는데.. CumulusRDF 1.0.1 CLI등은 사용법이르며 일부기능이 지원되지 않음)

java -cp ./cumulusrdf-1.0.0-jar-with-dependencies.jar edu.kit.aifb.cumulus.cli.Main Load -i ./icbms_2016-04-15_15-10-16.nq -b 10000 -t 8

배치 10000, 쓰레드 8개로 nq파일을 업로드함

: 첨부파일을 이용할것


----------------------------첨부된 jar파일 사용시 가능한 옵션(Load, Dump, Query, Remove별로 다름)----------------------

-bash-4.1# java -cp ./cumulusrdf-1.0.0-jar-with-dependencies.jar edu.kit.aifb.cumulus.cli.Main Load -help

***ERROR: class org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: -help

usage: parameters:

 -b <arg>   batch size - number of triples (default: 100)

 -f <arg>   format ('nt', 'nq' or 'xml') (default: 'nt')

 -h         print help

 -i <arg>   name of file to read, - for stdin (but then need to specify -x

            option)

 -k <arg>   Cassandra keyspace (default KeyspaceCumulus)

 -n <arg>   Cassandra hosts as comma-separated list

            ('host1:port1,host2:port2,...') (default localhost:9160)

 -r <arg>   replication factor  (default: 1)

 -s <arg>   storage layout to use (triple|quad) (needs to match webapp

            configuration)

 -t <arg>   number of loading threads (defaults to min(1,|hosts|/1.5))

time elapsed 6 ms

-bash-4.1# java -cp ./cumulusrdf-1.0.0-jar-with-dependencies.jar edu.kit.aifb.cumulus.cli.Main Dump -help

***ERROR: class org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: -help

usage: parameters:

 -h         print help

 -k <arg>   Cassandra keyspace (default KeyspaceCumulus)

 -n <arg>   Cassandra hosts as comma-separated list

            ('host1:port1,host2:port2,...') (default localhost:9160)

 -o <arg>   name of output file

time elapsed 6 ms

-bash-4.1# java -cp ./cumulusrdf-1.0.0-jar-with-dependencies.jar edu.kit.aifb.cumulus.cli.Main Query -help

***ERROR: class org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: -help

usage: parameters:

 -h         print help

 -k <arg>   Cassandra keyspace (default KeyspaceCumulus)

 -n <arg>   Cassandra hosts as comma-separated list

            ('host1:port1,host2:port2,...') (default localhost:9160)

 -q <arg>   sparql query string

 -s <arg>   storage layout to use (triple|quad) (needs to match webapp

            configuration)

time elapsed 10 ms

-bash-4.1# java -cp ./cumulusrdf-1.0.0-jar-with-dependencies.jar edu.kit.aifb.cumulus.cli.Main Remove -help

***ERROR: class org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: -help

usage: parameters:

 -h         print help

 -k <arg>   Cassandra keyspace (default KeyspaceCumulus)

 -n <arg>   Cassandra hosts as comma-separated list

            ('host1:port1,host2:port2,...') (default localhost:9160)

 -q <arg>   sparql construct query string. all its bindings will be

            removed.

 -s <arg>   storage layout to use (triple|quad) (needs to match webapp

            configuration)

time elapsed 11 ms

--------------------------------cql.sh--------------

가. keyspace확인하기 
 cqlsh> select * from system.schema_keyspaces;
===>
 keyspace_name   | durable_writes | strategy_class                              | strategy_options
-----------------+----------------+---------------------------------------------+----------------------------
 KeyspaceCumulus |           True | org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"}
          system |           True |  org.apache.cassandra.locator.LocalStrategy |                         {}
   system_traces |           True | org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"2"}
---------------

* keyspace목록 보기 : cqlsh>describe keyspaces;
* columnfamilies목록 보기 : describe columnfamilies;
* keyspace정보 보기 : describe keyspace "KeyspaceCumulus";

나. 사용할 keyspace지정하기
use "KeyspaceCumulus";


다. 테이블 목록 조회

describe tables;


*테이블 내용조회

select * from "DICT_P" limit 10;


라. KeyspaceCumulus가 사용하는 테이블 목록
TRUNCATE "DICT_P";
TRUNCATE "OSPC";
TRUNCATE "PREFIX_TO_NS";
TRUNCATE "SPOC";        
TRUNCATE "DICT_P_REVERSE";
TRUNCATE "POSC";          
TRUNCATE "SCHEMA_CLASSES";
TRUNCATE "SPO_RN_DT";     
TRUNCATE "DICT_SO";  
TRUNCATE "POS_RN_DT";
TRUNCATE "SCHEMA_D_PROPS";
TRUNCATE "SPO_RN_NUM";    
TRUNCATE "DICT_SO_REVERSE";
TRUNCATE "POS_RN_NUM";     
TRUNCATE "SCHEMA_O_PROPS";
TRUNCATE "counter";       

-----------------------------------KeyspaceCumulus정보---------------------
cqlsh> describe keyspace "KeyspaceCumulus";
CREATE KEYSPACE "KeyspaceCumulus" WITH replication = {
  'class': 'SimpleStrategy',
  'replication_factor': '1'
};
USE "KeyspaceCumulus";
CREATE TABLE "DICT_P" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "DICT_P_REVERSE" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "DICT_SO" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "DICT_SO_REVERSE" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "OSPC" (
  key blob,
  column1 blob,
  column2 blob,
  column3 blob,
  value blob,
  PRIMARY KEY (key, column1, column2, column3)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "POSC" (
  key blob,
  column1 blob,
  column2 blob,
  column3 blob,
  "03" blob,
  PRIMARY KEY (key, column1, column2, column3)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE INDEX index_924575 ON "POSC" ("03");
CREATE TABLE "POS_RN_DT" (
  key blob,
  column1 bigint,
  column2 blob,
  value blob,
  PRIMARY KEY (key, column1, column2)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "POS_RN_NUM" (
  key blob,
  column1 double,
  column2 blob,
  value blob,
  PRIMARY KEY (key, column1, column2)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "PREFIX_TO_NS" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SCHEMA_CLASSES" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SCHEMA_D_PROPS" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SCHEMA_O_PROPS" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SPOC" (
  key blob,
  column1 blob,
  column2 blob,
  column3 blob,
  value blob,
  PRIMARY KEY (key, column1, column2, column3)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SPO_RN_DT" (
  key blob,
  column1 bigint,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SPO_RN_NUM" (
  key blob,
  column1 double,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=0 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='false' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE counter (
  key text,
  column1 text,
  value counter,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
번호 제목 날짜 조회 수
750 [Shellscript]Impala view의 실제 참조 테이블 추출용 shellscript파일 2025.03.22 1007
749 엑셀에서 K ,M, G ,T 단위를 숫자로 변환 하는 수식 2025.04.09 1270
748 beeline을 이용한 impala JDBC 테스트 방법(Kerberos 설정된 상태임) 2024.11.29 1547
747 외부에서 ImpalaJDBC42.jar를 통해서 Impala에 접속시 sessions정보 2024.11.26 1628
746 test333 2017.05.01 1836
745 http://blog.naver.com... 2017.06.23 1839
744 Failed to resolve 'acme-v02.api.letsencrypt.org' ([Errno -3] Temporary failure in name resolution)" 2024.11.27 1918
743 eclipse 3.1 단축키 정리파일 2017.01.02 2059
742 5건의 triple data를 이용하여 특정 작업 폴더에서 작업하는 방법/절차 2016.06.16 2079
741 [vi] test.nq파일에서 특정문자열(예, <>)을 찾아서 포함되는 라인을 삭제한 동일한 이름의 파일을 만드는 방법 2017.01.25 2079
740 Windows에서 sbt개발환경 구축 방법(링크) 2016.06.02 2088
739 [EncryptionZone]User:testuser not allowed to do "DECRYPT_EEK" on 'testkey' 2023.06.29 2097
738 외부 jar파일을 만들려고하는jar파일의 package로 포함하는 방법 2016.08.10 2106
737 java스레드 덤프 분석하기 file 2016.11.03 2117
736 restaurant-controller,에서 등록 예시 2022.04.30 2126
735 DataSetCreator.py 실행시 파일을 찾을 수 없는 오류 2016.05.27 2129
734 실시간 쿼리 변환 모니터링(팩트내 필드값의 변경사항을 실시간으로 추적함)하는 테스트 java 프로그램 file 2016.07.21 2129
733 [oracle]10자리 timestamp값을 날짜로 변환하는 방법 2022.04.14 2166
732 [메모리 덤프파일 분석] 2017.03.31 2242
731 test333444 2017.05.01 2253
위로