메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


1. Cassandra 1.2.11 다운로드/설치(cumulusRDF 1.0.1와 호환되며 테스트 된 버젼임)

 http://archive.apache.org/dist/cassandra/


 *참고1 : https://www.gooper.com/ss/index.php?mid=bigdata&category=2803&document_srl=3110 (버젼이 다르지만 언급된 항목에 대한 설정은 같으므로 참고하여 설정해준다)

 *참고2 : CumulusRDF 1.1.0이 Cassandra 1.2.X만 지원하므로 1.2.X를 다운받아 설치해야한다.


2. CumulusRDF 1.0.1 Web Application다운로드및 설치(직접 maven으로 빌드하면.. *.jar, war파일이 만들어지나 잘안됨(?))

  https://github.com/cumulusrdf/cumulusrdf/wiki/Downloads

  에서 March 11th 2014: CumulusRDF v1.0.1 war파일을 다운로드 받아서 WAS에 deploy한다.

  (예, tomcat의 경우 webapps폴더 밑에 두면 파일명을 context명으로 자동설치된다)


* 참고1 : https://github.com/cumulusrdf/cumulusrdf/wiki
* 참고2 : http://xxx.xxx.xxx.43:8080/cumulusrdf-1.0.1/info에 접근하면 web페이지에서 query및 bulkupload를 할 수있다. 

3. CumulusRDF 1.0.1 CLI 툴 다운로드및 설치

  https://github.com/cumulusrdf/cumulusrdf/wiki/Downloads

  에서 March 11th 2014: CumulusRDF v1.0.1 CLI jar를 다운로드 받아서 적절한 위치에 복사한다.

 * 참고1 : https://github.com/cumulusrdf/cumulusrdf/wiki/CLI)

 * 참고2 : dump, load, query, remove를 실행할 수 있는 jar파일임


 

4. 첨부된 1.0.0버젼의 CLI jar파일은 load할때 아래와 같이 사용한다.(이것은 thread개수를 지정할 수 있는데.. CumulusRDF 1.0.1 CLI등은 사용법이르며 일부기능이 지원되지 않음)

java -cp ./cumulusrdf-1.0.0-jar-with-dependencies.jar edu.kit.aifb.cumulus.cli.Main Load -i ./icbms_2016-04-15_15-10-16.nq -b 10000 -t 8

배치 10000, 쓰레드 8개로 nq파일을 업로드함

: 첨부파일을 이용할것


----------------------------첨부된 jar파일 사용시 가능한 옵션(Load, Dump, Query, Remove별로 다름)----------------------

-bash-4.1# java -cp ./cumulusrdf-1.0.0-jar-with-dependencies.jar edu.kit.aifb.cumulus.cli.Main Load -help

***ERROR: class org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: -help

usage: parameters:

 -b <arg>   batch size - number of triples (default: 100)

 -f <arg>   format ('nt', 'nq' or 'xml') (default: 'nt')

 -h         print help

 -i <arg>   name of file to read, - for stdin (but then need to specify -x

            option)

 -k <arg>   Cassandra keyspace (default KeyspaceCumulus)

 -n <arg>   Cassandra hosts as comma-separated list

            ('host1:port1,host2:port2,...') (default localhost:9160)

 -r <arg>   replication factor  (default: 1)

 -s <arg>   storage layout to use (triple|quad) (needs to match webapp

            configuration)

 -t <arg>   number of loading threads (defaults to min(1,|hosts|/1.5))

time elapsed 6 ms

-bash-4.1# java -cp ./cumulusrdf-1.0.0-jar-with-dependencies.jar edu.kit.aifb.cumulus.cli.Main Dump -help

***ERROR: class org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: -help

usage: parameters:

 -h         print help

 -k <arg>   Cassandra keyspace (default KeyspaceCumulus)

 -n <arg>   Cassandra hosts as comma-separated list

            ('host1:port1,host2:port2,...') (default localhost:9160)

 -o <arg>   name of output file

time elapsed 6 ms

-bash-4.1# java -cp ./cumulusrdf-1.0.0-jar-with-dependencies.jar edu.kit.aifb.cumulus.cli.Main Query -help

***ERROR: class org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: -help

usage: parameters:

 -h         print help

 -k <arg>   Cassandra keyspace (default KeyspaceCumulus)

 -n <arg>   Cassandra hosts as comma-separated list

            ('host1:port1,host2:port2,...') (default localhost:9160)

 -q <arg>   sparql query string

 -s <arg>   storage layout to use (triple|quad) (needs to match webapp

            configuration)

time elapsed 10 ms

-bash-4.1# java -cp ./cumulusrdf-1.0.0-jar-with-dependencies.jar edu.kit.aifb.cumulus.cli.Main Remove -help

***ERROR: class org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: -help

usage: parameters:

 -h         print help

 -k <arg>   Cassandra keyspace (default KeyspaceCumulus)

 -n <arg>   Cassandra hosts as comma-separated list

            ('host1:port1,host2:port2,...') (default localhost:9160)

 -q <arg>   sparql construct query string. all its bindings will be

            removed.

 -s <arg>   storage layout to use (triple|quad) (needs to match webapp

            configuration)

time elapsed 11 ms

--------------------------------cql.sh--------------

가. keyspace확인하기 
 cqlsh> select * from system.schema_keyspaces;
===>
 keyspace_name   | durable_writes | strategy_class                              | strategy_options
-----------------+----------------+---------------------------------------------+----------------------------
 KeyspaceCumulus |           True | org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"}
          system |           True |  org.apache.cassandra.locator.LocalStrategy |                         {}
   system_traces |           True | org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"2"}
---------------

* keyspace목록 보기 : cqlsh>describe keyspaces;
* columnfamilies목록 보기 : describe columnfamilies;
* keyspace정보 보기 : describe keyspace "KeyspaceCumulus";

나. 사용할 keyspace지정하기
use "KeyspaceCumulus";


다. 테이블 목록 조회

describe tables;


*테이블 내용조회

select * from "DICT_P" limit 10;


라. KeyspaceCumulus가 사용하는 테이블 목록
TRUNCATE "DICT_P";
TRUNCATE "OSPC";
TRUNCATE "PREFIX_TO_NS";
TRUNCATE "SPOC";        
TRUNCATE "DICT_P_REVERSE";
TRUNCATE "POSC";          
TRUNCATE "SCHEMA_CLASSES";
TRUNCATE "SPO_RN_DT";     
TRUNCATE "DICT_SO";  
TRUNCATE "POS_RN_DT";
TRUNCATE "SCHEMA_D_PROPS";
TRUNCATE "SPO_RN_NUM";    
TRUNCATE "DICT_SO_REVERSE";
TRUNCATE "POS_RN_NUM";     
TRUNCATE "SCHEMA_O_PROPS";
TRUNCATE "counter";       

-----------------------------------KeyspaceCumulus정보---------------------
cqlsh> describe keyspace "KeyspaceCumulus";
CREATE KEYSPACE "KeyspaceCumulus" WITH replication = {
  'class': 'SimpleStrategy',
  'replication_factor': '1'
};
USE "KeyspaceCumulus";
CREATE TABLE "DICT_P" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "DICT_P_REVERSE" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "DICT_SO" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "DICT_SO_REVERSE" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "OSPC" (
  key blob,
  column1 blob,
  column2 blob,
  column3 blob,
  value blob,
  PRIMARY KEY (key, column1, column2, column3)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "POSC" (
  key blob,
  column1 blob,
  column2 blob,
  column3 blob,
  "03" blob,
  PRIMARY KEY (key, column1, column2, column3)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE INDEX index_924575 ON "POSC" ("03");
CREATE TABLE "POS_RN_DT" (
  key blob,
  column1 bigint,
  column2 blob,
  value blob,
  PRIMARY KEY (key, column1, column2)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "POS_RN_NUM" (
  key blob,
  column1 double,
  column2 blob,
  value blob,
  PRIMARY KEY (key, column1, column2)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "PREFIX_TO_NS" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SCHEMA_CLASSES" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SCHEMA_D_PROPS" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SCHEMA_O_PROPS" (
  key blob,
  column1 blob,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SPOC" (
  key blob,
  column1 blob,
  column2 blob,
  column3 blob,
  value blob,
  PRIMARY KEY (key, column1, column2, column3)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SPO_RN_DT" (
  key blob,
  column1 bigint,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE "SPO_RN_NUM" (
  key blob,
  column1 double,
  value blob,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=0 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='false' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
CREATE TABLE counter (
  key text,
  column1 text,
  value counter,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=1.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
번호 제목 글쓴이 날짜 조회 수
19 fuseki용 config-examples.ttl 예시 내용 총관리자 2017.05.17 646
18 시맨틱 관련 논문 모음 사이트 총관리자 2017.06.13 94
17 숭실대 교수님등 강의영상(바이오데이터마이닝, 빅데이터분산컴퓨팅, 컴퓨터 그래픽스, 데이터베이스응용및 프로그램밍, 데이터베이스, 의생명영상처리, 웹그로그래밍, 데이터마이닝, 컴퓨터구조) file 총관리자 2017.06.13 207
16 halyard 1.3을 다른 서버로 이전하는 방법 총관리자 2017.07.05 66
15 halyard 1.3의 rdf4j-server.war와 rdf4j-workbench.war를 tomcat deploy후 조회시 java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/Cell발생시 조치사항 총관리자 2017.07.05 65
14 halyard의 console스크립트에서 생성한 repository는 RDF4J Web Applications에서 공유가 되지 않는다. 총관리자 2017.07.05 45
13 9대가 hbase cluster로 구성된 서버에서 테스트 data를 halyard에 적재하고 테스트 하는 방법및 절차 총관리자 2017.07.21 56
12 LUBM 데이타 생성구문 총관리자 2017.07.24 143
11 jena/fuseki 3.4.0 설치 총관리자 2017.07.25 172
10 [oneM2M]Ontologies used for oneM2M 총관리자 2017.08.02 49
9 DeviceType이 o:motion-sensor_33 이거나 o:motion-sensor_32 경우의 sparql문장은 다음과 같다. 총관리자 2017.08.16 40
8 RDF4J의 rdf4j-server.war가 제공하는 RESTFul API를 이용하여 repository에 CRUD테스트 총관리자 2017.08.30 51
7 RDF4J의 rdf4j-server.war가 제공하는 RESTFul API를 이용한 CRUD테스트(트랜잭션처리) 총관리자 2017.08.30 43
6 RDF4J의 RESTFul API처리 클래스 소스 파악(web module위주) 총관리자 2017.08.30 156
5 halyard 1.3의 console을 이용하여 100억건의 데이타에 대한 쿼리수행시 ScannerTimeoutException 발생시 조치사항 총관리자 2017.09.06 134
4 fuseki에서 제공하는 script중 s-post를 사용하는 예문 총관리자 2017.09.15 34
3 oneM2M Specification(Draft Release 3, 2, 1), Draft Technical Reports 총관리자 2017.10.25 81
2 전체 컨택스트 내용 file 총관리자 2017.12.19 66
1 fuseki의 endpoint를 이용한 insert, delete하는 sparql예시 총관리자 2018.02.14 102

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로