메뉴 건너뛰기

Bigdata, Semantic IoT, Hadoop, NoSQL

Bigdata, Hadoop ecosystem, Semantic IoT등의 프로젝트를 진행중에 습득한 내용을 정리하는 곳입니다.
필요한 분을 위해서 공개하고 있습니다. 문의사항은 gooper@gooper.com로 메일을 보내주세요.


hive index생성, 삭제, 활용

총관리자 2014.04.25 16:41 조회 수 : 1702

1. index설정

hive> create index h_price_info_index on table h_price_info (key_id) as 'COMPACT' WITH DEFERRED REBUILD;
OK
Time taken: 6.898 seconds

 

2. index 생성 정보 확인
hive> show formatted index on h_price_info;
OK
idx_name             tab_name             col_names            idx_tab_name         idx_type             comment             
h_price_info_index h_price_info key_id default__h_price_info_h_price_info_index__ compact
Time taken: 0.402 seconds, Fetched: 4 row(s)


3. index를 물리적으로 생성함
hive> alter index h_price_info_index on h_price_info rebuild;

--> 아래와 같은 오류가 발생할 수 있는데.. 아래와 같이 hive 실행시 libpath를 지정하고 실행한다.

(hive --auxpath /home/hadoop/hive/lib/hbase-0.94.6.1.jar,/home/hadoop/hive/lib/zookeeper-3.4.3.jar,/home/hadoop/hive/lib/hive-hbase-handler-0.11.0.jar,/home/hadoop/hive/lib/guava-11.0.2.jar,/home/hadoop/hive/lib/hive-contrib-0.11.0.jar -hiveconf hbase.master=localhost:60000 )

 

Total MapReduce jobs = 1

----------------------오류내용----------------------
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201404241444_0032, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201404241444_0032
Kill Command = /home/hadoop/hadoop/libexec/../bin/hadoop job  -kill job_201404241444_0032
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2014-04-25 16:39:27,621 Stage-1 map = 0%,  reduce = 0%
2014-04-25 16:40:22,624 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201404241444_0032 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201404241444_0032
Examining task ID: task_201404241444_0032_m_000003 (and more) from job job_201404241444_0032

Task with the most failures(4):
-----
Task ID:
  task_201404241444_0032_m_000000

URL:
  http://localhost:50030/taskdetails.jsp?jobid=job_201404241444_0032&tipid=task_201404241444_0032_m_000000
-----
Diagnostic Messages for this Task:
java.io.IOException: Cannot create an instance of InputSplit class = org.apache.hadoop.hive.hbase.HBaseSplit:org.apache.hadoop.hive.hbase.HBaseSplit
 at org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:146)
 at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
 at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
 at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:390)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:406)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HBaseSplit
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:270)
 at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
 at org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.readFields(HiveInputFormat.java:143)
 ... 10 more


FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 2  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
--------------------------------------------------------------------------------------------

 

4. index사용설정

set hive.optimize.autoindex=true;

 

5. 쿼리수행

 select * from h_price_info where key_id like '%고추%'

 

근데 속도가 빠른건지.. 모르겠다...

번호 제목 글쓴이 날짜 조회 수
420 [Impala jdbc]CDP7.1.7환경에서 java프로그램을 이용하여 kerberized impala cluster에 접근하여 SQL을 수행하는 방법 gooper 2023.08.22 58
419 [Hue metadata]Oracle에 있는 Hue 메타정보 테이블을 이용하여 coordinator와 workflow관계 목록을 추출하는 방법 gooper 2023.08.22 15
418 [Hue admin]Add/Sync LDAP user, Sync LDAP users/groups 버튼 기능 설명 gooper 2023.08.09 15
417 oozie의 sqoop action수행시 ooize:launcher의 applicationId를 이용하여 oozie:action의 applicationId및 관련 로그를 찾는 방법 gooper 2023.07.26 10
416 [CDP7.1.6,HDFS]HDFS파일을 삭제하고 Trash비움이 완료된후에도 HDFS 공간을 차지하고 있는 경우 확인/조치 방법 gooper 2023.07.17 16
415 [Encryption Zone]Encryption Zone에 생성된 table을 select할때 HDFS /tmp/zone1에 대한 권한이 없는 경우 gooper 2023.06.29 12
414 [EncryptionZone]User:testuser not allowed to do "DECRYPT_EEK" on 'testkey' gooper 2023.06.29 11
413 [HDFS]Encryption Zone에 생성된 테이블 조회시 Failed to open HDFS file hdfs://nameservice1/tmp/zone1/sec_test_file.txt Error(255): Unknown error 255 Root cause: AuthorizationException: User:impala not allowd to do 'DECRYPT_EEK' on 'testkey' gooper 2023.06.29 53
412 [Hadoop Encryption] Encryption Zone 생성/설정시 User:hadoop not allowed to do 'DECRYPT_EEK' ON 'testkey' 오류 발생 조치 사항 gooper 2023.06.28 18
411 [KTS Cluster의 Key Trustee Server]self-signed 인증서 발급및 설정 방법 gooper 2023.06.27 29
410 [Ranger]RangerAdminRESTClient Error gertting pplicies; Received NULL response!!, secureMode=true, user=rangerkms/node01.gooper.com@ GOOPER.COM (auth:KERBEROS), serviceName=cm_kms gooper 2023.06.27 24
409 [CDP7.1.3]Ranger WebUI에서 Error! Connection refused: Please check the KMS provider URL and whether the Ranager KMS is running발생시 조치 방법 gooper 2023.06.07 19
408 [impala]insert into db명.table명 select a, b from db명.table명 쿼리 수행시 "Memory limit exceeded: Failed to allocate memory for Parquet page index"오류 조치 방법 gooper 2023.05.31 22
407 Impala Admission Control 설정시 쿼리가 사용하는 메모리 사용량 판단 방법 gooper 2023.05.19 93
406 [Atlas Server]org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient permissions (user=atlas/node01.gooper.com@GOOPER.COM, scope=default:atlas_janus, params=[table=default:atlas_janus,], action-CREATE)] gooper 2023.05.15 66
405 [Ranger]계정에 admin권한(grant, create등)의 권한 부여 방법 gooper 2023.04.18 49
404 [DataNode]org.apache.hadoop.security.KerberosAuthException: failure to login: for principal: hdfs/datanode03@GOOPER.COM from keytab hdfs.keytab오류 gooper 2023.04.18 4567
403 [KUDU] kudu tablet server여러가지 원인에 의해서 corrupted상태가 된 경우 복구방법 gooper 2023.03.28 37
402 Hadoop Clsuter에 이미 포함된 host의 hostname변경시 처리 절차 gooper 2023.03.24 14
401 [Kudu]ERROR: Unable to advance iterator for node with id '2' for Kudu table 'impala::core.pm0_abdasubjct': Network error: recv error from unknown peer: Transport endpoint is not connected (error 107) gooper 2023.03.16 536

A personal place to organize information learned during the development of such Hadoop, Hive, Hbase, Semantic IoT, etc.
We are open to the required minutes. Please send inquiries to gooper@gooper.com.

위로