HadoopのHive

3997 ワード

詳細
Hive基本紹介
クエリはHDFSに格納されたビッグデータの集合を管理し、SQLのようなクエリ文HiveQLを提供する.HiveテーブルのメタデータはDerbyデータベースに格納され、MysqlとOracleはHiveの格納元としても使用できます.
Hive操作
use:データベースの切り替え
 
hive> use default;
OK
Time taken: 0.454 seconds
hive> show tables;
OK
categories
customers
departments
intermediate_access_logs
order_items
orders
products
tokenized_access_logs
wlslog
Time taken: 0.404 seconds, Fetched: 9 row(s)

CREATE DATABASE:データベースの作成
 
CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name //    
[COMMENT database_comment] //     
[LOCATION hdfs_path] //    
[WITH DBPROPERTIES (property_name=property_value, ...)]; //  

データベースの直接作成
hive> CREATE DATABASE IF NOT  EXISTS testdb;
OK
Time taken: 1.45 seconds
hive> DESC database extended testdb;
OK
testdb          hdfs://quickstart.cloudera:8020/user/hive/warehouse/testdb.db   cloudera        USER
Time taken: 0.046 seconds, Fetched: 1 row(s)

 
CREATE TABLE:テーブルの作成
ファイルを分割するためのROW FORMAT:DELIMITED
FIELDS TERMINTED BY:指定フィールド
LINES TERMINATE BY:行ID
 
CREATE TABLE wlslog(time_stamp STRING,category STRING, type STRING,servername STRING,code
STRING,msg STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '
'; hive> show tables; OK wlslog Time taken: 0.019 seconds, Fetched: 1 row(s)

LOAD DATA:データのインポート
 LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION(partcol1=val1, partcol2=val2 ...)]
ファイルのインポート
hive> LOAD DATA LOCAL INPATH 'cdh/wlslog.log' OVERWRITE INTO TABLE wlslog;
Loading data to table testdb.wlslog
Table testdb.wlslog stats: [numFiles=1, numRows=0, totalSize=717, rawDataSize=0]
OK
Time taken: 0.757 seconds
hive> select * from wlslog;
OK
Apr-8-2014-7:06:16-PM-PDT       Notice  WebLogicServer  AdminServer     BEA-000365      Server state changed to STANDBY
Apr-8-2014-7:06:17-PM-PDT       Notice  WebLogicServer  AdminServer     BEA-000365      Server state changed to STARTING
Apr-8-2014-7:06:18-PM-PDT       Notice  WebLogicServer  AdminServer     BEA-000365      Server state changed to ADMIN
Apr-8-2014-7:06:19-PM-PDT       Notice  WebLogicServer  AdminServer     BEA-000365      Server state changed to RESUMING
Apr-8-2014-7:06:20-PM-PDT       Notice  WebLogicServer  AdminServer     BEA-000331      Started WebLogic AdminServer
Apr-8-2014-7:06:21-PM-PDT       Notice  WebLogicServer  AdminServer     BEA-000365      Server state changed to RUNNING
Apr-8-2014-7:06:22-PM-PDT       Notice  WebLogicServer  AdminServer     BEA-000360      Server started in RUNNING mode
Time taken: 0.41 seconds, Fetched: 7 row(s)

テーブル構造のコピー
 
CREATE TABLE wlslog_2 LIKE wlslog;
データ挿入
 
 INSERT OVERWRITE TABLE tablename1 [PARTITION(partcol1=val1,partcol2=val2 ...) [IF NOT   EXISTS]]  select_statement1 FROM from_statement;
 INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1,partcol2=val2 ...)] select_statement1 FROM from_statement;
テーブル作成時にデータを挿入
CREATE TABLE wlslog_copy
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '
' STORED AS TEXTFILE(SEQUENCEFILE/ORC/PARQUET) AS SELECT time_stamp,category,type,servername,code,msg FROM wlslog 

ALTER TABLE:表名、コメント、属性、記憶の変更
TRUNCATE TABLE:データ行の削除
  TRUNCATE TABLE table_name [PARTITION partition_spec];
DROP TABLE:テーブルの削除
 DROP TABLE [IF EXISTS] table_name;