Hiveデータベースの作成からデータのインポートまでのケーススタディ
7091 ワード
DDL
hive> create database if not exists ruozedata_test comment 'hive test database' location '/user/hadoop';
OK
Time taken: 0.325 seconds
hive> desc database ruozedata_test;
OK
ruozedata_test hive test database hdfs://192.168.52.130:9000/user/hive/warehouse/ruozedata_test.db hadoop USER
Time taken: 0.126 seconds, Fetched: 1 row(s)
hive> alter database ruozedata_test set owner user root;
OK
Time taken: 0.198 seconds
hive> desc database ruozedata_test;
OK
ruozedata_test hive test database hdfs://192.168.52.130:9000/user/hive/warehouse/ruozedata_test.db root USER
Time taken: 0.061 seconds, Fetched: 1 row(s)
hive> create table MANAGED_TABLE_emp1(
> empno int,
> ename string,
> job string,
> mgr int,
> hiredate string,
> sal int,
> comm int,
> deptno int
> )comment 'This is a emp test table'
> partitioned by(dt string)
> row format delimited fields terminated by '\t';
OK
Time taken: 0.203 seconds
方法2:既存のテーブルのテーブル構造をコピーします.
hive> create table managed_table_emp2 like managed_table_emp1;
OK
Time taken: 0.494 seconds
hive> show tables;
OK
方法3:既存のテーブル構造とデータをコピーする
hive> create table managed_table_emp3 as select * from managed_table_emp1;
hive> create external table external_TABLE_emp1(
> empno int,
> ename string,
> job string,
> mgr int,
> hiredate string,
> sal int,
> comm int,
> deptno int
> )comment 'This is a emp test external table'
> row format delimited fields terminated by '\t'
> location '/user/hadoop/external';
OK
Time taken: 0.415 seconds
Time taken: 0.034 seconds
hive> create external table user(
> id int,
> name string,
> address string,
> birthday string
> )partitioned by (dt string)
> row format delimited fields terminated by '\t';
OK
Time taken: 0.123 seconds
hive> show create table user;
hive> insert overwrite table ruoze_emp_partition PARTITION(deptno)
> select empno,ename,job,mgr,hiredate,sal,comm,deptno from ruoze_emp;
Query ID = hadoop_20181029235151_8ea2d815-d0a8-4dd8-b4f7-f3b9e6a53b47
Total jobs = 3
[hadoop@hadoop001 data]$ hadoop fs -ls /user/hive/warehouse/ruoze_emp_partition
18/10/30 03:33:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 4 items
drwxr-xr-x - hadoop supergroup 0 2018-10-30 03:28 /user/hive/warehouse/ruoze_emp_partition/deptno=10
drwxr-xr-x - hadoop supergroup 0 2018-10-30 03:28 /user/hive/warehouse/ruoze_emp_partition/deptno=20
drwxr-xr-x - hadoop supergroup 0 2018-10-30 03:28 /user/hive/warehouse/ruoze_emp_partition/deptno=30
drwxr-xr-x - hadoop supergroup 0 2018-10-30 03:28 /user/hive/warehouse/ruoze_emp_partition/deptno=__HIVE_DEFAULT_PARTITION__
[hadoop@hadoop001 data]$ hadoop fs -ls /user/hive/warehouse/ruoze_emp_partition/deptno=__HIVE_DEFAULT_PARTITION__
load data [local] inpath 'file-path' [overwrite] into table emp [partition (deptno=10)]
hive> desc ruoze_emp;
OK
empno int
ename string
job string
mgr int
hiredate string
sal double
comm double
deptno int
Time taken: 0.296 seconds, Fetched: 8 row(s)
hive>
hive> desc formatted ruoze_emp;
OK
# col_name data_type comment
empno int
ename string
job string
mgr int
hiredate string
sal double
comm double
deptno int
# Detailed Table Information
Database: default
Owner: hadoop
CreateTime: Mon Oct 22 15:32:07 CST 2018
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://hadoop001:9000/user/hive/warehouse/ruoze_emp
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
numFiles 1
numRows 0
rawDataSize 0
totalSize 700
transient_lastDdlTime 1540817159
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim \t
serialization.format \t
Time taken: 1.14 seconds, Fetched: 39 row(s)
整理中.....
ヒント
set hive.cli.print.header
=trueを構成することにより、クエリーテーブルのヘッダset hive.cli.print.current.db =true
を構成することにより、現在のデータベース