hive:パーティション

4317 ワード

ビッグデータHive

パーティションの種類:1、静的なパーティション:データをロードする時にパーティションの値を指定します.2、ダイナミックパーティション:データが不明で、パーティションの値に基づいてパーティションを作成することができます.3、混合パーティション:静的とダイナミックがあります.
注意事項:1、hiveのパーティションは表外フィールドを使用しています.パーティションフィールドは疑似列ですが、クエリーフィルタができます.2、パーティションフィールドは中国語で3、ダイナミックパーティションの使用はあまり推奨されていません.ダイナミックパーティションはmapreduceを使ってデータを検索しますので、パーティションの数が多すぎるとnamenodeとyarnのパフォーマンスがボトルネックになります.ダイナミックパーティションの前に、できるだけパーティションの数を予知する必要があると提案します.4、パーティション属性の変更は、メタデータとhdfsのデータ内容を手動で変更することができます.(変更を保持)
1、静的なパーティション
レベル1のパーティションを作成:

create table if not exists t_part1(
uid int,
uname string,
age int
)
partitioned by (dt string)
row format delimited fields terminated by ',';

データをインポート:

load data local inpath '/usr/local/hivedata/users.txt' into table t_part1 partition(dt='2018-07-04');

二段階パーティションを作成:

create table if not exists t_part2(
uid int,
uname string,
age int
)
partitioned by (year string,month string)
row format delimited fields terminated by ',';

データをロード:

load data local inpath '/usr/local/hivedata/users.txt' into table t_part2 partition(year='2018',month=07);

load data local inpath '/usr/local/hivedata/users.txt' into table t_part2 partition(year='2018',month='07');

レベル3のディレクトリを作成:

create table if not exists t_part3(
uid int,
uname string,
age int
)
partitioned by (year string,month string,days string)
row format delimited fields terminated by ',';

データをロード:

load data local inpath '/usr/local/hivedata/users.txt' into table t_part3 partition(year='2018',month='07',days='04');

クエリー:

select * from t_part3;

select * from t_part3 where year = '2018';

select * from t_part3 where year = '2018' and month = '07';

select * from t_part3 where month = '07' and days='04';

パーティションを表示:

show partitions t_part2;

パーティションを修正します.1、パーティション名はどうやって修正しますか?現在はパーティション名の変更方法が提供されていません(暴力的に修正し、直接hdfs上のディレクトリ名を修正します).
2、パーティションを追加する:

alter table t_part2 add partition(year='2018',month='06');

alter table t_part2 add partition(year='2018',month='06') partition(year='2017',month='12') partition(year='2017',month='11');

パーティションを追加してデータを設定します.

alter table t_part2 add partition(year='2017',month='12') location '/user/hive/warehouse/gp1707.db/t_user_info';

alter table t_part2 add partition(year='2017',month='08') location '/user/hive/warehouse/gp1707.db/t_user_info' partition(year='2017',month='09') location '/user/hive/warehouse/gp1707.db/t_user_info';

3、パーティションの記憶経路を変更する:

alter table t_part2 partition(year='2017',month='08') set location 'hdfs://hadoop01:9000/user/hive/warehouse/gp1707.db/t_userinfo';

4、パーティションを削除する

alter table t_part2 drop partition(year='2017',month='06');

alter table t_part2 drop partition(year='2017',month='09'),partition(year='2018',month=07);

注意:1、パーティションパスを変更する場合は、パスの全称、絶対パスを書く必要があります.2、大量にパーティションを追加したり、大量に削除したりする場合は文法が異なりますが、大量に増やす時は「」スペースを使ってパーティションを分割し、一括削除する時は「カンマ」を使って分割します.
2、ダイナミックパーティション

       ： 
set hive.exec.dynamic.partition=true;(   true)
set hive.exec.dynamic.partition.mode=strict/nonstrict;
set hive.exec.max.dynamic.partitions=1000;
set hive.exec.max.dynamic.partitions.pernode=100;

ダイナミックパーティションを作成:

create table t_dypart1(
uid int,
uname string,
uage int
)
partitioned by (dt string)
row format delimited fields terminated by ',';

データをロードする:(ロード方式でデータをロードすることはできません)

insert into table t_dypart1 partition(dt) select * from t_part1;

ミキシングパーティション:

create table t_dypart3(
uid int,
uname string,
uage int
)
partitioned by (year string,month string,days string)
row format delimited fields terminated by ',';

insert into t_dypart3 partition(year='2018',month,days)
select uid,uname,age,month,days from t_part3;

JDBCでMySqlを操作する

layUI rableは、データの読み込み、データの更新、パラメータの転送を初期化する.