CDH 6でHive 2.0中国語の文字化けし問題解決


問題の現象、表の注釈、データベースの注釈の中国語はすべて文字化けしています:
 desc t_dm_test;
address                 string                  ???? rowtime                 string                  ?????????? inserttime              string                  ???? remark                  string                  ??
メタデータベースmysqlの構成を表示します.すべて正常utf-8です.
mysql> show variables like 'char%'; +--------------------------+----------------------------+ | Variable_name            | Value                      | +--------------------------+----------------------------+ | character_set_client     | utf8                       | | character_set_connection | utf8                       | | character_set_database   | utf8                       | | character_set_filesystem | binary                     | | character_set_results    | utf8                       | | character_set_server     | utf8                       | | character_set_system     | utf8                       | | character_sets_dir       |/usr/share/mysql/charsets/| +--------------------------+----------------------------+
hiveメタデータベースをチェックすると、データベース文字セットとテーブル文字セットはCHARSET=latin 1であり、mysqlがutf 8文字セットを変更していないか、文字セットが有効でない場合に作成されたhiveメタデータベースである必要があります.
mysql>  show create database hive;
+----------+---------------------------------------------------------------+
| Database | Create Database                                               |
+----------+---------------------------------------------------------------+
| hive     | CREATE DATABASE `hive` /*!40100 DEFAULT CHARACTER SET utf8 */ |
+----------+---------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> show create table COLUMNS_V2;
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table      | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| COLUMNS_V2 | CREATE TABLE `COLUMNS_V2` (
  `CD_ID` bigint(20) NOT NULL,
  `COMMENT` varchar(256) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
  `COLUMN_NAME` varchar(767) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
  `TYPE_NAME` mediumtext,
  `INTEGER_IDX` int(11) NOT NULL,
  PRIMARY KEY (`CD_ID`,`COLUMN_NAME`),
  KEY `COLUMNS_V2_N49` (`CD_ID`),
  CONSTRAINT `COLUMNS_V2_FK1` FOREIGN KEY (`CD_ID`) REFERENCES `CDS` (`CD_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

理由を知っていれば処理できます.メタデータベースとテーブルの文字タイプを変更します.
mysql-u-p接続データベース
mysql> alter database hive  character set utf8; Query OK, 1 row affected (0.00 sec)
#        
alter database hive  character set utf8;

#         
ALTER TABLE COLUMNS_V2 modify column COMMENT varchar(256) character set utf8;
#        
ALTER TABLE TABLE_PARAMS modify column PARAM_VALUE varchar(40000) character set utf8;
#      ,          
ALTER TABLE PARTITION_PARAMS modify column PARAM_VALUE varchar(40000) character set utf8;
ALTER TABLE PARTITION_KEYS modify column PKEY_COMMENT varchar(40000) character set utf8;
#      ,      
ALTER TABLE INDEX_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
#    ,      
ALTER TABLE TBLS modify COLUMN VIEW_EXPANDED_TEXT mediumtext CHARACTER SET utf8;
ALTER TABLE TBLS modify COLUMN VIEW_ORIGINAL_TEXT mediumtext CHARACTER SET utf8;

#         
ALTER TABLE `DBS` CHANGE COLUMN `DESC` `DESC` VARCHAR(4000) CHARACTER SET 'utf8' NULL DEFAULT NULL ;

 
hiveで表情報を表示し、中国語で正常に表示
hive> desc t_dm_test;
OK
rowtime             	string              	          
inserttime          	string              	    
remark              	string