HiveServer 2 HAモデル導入テスト

7540 ワード

zookeeperを構成し、複数のHiveServerを起動


HAZookeeperの構成情報は次のとおりです.
    
      hive.server2.transport.mode
      binary
    
    
    
      hive.server2.zookeeper.namespace
      hiveserver2-lsm
    

    
      hive.vectorized.execution.enabled
      true
    

    
      hive.zookeeper.quorum
      hadoop710.lt.163.org:2181,hadoop711.lt.163.org:2181,hadoop712.lt.163.org:2181
    

まず、host 1でMetaStoreとHiveServerプロセスを開始します.その後、別のマシンでHiveServerプロセスを起動し、簡単なHAクラスタを構成する.
2017-06-27T10:31:18,263  INFO [main] zookeeper.ZooKeeper: Client environment:user.dir=/home/hzlishuming/env/apache-hive-2.1.1-bin
2017-06-27T10:31:18,264  INFO [main] zookeeper.ZooKeeper: Initiating client connection, connectString=hadoop710.lt.163.org:2181,hadoop711.lt.163.org:2181,hadoop712.lt.163.org:2181 sessionTimeout=1200000 watcher=org.apache.curator.ConnectionState@3f9b7fe1
2017-06-27T10:31:18,264  INFO [Thread-11] zookeeper.ZooKeeper: Initiating client connection, connectString=hadoop710.lt.163.org:2181,hadoop711.lt.163.org:2181,hadoop712.lt.163.org:2181 sessionTimeout=60000 watcher=org.apache.curator.ConnectionState@47b17830
2017-06-27T10:31:18,295  INFO [main-SendThread(hadoop711.lt.163.org:2181)] zookeeper.Login: successfully logged in.
2017-06-27T10:31:18,296  INFO [Thread-12] zookeeper.Login: TGT refresh thread started.
2017-06-27T10:31:18,297  INFO [Thread-11-SendThread(hadoop710.lt.163.org:2181)] zookeeper.Login: successfully logged in.
2017-06-27T10:31:18,298  INFO [Thread-13] zookeeper.Login: TGT refresh thread started.
2017-06-27T10:31:18,301  INFO [main-SendThread(hadoop711.lt.163.org:2181)] client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism.
2017-06-27T10:31:18,301  INFO [Thread-11-SendThread(hadoop710.lt.163.org:2181)] client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism.
2017-06-27T10:31:18,304  INFO [Thread-11-SendThread(hadoop710.lt.163.org:2181)] zookeeper.ClientCnxn: Opening socket connection to server hadoop710.lt.163.org/10.120.219.54:2181. Will attempt to SASL-authenticate using Login Context section 'HiveZooKeeperClient'
2017-06-27T10:31:18,304  INFO [main-SendThread(hadoop711.lt.163.org:2181)] zookeeper.ClientCnxn: Opening socket connection to server hadoop711.lt.163.org/10.120.219.55:2181. Will attempt to SASL-authenticate using Login Context section 'HiveZooKeeperClient'
2017-06-27T10:31:18,304  INFO [Thread-11-SendThread(hadoop710.lt.163.org:2181)] zookeeper.ClientCnxn: Socket connection established to hadoop710.lt.163.org/10.120.219.54:2181, initiating session
2017-06-27T10:31:18,304  INFO [main-SendThread(hadoop711.lt.163.org:2181)] zookeeper.ClientCnxn: Socket connection established to hadoop711.lt.163.org/10.120.219.55:2181, initiating session

Beeline接続


./bin/beeline !connect jdbc:hive2://hadoop710.lt.163.org:2181,hadoop711.lt.163.org:2181,hadoop712.lt.163.org:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-lsm hadoop ""
beeline> !connect jdbc:hive2://hadoop710.lt.163.org:2181,hadoop711.lt.163.org:2181,hadoop712.lt.163.org:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-lsm hadoop ""
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hzlishuming/env/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/ndp/0.1.0/yarn_client/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://hadoop710.lt.163.org:2181,hadoop711.lt.163.org:2181,hadoop712.lt.163.org:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-lsm
17/06/27 14:04:09 [main]: INFO jdbc.HiveConnection: Connected to hadoop692.lt.163.org:10000
Connected to: Apache Hive (version 2.1.1)
Driver: Hive JDBC (version 2.1.1)
17/06/27 14:04:09 [main]: WARN jdbc.HiveConnection: Request to set autoCommit to false; Hive does not support autoCommit=false.
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://hadoop710.lt.163.org:2181,had> show databases;
+--------------------+--+
|   database_name    |
+--------------------+--+
| default            |
| hive_examples      |
| tpcds_data_test_2  |
+--------------------+--+
3 rows selected (2.017 seconds)

Zookeeper状態


現在、2台のマシンからなるクラスタがあり、Zookeeperのステータス情報を表示します.
[zk: hadoop712.lt.163.org(CONNECTED) 1] ls /hiveserver2-lsm

[serverUri=hadoop691.lt.163.org:10000;version=2.1.1;sequence=0000000003, serverUri=hadoop692.lt.163.org:10000;version=2.1.1;sequence=0000000002]

[zk: hadoop712.lt.163.org(CONNECTED) 2] get /hiveserver2-lsm/serverUri=hadoop691.lt.163.org:10000;version=2.1.1;sequence=0000000003

hive.server2.authentication=KERBEROS;hive.server2.transport.mode=binary;hive.server2.thrift.sasl.qop=auth;hive.server2.thrift.bind.host=hadoop691.lt.163.org;hive.server2.thrift.port=10000;hive.server2.use.SSL=false;hive.server2.authentication.kerberos.principal=hive/[email protected]
cZxid = 0x70001d53d
ctime = Tue Jun 27 11:02:34 CST 2017
mZxid = 0x70001d53d
mtime = Tue Jun 27 11:02:34 CST 2017
pZxid = 0x70001d53d
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x25cc3695ce40f40
dataLength = 296
numChildren = 0

高可用性シミュレーション


現在のセッションはhadoop692.lt.163.orgマシンに接続されています.このマシンのhiveserverプロセスをkillし、現在のセッションとその後に作成されたセッションがどのように影響するかを観察します.
現在のセッション処理では、次のような異常が発生します.
0: jdbc:hive2://hadoop710.lt.163.org:2181,had> show databases;Unexpected end of file when reading from HS2 server. The root cause might be too many concurrent connections. Please ask the administrator to check the number of active connections, and adjust hive.server2.thrift.max.worker.threads if applicable.
Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0)

Sessionを再作成し、別のホストへの接続に成功しました.
hzlishuming@hadoop691:~/env/hive$ ./bin/beeline
Beeline version 2.1.1 by Apache Hive
beeline> !connect jdbc:hive2://hadoop710.lt.163.org:2181,hadoop711.lt.163.org:2181,hadoop712.lt.163.org:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-lsm hadoop ""
17/06/28 10:06:30 [main]: INFO jdbc.HiveConnection: Connected to hadoop691.lt.163.org:10000
Connected to: Apache Hive (version 2.1.1)
Driver: Hive JDBC (version 2.1.1)
17/06/28 10:06:30 [main]: WARN jdbc.HiveConnection: Request to set autoCommit to false; Hive does not support autoCommit=false.
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://hadoop710.lt.163.org:2181,had> show databases;

まとめ

  • HiveServer 2 HAモデルの配置、実現は比較的に簡単で、負荷均衡の方式に基づいてルーティングの高可用性を実現し、Zookeeper層は各ノードのHost+Port情報を保存し、Sessionを作成する時にランダムにその中から1台のHostを選択して接続する.
  • マシンがシャットダウンされると、Zookeeperでノードが削除され、高可用性を実現します.

  • 参照先:
  • 原理:http://blog.csdn.net/wulantian/article/details/42418231
  • 実装:http://blog.csdn.net/wulantian/article/details/42173095
  • HiveServer 2 HAモード配備:https://toutiao.io/posts/8rllx9/preview