hama学習ノート(5)-Zookeeperの構成問題の一例

9859 ワード

hamaクラスタが起動すると正常に動作し、ecampleを実行します.
$ bin/hama jar hama-examples-0.6.0.jar pi

エラーが発生しました.zookeeperに接続できません.
13/03/21 01:37:41 INFO bsp.BSPJobClient: Running job: job_201303210137_0001
13/03/21 01:37:44 INFO bsp.BSPJobClient: Current supersteps number: 0
attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO sync.ZKSyncClient: Initializing ZK Sync Client
attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO sync.ZooKeeperSyncClientImpl: Start connecting to Zookeeper! At iir455-199/10.77.30.199:61002
attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 ERROR sync.ZooKeeperSyncClientImpl: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /bsp/job_201303210137_0001/peers
attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 ERROR sync.ZKSyncClient: Error checking zk path /bsp/job_201303210137_0001/peers/iir455-199:61002
attempt_201303210137_0001_000007_0: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /bsp/job_201303210137_0001/peers/iir455-199:61002
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZKSyncClient.isExists(ZKSyncClient.java:108)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:262)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.registerTask(ZooKeeperSyncClientImpl.java:270)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.register(ZooKeeperSyncClientImpl.java:250)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.BSPPeerImpl.initializeSyncService(BSPPeerImpl.java:338)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.BSPPeerImpl.<init>(BSPPeerImpl.java:169)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1262)
attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 ERROR sync.ZKSyncClient: Error creating zk path /bsp/job_201303210137_0001/peers/iir455-199:61002
attempt_201303210137_0001_000007_0: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /bsp
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZKSyncClient.createZnode(ZKSyncClient.java:135)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:282)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.registerTask(ZooKeeperSyncClientImpl.java:270)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.register(ZooKeeperSyncClientImpl.java:250)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.BSPPeerImpl.initializeSyncService(BSPPeerImpl.java:338)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.BSPPeerImpl.<init>(BSPPeerImpl.java:169)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1262)
attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO ipc.Server: Starting SocketReader
attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO ipc.Server: IPC Server Responder: starting
attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO ipc.Server: IPC Server listener on 61002: starting
attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO message.HadoopMessageManagerImpl:  BSPPeer address:iir455-199 port:61002
attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO ipc.Server: IPC Server handler 0 on 61002: starting
attempt_201303210137_0001_000007_0: 13/03/21 01:37:17 ERROR sync.ZKSyncClient: Error checking zk path /bsp/job_201303210137_0001/sync/-1
attempt_201303210137_0001_000007_0: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /bsp/job_201303210137_0001/sync/-1
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZKSyncClient.isExists(ZKSyncClient.java:108)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:262)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncClientImpl.java:99)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.BSPPeerImpl.doFirstSync(BSPPeerImpl.java:345)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.BSPPeerImpl.<init>(BSPPeerImpl.java:233)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1262)
attempt_201303210137_0001_000007_0: 13/03/21 01:37:17 ERROR sync.ZKSyncClient: Error creating zk path /bsp/job_201303210137_0001/sync/-1
attempt_201303210137_0001_000007_0: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /bsp
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
attempt_201303210137_0001_000007_0: 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZKSyncClient.createZnode(ZKSyncClient.java:135)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:282)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncClientImpl.java:99)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.BSPPeerImpl.doFirstSync(BSPPeerImpl.java:345)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.BSPPeerImpl.<init>(BSPPeerImpl.java:233)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1262)
attempt_201303210137_0001_000007_0: 13/03/21 01:37:17 FATAL bsp.GroomServer: SyncError from child
attempt_201303210137_0001_000007_0: org.apache.hama.bsp.sync.SyncException
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncClientImpl.java:137)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.BSPPeerImpl.doFirstSync(BSPPeerImpl.java:345)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.BSPPeerImpl.<init>(BSPPeerImpl.java:233)
attempt_201303210137_0001_000007_0: 	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1262)
13/03/21 01:37:47 INFO bsp.BSPJobClient: Job failed.

zookeeperはhamaが持参したものではなく、3つのノードのクラスタであり、zookeeperログ(通常はzookeeperを起動するユーザーのホームディレクトリの下)を表示するのも正常であり、zookeeperをテストする:
$ bin/zkCli.sh -server ***:2181

問題もありません.
hamaのプロファイルを表示中に問題が見つかりました.hamaにはxmlプロファイルが2つありますhama-site.xmlとhama-default.xml.前者は後者のデフォルト構成を上書きできます.
私はここにいるxmlにはhamaが配置する.zookeeper.quorumは、zookeeperのポートを構成していません.デフォルトはzookeeperのデフォルトと同じだと思っていましたが、実際にhamaのデフォルトのzookeeperポートは2181ではなく21810で、hama-siteです.xmlに追加:
  <property>
    <name>hama.zookeeper.property.clientPort</name>
    <value>2181</value>
    <description>Property from ZooKeeper's config zoo.cfg.
      The port at which the clients will connect.
    </description>
  </property>

hamaを再起動し、piサンプルを実行し、OK:
[iir@iir455-200 hama-0.6.0]$ bin/hama jar hama-examples-0.6.0.jar pi
13/03/21 01:47:41 INFO bsp.BSPJobClient: Running job: job_201303210147_0001
13/03/21 01:47:44 INFO bsp.BSPJobClient: Current supersteps number: 0
13/03/21 01:47:50 INFO bsp.BSPJobClient: Current supersteps number: 1
13/03/21 01:47:50 INFO bsp.BSPJobClient: The total number of supersteps: 1
13/03/21 01:47:50 INFO bsp.BSPJobClient: Counters: 6
13/03/21 01:47:50 INFO bsp.BSPJobClient:   org.apache.hama.bsp.JobInProgress$JobCounter
13/03/21 01:47:50 INFO bsp.BSPJobClient:     SUPERSTEPS=1
13/03/21 01:47:50 INFO bsp.BSPJobClient:     LAUNCHED_TASKS=21
13/03/21 01:47:50 INFO bsp.BSPJobClient:   org.apache.hama.bsp.BSPPeerImpl$PeerCounter
13/03/21 01:47:50 INFO bsp.BSPJobClient:     SUPERSTEP_SUM=21
13/03/21 01:47:50 INFO bsp.BSPJobClient:     TIME_IN_SYNC_MS=7313
13/03/21 01:47:50 INFO bsp.BSPJobClient:     TOTAL_MESSAGES_SENT=21
13/03/21 01:47:50 INFO bsp.BSPJobClient:     TOTAL_MESSAGES_RECEIVED=21
Estimated value of PI is	3.1463428571428564
Job Finished in 10.379 seconds