spark on yarnで発生した問題(一)
6141 ワード
テストspark on yarn
sparkバージョン:spark-0.9.0-incubating-bin-hadoop 2
WordCount.scalaコード:
このプログラムを使用してspark on yarnテスト中に、次の異常を報告しました.
異常エラーにより、8030というポートに接続できないことが判明しました.このポートはRMのスケジューリングポートで、yarn-siteです.xmlでも構成されています.
しかし、つながっていないので、この問題に対応する解決策は:
yarn-site.xmlに以下の構成を追加し、RMのhostnameを明確に指定します.
これで接続できます.具体的な理由はsparkのソースコードを読まなければなりません.
sparkバージョン:spark-0.9.0-incubating-bin-hadoop 2
WordCount.scalaコード:
import org.apache.spark._
import SparkContext._
object WordCount {
def main(args: Array[String]) {
if (args.length != 3 ){
println("usage is org.test.WordCount <master> <input> <output>")
return
}
val sc = new SparkContext(args(0), "WordCount", System.getenv("SPARK_HOME"), Seq(System.getenv("SPARK_TEST_JAR")))
val textFile = sc.textFile(args(1))
val result = textFile.flatMap(line => line.split("\\s+")).map(word => (word, 1)).reduceByKey(_ + _)
result.saveAsTextFile(args(2))
}
}
このプログラムを使用してspark on yarnテスト中に、次の異常を報告しました.
14/03/05 15:57:48 DEBUG ipc.Client: The ping interval is 60000 ms.
14/03/05 15:57:48 DEBUG ipc.Client: Connecting to 0.0.0.0/0.0.0.0:8030
14/03/05 15:57:49 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:50 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:51 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:52 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:53 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:54 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:55 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:56 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:57 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:58 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/03/05 15:57:58 DEBUG ipc.Client: closing ipc connection to 0.0.0.0/0.0.0.0:8030:
java.net.ConnectException:
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399)
at org.apache.hadoop.ipc.Client.call(Client.java:1318)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy11.registerApplicationMaster(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy12.registerApplicationMaster(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:197)
at org.apache.spark.deploy.yarn.ApplicationMaster.registerApplicationMaster(ApplicationMaster.scala:138)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:102)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:429)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
14/03/05 15:57:58 DEBUG ipc.Client: IPC Client (1553281132) connection to 0.0.0.0/0.0.0.0:8030 from hadoop: closed
異常エラーにより、8030というポートに接続できないことが判明しました.このポートはRMのスケジューリングポートで、yarn-siteです.xmlでも構成されています.
<property>
<name>yarn.resourcemanager.scheduler.address.rm2</name>
<value>master2:8030</value>
</property>
しかし、つながっていないので、この問題に対応する解決策は:
yarn-site.xmlに以下の構成を追加し、RMのhostnameを明確に指定します.
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master1</value>
</property>
これで接続できます.具体的な理由はsparkのソースコードを読まなければなりません.