Sparkソースコードコンパイル

7247 ワード

前言:オンライン生産環境と実際の業務ニーズの複雑さのため、sparkソースコードを修正し、再コンパイルし、テストが完了した後、オンライン生産環境に適用することは避けられない.本稿では,著者らがLinux(centos 6.5)上でspark-2.2.1ソースコードを再コンパイルする過程と,コンパイル環境を導入する際に遭遇するピットについて主に紹介する.

一.ソースのダウンロード

git clone git://github.com/apache/spark.git -b branch-2.2.1  ( )

実行が完了すると、sparkソースコードは/home/${user_name}/sparkディレクトリにダウンロードされます.
 :
wget https://archive.apache.org/dist/spark/spark-2.2.1/spark-2.2.1.tgz

二.ソースコードのコンパイル

./build/mvn -Phadoop-2.7 -Pyarn -Dhadoop.version=2.7.3 -Phive -Phive-thriftserver clean package -Dmaven.test.skip=true

パラメータの説明:-Pahadoop:Hadoopバージョン番号、デフォルトバージョン2.6.5;Dhadoop.バージョン:同-Pahadoop;-Pyarn:Hadoop YARNをサポートしていますか.-Phive:Spark SQLでhiveをサポートするかどうか、hiveのデフォルトバージョンは1.2.1です.-Phive-thriftserver:同-Phive;-Dmaven.test.skip=true:テスト用例を実行せず、テスト用例クラスもコンパイルしない.
【穴埋め一】SSL connect errorエラーメッセージは以下の通り.
[root@cbas-virt-20 spark]# ./build/mvn -Phadoop-2.7 -Pyarn -Dhadoop.version=2.7.3 -Phive -Phive-thriftserver clean package -Dmaven.test.skip=true
exec: curl --progress-bar -L https://downloads.typesafe.com/zinc/0.3.15/zinc-0.3.15.tgz
curl: (35) SSL connect error

gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
exec: curl --progress-bar -L https://downloads.typesafe.com/scala/2.11.8/scala-2.11.8.tgz
curl: (35) SSL connect error

gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
./build/mvn: line 119: cd: /root/spark/build/scala-2.11.8/bin/../lib:  
./build/mvn: line 120: cd: /root/spark/build/scala-2.11.8/bin/../lib:  
./build/mvn: line 143: /root/spark/build/zinc-0.3.15/bin/zinc:  
./build/mvn: line 145: /root/spark/build/zinc-0.3.15/bin/zinc:  
Using `mvn` from path: /root/spark/build/apache-maven-3.3.9/bin/mvn

エラー解析:./build/mvnでソースコードをコンパイルする場合、zinc-0.3.15とscala-2.1.8エラーをダウンロードし、コンパイルサーバはhttps://downloads.typesafe.comとSSL接続を確立できません.検証(wgetでダウンロード):
[root@cbas-virt-20 ~]# wget https://downloads.typesafe.com/zinc/0.3.15/zinc-0.3.15.tgz
--2018-04-24 13:46:02--  https://downloads.typesafe.com/zinc/0.3.15/zinc-0.3.15.tgz
  downloads.typesafe.com... 54.230.129.25, 54.230.129.53, 54.230.129.138, ...
  downloads.typesafe.com|54.230.129.25|:443...  。
  SSL  。

ソリューション:./build/mvnに進み、対応するurlを変更します.
 :
curl --progress-bar -L http://downloads.typesafe.com/scala/2.11.8/scala-2.11.8.tgz
curl --progress-bar -L http://downloads.typesafe.com/zinc/0.3.15/zinc-0.3.15.tgz

 :
curl --progress-bar -L http://downloads.lightbend.com/scala/2.11.8/scala-2.11.8.tgz
curl --progress-bar -L http://downloads.lightbend.com/zinc/0.3.15/zinc-0.3.15.tgz

【穴埋め2】無効なソース発行版の誤報情報は以下の通りである.
[INFO] Using zinc server for incremental compilation
[info] 'compiler-interface' not yet compiled for Scala 2.11.8. Compiling...
[info]   Compilation completed in 12.407 s
[warn] Pruning sources from previous analysis, due to incompatible CompileSetup.
[info] Compiling 2 Scala sources and 6 Java sources to /root/spark/common/tags/target/scala-2.11/classes...
[error] javac:  : 1.8
[error]  : javac  
[error] -help  
[error] Compile failed at 2018-4-24 14:48:20 [14.062s]
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Spark Project Parent POM ........................... SUCCESS [03:33 min]
[INFO] Spark Project Tags ................................. FAILURE [ 36.154 s]

エラー分析:JDKバージョンが間違っていて、spark-2.2.1コンパイルはJDK 1をサポートしません.7,すべてのコンパイルと実行はJDK 1のみである.8で行います.
解決策:公式サイトからJDK 1をダウンロードする.8: http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
mv jdk-8u171-linux-x64.tar.gz /usr/java/
tar -zxvf jdk-8u171-linux-x64.tar.gz
vim /etc/profile => export JAVA_HOME=/usr/java/jdk1.8.0_171
source /etc/profile
java -version
javac -version

Linux上のJDKがJDK 1に更新されていないことが判明した場合.8、次の操作を行う必要があります.
 JDK Linux JDK :
update-alternatives --config java
update-alternatives --config javac
 , :
update-alternatives --install /usr/bin/java java /usr/java/jdk1.8.0_171/bin/java 300
update-alternatives --install /usr/bin/javac javac /usr/java/jdk1.8.0_171/bin/javac 300
 JDK ( ):
update-alternatives --config java
update-alternatives --config javac
 :
java -version
javac -version

三.コンパイル成功確認

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Spark Project Parent POM ........................... SUCCESS [  4.242 s]
[INFO] Spark Project Tags ................................. SUCCESS [  3.027 s]
[INFO] Spark Project Sketch ............................... SUCCESS [  9.362 s]
[INFO] Spark Project Local DB ............................. SUCCESS [  3.107 s]
[INFO] Spark Project Networking ........................... SUCCESS [  6.164 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 10.022 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [  3.123 s]
[INFO] Spark Project Launcher ............................. SUCCESS [  5.299 s]
[INFO] Spark Project Core ................................. SUCCESS [01:31 min]
[INFO] Spark Project ML Local Library ..................... SUCCESS [02:34 min]
[INFO] Spark Project GraphX ............................... SUCCESS [ 23.556 s]
[INFO] Spark Project Streaming ............................ SUCCESS [ 31.368 s]
[INFO] Spark Project Catalyst ............................. SUCCESS [03:23 min]
[INFO] Spark Project SQL .................................. SUCCESS [05:03 min]
[INFO] Spark Project ML Library ........................... SUCCESS [01:35 min]
[INFO] Spark Project Tools ................................ SUCCESS [  9.692 s]
[INFO] Spark Project Hive ................................. SUCCESS [01:02 min]
[INFO] Spark Project REPL ................................. SUCCESS [  5.806 s]
[INFO] Spark Project YARN Shuffle Service ................. SUCCESS [ 12.565 s]
[INFO] Spark Project YARN ................................. SUCCESS [ 32.809 s]
[INFO] Spark Project Hive Thrift Server ................... SUCCESS [ 24.966 s]
[INFO] Spark Project Assembly ............................. SUCCESS [  4.969 s]
[INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [01:00 min]
[INFO] Kafka 0.10 Source for Structured Streaming ......... SUCCESS [ 15.672 s]
[INFO] Spark Project Examples ............................. SUCCESS [ 25.670 s]
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [  5.885 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 20:10 min
[INFO] Finished at: 2018-04-24T19:37:26+08:00
[INFO] Final Memory: 85M/1177M
[INFO] ------------------------------------------------------------------------

ブログのホームページ:https://www.jianshu.com/u/e97bb429f278