Sqoop2 mysql->hdfs & mysql->kafka
7664 ワード
一.mysqlからhdfsへ
1.sqoop 2-shellを起動し、sqoop serverパラメータを構成する
2.サポートされているすべての接続の表示
3.mysql linkの作成
4.hdfs linkの作成
5.jobの作成
6.jobの実行
二.mysqlからkafkaへ
1.mysql linkを作成します.
2.kafka linkの作成
3.jobの作成
4.ジョブの実行
1.sqoop 2-shellを起動し、sqoop serverパラメータを構成する
[sivan@sivan ~]$ sqoop2-shell
Setting conf dir: /home/sivan/software/sqoop/sqoop-1.99.7-bin-hadoop200/bin/../conf
Sqoop home directory: /home/sivan/software/sqoop/sqoop-1.99.7-bin-hadoop200
Sqoop Shell: Type 'help' or '\h' for help.
sqoop:000> set server --host localhost --port 12000 --webapp sqoop
Server is set successfully
2.サポートされているすべての接続の表示
sqoop:000> show connector
+------------------------+---------+------------------------------------------------------------+----------------------+
| Name | Version | Class | Supported Directions |
+------------------------+---------+------------------------------------------------------------+----------------------+
| generic-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO |
| kite-connector | 1.99.7 | org.apache.sqoop.connector.kite.KiteConnector | FROM/TO |
| oracle-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.oracle.OracleJdbcConnector | FROM/TO |
| ftp-connector | 1.99.7 | org.apache.sqoop.connector.ftp.FtpConnector | TO |
| hdfs-connector | 1.99.7 | org.apache.sqoop.connector.hdfs.HdfsConnector | FROM/TO |
| kafka-connector | 1.99.7 | org.apache.sqoop.connector.kafka.KafkaConnector | TO |
| sftp-connector | 1.99.7 | org.apache.sqoop.connector.sftp.SftpConnector | TO |
+------------------------+---------+------------------------------------------------------------+----------------------+
3.mysql linkの作成
sqoop:000> create link -connector generic-jdbc-connector
Creating link for connector with name generic-jdbc-connector
Please fill following values to create new link object
Name: mysql_link
Database connection
Driver class: com.mysql.jdbc.Driver
Connection String: jdbc:mysql://172.19.225.156:3306/test
Username: root
Password: ****
Fetch Size:
Connection Properties:
There are currently 0 values in the map:
entry# protocol=tcp
There are currently 1 values in the map:
protocol = tcp
entry#
SQL Dialect
Identifier enclose: #
New link was successfully created with validation status OK and name mysql_link
4.hdfs linkの作成
sqoop:000> create link -connector hdfs-connector
Creating link for connector with name hdfs-connector
Please fill following values to create new link object
Name: hdfs_link
HDFS cluster
URI: hdfs://172.19.225.156:9000
Conf directory: /home/sivan/bigdata/softwares/hadoop/etc/hadoop
Additional configs::
There are currently 0 values in the map:
entry#
New link was successfully created with validation status OK and name hdfs_link
5.jobの作成
sqoop:000> create job -f "mysql_link" -t "hdfs_link"
Creating job for links with from name mysql_link and to name hdfs_link
Please fill following values to create new job object
Name: mysql_to_hdfs
Database source
Schema name: test
Table name: user
SQL statement:
Column names:
There are currently 0 values in the list:
element#
Partition column:
Partition column nullable:
Boundary query:
Incremental read
Check column:
Last value:
Target configuration
Override null value:
Null value:
File format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
2 : PARQUET_FILE
Choose: 0
Compression codec:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
8 : CUSTOM
Choose: 0
Custom codec:
Output directory: hdfs://172.19.225.156:9000/user/sivan/sqoop
Append mode:
Throttling resources
Extractors:
Loaders:
Classpath configuration
Extra mapper jars:
There are currently 0 values in the list:
element#
New job was successfully created with validation status OK and name mysql_to_hdfs
6.jobの実行
# job
sqoop:000> start job -n mysql_to_hdfs
# job
sqoop:000> status job -n mysql_to_hdfs
二.mysqlからkafkaへ
1.mysql linkを作成します.
2.kafka linkの作成
sqoop:000> create link -c kafka-connector
Creating link for connector with name kafka-connector
Please fill following values to create new link object
Name: kafka_link
Kafka cluster
Kafka brokers: sivan:9092 #kafka IP:port,
Zookeeper quorum: sivan:2181 #zookeeper IP:port,
New link was successfully created with validation status OK and name kafka_link
3.jobの作成
sqoop:000> create job -f "mysql_link" -t "kafka_link"
Creating job for links with from name mysql_link and to name kafka_link
Please fill following values to create new job object
Name: mysql_to_kafka
Database source
Schema name: test
Table name: user
SQL statement:
Column names:
There are currently 0 values in the list:
element#
Partition column:
Partition column nullable:
Boundary query:
Incremental read
Check column:
Last value:
Output configuration
Topic: test #kafka
Throttling resources
Extractors:
Loaders:
Classpath configuration
Extra mapper jars:
There are currently 0 values in the list:
element#
New job was successfully created with validation status OK and name mysql_to_kafka
4.ジョブの実行
sqoop:000> start job -n mysql_to_kafka
Submission details
Job Name: mysql_to_kafka
Server URL: http://localhost:12000/sqoop/
Created by: sivan
Creation date: 2019-01-02 16:25:36 CST
Lastly updated by: sivan
External ID: job_1546391357312_0009
http://sivan:8088/proxy/application_1546391357312_0009/
2019-01-02 16:25:36 CST: BOOTING - Progress is not available
sqoop:000> status job -n mysql_to_kafka
Submission details
Job Name: mysql_to_kafka
Server URL: http://localhost:12000/sqoop/
Created by: sivan
Creation date: 2019-01-02 16:25:36 CST
Lastly updated by: sivan
External ID: job_1546391357312_0009
http://sivan:8088/proxy/application_1546391357312_0009/
2019-01-02 16:25:49 CST: RUNNING - 16.67 %
sqoop:000> status job -n mysql_to_kafka
Submission details
Job Name: mysql_to_kafka
Server URL: http://localhost:12000/sqoop/
Created by: sivan
Creation date: 2019-01-02 16:25:36 CST
Lastly updated by: sivan
External ID: job_1546391357312_0009
http://sivan:8088/proxy/application_1546391357312_0009/
2019-01-02 16:25:53 CST: SUCCEEDED
Counters:
org.apache.hadoop.mapreduce.FileSystemCounter
FILE_LARGE_READ_OPS: 0
FILE_WRITE_OPS: 0
HDFS_READ_OPS: 3
HDFS_BYTES_READ: 424
HDFS_LARGE_READ_OPS: 0
FILE_READ_OPS: 0
FILE_BYTES_WRITTEN: 523026
FILE_BYTES_READ: 0
HDFS_WRITE_OPS: 0
HDFS_BYTES_WRITTEN: 0
org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter
BYTES_WRITTEN: 0
org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
BYTES_READ: 0
org.apache.hadoop.mapreduce.JobCounter
TOTAL_LAUNCHED_MAPS: 3
MB_MILLIS_MAPS: 11265024
SLOTS_MILLIS_REDUCES: 0
VCORES_MILLIS_MAPS: 11001
SLOTS_MILLIS_MAPS: 11001
OTHER_LOCAL_MAPS: 3
MILLIS_MAPS: 11001
org.apache.sqoop.submission.counter.SqoopCounters
ROWS_READ: 4
ROWS_WRITTEN: 4
org.apache.hadoop.mapreduce.TaskCounter
SPILLED_RECORDS: 0
MERGED_MAP_OUTPUTS: 0
VIRTUAL_MEMORY_BYTES: 6396149760
MAP_INPUT_RECORDS: 0
SPLIT_RAW_BYTES: 424
MAP_OUTPUT_RECORDS: 4
FAILED_SHUFFLE: 0
PHYSICAL_MEMORY_BYTES: 770457600
GC_TIME_MILLIS: 528
CPU_MILLISECONDS: 7140
COMMITTED_HEAP_BYTES: 521142272
Job executed successfully