spark-shell上でcassandrarowを扱う


spark-shellの起動

cd spark-1.5.0
bin/spark-shell --jars spark-cassandra-connector-assembly-1.5.0-M2-SNAPSHOT.jar --conf spark.cassandra.connection.host = 127.0.0.1

cassandraTableの取得

import com.datastax.spark.connector._ //Imports basic rdd functions
import com.datastax.spark.connector.cql._ //(Optional) Imports java driver helper functions

val c = CassandraConnector(sc.getConf)

val d = sc.cassandraTable("test_from_spark", "fun");

scala> d.collect.foreach(println)
CassandraRow{k: 1, v: 10}                                                       
CassandraRow{k: 2, v: 20}


columnの選択、column dataの取得、where句でfilter


d.select("k").where("k = ?", 10).foreach(println)

CassandraRow(k: 10)

d.select("k", "v").where("k = ?", 12).map(row => row.get[Int]("v")).collect
res2: Array[Int] = Array(120)


cassandra tableへの書き込み

val d = [なんかRDD]

d.saveToCassandra("keyspace", "table", SomeColumns("col1", "col2", ....))



link