January 2019
Beginner to intermediate
154 pages
4h 31m
English
Spark SQL also enables users to query directly from different RDBMS data sources. The results of the query are returned as a DataFrame that can be further queried with Spark SQL or joined with other datasets.
To use a JDBC connection, you need to add the JDBC driver jars for the required database in the Spark classpath.
For example, mysql can be connected with Spark SQL with the following commands:
import org.apache.spark.sql.SparkSessionobject JDBCMySQL { def main(args: Array[String]) { //At first create a Spark Session as the entry point of your app val spark:SparkSession = SparkSession .builder() .appName("JDBC-MYSQL") .master("local[*]") .config("spark.sql.warehouse.dir", "C:/Spark") .getOrCreate(); val dataframe_mysql ...Read now
Unlock full access