Create Spark Dataset from MySQL table

Views: 145   

Sample Program :

SparkSession sparkSession = SparkSession.builder().appName("mysqltable").master("local[2]").getOrCreate();

Map<String, String> jdbcOptionsMap = new HashMap<String, String>(); 
String jdbcUrl = "jdbc:mysql://"+hostname+":"+portNo+"/"+databasename; 

      jdbcOptionsMap.put("url", jdbcUrl);  
      jdbcOptionsMap.put("driver", <driver>);
      jdbcOptionsMap.put("dbtable", <tableName>); 
      jdbcOptionsMap.put("user", <username>);
      jdbcOptionsMap.put("password", <password>);

        System.out.println("Reading From MySQL ");

Dataset<Row> mysqlData = sparkSession.read().format("jdbc").options(jdbcOptionsMap).load();
System.out.println("Column Names : " + Arrays.asList(tableData.columns()));
System.out.println("Table Schema : " + tableData.schema());

Bullet Points:
1.) In above code initically I have created SparkSession to run spark job in standalone mode.
2.) In second step we should provide mysql table details like its URL, driver, tableName, username and password
3.) In third step, we read mysql table in jdbc format and load it into mysqlData Dataset using load command.
4.) We can know the table column names using columns() method
5.) We can view the table schema using schema() method

On By