Spark SQL Using Hive

by beginnershadoop · Published May 23, 2016 · Updated October 7, 2016

In this blog I’m going to describe how to integrate hive with spark. You may find this code on spark’s official github page. My effort is to describe each steps of the code.

For spark word count example please follow my previous blog and for spark sql you can go through sparksql blog . Basic configuration is similar to spark word count ie SparkConf and Spark context. The only difference is creation of hive context. Spark provides direct support for hive tables. Using hive context we can run our sql queries.

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

OR you can create new table and load data on recently created table.

sqlContext.sql("CREATE TABLE IF NOT EXISTS employee(id INT, name STRING, age INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'")
    sqlContext.sql("LOAD DATA LOCAL INPATH 'src/data/employee.txt' INTO TABLE employee")

Next step is to run your intended query

    val result = sqlContext.sql("FROM employe SELECT id, name, age")

Until now all codes are lazy evaluation when you trigger action it will execute the task. In this code result.show() is an example of action.

result.show()

Complete code:

import org.apache.spark.{SparkConf, SparkContext}


object SparkSqlHiveExample {


  def main(args: Array[String]) {

    val sparkconf = new SparkConf()
      .setMaster("local[*]")
      .setAppName("Spark SQL Test")


    val sc = new SparkContext(sparkconf)
    val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
    sqlContext.sql("CREATE TABLE IF NOT EXISTS employee(id INT, name STRING, age INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'")
    sqlContext.sql("LOAD DATA LOCAL INPATH 'src/data/employee.txt' INTO TABLE employee")

    val result = sqlContext.sql(" SELECT id, name, age FROM employe")
    result.show()

  }


}

Spark SQL Using Hive

You may also like...

Leave a Reply Cancel reply

Recent Posts

Archives

Categories

Spark SQL Using Hive

Share this:

Related

You may also like...

Spark Streaming : Word Count Example

User defined functions(udf) in spark

File Operation in scala

Leave a Reply Cancel reply

Recent Posts

Archives

Categories