Spark SQL Using Parquet
Today, I’m focusing on how to use parquet format in spark. Please get the more insight about parquet format If you are new to this format. Parquet: Apache Parquet is a...
A Learner's Platform
Hadoop / Scala / Spark / Spark Sql
by beginnershadoop · Published May 30, 2016 · Last modified July 16, 2016
Today, I’m focusing on how to use parquet format in spark. Please get the more insight about parquet format If you are new to this format. Parquet: Apache Parquet is a...
Today, I’m flashing lights on how to use Avro, a data serialization system, data format on spark sql. Unlike hive spark does not provides direct support for the...
Hadoop / Scala / Spark / Spark Sql
by beginnershadoop · Published May 23, 2016 · Last modified October 7, 2016
In this blog I’m going to describe how to integrate hive with spark. You may find this code on spark’s official github page. My effort is to describe...
Spark Streaming makes it easy to build scalable fault-tolerant streaming applications. Spark Stream API is a near real time streaming it supports Java, Scala, Python and R....
Scala / Spark / Spark Streaming
by beginnershadoop · Published April 19, 2016 · Last modified April 29, 2016
Network streaming in spark is another interesting topic, here I am going to explain how network streaming works and will provide complete spark scala code. Before jumping to the code, I...
More