Category: Spark

Add constant column in spark 0

Add constant column in spark

If we want to add a column with default value then we can do in spark. In spark 2.2 there are two ways to add constant value in...

salesforce logo 0

Salesforce connector in Spark

Salesforce is a customer relationship management solution that brings companies and customers together. It’s one integrated CRM platform that gives all your departments — including marketing, sales, commerce,...

Difference between DataFrame, Dataset, and RDD in Spark 0

Difference between DataFrame, Dataset, and RDD in Spark

RDD RDD is a fault-tolerant collection of elements that can be operated on in parallel. DataFrame DataFrame is a Dataset organised into named columns. It is conceptually equivalent to a...

spark job stage and task 0

Spark Jobs, Stages, Tasks

Every distributed computation is divided in small parts called jobs, stages and tasks. It’s useful to know them especially during monitoring because it helps to detect bottlenecks. Job -> Stages -> Tasks...

JDBC in Spark SQL 0

JDBC in Spark SQL

Apache Spark has very powerful built-in API for gathering data from a relational database. Effectiveness and efficiency, following the usual Spark approach, is managed in a transparent way....