Author: beginnershadoop

spark job stage and task 0

Spark Jobs, Stages, Tasks

Every distributed computation is divided in small parts called jobs, stages and tasks. It’s useful to know them especially during monitoring because it helps to detect bottlenecks. Job -> Stages -> Tasks...

predictive modelling 0

Predictive Modeling

Classification: What are the advantages of different classification algorithms? What are the advantages of using a decision tree for classification? What are the disadvantages of using a decision...

scala logo 0

Scala 101

In this tutorial, we will go over the Scala programming language. Scala is a powerful high-level programming language that incorporates object-oriented and functional programming. It’s a type-safe language that relies on the JVM...

JDBC in Spark SQL 0

JDBC in Spark SQL

Apache Spark has very powerful built-in API for gathering data from a relational database. Effectiveness and efficiency, following the usual Spark approach, is managed in a transparent way....