Predictive Modeling

Scala 101

In this tutorial, we will go over the Scala programming language. Scala is a powerful high-level programming language that incorporates object-oriented and functional programming. It’s a type-safe language that relies on the JVM...

JDBC in Spark SQL

Apache Spark has very powerful built-in API for gathering data from a relational database. Effectiveness and efficiency, following the usual Spark approach, is managed in a transparent way....

Useful commands for hadoop developer

This post combines most frequently used command for spark, emr, yarn and AWS by hadoop developer. Kill Spark  job: This command will kill all the running spark jobs.