Beginner's Hadoop

0

Barrier Execution Mode in Spark

The barrier execution mode is experimental and it only handles limited scenarios. See SPIP: Barrier Execution Mode and Design Doc. In case of a task failure, instead of only restarting the...

Impala Export to CSV 0

Impala Export to CSV

Apache Impala is an open source massively parallel processing SQL query engine for data stored in a computer cluster running Apache Hadoop. In some cases, impala-shell is installed manually...

salesforce logo 0

Salesforce connector in Spark

Salesforce is a customer relationship management solution that brings companies and customers together. It’s one integrated CRM platform that gives all your departments — including marketing, sales, commerce,...

spark job stage and task 0

Spark Jobs, Stages, Tasks

Every distributed computation is divided in small parts called jobs, stages and tasks. It’s useful to know them especially during monitoring because it helps to detect bottlenecks. Job -> Stages -> Tasks...