Data Engineering

Detailed Guide to Setting up Scalable Apache Spark Infrastructure on Docker - Standalone Cluster With History Server

This post is a complete guide to build a scalable Apache Spark on using Dockers. We will see how to enable History Servers for log persistence.

Pavan Kulkarni

10 minute read

This post is a complete guide to build a scalable Apache Spark on using Dockers. We will see how to enable History Servers for log persistence. To be able to scale up and down is one of the key requirements of today’s distributed infrastructure. By the end of this guide, you should have pretty fair understanding of setting up Apache Spark on Docker and we will see how to run a sample program.