Video description
An advanced Apache Spark course to help you prepare and crack Spark job interviews.
About This Video
- Practice common job interview questions and answers
- Deep dive into Spark 3 architecture and memory management
- Prepare for Databricks Spark certification in a structured way
In Detail
Apache Spark is a lightning-fast unified analytics engine for big data and machine learning. Since its release, Apache Spark has seen rapid adoption by enterprises across a wide range of industries. Internet powerhouses such as Netflix, Yahoo, and eBay have deployed Spark at a massive scale. It has quickly become the largest open-source community in big data. So, mastering Apache Spark opens a wide range of professional opportunities.
This course covers some advanced topics and concepts such as Spark 3 architecture and memory management, AQE, DPP, broadcast, accumulators, and multithreading in Spark 3 along with common job interview questions and answers. The objective of this course is to prepare you for advanced certification topics.
By the end of this course, you will have learned some advanced topics and concepts that are asked for in the Databricks Spark Certification or Spark job interviews. This will not only help you develop advanced skills in Apache Spark but also crack your job interviews.
Audience
This course is for anyone who wants to learn and develop advanced skills in Apache Spark or those who are preparing for job interviews and want to learn advanced skills.
Before proceeding with the course, you will need basic knowledge of Spark programming in Python – PySpark.
Table of Contents
Chapter 1 : Before You Begin
Course Introduction
Chapter 2 : Spark Architecture
Spark Cluster and Runtime Architecture
Spark Submit and Important Options
Deploy Modes - Client and Cluster Mode
Spark Jobs - Stage, Shuffle, Task, Slots
Spark SQL Engine and Query Planning
Let’s Practice - Quiz 1 Solution Video
Let’s Practice - Quiz 2 Solution Video
Chapter 3 : Performance and Applied Understanding
Spark Memory Allocation
Spark Memory Management
Spark Adaptive Query Execution
Spark AQE Dynamic Join Optimization
Handling Data Skew in Spark Joins
Spark Dynamic Partition Pruning
Data Caching in Spark
Repartition and Coalesce
Dataframe Hints
Broadcast Variables
Accumulators
Speculative Execution
Dynamic Resource Allocation
Spark Schedulers
Let’s Practice - Quiz 3 Solution Video
Let’s Practice - Quiz 4 Solution Video
Chapter 4 : Open-Ended Topics
Help Me Add More Content - Demand for More