PinnedSubham KhandelwalApache Spark Interview Series — Test your Knowledge 🧠Questions on Apache Spark to test your ability for Interviews and knowledge on Spark background3 min read·Mar 17, 2024----
PinnedSubham KhandelwalLearnBigData101 — Spark SeriesSeries follows learning Apache Spark from Scratch with Python. Click on the links below to learn more.2 min read·Oct 1, 2022--1--1
Subham KhandelwalPySpark — Spark Streaming Error and Exception HandlingUnderstand How to handle Spark Streaming Errors and Exceptions7 min read·Mar 20, 2024----
Subham KhandelwalinDev GeniusPySpark — Spark Streaming Checkpoint DirectoryUnderstand the use of different folders and contents inside Spark Streaming Checkpoint Directory5 min read·Feb 26, 2024----
Subham KhandelwalinDev GeniusPySpark — Dynamic Resource Allocation in SparkConfigure Dynamic Resource Allocation using PySpark for proper resource utilization5 min read·Jan 7, 2024----
Subham KhandelwalinDev GeniusPySpark — Optimize Joins in SparkShuffle Hash Join, Sort Merge Join, Broadcast joins and Bucketing for better Join Performance.8 min read·Dec 30, 2023--1--1
Subham KhandelwalPySpark — DAG & Explain PlansUnderstand How Spark divides Jobs into Stages and Tasks?4 min read·Nov 19, 2023----
Subham KhandelwalinDev GeniusPySpark — Unit Test Cases using PyTestUnderstand how to write unit test cases for PySpark using PyTest module.5 min read·Sep 30, 2023--1--1
Subham KhandelwalPySpark — Optimize Parquet FilesUnderstand How Parquet files can be compacted efficiently utilizing RLE (Run Length Encoding)5 min read·Apr 16, 2023----
Subham KhandelwalinDev GeniusPySpark — Estimate Partition Count for File ReadUnderstand how Spark estimates the number of Partitions required to read a file5 min read·Mar 21, 2023--1--1