PinnedSubham KhandelwalApache Spark Interview Series — Test your Knowledge 🧠Questions on Apache Spark to test your ability for Interviews and knowledge on Spark backgroundMar 17Mar 17
PinnedSubham KhandelwalLearnBigData101 — Spark SeriesSeries follows learning Apache Spark from Scratch with Python. Click on the links below to learn more.Oct 1, 20221Oct 1, 20221
Subham KhandelwalPySpark — Spark Streaming Error and Exception HandlingUnderstand How to handle Spark Streaming Errors and ExceptionsMar 20Mar 20
Subham KhandelwalinDev GeniusPySpark — Spark Streaming Checkpoint DirectoryUnderstand the use of different folders and contents inside Spark Streaming Checkpoint DirectoryFeb 26Feb 26
Subham KhandelwalinDev GeniusPySpark — Dynamic Resource Allocation in SparkConfigure Dynamic Resource Allocation using PySpark for proper resource utilizationJan 7Jan 7
Subham KhandelwalinDev GeniusPySpark — Optimize Joins in SparkShuffle Hash Join, Sort Merge Join, Broadcast joins and Bucketing for better Join Performance.Dec 30, 20231Dec 30, 20231
Subham KhandelwalPySpark — DAG & Explain PlansUnderstand How Spark divides Jobs into Stages and Tasks?Nov 19, 2023Nov 19, 2023
Subham KhandelwalinDev GeniusPySpark — Unit Test Cases using PyTestUnderstand how to write unit test cases for PySpark using PyTest module.Sep 30, 20231Sep 30, 20231
Subham KhandelwalPySpark — Optimize Parquet FilesUnderstand How Parquet files can be compacted efficiently utilizing RLE (Run Length Encoding)Apr 16, 2023Apr 16, 2023
Subham KhandelwalinDev GeniusPySpark — Estimate Partition Count for File ReadUnderstand how Spark estimates the number of Partitions required to read a fileMar 21, 20231Mar 21, 20231