Apache Spark Interview Series — Test your Knowledge 🧠
Questions on Apache Spark to test your ability for Interviews and knowledge on Spark background
Curated Apache Spark/PySpark Interview Challenge Series to understand it better. Challenges are posted on LinkedIn, so make sure to Follow me on LinkedIn. Answer to each challenge are posted in the comments section.
Note of Caution ⚠️
These questions will check your understanding and skills in Apache Spark, not generic questions for Spark knowledge. Proceed with Caution 😉
Challenges 💬
Spark Challenge1️⃣: Code execution on Driver or on Executor ??
Spark Challenge2️⃣: Read all CSV files in nested folders
Spark Challenge3️⃣: File name in CSV file in column while reading data
Spark Challenge4️⃣: How many JOBs while reading file ??
Spark Challenge5️⃣: Handling JSON data
Spark Challenge6️⃣: Coalesce vs Repartition, how many files ??
Spark Challenge7️⃣: Stages and DAG, determine number of Stages ??
Spark Challenge8️⃣: Spark Core API, Scala vs Python ??
Spark Challenge9️⃣: JDBC Optimization, Read data faster ??
Spark Challenge1️⃣0️⃣: Reading Complex JSON data
Spark Challenge1️⃣1️⃣: Writing data
Spark Challenge1️⃣2️⃣: Estimate number of Partitions
This Page will be updated as questions are posted on LinkedIn. Make sure to follow me on LinkedIn, to not miss any content.
Important Links and References 🏷️
LinkedIn Profile: https://www.linkedin.com/in/subhamkharwal
PySpark Zero to Hero Series on YouTube: https://youtube.com/playlist?list=PL2IsFZBGM_IHCl9zhRVC1EXTomkEp_1zm&si=Q664l-TFXf4wj1We
Spark Streaming with PySpark on YouTube: https://youtube.com/playlist?list=PL2IsFZBGM_IEtp2fF5xxZCS9CYBSHV2WW&si=4rF9V-Px9EJTiIiU
Checkout Ease With Data YouTube Channel: https://www.youtube.com/@easewithdata
Wish to connect with me: https://topmate.io/subham_khandelwal