We always have use cases where we have to flatten the complex JSON/Struct Data Frame into flattened simple Data Frame just like the example below:
example.this.that => example_this_that
Press enter or click to view image in full size
Flatten JSON/Struct Data Frame Data
Following code snippet does the exact job dynamically. No manual effort required to expand the data structure or to determine the schema.
Lets first create an example Data Frame for the job
# Lets create an Example Data Frame to hold JSON data# Example Data Frame with column having JSON data _data = [ ['EMP001', '{"dept" : "account", "fname": "Ramesh", "lname": "Singh", "skills": ["excel", "tally", "word"]}'], ['EMP002', '{"dept" : "sales", "fname": "Siv", "lname": "Kumar", "skills": ["biking", "sales"]}'], ['EMP003', '{"dept" : "hr", "fname": "MS Raghvan", "skills": ["communication", "soft-skills"], "hobbies" : {"cycling": "expert", "computers":"basic"}}'] ]# Columns for the data _cols = ['emp_no', 'raw_data']# Lets create the raw Data Frame df_raw = spark.createDataFrame(data = _data, schema = _cols)# Determine the schema of the JSON payload from the column…