ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 252 - DP-203 discussion

Report
Export

You are creating an Apache Spark job in Azure Databricks that will ingest JSON-formatted data. You need to convert a nested JSON string into a DataFrame that will contain multiple rows. Which Spark SQL function should you use?

A.
explode
Answers
A.
explode
B.
filter
Answers
B.
filter
C.
coalesce
Answers
C.
coalesce
D.
extract
Answers
D.
extract
Suggested answer: A

Explanation:

Convert nested JSON to a flattened DataFrame

You can to flatten nested JSON, using only $"column.*" and explode methods. Note: Extract and flatten

Use $"column.*" and explode methods to flatten the struct and array types before displaying the flattened DataFrame. Scala

display(DF.select($"id" as "main_id",$"name",$"batters",$"ppu",explode($"topping")) // Exploding the topping column using explode as it is an array type

.withColumn("topping_id",$"col.id") // Extracting topping_id from col using DOT form .withColumn("topping_type",$"col.type") // Extracting topping_tytpe from col using DOT form .drop($"col")

.select($"*",$"batters.*") // Flattened the struct type batters tto array type which is batter .drop($"batters")

.select($"*",explode($"batter"))

.drop($"batter")

.withColumn("batter_id",$"col.id") // Extracting batter_id from col using DOT form .withColumn("battter_type",$"col.type") // Extracting battter_type from col using DOT form .drop($"col")

)

Reference: https://learn.microsoft.com/en-us/azure/databricks/kb/scala/flatten-nested-columnsdynamically

asked 02/10/2024
brandon landaal
40 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first