You are loading CSV files from Cloud Storage to BigQuery. The files have known data quality issues, including mismatched data types, such as STRINGS and INT64s in the same column, and inconsistent formatting of values such as phone numbers or addresses. You need to create the data pipeline to maintain data quality and perform the required cleansing and transformation. What should you do?

Question

Mattie Hendricks · Accepted Answer

Use Data Fusion to transform the data before loading it into BigQuery.

Mattie Hendricks · Answer

Load the CSV files into a staging table with the desired schema, perform the transformations with SQL. and then write the results to the final destination table.

Mattie Hendricks · Answer

Create a table with the desired schema, toad the CSV files into the table, and perform the transformations in place using SQL.

Mattie Hendricks · Answer

Use Data Fusion to convert the CSV files lo a self-describing data formal, such as AVRO. before loading the data to BigOuery.

Question list

List of questions

Question 1

(0)

Question 2

(0)

Question 3

(0)

Question 4

(0)

Question 5

(0)

Question 6

(0)

Question 7

(0)

Question 8

(0)

Question 9

(0)

Question 10

(0)

Related questions

Question 309 - Professional Data Engineer discussion

Suggested answer: A

0 comments