ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 294 - Professional Data Engineer discussion

Report
Export

You have data located in BigQuery that is used to generate reports for your company. You have noticed some weekly executive report fields do not correspond to format according to company standards for example, report errors include different telephone formats and different country code identifiers. This is a frequent issue, so you need to create a recurring job to normalize the data. You want a quick solution that requires no coding What should you do?

A.
Use Cloud Data Fusion and Wrangler to normalize the data, and set up a recurring job.
Answers
A.
Use Cloud Data Fusion and Wrangler to normalize the data, and set up a recurring job.
B.
Use BigQuery and GoogleSQL to normalize the data, and schedule recurring quenes in BigQuery.
Answers
B.
Use BigQuery and GoogleSQL to normalize the data, and schedule recurring quenes in BigQuery.
C.
Create a Spark job and submit it to Dataproc Serverless.
Answers
C.
Create a Spark job and submit it to Dataproc Serverless.
D.
Use Dataflow SQL to create a job that normalizes the data, and that after the first run of the job, schedule the pipeline to execute recurrently.
Answers
D.
Use Dataflow SQL to create a job that normalizes the data, and that after the first run of the job, schedule the pipeline to execute recurrently.
Suggested answer: A

Explanation:

Cloud Data Fusion is a fully managed, cloud-native data integration service that allows you to build and manage data pipelines with a graphical interface. Wrangler is a feature of Cloud Data Fusion that enables you to interactively explore, clean, and transform data using a spreadsheet-like UI. You can use Wrangler to normalize the data in BigQuery by applying various directives, such as parsing, formatting, replacing, and validating data. You can also preview the results and export the wrangled data to BigQuery or other destinations. You can then set up a recurring job in Cloud Data Fusion to run the Wrangler pipeline on a schedule, such as weekly or daily. This way, you can create a quick and code-free solution to normalize the data for your reports.Reference:

Cloud Data Fusion overview

Wrangler overview

Wrangle data from BigQuery

[Scheduling pipelines]

asked 18/09/2024
Yuriy Kitsis
35 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first