ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 28 - Professional Data Engineer discussion

Report
Export

Your company is performing data preprocessing for a learning algorithm in Google Cloud Dataflow.

Numerous data logs are being are being generated during this step, and the team wants to analyze them. Due to the dynamic nature of the campaign, the data is growing exponentially every hour.

The data scientists have written the following code to read the data for a new key features in the logs.

BigQueryIO.Read

.named("ReadLogData")

.from("clouddataflow-readonly:samples.log_data")

You want to improve the performance of this data read. What should you do?

A.
Specify the TableReference object in the code.
Answers
A.
Specify the TableReference object in the code.
B.
Use .fromQuery operation to read specific fields from the table.
Answers
B.
Use .fromQuery operation to read specific fields from the table.
C.
Use of both the Google BigQuery TableSchema and TableFieldSchema classes.
Answers
C.
Use of both the Google BigQuery TableSchema and TableFieldSchema classes.
D.
Call a transform that returns TableRow objects, where each element in the PCollexction represents a single row in the table.
Answers
D.
Call a transform that returns TableRow objects, where each element in the PCollexction represents a single row in the table.
Suggested answer: D
asked 18/09/2024
ahmed bin shehab
45 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first