You are testing a Dataflow pipeline to ingest and transform text files. The files are compressed gzip, errors are written to a dead-letter queue, and you are using Sidelnputs to join data You noticed that the pipeline is taking longer to complete than expected, what should you do to expedite the Dataflow job?

Question

Malik Spamu · Accepted Answer

Reduce the batch size

Malik Spamu · Answer

Switch to compressed Avro files

Malik Spamu · Answer

Retry records that throw an error

Malik Spamu · Answer

Use CoGroupByKey instead of the Sidelnput

Question list

List of questions

Question 1

(0)

Question 2

(0)

Question 3

(0)

Question 4

(0)

Question 5

(0)

Question 6

(0)

Question 7

(0)

Question 8

(0)

Question 9

(0)

Question 10

(0)

Related questions

Question 246 - Professional Data Engineer discussion

Suggested answer: B

0 comments