ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 261 - Professional Data Engineer discussion

Report
Export

You've migrated a Hadoop job from an on-premises cluster to Dataproc and Good Storage. Your Spark job is a complex analytical workload fiat consists of many shuffling operations, and initial data are parquet toes (on average 200-400 MB size each) You see some degradation in performance after the migration to Dataproc so you'd like to optimize for it. Your organization is very cost-sensitive so you'd Idee to continue using Dataproc on preemptibles (with 2 non-preemptibles workers only) for this workload. What should you do?

A.
Switch from HODs to SSDs override the preemptible VMs configuration to increase the boot disk size
Answers
A.
Switch from HODs to SSDs override the preemptible VMs configuration to increase the boot disk size
B.
Increase the see of your parquet files to ensure them to be 1 GB minimum
Answers
B.
Increase the see of your parquet files to ensure them to be 1 GB minimum
C.
Switch to TFRecords format (appr 200 MB per We) instead of parquet files
Answers
C.
Switch to TFRecords format (appr 200 MB per We) instead of parquet files
D.
Switch from HDDs to SSDs. copy initial data from Cloud Storage to Hadoop Distributed File System (HDFS) run the Spark job and copy results back to Cloud Storage
Answers
D.
Switch from HDDs to SSDs. copy initial data from Cloud Storage to Hadoop Distributed File System (HDFS) run the Spark job and copy results back to Cloud Storage
Suggested answer: A
asked 18/09/2024
Jennifer Okai Addey
36 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first