ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 364 - Professional Data Engineer discussion

Report
Export

You are deploying a batch pipeline in Dataflow. This pipeline reads data from Cloud Storage, transforms the data, and then writes the data into BigQuory. The security team has enabled an organizational constraint in Google Cloud, requiring all Compute Engine instances to use only internal IP addresses and no external IP addresses. What should you do?

A.
Ensure that the firewall rules allow access to Cloud Storage and BigQuery. Use Dataflow with only internal IPs.
Answers
A.
Ensure that the firewall rules allow access to Cloud Storage and BigQuery. Use Dataflow with only internal IPs.
B.
Ensure that your workers have network tags to access Cloud Storage and BigQuery. Use Dataflow with only internal IP addresses.
Answers
B.
Ensure that your workers have network tags to access Cloud Storage and BigQuery. Use Dataflow with only internal IP addresses.
C.
Create a VPC Service Controls perimeter that contains the VPC network and add Dataflow. Cloud Storage, and BigQuery as allowed services in the perimeter. Use Dataflow with only internal IP addresses.
Answers
C.
Create a VPC Service Controls perimeter that contains the VPC network and add Dataflow. Cloud Storage, and BigQuery as allowed services in the perimeter. Use Dataflow with only internal IP addresses.
D.
Ensure that Private Google Access is enabled in the subnetwork. Use Dataflow with only internal IP addresses.
Answers
D.
Ensure that Private Google Access is enabled in the subnetwork. Use Dataflow with only internal IP addresses.
Suggested answer: D

Explanation:

To deploy a batch pipeline in Dataflow that adheres to the organizational constraint of using only internal IP addresses, ensuring Private Google Access is the most effective solution. Here's why option D is the best choice:

Private Google Access:

Private Google Access allows resources in a VPC network that do not have external IP addresses to access Google APIs and services through internal IP addresses.

This ensures compliance with the organizational constraint of using only internal IPs while allowing Dataflow to access Cloud Storage and BigQuery.

Dataflow with Internal IPs:

Dataflow can be configured to use only internal IP addresses for its worker nodes, ensuring that no external IP addresses are assigned.

This configuration ensures secure and compliant communication between Dataflow, Cloud Storage, and BigQuery.

Firewall and Network Configuration:

Enabling Private Google Access requires ensuring the correct firewall rules and network configurations to allow internal traffic to Google Cloud services.

Steps to Implement:

Enable Private Google Access:

Enable Private Google Access on the subnetwork used by the Dataflow pipeline

gcloud compute networks subnets update [SUBNET_NAME] \

--region [REGION] \

--enable-private-ip-google-access

Configure Dataflow:

Configure the Dataflow job to use only internal IP addresses

gcloud dataflow jobs run [JOB_NAME] \

--region [REGION] \

--network [VPC_NETWORK] \

--subnetwork [SUBNETWORK] \

--no-use-public-ips

Verify Access:

Ensure that firewall rules allow the necessary traffic from the Dataflow workers to Cloud Storage and BigQuery using internal IPs.

Private Google Access Documentation

Configuring Dataflow to Use Internal IPs

VPC Firewall Rules

asked 18/09/2024
Marcel Wienhusen
43 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first