List of questions
Related questions
Question 341 - Professional Data Engineer discussion
You migrated your on-premises Apache Hadoop Distributed File System (HDFS) data lake to Cloud Storage. The data scientist team needs to process the data by using Apache Spark and SQL. Security policies need to be enforced at the column level. You need a cost-effective solution that can scale into a data mesh. What should you do?
A.
1. Deploy a long-living Dalaproc cluster with Apache Hive and Ranger enabled. 2. Configure Ranger for column level security. 3. Process with Dataproc Spark or Hive SQL.
B.
1. Define a BigLake table. 2. Create a taxonomy of policy tags in Data Catalog. 3. Add policy lags to columns. 4. Process with the Spark-BigQuery connector or BigQuery SOL.
C.
1. Load the data to BigQuery tables. 2. Create a taxonomy of policy tags in Data Catalog. 3. Add policy tags to columns. 4. Procoss with the Spark-BigQuery connector or BigQuery SQL.
D.
1 Apply an Identity and Access Management (IAM) policy at the file level in Cloud Storage 2. Define a BigQuery external table for SQL processing. 3. Use Dataproc Spark to process the Cloud Storage files.
Your answer:
0 comments
Sorted by
Leave a comment first