ExamGecko
Home Home / Amazon / DAS-C01

Amazon DAS-C01 Practice Test - Questions Answers, Page 15

Question list
Search
Search

List of questions

Search

Related questions











A global company has different sub-organizations, and each sub-organization sells its products and services in various countries. The company's senior leadership wants to quickly identify which suborganization is the strongest performer in each country. All sales data is stored in Amazon S3 in Parquet format.

Which approach can provide the visuals that senior leadership requested with the least amount of effort?

A.
Use Amazon QuickSight with Amazon Athena as the data source. Use heat maps as the visual type.
A.
Use Amazon QuickSight with Amazon Athena as the data source. Use heat maps as the visual type.
Answers
B.
Use Amazon QuickSight with Amazon S3 as the data source. Use heat maps as the visual type.
B.
Use Amazon QuickSight with Amazon S3 as the data source. Use heat maps as the visual type.
Answers
C.
Use Amazon QuickSight with Amazon Athena as the data source. Use pivot tables as the visual type.
C.
Use Amazon QuickSight with Amazon Athena as the data source. Use pivot tables as the visual type.
Answers
D.
Use Amazon QuickSight with Amazon S3 as the data source. Use pivot tables as the visual type.
D.
Use Amazon QuickSight with Amazon S3 as the data source. Use pivot tables as the visual type.
Answers
Suggested answer: C

A retail company’s data analytics team recently created multiple product sales analysis dashboards for the average selling price per product using Amazon QuickSight. The dashboards were created from .csv files uploaded to Amazon S3. The team is now planning to share the dashboards with the respective external product owners by creating individual users in Amazon QuickSight. For compliance and governance reasons, restricting access is a key requirement. The product owners should view only their respective product analysis in the dashboard reports.

Which approach should the data analytics team take to allow product owners to view only their products in the dashboard?

A.
Separate the data by product and use S3 bucket policies for authorization.
A.
Separate the data by product and use S3 bucket policies for authorization.
Answers
B.
Separate the data by product and use IAM policies for authorization.
B.
Separate the data by product and use IAM policies for authorization.
Answers
C.
Create a manifest file with row-level security.
C.
Create a manifest file with row-level security.
Answers
D.
Create dataset rules with row-level security.
D.
Create dataset rules with row-level security.
Answers
Suggested answer: B

An operations team notices that a few AWS Glue jobs for a given ETL application are failing. The AWS Glue jobs read a large number of small JSON files from an Amazon S3 bucket and write the data to a different S3 bucket in Apache Parquet format with no major transformations. Upon initial investigation, a data engineer notices the following error message in the History tab on the AWS Glue console:

“Command Failed with Exit Code 1.”

Upon further investigation, the data engineer notices that the driver memory profile of the failed jobs crosses the safe threshold of 50% usage quickly and reaches 90–95% soon after. The average memory usage across all executors continues to be less than 4%.

The data engineer also notices the following error while examining the related Amazon CloudWatch Logs.

What should the data engineer do to solve the failure in the MOST cost-effective way?

A.
Change the worker type from Standard to G.2X.
A.
Change the worker type from Standard to G.2X.
Answers
B.
Modify the AWS Glue ETL code to use the ‘groupFiles’: ‘inPartition’ feature.
B.
Modify the AWS Glue ETL code to use the ‘groupFiles’: ‘inPartition’ feature.
Answers
C.
Increase the fetch size setting by using AWS Glue dynamics frame.
C.
Increase the fetch size setting by using AWS Glue dynamics frame.
Answers
D.
Modify maximum capacity to increase the total maximum data processing units (DPUs) used.
D.
Modify maximum capacity to increase the total maximum data processing units (DPUs) used.
Answers
Suggested answer: D

A healthcare company uses AWS data and analytics tools to collect, ingest, and store electronic health record (EHR) data about its patients. The raw EHR data is stored in Amazon S3 in JSON format partitioned by hour, day, and year and is updated every hour. The company wants to maintain the data catalog and metadata in an AWS Glue Data Catalog to be able to access the data using Amazon Athena or Amazon Redshift Spectrum for analytics.

When defining tables in the Data Catalog, the company has the following requirements:

Choose the catalog table name and do not rely on the catalog table naming algorithm.

Keep the table updated with new partitions loaded in the respective S3 bucket prefixes. Which solution meets these requirements with minimal effort?

A.
Run an AWS Glue crawler that connects to one or more data stores, determines the data structures, and writes tables in the Data Catalog.
A.
Run an AWS Glue crawler that connects to one or more data stores, determines the data structures, and writes tables in the Data Catalog.
Answers
B.
Use the AWS Glue console to manually create a table in the Data Catalog and schedule an AWS Lambda function to update the table partitions hourly.
B.
Use the AWS Glue console to manually create a table in the Data Catalog and schedule an AWS Lambda function to update the table partitions hourly.
Answers
C.
Use the AWS Glue API CreateTable operation to create a table in the Data Catalog. Create an AWS Glue crawler and specify the table as the source.
C.
Use the AWS Glue API CreateTable operation to create a table in the Data Catalog. Create an AWS Glue crawler and specify the table as the source.
Answers
D.
Create an Apache Hive catalog in Amazon EMR with the table schema definition in Amazon S3, and update the table partition with a scheduled job. Migrate the Hive catalog to the Data Catalog.
D.
Create an Apache Hive catalog in Amazon EMR with the table schema definition in Amazon S3, and update the table partition with a scheduled job. Migrate the Hive catalog to the Data Catalog.
Answers
Suggested answer: B

Explanation:


Reference: https://docs.aws.amazon.com/glue/latest/dg/tables-described.html

A company has 10-15 ?? of uncompressed .csv files in Amazon S3. The company is evaluating Amazon Athena as a onetime query engine. The company wants to transform the data to optimize query runtime and storage costs. Which option for data format and compression meets these requirements?

A.
CSV compressed with zip
A.
CSV compressed with zip
Answers
B.
JSON compressed with bzip2
B.
JSON compressed with bzip2
Answers
C.
Apache Parquet compressed with Snappy
C.
Apache Parquet compressed with Snappy
Answers
D.
Apache Avro compressed with LZO
D.
Apache Avro compressed with LZO
Answers
Suggested answer: B

Explanation:


Reference: https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/

A manufacturing company uses Amazon S3 to store its data. The company wants to use AWS Lake Formation to provide granular-level security on those data assets. The data is in Apache Parquet format. The company has set a deadline for a consultant to build a data lake.

How should the consultant create the MOST cost-effective solution that meets these requirements?

A.
Run Lake Formation blueprints to move the data to Lake Formation. Once Lake Formation has the data, apply permissions on Lake Formation.
A.
Run Lake Formation blueprints to move the data to Lake Formation. Once Lake Formation has the data, apply permissions on Lake Formation.
Answers
B.
To create the data catalog, run an AWS Glue crawler on the existing Parquet data. Register the Amazon S3 path and then apply permissions through Lake Formation to provide granular-level security.
B.
To create the data catalog, run an AWS Glue crawler on the existing Parquet data. Register the Amazon S3 path and then apply permissions through Lake Formation to provide granular-level security.
Answers
C.
Install Apache Ranger on an Amazon EC2 instance and integrate with Amazon EMR. Using Ranger policies, create rolebased access control for the existing data assets in Amazon S3.
C.
Install Apache Ranger on an Amazon EC2 instance and integrate with Amazon EMR. Using Ranger policies, create rolebased access control for the existing data assets in Amazon S3.
Answers
D.
Create multiple IAM roles for different users and groups. Assign IAM roles to different data assets in Amazon S3 to create table-based and column-based access controls.
D.
Create multiple IAM roles for different users and groups. Assign IAM roles to different data assets in Amazon S3 to create table-based and column-based access controls.
Answers
Suggested answer: C

A retail company leverages Amazon Athena for ad-hoc queries against an AWS Glue Data Catalog. The data analytics team manages the data catalog and data access for the company. The data analytics team wants to separate queries and manage the cost of running those queries by different workloads and teams. Ideally, the data analysts want to group the queries run by different users within a team, store the query results in individual Amazon S3 buckets specific to each team, and enforce cost constraints on the queries run against the Data Catalog. Which solution meets these requirements?

A.
Create IAM groups and resource tags for each team within the company. Set up IAM policies that control user access and actions on the Data Catalog resources.
A.
Create IAM groups and resource tags for each team within the company. Set up IAM policies that control user access and actions on the Data Catalog resources.
Answers
B.
Create Athena resource groups for each team within the company and assign users to these groups. Add S3 bucket names and other query configurations to the properties list for the resource groups.
B.
Create Athena resource groups for each team within the company and assign users to these groups. Add S3 bucket names and other query configurations to the properties list for the resource groups.
Answers
C.
Create Athena workgroups for each team within the company. Set up IAM workgroup policies that control user access and actions on the workgroup resources.
C.
Create Athena workgroups for each team within the company. Set up IAM workgroup policies that control user access and actions on the workgroup resources.
Answers
D.
Create Athena query groups for each team within the company and assign users to the groups.
D.
Create Athena query groups for each team within the company and assign users to the groups.
Answers
Suggested answer: A

A team of data scientists plans to analyze market trend data for their company’s new investment strategy. The trend data comes from five different data sources in large volumes. The team wants to utilize Amazon Kinesis to support their use case.

The team uses SQL-like queries to analyze trends and wants to send notifications based on certain significant patterns in the trends. Additionally, the data scientists want to save the data to Amazon S3 for archival and historical reprocessing, and use AWS managed services wherever possible. The team wants to implement the lowest-cost solution. Which solution meets these requirements?

A.
Publish data to one Kinesis data stream. Deploy a custom application using the Kinesis Client Library (KCL) for analyzing trends, and send notifications using Amazon SNS. Configure Kinesis Data Firehose on the Kinesis data stream topersist data to an S3 bucket.
A.
Publish data to one Kinesis data stream. Deploy a custom application using the Kinesis Client Library (KCL) for analyzing trends, and send notifications using Amazon SNS. Configure Kinesis Data Firehose on the Kinesis data stream topersist data to an S3 bucket.
Answers
B.
Publish data to one Kinesis data stream. Deploy Kinesis Data Analytic to the stream for analyzing trends, and configure an AWS Lambda function as an output to send notifications using Amazon SNS. Configure Kinesis Data Firehose onthe Kinesis data stream to persist data to an S3 bucket.
B.
Publish data to one Kinesis data stream. Deploy Kinesis Data Analytic to the stream for analyzing trends, and configure an AWS Lambda function as an output to send notifications using Amazon SNS. Configure Kinesis Data Firehose onthe Kinesis data stream to persist data to an S3 bucket.
Answers
C.
Publish data to two Kinesis data streams. Deploy Kinesis Data Analytics to the first stream for analyzing trends, and configure an AWS Lambda function as an output to send notifications using Amazon SNS. Configure Kinesis DataFirehose on the second Kinesis data stream to persist data to an S3 bucket.
C.
Publish data to two Kinesis data streams. Deploy Kinesis Data Analytics to the first stream for analyzing trends, and configure an AWS Lambda function as an output to send notifications using Amazon SNS. Configure Kinesis DataFirehose on the second Kinesis data stream to persist data to an S3 bucket.
Answers
D.
Publish data to two Kinesis data streams. Deploy a custom application using the Kinesis Client Library (KCL) to the first stream for analyzing trends, and send notifications using Amazon SNS. Configure Kinesis Data Firehose on thesecond Kinesis data stream to persist data to an S3 bucket.
D.
Publish data to two Kinesis data streams. Deploy a custom application using the Kinesis Client Library (KCL) to the first stream for analyzing trends, and send notifications using Amazon SNS. Configure Kinesis Data Firehose on thesecond Kinesis data stream to persist data to an S3 bucket.
Answers
Suggested answer: A

A company analyzes its data in an Amazon Redshift data warehouse, which currently has a cluster of three dense storage nodes. Due to a recent business acquisition, the company needs to load an additional 4 TB of user data into Amazon Redshift. The engineering team will combine all the user data and apply complex calculations that require I/O intensive resources. The company needs to adjust the cluster's capacity to support the change in analytical and storage requirements.

Which solution meets these requirements?

A.
Resize the cluster using elastic resize with dense compute nodes.
A.
Resize the cluster using elastic resize with dense compute nodes.
Answers
B.
Resize the cluster using classic resize with dense compute nodes.
B.
Resize the cluster using classic resize with dense compute nodes.
Answers
C.
Resize the cluster using elastic resize with dense storage nodes.
C.
Resize the cluster using elastic resize with dense storage nodes.
Answers
D.
Resize the cluster using classic resize with dense storage nodes.
D.
Resize the cluster using classic resize with dense storage nodes.
Answers
Suggested answer: C

Explanation:


Reference: https://aws.amazon.com/redshift/pricing/

A data architect is building an Amazon S3 data lake for a bank. The goal is to provide a single data repository for customer data needs, such as personalized recommendations. The bank uses Amazon Kinesis Data Firehose to ingest customers’ personal information bank accounts, and transactions in near-real time from a transactional relational database. The bank requires all personally identifiable information (PII) that is stored in the AWS Cloud to be masked. Which solution will meet these requirements?

A.
Invoke an AWS Lambda function from Kinesis Data Firehose to mask PII before delivering the data into Amazon S3.
A.
Invoke an AWS Lambda function from Kinesis Data Firehose to mask PII before delivering the data into Amazon S3.
Answers
B.
Use Amazon Made, and configure it to discover and mask PII.
B.
Use Amazon Made, and configure it to discover and mask PII.
Answers
C.
Enable server-side encryption (SSE) in Amazon S3.
C.
Enable server-side encryption (SSE) in Amazon S3.
Answers
D.
Invoke Amazon Comprehend from Kinesis Data Firehose to detect and mask PII before delivering the data into Amazon S3.
D.
Invoke Amazon Comprehend from Kinesis Data Firehose to detect and mask PII before delivering the data into Amazon S3.
Answers
Suggested answer: C

Explanation:


Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingServerSideEncryption.html

Total 214 questions
Go to page: of 22