ExamGecko
Home Home / Amazon / DAS-C01

Amazon DAS-C01 Practice Test - Questions Answers, Page 5

Question list
Search
Search

List of questions

Search

Related questions











A market data company aggregates external data sources to create a detailed view of product consumption in different countries. The company wants to sell this data to external parties through a subscription. To achieve this goal, the company needs to make its data securely available to external parties who are also AWS users.

What should the company do to meet these requirements with the LEAST operational overhead?

A.
Store the data in Amazon S3. Share the data by using presigned URLs for security.
A.
Store the data in Amazon S3. Share the data by using presigned URLs for security.
Answers
B.
Store the data in Amazon S3. Share the data by using S3 bucket ACLs.
B.
Store the data in Amazon S3. Share the data by using S3 bucket ACLs.
Answers
C.
Upload the data to AWS Data Exchange for storage. Share the data by using presigned URLs for security.
C.
Upload the data to AWS Data Exchange for storage. Share the data by using presigned URLs for security.
Answers
D.
Upload the data to AWS Data Exchange for storage. Share the data by using the AWS Data Exchange sharing wizard.
D.
Upload the data to AWS Data Exchange for storage. Share the data by using the AWS Data Exchange sharing wizard.
Answers
Suggested answer: A

A company uses Amazon Redshift as its data warehouse. A new table has columns that contain sensitive data. The data in the table will eventually be referenced by several existing queries that run many times a day. A data analyst needs to load 100 billion rows of data into the new table. Before doing so, the data analyst must ensure that only members of the auditing group can read the columns containing sensitive data. How can the data analyst meet these requirements with the lowest maintenance overhead?

A.
Load all the data into the new table and grant the auditing group permission to read from the table. Load all the data except for the columns containing sensitive data into a second table. Grant the appropriate users read-only permissionsto the second table.
A.
Load all the data into the new table and grant the auditing group permission to read from the table. Load all the data except for the columns containing sensitive data into a second table. Grant the appropriate users read-only permissionsto the second table.
Answers
B.
Load all the data into the new table and grant the auditing group permission to read from the table. Use the GRANT SQL command to allow read-only access to a subset of columns to the appropriate users.
B.
Load all the data into the new table and grant the auditing group permission to read from the table. Use the GRANT SQL command to allow read-only access to a subset of columns to the appropriate users.
Answers
C.
Load all the data into the new table and grant all users read-only permissions to non-sensitive columns. Attach an IAM policy to the auditing group with explicit ALLOW access to the sensitive data columns.
C.
Load all the data into the new table and grant all users read-only permissions to non-sensitive columns. Attach an IAM policy to the auditing group with explicit ALLOW access to the sensitive data columns.
Answers
D.
Load all the data into the new table and grant the auditing group permission to read from the table. Create a view of the new table that contains all the columns, except for those considered sensitive, and grant the appropriate users read-only permissions to the table.
D.
Load all the data into the new table and grant the auditing group permission to read from the table. Create a view of the new table that contains all the columns, except for those considered sensitive, and grant the appropriate users read-only permissions to the table.
Answers
Suggested answer: D

A hospital is building a research data lake to ingest data from electronic health records (EHR) systems from multiple hospitals and clinics. The EHR systems are independent of each other and do not have a common patient identifier. The data engineering team is not experienced in machine learning (ML) and has been asked to generate a unique patient identifier for the ingested records. Which solution will accomplish this task?

A.
An AWS Glue ETL job with the FindMatches transform
A.
An AWS Glue ETL job with the FindMatches transform
Answers
B.
Amazon Kendra
B.
Amazon Kendra
Answers
C.
Amazon SageMaker Ground Truth
C.
Amazon SageMaker Ground Truth
Answers
D.
An AWS Glue ETL job with the ResolveChoice transform
D.
An AWS Glue ETL job with the ResolveChoice transform
Answers
Suggested answer: A

Explanation:


Matching Records with AWS Lake Formation FindMatches

Reference: https://docs.aws.amazon.com/glue/latest/dg/machine-learning.html

An online retailer is rebuilding its inventory management system and inventory reordering system to automatically reorder products by using Amazon Kinesis Data Streams. The inventory management system uses the Kinesis Producer Library (KPL) to publish data to a stream. The inventory reordering system uses the Kinesis Client Library (KCL) to consume data from the stream. The stream has been configured to scale as needed. Just before production deployment, the retailer discovers that the inventory reordering system is receiving duplicated data. Which factors could be causing the duplicated data? (Choose two.)

A.
The producer has a network-related timeout.
A.
The producer has a network-related timeout.
Answers
B.
The stream’s value for the IteratorAgeMilliseconds metric is too high.
B.
The stream’s value for the IteratorAgeMilliseconds metric is too high.
Answers
C.
There was a change in the number of shards, record processors, or both.
C.
There was a change in the number of shards, record processors, or both.
Answers
D.
The AggregationEnabled configuration property was set to true.
D.
The AggregationEnabled configuration property was set to true.
Answers
E.
The max_records configuration property was set to a number that is too high.
E.
The max_records configuration property was set to a number that is too high.
Answers
Suggested answer: B, D

A marketing company is storing its campaign response data in Amazon S3. A consistent set of sources has generated the data for each campaign. The data is saved into Amazon S3 as .csv files. A business analyst will use Amazon Athena to analyze each campaign’s data. The company needs the cost of ongoing data analysis with Athena to be minimized. Which combination of actions should a data analytics specialist take to meet these requirements? (Choose two.)

A.
Convert the .csv files to Apache Parquet.
A.
Convert the .csv files to Apache Parquet.
Answers
B.
Convert the .csv files to Apache Avro.
B.
Convert the .csv files to Apache Avro.
Answers
C.
Partition the data by campaign.
C.
Partition the data by campaign.
Answers
D.
Partition the data by source.
D.
Partition the data by source.
Answers
E.
Compress the .csv files.
E.
Compress the .csv files.
Answers
Suggested answer: B, C

A company wants to research user turnover by analyzing the past 3 months of user activities. With millions of users, 1.5 TB of uncompressed data is generated each day. A 30-node Amazon Redshift cluster with 2.56 TB of solid state drive (SSD) storage for each node is required to meet the query performance goals.

The company wants to run an additional analysis on a year’s worth of historical data to examine trends indicating which features are most popular. This analysis will be done once a week. What is the MOST cost-effective solution?

A.
Increase the size of the Amazon Redshift cluster to 120 nodes so it has enough storage capacity to hold 1 year of data.Then use Amazon Redshift for the additional analysis.
A.
Increase the size of the Amazon Redshift cluster to 120 nodes so it has enough storage capacity to hold 1 year of data.Then use Amazon Redshift for the additional analysis.
Answers
B.
Keep the data from the last 90 days in Amazon Redshift. Move data older than 90 days to Amazon S3 and store it in Apache Parquet format partitioned by date. Then use Amazon Redshift Spectrum for the additional analysis.
B.
Keep the data from the last 90 days in Amazon Redshift. Move data older than 90 days to Amazon S3 and store it in Apache Parquet format partitioned by date. Then use Amazon Redshift Spectrum for the additional analysis.
Answers
C.
Keep the data from the last 90 days in Amazon Redshift. Move data older than 90 days to Amazon S3 and store it in Apache Parquet format partitioned by date. Then provision a persistent Amazon EMR cluster and use Apache Prestofor the additional analysis.
C.
Keep the data from the last 90 days in Amazon Redshift. Move data older than 90 days to Amazon S3 and store it in Apache Parquet format partitioned by date. Then provision a persistent Amazon EMR cluster and use Apache Prestofor the additional analysis.
Answers
D.
Resize the cluster node type to the dense storage node type (DS2) for an additional 16 TB storage capacity on each individual node in the Amazon Redshift cluster. Then use Amazon Redshift for the additional analysis.
D.
Resize the cluster node type to the dense storage node type (DS2) for an additional 16 TB storage capacity on each individual node in the Amazon Redshift cluster. Then use Amazon Redshift for the additional analysis.
Answers
Suggested answer: B

An online retail company uses Amazon Redshift to store historical sales transactions. The company is required to encrypt data at rest in the clusters to comply with the Payment Card Industry Data Security Standard (PCI DSS). A corporate governance policy mandates management of encryption keys using an on-premises hardware security module (HSM). Which solution meets these requirements?

A.
Create and manage encryption keys using AWS CloudHSM Classic. Launch an Amazon Redshift cluster in a VPC with the option to use CloudHSM Classic for key management.
A.
Create and manage encryption keys using AWS CloudHSM Classic. Launch an Amazon Redshift cluster in a VPC with the option to use CloudHSM Classic for key management.
Answers
B.
Create a VPC and establish a VPN connection between the VPC and the on-premises network. Create an HSM connection and client certificate for the on-premises HSM. Launch a cluster in the VPC with the option to use the onpremisesHSM to store keys.
B.
Create a VPC and establish a VPN connection between the VPC and the on-premises network. Create an HSM connection and client certificate for the on-premises HSM. Launch a cluster in the VPC with the option to use the onpremisesHSM to store keys.
Answers
C.
Create an HSM connection and client certificate for the on-premises HSM. Enable HSM encryption on the existing unencrypted cluster by modifying the cluster. Connect to the VPC where the Amazon Redshift cluster resides from theonpremises network using a VPN.
C.
Create an HSM connection and client certificate for the on-premises HSM. Enable HSM encryption on the existing unencrypted cluster by modifying the cluster. Connect to the VPC where the Amazon Redshift cluster resides from theonpremises network using a VPN.
Answers
D.
Create a replica of the on-premises HSM in AWS CloudHSM. Launch a cluster in a VPC with the option to use CloudHSM to store keys.
D.
Create a replica of the on-premises HSM in AWS CloudHSM. Launch a cluster in a VPC with the option to use CloudHSM to store keys.
Answers
Suggested answer: B

A web retail company wants to implement a near-real-time clickstream analytics solution. The company wants to analyze the data with an open-source package. The analytics application will process the raw data only once, but other applications will need immediate access to the raw data for up to 1 year.

Which solution meets these requirements with the LEAST amount of operational effort?

A.
Use Amazon Kinesis Data Streams to collect the data. Use Amazon EMR with Apache Flink to consume and process the data from the Kinesis data stream. Set the retention period of the Kinesis data stream to 8.760 hours.
A.
Use Amazon Kinesis Data Streams to collect the data. Use Amazon EMR with Apache Flink to consume and process the data from the Kinesis data stream. Set the retention period of the Kinesis data stream to 8.760 hours.
Answers
B.
Use Amazon Kinesis Data Streams to collect the data. Use Amazon Kinesis Data Analytics with Apache Flink to process the data in real time. Set the retention period of the Kinesis data stream to 8,760 hours.
B.
Use Amazon Kinesis Data Streams to collect the data. Use Amazon Kinesis Data Analytics with Apache Flink to process the data in real time. Set the retention period of the Kinesis data stream to 8,760 hours.
Answers
C.
Use Amazon Managed Streaming for Apache Kafka (Amazon MSK) to collect the data. Use Amazon EMR with Apache Flink to consume and process the data from the Amazon MSK stream. Set the log retention hours to 8,760.
C.
Use Amazon Managed Streaming for Apache Kafka (Amazon MSK) to collect the data. Use Amazon EMR with Apache Flink to consume and process the data from the Amazon MSK stream. Set the log retention hours to 8,760.
Answers
D.
Use Amazon Kinesis Data Streams to collect the data. Use Amazon EMR with Apache Flink to consume and process the data from the Kinesis data stream. Create an Amazon Kinesis Data Firehose delivery stream to store the data inAmazon S3. Set an S3 Lifecycle policy to delete the data after 365 days.
D.
Use Amazon Kinesis Data Streams to collect the data. Use Amazon EMR with Apache Flink to consume and process the data from the Kinesis data stream. Create an Amazon Kinesis Data Firehose delivery stream to store the data inAmazon S3. Set an S3 Lifecycle policy to delete the data after 365 days.
Answers
Suggested answer: B

Explanation:


Reference: https://docs.aws.amazon.com/streams/latest/dev/kinesis-dg.pdf

A company’s data analyst needs to ensure that queries run in Amazon Athena cannot scan more than a prescribed amount of data for cost control purposes. Queries that exceed the prescribed threshold must be canceled immediately. What should the data analyst do to achieve this?

A.
Configure Athena to invoke an AWS Lambda function that terminates queries when the prescribed threshold is crossed.
A.
Configure Athena to invoke an AWS Lambda function that terminates queries when the prescribed threshold is crossed.
Answers
B.
For each workgroup, set the control limit for each query to the prescribed threshold.
B.
For each workgroup, set the control limit for each query to the prescribed threshold.
Answers
C.
Enforce the prescribed threshold on all Amazon S3 bucket policies
C.
Enforce the prescribed threshold on all Amazon S3 bucket policies
Answers
D.
For each workgroup, set the workgroup-wide data usage control limit to the prescribed threshold.
D.
For each workgroup, set the workgroup-wide data usage control limit to the prescribed threshold.
Answers
Suggested answer: D

Explanation:


Reference: https://docs.aws.amazon.com/athena/latest/ug/workgroups-setting-control-limits-cloudwatch.html

A retail company is using an Amazon S3 bucket to host an ecommerce data lake. The company is using AWS Lake Formation to manage the data lake.

A data analytics specialist must provide access to a new business analyst team. The team will use Amazon Athena from the AWS Management Console to query data from existing web_sales and customer tables in the ecommerce database. The team needs read-only access and the ability to uniquely identify customers by using first and last names. However, the team must not be able to see any other personally identifiable data. The table structure is as follows:

Which combination of steps should the data analytics specialist take to provide the required permission by using the principle of least privilege? (Choose three.)

A.
In AWS Lake Formation, grant the business_analyst group SELECT and ALTER permissions for the web_sales table.
A.
In AWS Lake Formation, grant the business_analyst group SELECT and ALTER permissions for the web_sales table.
Answers
B.
In AWS Lake Formation, grant the business_analyst group the SELECT permission for the web_sales table.
B.
In AWS Lake Formation, grant the business_analyst group the SELECT permission for the web_sales table.
Answers
C.
In AWS Lake Formation, grant the business_analyst group the SELECT permission for the customer table. Under columns, choose filter type “Include columns” with columns fisrt_name, last_name, and customer_id.
C.
In AWS Lake Formation, grant the business_analyst group the SELECT permission for the customer table. Under columns, choose filter type “Include columns” with columns fisrt_name, last_name, and customer_id.
Answers
D.
In AWS Lake Formation, grant the business_analyst group SELECT and ALTER permissions for the customer table.Under columns, choose filter type “Include columns” with columns fisrt_name and last_name.
D.
In AWS Lake Formation, grant the business_analyst group SELECT and ALTER permissions for the customer table.Under columns, choose filter type “Include columns” with columns fisrt_name and last_name.
Answers
E.
Create users under a business_analyst IAM group. Create a policy that allows the lakeformation:GetDataAccess action, the athena:* action, and the glue:Get* action.
E.
Create users under a business_analyst IAM group. Create a policy that allows the lakeformation:GetDataAccess action, the athena:* action, and the glue:Get* action.
Answers
F.
Create users under a business_analyst IAM group. Create a policy that allows the lakeformation:GetDataAccess action, the athena:* action, and the glue:Get* action. In addition, allow the s3:GetObject action, the s3:PutObject action, andthe s3:GetBucketLocation action for the Athena query results S3 bucket.
F.
Create users under a business_analyst IAM group. Create a policy that allows the lakeformation:GetDataAccess action, the athena:* action, and the glue:Get* action. In addition, allow the s3:GetObject action, the s3:PutObject action, andthe s3:GetBucketLocation action for the Athena query results S3 bucket.
Answers
Suggested answer: B, D, F
Total 214 questions
Go to page: of 22