ExamGecko
Home Home / Amazon / DAS-C01

Amazon DAS-C01 Practice Test - Questions Answers, Page 7

Question list
Search
Search

List of questions

Search

Related questions











A software company hosts an application on AWS, and new features are released weekly. As part of the application testing process, a solution must be developed that analyzes logs from each Amazon EC2 instance to ensure that the application is working as expected after each deployment. The collection and analysis solution should be highly available with the ability to display new information with minimal delays. Which method should the company use to collect and analyze the logs?

A.
Enable detailed monitoring on Amazon EC2, use Amazon CloudWatch agent to store logs in Amazon S3, and use Amazon Athena for fast, interactive log analytics.
A.
Enable detailed monitoring on Amazon EC2, use Amazon CloudWatch agent to store logs in Amazon S3, and use Amazon Athena for fast, interactive log analytics.
Answers
B.
Use the Amazon Kinesis Producer Library (KPL) agent on Amazon EC2 to collect and send data to Kinesis Data Streams to further push the data to Amazon OpenSearch Service (Amazon Elasticsearch Service) and visualize usingAmazon QuickSight.
B.
Use the Amazon Kinesis Producer Library (KPL) agent on Amazon EC2 to collect and send data to Kinesis Data Streams to further push the data to Amazon OpenSearch Service (Amazon Elasticsearch Service) and visualize usingAmazon QuickSight.
Answers
C.
Use the Amazon Kinesis Producer Library (KPL) agent on Amazon EC2 to collect and send data to Kinesis Data Firehose to further push the data to Amazon OpenSearch Service (Amazon Elasticsearch Service) and OpenSearchDashboards (Kibana).
C.
Use the Amazon Kinesis Producer Library (KPL) agent on Amazon EC2 to collect and send data to Kinesis Data Firehose to further push the data to Amazon OpenSearch Service (Amazon Elasticsearch Service) and OpenSearchDashboards (Kibana).
Answers
D.
Use Amazon CloudWatch subscriptions to get access to a real-time feed of logs and have the logs delivered to Amazon Kinesis Data Streams to further push the data to Amazon OpenSearch Service (Amazon Elasticsearch Service) andOpenSearch Dashboards (Kibana).
D.
Use Amazon CloudWatch subscriptions to get access to a real-time feed of logs and have the logs delivered to Amazon Kinesis Data Streams to further push the data to Amazon OpenSearch Service (Amazon Elasticsearch Service) andOpenSearch Dashboards (Kibana).
Answers
Suggested answer: D

Explanation:


Reference: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Subscriptions.html

A financial company uses Apache Hive on Amazon EMR for ad-hoc queries. Users are complaining of sluggish performance.

A data analyst notes the following:

Approximately 90% of queries are submitted 1 hour after the market opens. Hadoop Distributed File System (HDFS) utilization never exceeds 10%. Which solution would help address the performance issues?

A.
Create instance fleet configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch CapacityRemainingGB metric. Create an automatic scaling policy toscale in the instance fleet based on the CloudWatch CapacityRemainingGB metric.
A.
Create instance fleet configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch CapacityRemainingGB metric. Create an automatic scaling policy toscale in the instance fleet based on the CloudWatch CapacityRemainingGB metric.
Answers
B.
Create instance fleet configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch YARNMemoryAvailablePercentage metric. Create an automatic scalingpolicy to scale in the instance fleet based on the CloudWatch YARNMemoryAvailablePercentage metric.
B.
Create instance fleet configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch YARNMemoryAvailablePercentage metric. Create an automatic scalingpolicy to scale in the instance fleet based on the CloudWatch YARNMemoryAvailablePercentage metric.
Answers
C.
Create instance group configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch CapacityRemainingGB metric. Create an automatic scaling policy toscale in the instance groups based on the CloudWatch CapacityRemainingGB metric.
C.
Create instance group configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch CapacityRemainingGB metric. Create an automatic scaling policy toscale in the instance groups based on the CloudWatch CapacityRemainingGB metric.
Answers
D.
Create instance group configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch YARNMemoryAvailablePercentage metric. Create an automaticscaling policy to scale in the instance groups based on the CloudWatch YARNMemoryAvailablePercentage metric.
D.
Create instance group configurations for core and task nodes. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch YARNMemoryAvailablePercentage metric. Create an automaticscaling policy to scale in the instance groups based on the CloudWatch YARNMemoryAvailablePercentage metric.
Answers
Suggested answer: C

A company hosts an on-premises PostgreSQL database that contains historical data. An internal legacy application uses the database for read-only activities. The company’s business team wants to move the data to a data lake in Amazon S3 as soon as possible and enrich the data for analytics.

The company has set up an AWS Direct Connect connection between its VPC and its on-premises network. A data analytics specialist must design a solution that achieves the business team’s goals with the least operational overhead. Which solution meets these requirements?

A.
Upload the data from the on-premises PostgreSQL database to Amazon S3 by using a customized batch upload process.Use the AWS Glue crawler to catalog the data in Amazon S3. Use an AWS Glue job to enrich and store the result in a separate S3 bucket in Apache Parquet format. Use Amazon Athena to query the data.
A.
Upload the data from the on-premises PostgreSQL database to Amazon S3 by using a customized batch upload process.Use the AWS Glue crawler to catalog the data in Amazon S3. Use an AWS Glue job to enrich and store the result in a separate S3 bucket in Apache Parquet format. Use Amazon Athena to query the data.
Answers
B.
Create an Amazon RDS for PostgreSQL database and use AWS Database Migration Service (AWS DMS) to migrate the data into Amazon RDS. Use AWS Data Pipeline to copy and enrich the data from the Amazon RDS forPostgreSQL table and move the data to Amazon S3. Use Amazon Athena to query the data.
B.
Create an Amazon RDS for PostgreSQL database and use AWS Database Migration Service (AWS DMS) to migrate the data into Amazon RDS. Use AWS Data Pipeline to copy and enrich the data from the Amazon RDS forPostgreSQL table and move the data to Amazon S3. Use Amazon Athena to query the data.
Answers
C.
Configure an AWS Glue crawler to use a JDBC connection to catalog the data in the on-premises database. Use an AWS Glue job to enrich the data and save the result to Amazon S3 in Apache Parquet format. Create an AmazonRedshift cluster and use Amazon Redshift Spectrum to query the data.
C.
Configure an AWS Glue crawler to use a JDBC connection to catalog the data in the on-premises database. Use an AWS Glue job to enrich the data and save the result to Amazon S3 in Apache Parquet format. Create an AmazonRedshift cluster and use Amazon Redshift Spectrum to query the data.
Answers
D.
Configure an AWS Glue crawler to use a JDBC connection to catalog the data in the on-premises database. Use an AWS Glue job to enrich the data and save the result to Amazon S3 in Apache Parquet format. Use Amazon Athena toquery the data.
D.
Configure an AWS Glue crawler to use a JDBC connection to catalog the data in the on-premises database. Use an AWS Glue job to enrich the data and save the result to Amazon S3 in Apache Parquet format. Use Amazon Athena toquery the data.
Answers
Suggested answer: B

An airline has been collecting metrics on flight activities for analytics. A recently completed proof of concept demonstrates how the company provides insights to data analysts to improve on-time departures. The proof of concept used objects in Amazon S3, which contained the metrics in .csv format, and used Amazon Athena for querying the data. As the amount of data increases, the data analyst wants to optimize the storage solution to improve query performance.

Which options should the data analyst use to improve performance as the data lake grows? (Choose three.)

A.
Add a randomized string to the beginning of the keys in S3 to get more throughput across partitions.
A.
Add a randomized string to the beginning of the keys in S3 to get more throughput across partitions.
Answers
B.
Use an S3 bucket in the same account as Athena.
B.
Use an S3 bucket in the same account as Athena.
Answers
C.
Compress the objects to reduce the data transfer I/O.
C.
Compress the objects to reduce the data transfer I/O.
Answers
D.
Use an S3 bucket in the same Region as Athena.
D.
Use an S3 bucket in the same Region as Athena.
Answers
E.
Preprocess the .csv data to JSON to reduce I/O by fetching only the document keys needed by the query.
E.
Preprocess the .csv data to JSON to reduce I/O by fetching only the document keys needed by the query.
Answers
F.
Preprocess the .csv data to Apache Parquet to reduce I/O by fetching only the data blocks needed for predicates.
F.
Preprocess the .csv data to Apache Parquet to reduce I/O by fetching only the data blocks needed for predicates.
Answers
Suggested answer: A, C, E

A technology company is creating a dashboard that will visualize and analyze time-sensitive data. The data will come in through Amazon Kinesis Data Firehose with the butter interval set to 60 seconds. The dashboard must support nearrealtime data.

Which visualization solution will meet these requirements?

A.
Select Amazon OpenSearch Service (Amazon Elasticsearch Service) as the endpoint for Kinesis Data Firehose. Set up an OpenSearch Dashboards (Kibana) using the data in Amazon OpenSearch Service (Amazon ES) with the desiredanalyses and visualizations.
A.
Select Amazon OpenSearch Service (Amazon Elasticsearch Service) as the endpoint for Kinesis Data Firehose. Set up an OpenSearch Dashboards (Kibana) using the data in Amazon OpenSearch Service (Amazon ES) with the desiredanalyses and visualizations.
Answers
B.
Select Amazon S3 as the endpoint for Kinesis Data Firehose. Read data into an Amazon SageMaker Jupyter notebook and carry out the desired analyses and visualizations.
B.
Select Amazon S3 as the endpoint for Kinesis Data Firehose. Read data into an Amazon SageMaker Jupyter notebook and carry out the desired analyses and visualizations.
Answers
C.
Select Amazon Redshift as the endpoint for Kinesis Data Firehose. Connect Amazon QuickSight with SPICE to Amazon Redshift to create the desired analyses and visualizations.
C.
Select Amazon Redshift as the endpoint for Kinesis Data Firehose. Connect Amazon QuickSight with SPICE to Amazon Redshift to create the desired analyses and visualizations.
Answers
D.
Select Amazon S3 as the endpoint for Kinesis Data Firehose. Use AWS Glue to catalog the data and Amazon Athena to query it. Connect Amazon QuickSight with SPICE to Athena to create the desired analyses and visualizations.
D.
Select Amazon S3 as the endpoint for Kinesis Data Firehose. Use AWS Glue to catalog the data and Amazon Athena to query it. Connect Amazon QuickSight with SPICE to Athena to create the desired analyses and visualizations.
Answers
Suggested answer: A

A technology company has an application with millions of active users every day. The company queries daily usage data with Amazon Athena to understand how users interact with the application. The data includes the date and time, the location ID, and the services used. The company wants to use Athena to run queries to analyze the data with the lowest latency possible. Which solution meets these requirements?

A.
Store the data in Apache Avro format with the date and time as the partition, with the data sorted by the location ID.
A.
Store the data in Apache Avro format with the date and time as the partition, with the data sorted by the location ID.
Answers
B.
Store the data in Apache Parquet format with the date and time as the partition, with the data sorted by the location ID.
B.
Store the data in Apache Parquet format with the date and time as the partition, with the data sorted by the location ID.
Answers
C.
Store the data in Apache ORC format with the location ID as the partition, with the data sorted by the date and time.
C.
Store the data in Apache ORC format with the location ID as the partition, with the data sorted by the date and time.
Answers
D.
Store the data in .csv format with the location ID as the partition, with the data sorted by the date and time.
D.
Store the data in .csv format with the location ID as the partition, with the data sorted by the date and time.
Answers
Suggested answer: C

Explanation:


Reference: https://cwiki.apache.org/confluence/display/hive/languagemanual+orc

A company is hosting an enterprise reporting solution with Amazon Redshift. The application provides reporting capabilities to three main groups: an executive group to access financial reports, a data analyst group to run long-running adhoc queries, and a data engineering group to run stored procedures and ETL processes. The executive team requires queries to run with optimal performance. The data engineering team expects queries to take minutes. Which Amazon Redshift feature meets the requirements for this task?

A.
Concurrency scaling
A.
Concurrency scaling
Answers
B.
Short query acceleration (SQA)
B.
Short query acceleration (SQA)
Answers
C.
Workload management (WLM)
C.
Workload management (WLM)
Answers
D.
Materialized views
D.
Materialized views
Answers
Suggested answer: D

Explanation:


Materialized views:

Reference: https://aws.amazon.com/redshift/faqs/

A retail company is building its data warehouse solution using Amazon Redshift. As a part of that effort, the company is loading hundreds of files into the fact table created in its Amazon Redshift cluster. The company wants the solution to achieve the highest throughput and optimally use cluster resources when loading data into the company’s fact table. How should the company meet these requirements?

A.
Use multiple COPY commands to load the data into the Amazon Redshift cluster.
A.
Use multiple COPY commands to load the data into the Amazon Redshift cluster.
Answers
B.
Use S3DistCp to load multiple files into the Hadoop Distributed File System (HDFS) and use an HDFS connector to ingest the data into the Amazon Redshift cluster.
B.
Use S3DistCp to load multiple files into the Hadoop Distributed File System (HDFS) and use an HDFS connector to ingest the data into the Amazon Redshift cluster.
Answers
C.
Use LOAD commands equal to the number of Amazon Redshift cluster nodes and load the data in parallel into each node.
C.
Use LOAD commands equal to the number of Amazon Redshift cluster nodes and load the data in parallel into each node.
Answers
D.
Use a single COPY command to load the data into the Amazon Redshift cluster.
D.
Use a single COPY command to load the data into the Amazon Redshift cluster.
Answers
Suggested answer: B

A company is migrating from an on-premises Apache Hadoop cluster to an Amazon EMR cluster. The cluster runs only during business hours. Due to a company requirement to avoid intraday cluster failures, the EMR cluster must be highly available. When the cluster is terminated at the end of each business day, the data must persist. Which configurations would enable the EMR cluster to meet these requirements? (Choose three.)

A.
EMR File System (EMRFS) for storage
A.
EMR File System (EMRFS) for storage
Answers
B.
Hadoop Distributed File System (HDFS) for storage
B.
Hadoop Distributed File System (HDFS) for storage
Answers
C.
AWS Glue Data Catalog as the metastore for Apache Hive
C.
AWS Glue Data Catalog as the metastore for Apache Hive
Answers
D.
MySQL database on the master node as the metastore for Apache Hive
D.
MySQL database on the master node as the metastore for Apache Hive
Answers
E.
Multiple master nodes in a single Availability Zone
E.
Multiple master nodes in a single Availability Zone
Answers
F.
Multiple master nodes in multiple Availability Zones
F.
Multiple master nodes in multiple Availability Zones
Answers
Suggested answer: B, C, F

An online retail company with millions of users around the globe wants to improve its ecommerce analytics capabilities.

Currently, clickstream data is uploaded directly to Amazon S3 as compressed files. Several times each day, an application running on Amazon EC2 processes the data and makes search options and reports available for visualization by editors and marketers. The company wants to make website clicks and aggregated data available to editors and marketers in minutes to enable them to connect with users more effectively.

Which options will help meet these requirements in the MOST efficient way? (Choose two.)

A.
Use Amazon Kinesis Data Firehose to upload compressed and batched clickstream records to Amazon OpenSearch Service (Amazon Elasticsearch Service).
A.
Use Amazon Kinesis Data Firehose to upload compressed and batched clickstream records to Amazon OpenSearch Service (Amazon Elasticsearch Service).
Answers
B.
Upload clickstream records to Amazon S3 as compressed files. Then use AWS Lambda to send data to Amazon OpenSearch Service (Amazon Elasticsearch Service) from Amazon S3.
B.
Upload clickstream records to Amazon S3 as compressed files. Then use AWS Lambda to send data to Amazon OpenSearch Service (Amazon Elasticsearch Service) from Amazon S3.
Answers
C.
Use Amazon OpenSearch Service (Amazon Elasticsearch Service) deployed on Amazon EC2 to aggregate, filter, and process the data. Refresh content performance dashboards in near-real time.
C.
Use Amazon OpenSearch Service (Amazon Elasticsearch Service) deployed on Amazon EC2 to aggregate, filter, and process the data. Refresh content performance dashboards in near-real time.
Answers
D.
Use OpenSearch Dashboards (Kibana) to aggregate, filter, and visualize the data stored in Amazon OpenSearch Service (Amazon Elasticsearch Service). Refresh content performance dashboards in near-real time.
D.
Use OpenSearch Dashboards (Kibana) to aggregate, filter, and visualize the data stored in Amazon OpenSearch Service (Amazon Elasticsearch Service). Refresh content performance dashboards in near-real time.
Answers
E.
Upload clickstream records from Amazon S3 to Amazon Kinesis Data Streams and use a Kinesis Data Streams consumer to send records to Amazon OpenSearch Service (Amazon Elasticsearch Service).
E.
Upload clickstream records from Amazon S3 to Amazon Kinesis Data Streams and use a Kinesis Data Streams consumer to send records to Amazon OpenSearch Service (Amazon Elasticsearch Service).
Answers
Suggested answer: C, E
Total 214 questions
Go to page: of 22