Amazon DAS-C01 Practice Test - Questions Answers, Page 18
List of questions
Related questions
A machinery company wants to collect data from sensors. A data analytics specialist needs to implement a solution that aggregates the data in near-real time and saves the data to a persistent data store. The data must be stored in nested JSON format and must be queried from the data store with a latency of single-digit milliseconds.
Which solution will meet these requirements?
A financial company uses Amazon Athena to query data from an Amazon S3 data lake. Files are stored in the S3 data lake in Apache ORC format. Data analysts recently introduced nested fields in the data lake ORC files, and noticed that queries are taking longer to run in Athena. A data analysts discovered that more data than what is required is being scanned for the queries.
What is the MOST operationally efficient solution to improve query performance?
A gaming company is building a serverless data lake. The company is ingesting streaming data into Amazon Kinesis Data Streams and is writing the data to Amazon S3 through Amazon Kinesis Data Firehose. The company is using 10 MB as the S3 buffer size and is using 90 seconds as the buffer interval. The company runs an AWS Glue ET L job to merge and transform the data to a different format before writing the data back to Amazon S3.
Recently, the company has experienced substantial growth in its data volume. The AWS Glue ETL jobs are frequently showing an OutOfMemoryError error.
Which solutions will resolve this issue without incurring additional costs? (Select TWO.)
A company uses Amazon Connect to manage its contact center. The company uses Salesforce to manage its customer relationship management (CRM) data. The company must build a pipeline to ingest data from Amazon Connect and Salesforce into a data lake that is built on Amazon S3.
Which solution will meet this requirement with the LEAST operational overhead?
A company is designing a data warehouse to support business intelligence reporting. Users will access the executive dashboard heavily each Monday and Friday morning for I hour. These read-only queries will run on the active Amazon Redshift cluster, which runs on dc2.8xIarge compute nodes 24 hours a day, 7 days a week. There are three queues set up in workload management: Dashboard, ETL, and System. The Amazon Redshift cluster needs to process the queries without wait time.
What is the MOST cost-effective way to ensure that the cluster processes these queries?
A marketing company has an application that stores event data in an Amazon RDS database. The company is replicating this data to Amazon Redshift for reporting and business intelligence (BI) purposes. New event data is continuously generated and ingested into the RDS database throughout the day and captured by a change data capture (CDC) replication task in AWS Database Migration Service (AWS DMS). The company requires that the new data be replicated to Amazon Redshift in near-real time.
Which solution meets these requirements?
A company is creating a data lake by using AWS Lake Formation. The data that will be stored in the data lake contains sensitive customer information and must be encrypted at rest using an AWS Key Management Service (AWS KMS) customer managed key to meet regulatory requirements.
How can the company store the data in the data lake to meet these requirements?
An ecommerce company uses Amazon Aurora PostgreSQL to process and store live transactional data and uses Amazon Redshift for its data warehouse solution. A nightly ET L job has been implemented to update the Redshift cluster with new data from the PostgreSQL database. The business has grown rapidly and so has the size and cost of the Redshift cluster. The company's data analytics team needs to create a solution to archive historical data and only keep the most recent 12 months of data in Amazon
Redshift to reduce costs. Data analysts should also be able to run analytics queries that effectively combine data from live transactional data in PostgreSQL, current data in Redshift, and archived historical data.
Which combination of tasks will meet these requirements? (Select THREE.)
A large marketing company needs to store all of its streaming logs and create near-real-time dashboards. The dashboards will be used to help the company make critical business decisions and must be highly available.
Which solution meets these requirements?
A company's system operators and security engineers need to analyze activities within specific date ranges of AWS CloudTrail logs. All log files are stored in an Amazon S3 bucket, and the size of the logs is more than 5 T B. The solution must be cost-effective and maximize query performance.
Which solution meets these requirements?
Question