ExamGecko
Home Home / Amazon / BDS-C00

Amazon BDS-C00 Practice Test - Questions Answers, Page 3

Question list
Search
Search

List of questions

Search

Related questions











A data engineer is about to perform a major upgrade to the DDL contained within an Amazon Redshift cluster to support a new data warehouse application. The upgrade scripts will include user permission updates, view and table structure changes as well as additional loading and data manipulation tasks.

The data engineer must be able to restore the database to its existing state in the event of issues. Which action should be taken prior to performing this upgrade task?

A.
Run an UNLOAD command for all data in the warehouse and save it to S3.
A.
Run an UNLOAD command for all data in the warehouse and save it to S3.
Answers
B.
Create a manual snapshot of the Amazon Redshift cluster.
B.
Create a manual snapshot of the Amazon Redshift cluster.
Answers
C.
Make a copy of the automated snapshot on the Amazon Redshift cluster.
C.
Make a copy of the automated snapshot on the Amazon Redshift cluster.
Answers
D.
Call the waitForSnapshotAvailable command from either the AWS CLI or an AWS SDK.
D.
Call the waitForSnapshotAvailable command from either the AWS CLI or an AWS SDK.
Answers
Suggested answer: B

Explanation:

Reference: https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-snapshots.html#working-with-snapshot-restore-table-from-snapshot

A large oil and gas company needs to provide near real-time alerts when peak thresholds are exceeded in its pipeline system. The company has developed a system to capture pipeline metrics such as flow rate, pressure, and temperature using millions of sensors. The sensors deliver to AWS IoT.

What is a cost-effective way to provide near real-time alerts on the pipeline metrics?

A.
Create an AWS IoT rule to generate an Amazon SNS notification.
A.
Create an AWS IoT rule to generate an Amazon SNS notification.
Answers
B.
Store the data points in an Amazon DynamoDB table and poll if for peak metrics data from an Amazon EC2 application.
B.
Store the data points in an Amazon DynamoDB table and poll if for peak metrics data from an Amazon EC2 application.
Answers
C.
Create an Amazon Machine Learning model and invoke it with AWS Lambda.
C.
Create an Amazon Machine Learning model and invoke it with AWS Lambda.
Answers
D.
Use Amazon Kinesis Streams and a KCL-based application deployed on AWS Elastic Beanstalk.
D.
Use Amazon Kinesis Streams and a KCL-based application deployed on AWS Elastic Beanstalk.
Answers
Suggested answer: C

A company is using Amazon Machine Learning as part of a medical software application. The application will predict the most likely blood type for a patient based on a variety of other clinical tests that are available when blood type knowledge is unavailable. What is the appropriate model choice and target attribute combination for this problem?

A.
Multi-class classification model with a categorical target attribute.
A.
Multi-class classification model with a categorical target attribute.
Answers
B.
Regression model with a numeric target attribute.
B.
Regression model with a numeric target attribute.
Answers
C.
Binary Classification with a categorical target attribute.
C.
Binary Classification with a categorical target attribute.
Answers
D.
K-Nearest Neighbors model with a multi-class target attribute.
D.
K-Nearest Neighbors model with a multi-class target attribute.
Answers
Suggested answer: A

A data engineer is running a DWH on a 25-node Redshift cluster of a SaaS service. The data engineer needs to build a dashboard that will be used by customers. Five big customers represent 80% of usage, and there is a long tail of dozens of smaller customers. The data engineer has selected the dashboarding tool.

How should the data engineer make sure that the larger customer workloads do NOT interfere with the smaller customer workloads?

A.
Apply query filters based on customer-id that can NOT be changed by the user and apply distribution keys on customer-id.
A.
Apply query filters based on customer-id that can NOT be changed by the user and apply distribution keys on customer-id.
Answers
B.
Place the largest customers into a single user group with a dedicated query queue and place the rest of the customers into a different query queue.
B.
Place the largest customers into a single user group with a dedicated query queue and place the rest of the customers into a different query queue.
Answers
C.
Push aggregations into an RDS for Aurora instance. Connect the dashboard application to Aurora rather than Redshift for faster queries.
C.
Push aggregations into an RDS for Aurora instance. Connect the dashboard application to Aurora rather than Redshift for faster queries.
Answers
D.
Route the largest customers to a dedicated Redshift cluster. Raise the concurrency of the multi-tenant Redshift cluster to accommodate the remainingcustomers.
D.
Route the largest customers to a dedicated Redshift cluster. Raise the concurrency of the multi-tenant Redshift cluster to accommodate the remainingcustomers.
Answers
Suggested answer: D

An Amazon Kinesis stream needs to be encrypted.

Which approach should be used to accomplish this task?

A.
Perform a client-side encryption of the data before it enters the Amazon Kinesis stream on the producer.
A.
Perform a client-side encryption of the data before it enters the Amazon Kinesis stream on the producer.
Answers
B.
Use a partition key to segment the data by MD5 hash functions, which makes it undecipherable while in transit.
B.
Use a partition key to segment the data by MD5 hash functions, which makes it undecipherable while in transit.
Answers
C.
Perform a client-side encryption of the data before it enters the Amazon Kinesis stream on the consumer.
C.
Perform a client-side encryption of the data before it enters the Amazon Kinesis stream on the consumer.
Answers
D.
Use a shard to segment the data, which has built-in functionality to make it indecipherable while in transit.
D.
Use a shard to segment the data, which has built-in functionality to make it indecipherable while in transit.
Answers
Suggested answer: B

Explanation:

Reference: https://docs.aws.amazon.com/firehose/latest/dev/encryption.html

An online photo album app has a key design feature to support multiple screens (e.g, desktop, mobile phone, and tablet) with high-quality displays. Multiple versions of the image must be saved in different resolutions and layouts.

The image-processing Java program takes an average of five seconds per upload, depending on the image size and format. Each image upload captures the following image metadata: user, album, photo label, upload timestamp.

The app should support the following requirements:

Hundreds of user image uploads per second

Maximum image upload size of 10 MB

Maximum image metadata size of 1 KB

Image displayed in optimized resolution in all supported screens no later than one minute after image upload Which strategy should be used to meet these requirements?

A.
Write images and metadata to Amazon Kinesis. Use a Kinesis Client Library (KCL) application to run the image processing and save the image output toAmazon S3 and metadata to the app repository DB.
A.
Write images and metadata to Amazon Kinesis. Use a Kinesis Client Library (KCL) application to run the image processing and save the image output toAmazon S3 and metadata to the app repository DB.
Answers
B.
Write image and metadata RDS with BLOB data type. Use AWS Data Pipeline to run the image processing and save the image output to Amazon S3 andmetadata to the app repository DB.
B.
Write image and metadata RDS with BLOB data type. Use AWS Data Pipeline to run the image processing and save the image output to Amazon S3 andmetadata to the app repository DB.
Answers
C.
Upload image with metadata to Amazon S3, use Lambda function to run the image processing and save the images output to Amazon S3 and metadata tothe app repository DB.
C.
Upload image with metadata to Amazon S3, use Lambda function to run the image processing and save the images output to Amazon S3 and metadata tothe app repository DB.
Answers
D.
Write image and metadata to Amazon Kinesis. Use Amazon Elastic MapReduce (EMR) with Spark Streaming to run image processing and save the imagesoutput to Amazon S3 and metadata to app repository DB.
D.
Write image and metadata to Amazon Kinesis. Use Amazon Elastic MapReduce (EMR) with Spark Streaming to run image processing and save the imagesoutput to Amazon S3 and metadata to app repository DB.
Answers
Suggested answer: C

A customer needs to determine the optimal distribution strategy for the ORDERS fact table in its Redshift schema. The ORDERS table has foreign key relationships with multiple dimension tables in this schema.

How should the company determine the most appropriate distribution key for the ORDERS table?

A.
Identify the largest and most frequently joined dimension table and ensure that it and the ORDERS table both have EVEN distribution.
A.
Identify the largest and most frequently joined dimension table and ensure that it and the ORDERS table both have EVEN distribution.
Answers
B.
Identify the largest dimension table and designate the key of this dimension table as the distribution key of the ORDERS table.
B.
Identify the largest dimension table and designate the key of this dimension table as the distribution key of the ORDERS table.
Answers
C.
Identify the smallest dimension table and designate the key of this dimension table as the distribution key of the ORDERS table.
C.
Identify the smallest dimension table and designate the key of this dimension table as the distribution key of the ORDERS table.
Answers
D.
Identify the largest and the most frequently joined dimension table and designate the key of this dimension table as the distribution key of the ORDERS table.
D.
Identify the largest and the most frequently joined dimension table and designate the key of this dimension table as the distribution key of the ORDERS table.
Answers
Suggested answer: D

Explanation:

Reference: https://aws.amazon.com/blogs/big-data/optimizing-for-star-schemas-and-interleaved-sorting-on-amazon-redshift/

A customer is collecting clickstream data using Amazon Kinesis and is grouping the events by IP address into 5-minute chunks stored in Amazon S3.

Many analysts in the company use Hive on Amazon EMR to analyze this data. Their queries always reference a single IP address. Data must be optimized for querying based on IP address using Hive running on Amazon EMR. What is the most efficient method to query the data with Hive?

A.
Store an index of the files by IP address in the Amazon DynamoDB metadata store for EMRFS.
A.
Store an index of the files by IP address in the Amazon DynamoDB metadata store for EMRFS.
Answers
B.
Store the Amazon S3 objects with the following naming scheme: bucket_name/source=ip_address/year=yy/month=mm/day=dd/hour=hh/filename.
B.
Store the Amazon S3 objects with the following naming scheme: bucket_name/source=ip_address/year=yy/month=mm/day=dd/hour=hh/filename.
Answers
C.
Store the data in an HBase table with the IP address as the row key.
C.
Store the data in an HBase table with the IP address as the row key.
Answers
D.
Store the events for an IP address as a single file in Amazon S3 and add metadata with keys: Hive_Partitioned_IPAddress.
D.
Store the events for an IP address as a single file in Amazon S3 and add metadata with keys: Hive_Partitioned_IPAddress.
Answers
Suggested answer: A

An online retailer is using Amazon DynamoDB to store data related to customer transactions. The items in the table contains several string attributes describing the transaction as well as a JSON attribute containing the shopping cart and other details corresponding to the transaction. Average item size is - 250KB, most of which is associated with the JSON attribute. The average customer generates - 3GB of data per month.

Customers access the table to display their transaction history and review transaction details as needed. Ninety percent of the queries against the table are executed when building the transaction history view, with the other 10% retrieving transaction details. The table is partitioned on CustomerID and sorted on transaction date.

The client has very high read capacity provisioned for the table and experiences very even utilization, but complains about the cost of Amazon DynamoDB compared to other NoSQL solutions.

Which strategy will reduce the cost associated with the client's read queries while not degrading quality?

A.
Modify all database calls to use eventually consistent reads and advise customers that transaction history may be one second out-of-date.
A.
Modify all database calls to use eventually consistent reads and advise customers that transaction history may be one second out-of-date.
Answers
B.
Change the primary table to partition on TransactionID, create a GSI partitioned on customer and sorted on date, project small attributes into GSI, and thenquery GSI for summary data and the primary table for JSON details.
B.
Change the primary table to partition on TransactionID, create a GSI partitioned on customer and sorted on date, project small attributes into GSI, and thenquery GSI for summary data and the primary table for JSON details.
Answers
C.
Vertically partition the table, store base attributes on the primary table, and create a foreign key reference to a secondary table containing the JSON data.Query the primary table for summary data and the secondary table for JSONdetails.
C.
Vertically partition the table, store base attributes on the primary table, and create a foreign key reference to a secondary table containing the JSON data.Query the primary table for summary data and the secondary table for JSONdetails.
Answers
D.
Create an LSI sorted on date, project the JSON attribute into the index, and then query the primary table for summary data and the LSI for JSON details.
D.
Create an LSI sorted on date, project the JSON attribute into the index, and then query the primary table for summary data and the LSI for JSON details.
Answers
Suggested answer: D

A company that manufactures and sells smart air conditioning units also offers add-on services so that customers can see real-time dashboards in a mobile application or a web browser. Each unit sends its sensor information in JSON format every two seconds for processing and analysis. The company also needs to consume this data to predict possible equipment problems before they occur. A few thousand pre-purchased units will be delivered in the next couple of months.

The company expects high market growth in the next year and needs to handle a massive amount of data and scale without interruption. Which ingestion solution should the company use?

A.
Write sensor data records to Amazon Kinesis Streams. Process the data using KCL applications for the end-consumer dashboard and anomaly detectionworkflows.
A.
Write sensor data records to Amazon Kinesis Streams. Process the data using KCL applications for the end-consumer dashboard and anomaly detectionworkflows.
Answers
B.
Batch sensor data to Amazon Simple Storage Service (S3) every 15 minutes. Flow the data downstream to the end-consumer dashboard and to the anomalydetection application.
B.
Batch sensor data to Amazon Simple Storage Service (S3) every 15 minutes. Flow the data downstream to the end-consumer dashboard and to the anomalydetection application.
Answers
C.
Write sensor data records to Amazon Kinesis Firehose with Amazon Simple Storage Service (S3) as the destination. Consume the data with a KCLapplication for the end-consumer dashboard and anomaly detection.
C.
Write sensor data records to Amazon Kinesis Firehose with Amazon Simple Storage Service (S3) as the destination. Consume the data with a KCLapplication for the end-consumer dashboard and anomaly detection.
Answers
D.
Write sensor data records to Amazon Relational Database Service (RDS). Build both the end-consumer dashboard and anomaly detection application on topof Amazon RDS.
D.
Write sensor data records to Amazon Relational Database Service (RDS). Build both the end-consumer dashboard and anomaly detection application on topof Amazon RDS.
Answers
Suggested answer: C
Total 85 questions
Go to page: of 9