ExamGecko
Home Home / Amazon / BDS-C00

Amazon BDS-C00 Practice Test - Questions Answers, Page 7

Question list
Search
Search

List of questions

Search

Related questions











A company hosts a portfolio of e-commerce websites across the Oregon, N. Virginia, Ireland, and Sydney AWS regions. Each site keeps log files that capture user behavior. The company has built an application that generates batches of product recommendations with collaborative filtering in Oregon. Oregon was selected because the flagship site is hosted there and provides the largest collection of data to train machine learning models against. The other regions do NOT have enough historic data to train accurate machine learning models.

Which set of data processing steps improves recommendations for each region?

A.
Use the e-commerce application in Oregon to write replica log files in each other region.
A.
Use the e-commerce application in Oregon to write replica log files in each other region.
Answers
B.
Use Amazon S3 bucket replication to consolidate log entries and build a single model in Oregon.
B.
Use Amazon S3 bucket replication to consolidate log entries and build a single model in Oregon.
Answers
C.
Use Kinesis as a buffer for web logs and replicate logs to the Kinesis stream of a neighboring region.
C.
Use Kinesis as a buffer for web logs and replicate logs to the Kinesis stream of a neighboring region.
Answers
D.
Use the CloudWatch Logs agent to consolidate logs into a single CloudWatch Logs group.
D.
Use the CloudWatch Logs agent to consolidate logs into a single CloudWatch Logs group.
Answers
Suggested answer: D

There are thousands of text files on Amazon S3. The total size of the files is 1 PB. The files contain retail order information for the past 2 years. A data engineer needs to run multiple interactive queries to manipulate the data. The Data Engineer has AWS access to spin up an Amazon EMR cluster. The data engineer needs to use an application on the cluster to process this data and return the results in interactive time frame. Which application on the cluster should the data engineer use?

A.
Oozie
A.
Oozie
Answers
B.
Apache Pig with Tachyon
B.
Apache Pig with Tachyon
Answers
C.
Apache Hive
C.
Apache Hive
Answers
D.
Presto
D.
Presto
Answers
Suggested answer: C

A media advertising company handles a large number of real-time messages sourced from over 200 websites. The company's data engineer needs to collect and process records in real time for analysis using Spark Streaming on Amazon Elastic MapReduce (EMR). The data engineer needs to fulfill a corporate mandate to keep ALL raw messages as they are received as a top priority. Which Amazon Kinesis configuration meets these requirements?

A.
Publish messages to Amazon Kinesis Firehose backed by Amazon Simple Storage Service (S3). Pull messages off Firehose with Spark Streaming in parallelto persistence to Amazon S3.
A.
Publish messages to Amazon Kinesis Firehose backed by Amazon Simple Storage Service (S3). Pull messages off Firehose with Spark Streaming in parallelto persistence to Amazon S3.
Answers
B.
Publish messages to Amazon Kinesis Streams. Pull messages off Streams with Spark Streaming in parallel to AWS Lambda pushing messages fromStreams to Firehose backed by Amazon Simple Storage Service (S3).
B.
Publish messages to Amazon Kinesis Streams. Pull messages off Streams with Spark Streaming in parallel to AWS Lambda pushing messages fromStreams to Firehose backed by Amazon Simple Storage Service (S3).
Answers
C.
Publish messages to Amazon Kinesis Firehose backed by Amazon Simple Storage Service (S3). Use AWS Lambda to pull messages from Firehose toStreams for processing with Spark Streaming.
C.
Publish messages to Amazon Kinesis Firehose backed by Amazon Simple Storage Service (S3). Use AWS Lambda to pull messages from Firehose toStreams for processing with Spark Streaming.
Answers
D.
Publish messages to Amazon Kinesis Streams, pull messages off with Spark Streaming, and write raw data to Amazon Simple Storage Service (S3) beforeand after processing.
D.
Publish messages to Amazon Kinesis Streams, pull messages off with Spark Streaming, and write raw data to Amazon Simple Storage Service (S3) beforeand after processing.
Answers
Suggested answer: C

A solutions architect for a logistics organization ships packages from thousands of suppliers to end customers. The architect is building a platform where suppliers can view the status of one or more of their shipments. Each supplier can have multiple roles that will only allow access to specific fields in the resulting information.

Which strategy allows the appropriate level of access control and requires the LEAST amount of management work?

A.
Send the tracking data to Amazon Kinesis Streams. Use AWS Lambda to store the data in an Amazon DynamoDB Table. Generate temporary AWScredentials for the suppliers' users with AWS STS, specifying fine-grained securitypolicies to limit access only to their applicable data.
A.
Send the tracking data to Amazon Kinesis Streams. Use AWS Lambda to store the data in an Amazon DynamoDB Table. Generate temporary AWScredentials for the suppliers' users with AWS STS, specifying fine-grained securitypolicies to limit access only to their applicable data.
Answers
B.
Send the tracking data to Amazon Kinesis Firehose. Use Amazon S3 notifications and AWS Lambda to prepare files in Amazon S3 with appropriate data foreach supplier's roles. Generate temporary AWS credentials for the suppliers'users with AWS STS. Limit access to the appropriate files through security policies.
B.
Send the tracking data to Amazon Kinesis Firehose. Use Amazon S3 notifications and AWS Lambda to prepare files in Amazon S3 with appropriate data foreach supplier's roles. Generate temporary AWS credentials for the suppliers'users with AWS STS. Limit access to the appropriate files through security policies.
Answers
C.
Send the tracking data to Amazon Kinesis Streams. Use Amazon EMR with Spark Streaming to store the data in HBase. Create one table per supplier. UseHBase Kerberos integration with the suppliers' users. Use HBase ACL-basedsecurity to limit access for the roles to their specific table and columns.
C.
Send the tracking data to Amazon Kinesis Streams. Use Amazon EMR with Spark Streaming to store the data in HBase. Create one table per supplier. UseHBase Kerberos integration with the suppliers' users. Use HBase ACL-basedsecurity to limit access for the roles to their specific table and columns.
Answers
D.
Send the tracking data to Amazon Kinesis Firehose. Store the data in an Amazon Redshift cluster. Create views for the suppliers' users and roles. Allowsuppliers access to the Amazon Redshift cluster using a user limited to theapplicable view.
D.
Send the tracking data to Amazon Kinesis Firehose. Store the data in an Amazon Redshift cluster. Create views for the suppliers' users and roles. Allowsuppliers access to the Amazon Redshift cluster using a user limited to theapplicable view.
Answers
Suggested answer: B

A company's social media manager requests more staff on the weekends to handle an increase in customer contacts from a particular region. The company needs a report to visualize the trends on weekends over the past 6 months using QuickSight. How should the data be represented?

A.
A line graph plotting customer contacts vs. time, with a line for each region
A.
A line graph plotting customer contacts vs. time, with a line for each region
Answers
B.
A pie chart per region plotting customer contacts per day of week
B.
A pie chart per region plotting customer contacts per day of week
Answers
C.
A map of regions with a heatmap overlay to show the volume of customer contacts
C.
A map of regions with a heatmap overlay to show the volume of customer contacts
Answers
D.
A bar graph plotting region vs. volume of social media contacts
D.
A bar graph plotting region vs. volume of social media contacts
Answers
Suggested answer: C

How should an Administrator BEST architect a large multi-layer Long Short-Term Memory (LSTM) recurrent neural network (RNN) running with MXNET on Amazon EC2? (Choose two.)

A.
Use data parallelism to partition the workload over multiple devices and balance the workload within the GPUs.
A.
Use data parallelism to partition the workload over multiple devices and balance the workload within the GPUs.
Answers
B.
Use compute-optimized EC2 instances with an attached elastic GPU.
B.
Use compute-optimized EC2 instances with an attached elastic GPU.
Answers
C.
Use general purpose GPU computing instances such as G3 and P3.
C.
Use general purpose GPU computing instances such as G3 and P3.
Answers
D.
Use processing parallelism to partition the workload over multiple storage devices and balance the workload within the GPUs.
D.
Use processing parallelism to partition the workload over multiple storage devices and balance the workload within the GPUs.
Answers
Suggested answer: A, C

An organization is soliciting public feedback through a web portal that has been deployed to track the number of requests and other important data. As part of reporting and visualization, AmazonQuickSight connects to an Amazon RDS database to virtualize data. Management wants to understand some important metrics about feedback and how the feedback has changed over the last four weeks in a visual representation.

What would be the MOST effective way to represent multiple iterations of an analysis in Amazon QuickSight that would show how the data has changed over the last four weeks?

A.
Use the analysis option for data captured in each week and view the data by a date range.
A.
Use the analysis option for data captured in each week and view the data by a date range.
Answers
B.
Use a pivot table as a visual option to display measured values and weekly aggregate data as a row dimension.
B.
Use a pivot table as a visual option to display measured values and weekly aggregate data as a row dimension.
Answers
C.
Use a dashboard option to create an analysis of the data for each week and apply filters to visualize the data change.
C.
Use a dashboard option to create an analysis of the data for each week and apply filters to visualize the data change.
Answers
D.
Use a story option to preserve multiple iterations of an analysis and play the iterations sequentially.
D.
Use a story option to preserve multiple iterations of an analysis and play the iterations sequentially.
Answers
Suggested answer: D

An organization is setting up a data catalog and metadata management environment for their numerous data stores currently running on AWS. The data catalog will be used to determine the structure and other attributes of data in the data stores. The data stores are composed of Amazon RDS databases, Amazon Redshift, and CSV files residing on Amazon S3. The catalog should be populated on a scheduled basis, and minimal administration is required to manage the catalog.

How can this be accomplished?

A.
Set up Amazon DynamoDB as the data catalog and run a scheduled AWS Lambda function that connects to data sources to populate the database.
A.
Set up Amazon DynamoDB as the data catalog and run a scheduled AWS Lambda function that connects to data sources to populate the database.
Answers
B.
Use an Amazon database as the data catalog and run a scheduled AWS Lambda function that connects to data sources to populate the database.
B.
Use an Amazon database as the data catalog and run a scheduled AWS Lambda function that connects to data sources to populate the database.
Answers
C.
Use AWS Glue Data Catalog as the data catalog and schedule crawlers that connect to data sources to populate the database.
C.
Use AWS Glue Data Catalog as the data catalog and schedule crawlers that connect to data sources to populate the database.
Answers
D.
Set up Apache Hive metastore on an Amazon EC2 instance and run a scheduled bash script that connects to data sources to populate the metastore.
D.
Set up Apache Hive metastore on an Amazon EC2 instance and run a scheduled bash script that connects to data sources to populate the metastore.
Answers
Suggested answer: C

An organization is currently using an Amazon EMR long-running cluster with the latest Amazon EMR release for analytic jobs and is storing data as external tables on Amazon S3.

The company needs to launch multiple transient EMR clusters to access the same tables concurrently, but the metadata about the Amazon S3 external tables are defined and stored on the long-running cluster.

Which solution will expose the Hive metastore with the LEAST operational effort?

A.
Export Hive metastore information to Amazon DynamoDB hive-site classification to point to the Amazon DynamoDB table.
A.
Export Hive metastore information to Amazon DynamoDB hive-site classification to point to the Amazon DynamoDB table.
Answers
B.
Export Hive metastore information to a MySQL table on Amazon RDS and configure the Amazon EMR hive-site classification to point to the Amazon RDSdatabase.
B.
Export Hive metastore information to a MySQL table on Amazon RDS and configure the Amazon EMR hive-site classification to point to the Amazon RDSdatabase.
Answers
C.
Launch an Amazon EC2 instance, install and configure Apache Derby, and export the Hive metastore information to derby.
C.
Launch an Amazon EC2 instance, install and configure Apache Derby, and export the Hive metastore information to derby.
Answers
D.
Create and configure an AWS Glue Data Catalog as a Hive metastore for Amazon EMR.
D.
Create and configure an AWS Glue Data Catalog as a Hive metastore for Amazon EMR.
Answers
Suggested answer: B

An organization is using Amazon Kinesis Data Streams to collect data generated from thousands of temperature devices and is using AWS Lambda to process the data. Devices generate 10 to 12 million records every day, but Lambda is processing only around 450 thousand records. Amazon CloudWatch indicates that throttling on Lambda is not occurring.

What should be done to ensure that all data is processed? (Choose two.)

A.
Increase the BatchSize value on the EventSource, and increase the memory allocated to the Lambda function.
A.
Increase the BatchSize value on the EventSource, and increase the memory allocated to the Lambda function.
Answers
B.
Decrease the BatchSize value on the EventSource, and increase the memory allocated to the Lambda function.
B.
Decrease the BatchSize value on the EventSource, and increase the memory allocated to the Lambda function.
Answers
C.
Create multiple Lambda functions that will consume the same Amazon Kinesis stream.
C.
Create multiple Lambda functions that will consume the same Amazon Kinesis stream.
Answers
D.
Increase the number of vCores allocated for the Lambda function.
D.
Increase the number of vCores allocated for the Lambda function.
Answers
E.
Increase the number of shards on the Amazon Kinesis stream.
E.
Increase the number of shards on the Amazon Kinesis stream.
Answers
Suggested answer: A, E
Total 85 questions
Go to page: of 9