ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 280 - MLS-C01 discussion

Report
Export

A company ingests machine learning (ML) data from web advertising clicks into an Amazon S3 data lake. Click data is added to an Amazon Kinesis data stream by using the Kinesis Producer Library (KPL). The data is loaded into the S3 data lake from the data stream by using an Amazon Kinesis Data Firehose delivery stream. As the data volume increases, an ML specialist notices that the rate of data ingested into Amazon S3 is relatively constant. There also is an increasing backlog of data for Kinesis Data Streams and Kinesis Data Firehose to ingest.

Which next step is MOST likely to improve the data ingestion rate into Amazon S3?

A.
Increase the number of S3 prefixes for the delivery stream to write to.
Answers
A.
Increase the number of S3 prefixes for the delivery stream to write to.
B.
Decrease the retention period for the data stream.
Answers
B.
Decrease the retention period for the data stream.
C.
Increase the number of shards for the data stream.
Answers
C.
Increase the number of shards for the data stream.
D.
Add more consumers using the Kinesis Client Library (KCL).
Answers
D.
Add more consumers using the Kinesis Client Library (KCL).
Suggested answer: C

Explanation:

The solution C is the most likely to improve the data ingestion rate into Amazon S3 because it increases the number of shards for the data stream. The number of shards determines the throughput capacity of the data stream, which affects the rate of data ingestion. Each shard can support up to 1 MB per second of data input and 2 MB per second of data output. By increasing the number of shards, the company can increase the data ingestion rate proportionally.The company can use the UpdateShardCount API operation to modify the number of shards in the data stream1.

The other options are not likely to improve the data ingestion rate into Amazon S3 because:

Option A: Increasing the number of S3 prefixes for the delivery stream to write to will not affect the data ingestion rate, as it only changes the way the data is organized in the S3 bucket.The number of S3 prefixes can help to optimize the performance of downstream applications that read the data from S3, but it does not impact the performance of Kinesis Data Firehose2.

Option B: Decreasing the retention period for the data stream will not affect the data ingestion rate, as it only changes the amount of time the data is stored in the data stream.The retention period can help to manage the data availability and durability, but it does not impact the throughput capacity of the data stream3.

Option D: Adding more consumers using the Kinesis Client Library (KCL) will not affect the data ingestion rate, as it only changes the way the data is processed by downstream applications.The consumers can help to scale the data processing and handle failures, but they do not impact the data ingestion into S3 by Kinesis Data Firehose4.

References:

1: Resharding - Amazon Kinesis Data Streams

2: Amazon S3 Prefixes - Amazon Kinesis Data Firehose

3: Data Retention - Amazon Kinesis Data Streams

4: Developing Consumers Using the Kinesis Client Library - Amazon Kinesis Data Streams

asked 16/09/2024
Maxime SELLY
43 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first