ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 146 - MLS-C01 discussion

Report
Export

A Data Scientist wants to gain real-time insights into a data stream of GZIP files. Which solution would allow the use of SQL to query the stream with the LEAST latency?

A.
Amazon Kinesis Data Analytics with an AWS Lambda function to transform the data.
Answers
A.
Amazon Kinesis Data Analytics with an AWS Lambda function to transform the data.
B.
AWS Glue with a custom ETL script to transform the data.
Answers
B.
AWS Glue with a custom ETL script to transform the data.
C.
An Amazon Kinesis Client Library to transform the data and save it to an Amazon ES cluster.
Answers
C.
An Amazon Kinesis Client Library to transform the data and save it to an Amazon ES cluster.
D.
Amazon Kinesis Data Firehose to transform the data and put it into an Amazon S3 bucket.
Answers
D.
Amazon Kinesis Data Firehose to transform the data and put it into an Amazon S3 bucket.
Suggested answer: A

Explanation:

Amazon Kinesis Data Analytics is a service that enables you to analyze streaming data in real time using SQL or Apache Flink applications. You can use Kinesis Data Analytics to process and gain insights from data streams such as web logs, clickstreams, IoT data, and more.

To use SQL to query a data stream of GZIP files, you need to first transform the data into a format that Kinesis Data Analytics can understand, such as JSON, CSV, or Apache Parquet. You can use an AWS Lambda function to perform this transformation and send the output to a Kinesis data stream that is connected to your Kinesis Data Analytics application. This way, you can use SQL to query the stream with the least latency, as Lambda functions are triggered in near real time by the incoming data and Kinesis Data Analytics can process the data as soon as it arrives.

The other options are not optimal for this scenario, as they introduce more latency or complexity. AWS Glue is a serverless data integration service that can perform ETL (extract, transform, and load) tasks on data sources, but it is not designed for real-time streaming data analysis. An Amazon Kinesis Client Library is a Java library that enables you to build custom applications that process data from Kinesis data streams, but it requires more coding and configuration than using a Lambda function. Amazon Kinesis Data Firehose is a service that can deliver streaming data to destinations such as Amazon S3, Amazon Redshift, Amazon OpenSearch Service, and Splunk, but it does not support SQL queries on the data.

References:

What Is Amazon Kinesis Data Analytics for SQL Applications?

Using AWS Lambda with Amazon Kinesis Data Streams

Using AWS Lambda with Amazon Kinesis Data Firehose

asked 16/09/2024
Mario Herrera González
46 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first