ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 180 - DAS-C01 discussion

Report
Export

A company's system operators and security engineers need to analyze activities within specific date ranges of AWS CloudTrail logs. All log files are stored in an Amazon S3 bucket, and the size of the logs is more than 5 T B. The solution must be cost-effective and maximize query performance.

Which solution meets these requirements?

A.
Copy the logs to a new S3 bucket with a prefix structure of <PARTITION COLUMN_NAME>. Use the date column as a partition key. Create a table on Amazon Athena based on the objects in the new bucket. Automatically add metadata partitions by using the MSCK REPAIR TABLE command in Athena. Use Athena to query the table and partitions.
Answers
A.
Copy the logs to a new S3 bucket with a prefix structure of <PARTITION COLUMN_NAME>. Use the date column as a partition key. Create a table on Amazon Athena based on the objects in the new bucket. Automatically add metadata partitions by using the MSCK REPAIR TABLE command in Athena. Use Athena to query the table and partitions.
B.
Create a table on Amazon Athena. Manually add metadata partitions by using the ALTER TABLE ADD PARTITION statement, and use multiple columns for the partition key. Use Athena to query the table and partitions.
Answers
B.
Create a table on Amazon Athena. Manually add metadata partitions by using the ALTER TABLE ADD PARTITION statement, and use multiple columns for the partition key. Use Athena to query the table and partitions.
C.
Launch an Amazon EMR cluster and use Amazon S3 as a data store for Apache HBase. Load the logs from the S3 bucket to an HBase table on Amazon EMR. Use Amazon Athena to query the table and partitions.
Answers
C.
Launch an Amazon EMR cluster and use Amazon S3 as a data store for Apache HBase. Load the logs from the S3 bucket to an HBase table on Amazon EMR. Use Amazon Athena to query the table and partitions.
D.
Create an AWS Glue job to copy the logs from the S3 source bucket to a new S3 bucket and create a table using Apache Parquet file format, Snappy as compression codec, and partition by date. Use Amazon Athena to query the table and partitions.
Answers
D.
Create an AWS Glue job to copy the logs from the S3 source bucket to a new S3 bucket and create a table using Apache Parquet file format, Snappy as compression codec, and partition by date. Use Amazon Athena to query the table and partitions.
Suggested answer: D

Explanation:

This solution meets the requirements because:

AWS Glue is a fully managed extract, transform, and load (ETL) service that can be used to prepare and load data for analytics1.You can use AWS Glue to create a job that copies the CloudTrail logs from the source S3 bucket to a new S3 bucket, and converts them to Apache Parquet format2.Parquet is a columnar storage format that is optimized for analytics and supports compression3.Snappy is a compression codec that provides a good balance between compression ratio and speed4.

AWS Glue can also create a table based on the Parquet files in the new S3 bucket, and partition the table by date2.Partitioning is a technique that divides a large dataset into smaller subsets based on a partition key, such as date5.Partitioning can improve query performance by reducing the amount of data scanned and filtering out irrelevant data5.

Amazon Athena is an interactive query service that allows you to analyze data in S3 using standard SQL6. You can use Athena to query the table created by AWS Glue, and specify the partitions you want to query based on the date range. Athena can leverage the benefits of Parquet format and partitioning to run queries faster and more cost-effectively.

asked 16/09/2024
Mitesh Patel
32 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first