List of questions
Related questions
Question 182 - DAS-C01 discussion
A company analyzes historical data and needs to query data that is stored in Amazon S3. New data is generated daily as .csv files that are stored in Amazon S3. The company's data analysts are using Amazon Athena to perform SQL queries against a recent subset of the overall data.
The amount of data that is ingested into Amazon S3 has increased to 5 PB over time. The query latency also has increased. The company needs to segment the data to reduce the amount of data that is scanned.
Which solutions will improve query performance? (Select TWO.)
A.
Use MySQL Workbench on an Amazon EC2 instance. Connect to Athena by using a JDBC connector. Run the query from MySQL Workbench instead of Athena directly.
B.
Configure Athena to use S3 Select to load only the files of the data subset.
C.
Create the data subset in Apache Parquet format each day by using the Athena CREATE TABLE AS SELECT (CTAS) statement. Query the Parquet data.
D.
Run a daily AWS Glue ETL job to convert the data files to Apache Parquet format and to partition the converted files. Create a periodic AWS Glue crawler to automatically crawl the partitioned data each day.
E.
Create an S3 gateway endpoint. Configure VPC routing to access Amazon S3 through the gateway endpoint.
Your answer:
0 comments
Sorted by
Leave a comment first