List of questions
Related questions
Question 61 - DEA-C01 discussion
A data engineer has a one-time task to read data from objects that are in Apache Parquet format in an Amazon S3 bucket. The data engineer needs to query only one column of the data.
Which solution will meet these requirements with the LEAST operational overhead?
Confiqure an AWS Lambda function to load data from the S3 bucket into a pandas dataframe- Write a SQL SELECT statement on the dataframe to query the required column.
Use S3 Select to write a SQL SELECT statement to retrieve the required column from the S3 objects.
Prepare an AWS Glue DataBrew project to consume the S3 objects and to query the required column.
Run an AWS Glue crawler on the S3 objects. Use a SQL SELECT statement in Amazon Athena to query the required column.
0 comments
Leave a comment first