List of questions
Related questions
Question 6 - DEA-C01 discussion
A company uses Amazon S3 as a data lake. The company sets up a data warehouse by using a multi-node Amazon Redshift cluster. The company organizes the data files in the data lake based on the data source of each data file.
The company loads all the data files into one table in the Redshift cluster by using a separate COPY command for each data file location. This approach takes a long time to load all the data files into the table. The company must increase the speed of the data ingestion. The company does not want to increase the cost of the process.
Which solution will meet these requirements?
Use a provisioned Amazon EMR cluster to copy all the data files into one folder. Use a COPY command to load the data into Amazon Redshift.
Load all the data files in parallel into Amazon Aurora. Run an AWS Glue job to load the data into Amazon Redshift.
Use an AWS Glue job to copy all the data files into one folder. Use a COPY command to load the data into Amazon Redshift.
Create a manifest file that contains the data file locations. Use a COPY command to load the data into Amazon Redshift.
0 comments
Leave a comment first