ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 71 - DP-203 discussion

Report
Export

You plan to implement an Azure Data Lake Storage Gen2 container that will contain CSV files. The size of the files will vary based on the number of events that occur per hour. File sizes range from 4 KB to 5 GB.

You need to ensure that the files stored in the container are optimized for batch processing. What should you do?

A.
Convert the files to JSON
Answers
A.
Convert the files to JSON
B.
Convert the files to Avro
Answers
B.
Convert the files to Avro
C.
Compress the files
Answers
C.
Compress the files
D.
Merge the files
Answers
D.
Merge the files
Suggested answer: B

Explanation:

Avro supports batch and is very relevant for streaming.

Note: Avro is framework developed within Apache's Hadoop project. It is a row-based storage format which is widely used as a serialization process. AVRO stores its schema in JSON format making it easy to read and interpret by any program. The data itself is stored in binary format by doing it compact and efficient.

Reference:

https://www.adaltas.com/en/2020/07/23/benchmark-study-of-different-file-format/

asked 02/10/2024
Ajay Vijayan
34 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first