ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 274 - Professional Data Engineer discussion

Report
Export

You have a network of 1000 sensors. The sensors generate time series data: one metric per sensor per second, along with a timestamp. You already have 1 TB of data, and expect the data to grow by 1 GB every day You need to access this data in two ways. The first access pattern requires retrieving the metric from one specific sensor stored at a specific timestamp, with a median single-digit millisecond latency. The second access pattern requires running complex analytic queries on the data, including joins, once a day. How should you store this data?

A.
Store your data in Bigtable Concatenate the sensor ID and timestamp and use it as the row key Perform an export to BigQuery every day.
Answers
A.
Store your data in Bigtable Concatenate the sensor ID and timestamp and use it as the row key Perform an export to BigQuery every day.
B.
Store your data in BigQuery Concatenate the sensor ID and timestamp. and use it as the primary key.
Answers
B.
Store your data in BigQuery Concatenate the sensor ID and timestamp. and use it as the primary key.
C.
Store your data in Bigtable Concatenate the sensor ID and metric, and use it as the row key Perform an export to BigQuery every day.
Answers
C.
Store your data in Bigtable Concatenate the sensor ID and metric, and use it as the row key Perform an export to BigQuery every day.
D.
Store your data in BigQuery. Use the metric as a primary key.
Answers
D.
Store your data in BigQuery. Use the metric as a primary key.
Suggested answer: A

Explanation:

To store your data in a way that meets both access patterns, you should:

A) Store your data in Bigtable Concatenate the sensor ID and timestamp and use it as the row key Perform an export to BigQuery every day.This option allows you to leverage the high performance and scalability of Bigtable for low-latency point queries on sensor data, as well as the powerful analytics capabilities of BigQuery for complex queries on large datasets. By using the sensor ID and timestamp as the row key, you can ensure that your data is sorted and distributed evenly across Bigtable nodes, and that you can easily retrieve the metric for a specific sensor and time. By performing an export to BigQuery every day, you can transfer your data to a columnar storage format that is optimized for analytical queries, and take advantage of BigQuery's features such as partitioning, clustering, and caching.

B) Store your data in BigQuery Concatenate the sensor ID and timestamp. and use it as the primary key.This option is not optimal because BigQuery is not designed for low-latency point queries, and using a concatenated primary key may result in poor performance and high costs. BigQuery does not support primary keys natively, and you would have to use a unique constraint or a hash function to enforce uniqueness. Moreover, BigQuery charges by the amount of data scanned, so using a long and complex primary key may increase the query cost and complexity.

C) Store your data in Bigtable Concatenate the sensor ID and metric, and use it as the row key Perform an export to BigQuery every day.This option is not optimal because using the sensor ID and metric as the row key may result in data skew and hotspots in Bigtable, as some sensors may generate more metrics than others, or some metrics may be more common than others. This may affect the performance and availability of Bigtable, as well as the efficiency of the export to BigQuery.

D) Store your data in BigQuery. Use the metric as a primary key.This option is not optimal because using the metric as a primary key may result in data duplication and inconsistency in BigQuery, as multiple sensors may generate the same metric at different times, or the same sensor may generate different metrics at the same time. This may affect the accuracy and reliability of your analytical queries, as well as the query cost and complexity.

asked 18/09/2024
leonie lira
39 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first