ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 69 - Professional Cloud DevOps Engineer discussion

Report
Export

You encounter a large number of outages in the production systems you support. You receive alerts for all the outages that wake you up at night. The alerts are due to unhealthy systems that are automatically restarted within a minute. You want to set up a process that would prevent staff burnout while following Site Reliability Engineering practices. What should you do?

A.
Eliminate unactionable alerts.
Answers
A.
Eliminate unactionable alerts.
B.
Create an incident report for each of the alerts.
Answers
B.
Create an incident report for each of the alerts.
C.
Distribute the alerts to engineers in different time zones.
Answers
C.
Distribute the alerts to engineers in different time zones.
D.
Redefine the related Service Level Objective so that the error budget is not exhausted.
Answers
D.
Redefine the related Service Level Objective so that the error budget is not exhausted.
Suggested answer: A

Explanation:

Eliminate bad monitoring : Unactionable alerts (i.e., spam) https://cloud.google.com/blog/products/management-tools/meeting-reliability-challenges-with-sre-principles

agree with kyubiblaze about having to remove unactionable items aka spam: 'good monitoring alerts on actionable problems' @ https://cloud.google.com/blog/products/management-tools/meeting-reliability-challenges-with-sre-principles

asked 18/09/2024
Velmurugan P
42 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first