ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 31 - Professional Cloud DevOps Engineer discussion

Report
Export

You are on-call for an infrastructure service that has a large number of dependent systems. You receive an alert indicating that the service is failing to serve most of its requests and all of its dependent systems with hundreds of thousands of users are affected. As part of your Site Reliability Engineering (SRE) incident management protocol, you declare yourself Incident Commander (IC) and pull in two experienced people from your team as Operations Lead (OLJ and Communications Lead (CL). What should you do next?

A.
Look for ways to mitigate user impact and deploy the mitigations to production.
Answers
A.
Look for ways to mitigate user impact and deploy the mitigations to production.
B.
Contact the affected service owners and update them on the status of the incident.
Answers
B.
Contact the affected service owners and update them on the status of the incident.
C.
Establish a communication channel where incident responders and leads can communicate with each other.
Answers
C.
Establish a communication channel where incident responders and leads can communicate with each other.
D.
Start a postmortem, add incident information, circulate the draft internally, and ask internal stakeholders for input.
Answers
D.
Start a postmortem, add incident information, circulate the draft internally, and ask internal stakeholders for input.
Suggested answer: A

Explanation:

https://sre.google/sre-book/managing-incidents/

asked 18/09/2024
Pawel Lenart
33 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first