ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 146 - Professional Cloud DevOps Engineer discussion

Report
Export

You recently noticed that one Of your services has exceeded the error budget for the current rolling window period. Your company's product team is about to launch a new feature. You want to follow Site Reliability Engineering (SRE) practices.

What should you do?

A.
Notify the team that their error budget is used up. Negotiate with the team for a launch freeze or tolerate a slightly worse user experience.
Answers
A.
Notify the team that their error budget is used up. Negotiate with the team for a launch freeze or tolerate a slightly worse user experience.
B.
Look through other metrics related to the product and find SLOs with remaining error budget. Reallocate the error budgets and allow the feature launch.
Answers
B.
Look through other metrics related to the product and find SLOs with remaining error budget. Reallocate the error budgets and allow the feature launch.
C.
Escalate the situation and request additional error budget.
Answers
C.
Escalate the situation and request additional error budget.
D.
Notify the team about the lack of error budget and ensure that all their tests are successful so the launch will not further risk the error budget.
Answers
D.
Notify the team about the lack of error budget and ensure that all their tests are successful so the launch will not further risk the error budget.
Suggested answer: A

Explanation:

The correct answer is

A) Notify the team that their error budget is used up. Negotiate with the team for a launch freeze or tolerate a slightly worse user experience.

According to the Site Reliability Engineering (SRE) practices, an error budget is the amount of unreliability that a service can tolerate without harming user satisfaction1. An error budget is derived from the service-level objectives (SLOs), which are the measurable goals for the service quality2. When a service exceeds its error budget, it means that it has violated its SLOs and may have negatively impacted the users. In this case, the SRE team should notify the product team that their error budget is used up and negotiate with them for a launch freeze or a lower SLO3. A launch freeze means that no new features are deployed until the service reliability is restored. A lower SLO means that the product team accepts a slightly worse user experience in exchange for launching new features. Both options require a trade-off between reliability and innovation, and should be agreed upon by both teams.

The other options are incorrect because they do not follow the SRE practices. Option B is incorrect because it violates the principle of error budget autonomy, which means that each service should have its own error budget and SLOs, and should not borrow or reallocate them from other services4. Option C is incorrect because it does not address the root cause of the error budget overspend, and may create unrealistic expectations for the service reliability. Option D is incorrect because it does not prevent the possibility of introducing new errors or bugs with the feature launch, which may further degrade the service quality and user satisfaction.

Error Budgets, Error Budgets. Service Level Objectives, Service Level Objectives. Error Budget Policies, Error Budget Policies. Error Budget Autonomy, Error Budget Autonomy.

asked 18/09/2024
William Hanna
30 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first