Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: openshift-4.12.z
Component/s: Network Edge
Labels:
None

Blocked:
False
Blocked Reason:
None
Ready:
False
Color Status:
Not Selected
Intelligence Requested:
Market:
PX Impact Score:
PX Priority Data:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

1. Proposed title of this feature request
Split CoreDNSErrorsHigh alert rule by creating one alert rule for each plugin in the default root zone.

2. What is the nature and description of the request?
At the moment this alert rule is based on a certain percentage of SERVFAIL response code logged in the DNS pods for the entire root zone.
Looking at the upstream documentation, it seems there is also a possibility to check for other results like plugin generating the respective response code:

coredns_dns_responses_total

{server, zone, view, rcode, plugin}

- response per zone, rcode and plugin.

Since the default root zone includes both forward and kubernetes plugins, customer would like to request that we would have 2 alert rules like for example:

CoreDNSErrorsHigh would gather SERVFAIL responses from the kubernetes plugin;
UpstreamDNSErrorsHigh would gather SERVFAIL responses from all upstream resolvers.

3. Why does the customer need this? (List the business requirements here)
This would help to quickly narrow down the origin of the errors and spend less time looking around within and outside of the platform what is actually occurring and where it does.
Right now the silencer would apply to the entire alert which is not the best option because customers would miss possible important alerts on internal DNS errors.

4. List any affected packages or components.
CoreDNS Prometheus metrics

account is impacted by

OCPBUGS-51193 Add runbook link to CoreDNSErrorsHigh

Verified

Assignee:: Marc Curry

Reporter:: Andre Costa

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2024/06/19 10:00 AM

Updated:: 2025/02/24 2:26 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates