-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
4.12, 4.11
-
Important
-
No
-
1
-
NE Sprint 257
-
1
-
Rejected
-
False
-
Description of problem:
In short, the DNS operator is shown as unavailable, and it's blocking an upgrade from 4.11.55 -> 4.12.48. It seems to me that the DNS operator is doing fine, but I'm wondering if the presence of a second dns.operator object is causing issues reconciling the operator's status and that's why the operator is shown as being unavailable. It could still be something else though. The details: - DNS cluster operator status shows that the 'DNS "default" is unavailable'. However, the expected number of DNS pods is running, and their logs don't show anything unusual or concerning. So the operator appears to be working fine to me. - The cluster is in the middle of an upgrade from 4.11 -> 4.12, and the DNS operator's unavailable status is the last thing that's preventing the upgrade from finishing. - The customer seems to have added a second dns.operator object to the cluster with the name example, and DNS operator logs show 'skipping unexpected dns example'. (Could this cause issues reconciling the DNS operator's status?)
Version-Release number of selected component (if applicable):
Image pullspec I got in Slack from art-bot (sorry if this is not what was meant to go in this field): docker://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:1cbf7bae1c097ffde786d7911c4095d0e654b0e98a4f1763c263a5a07f0620c8
How reproducible:
Not sure - I am an ARO SRE opening this Jira after seeing this issue on one particular production ARO cluster.
Steps to Reproduce:
N/A - not sure.
Actual results:
DNS operator is shown as unavailable and cluster upgrade is blocked.
Expected results:
DNS operator is temporarily unavailable during upgrade and then returns to a healthy status, allowing the upgrade to complete.
Additional info:
Should we put some sort of guardrail in place to prevent customers from creating a second dns.operator?