-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.14.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
The node running the etcd-2 pod for a cluster cannot drain the etcd-2 pod because it would violate the PDB, because etcd-0 is stuck in a pending state. etcd-0 is stuck in a pending state, we believe, because the PVC is bound to a PV that says it should live in the ap-southeast-2c AZ. This is causing the scheduler for etcd-0 to specifically not schedule the etcd-0 pod because of podAntiAffinity rules and specifically a volume node affinity conflict reported on the pod itself. etcd-1, however, also is living in the ap-southeast-2c AZ, though that pod is working and running without issue. etcd-2 is scheduled to the ap-southeast-2a AZ.
Version-Release number of selected component (if applicable):
The HyperShift operator running is on the image quay.io/acm-d/rhtap-hypershift-operator:1986ca3
How reproducible:
Not sure - we have at least two HCP clusters seeing this issue.
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
We expect the PVs to be bound to separate availability zones, but somehow they are binding to availability zones that may already have PVs in them. We believe this may be seen because of automation we run to remove nodes after they reach a certain age.
Additional info:
- links to