-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
odf-4.17, odf-4.16
Description of problem - Provide a detailed description of the issue encountered, including logs/command-output snippets and screenshots if the issue is observed in the UI:
Upon storage deployment, the StorageCluster typically utilizes nodes available in a single machine pool. During new deployments and day-two node operations, the StorageCluster can operate across multiple machine pools but* osd pod* fails to relocate to a new node and stuck in Pending.
OSD description
topology.rook.io/rack:DoNotSchedule when max skew 1 is exceeded for selector app in (rook-ceph-osd) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 40s default-scheduler 0/4 nodes are available: 1 node(s) were unschedulable, 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.
StorageCluster nodeTopology:
status: ... nodeTopologies: labels: failure-domain.beta.kubernetes.io/region: - us-west-2 failure-domain.beta.kubernetes.io/zone: - us-west-2a kubernetes.io/hostname: - ip-10-0-0-235.us-west-2.compute.internal - ip-10-0-0-60.us-west-2.compute.internal - ip-10-0-0-71.us-west-2.compute.internal - ip-10-0-0-94.us-west-2.compute.internal topology.rook.io/rack: - rack0 - rack1 - rack2 phase: Ready
Investigation from kmajumder@redhat.com
The new node is ip-10-0-0-145.us-west-2.compute.internal if I am not wrong. Should have been labeled as rack1. I will check in code where the new nodes get labeled from.
oc get nodes -o custom-columns='NAME:.metadata.name,STATUS:.status.conditions[-1].type,TAINTS:.spec.taints[*].key,RACK:.metadata.labels.topology\.rook\.io/rack,INSTANCE_TYPE:.metadata.labels.beta\.kubernetes\.io/instance-type' NAME STATUS TAINTS RACK INSTANCE_TYPE ip-10-0-0-109.us-west-2.compute.internal Ready <none> rack0 m5.12xlarge ip-10-0-0-145.us-west-2.compute.internal Ready <none> rack0 m5.xlarge ip-10-0-0-149.us-west-2.compute.internal Ready node.kubernetes.io/unschedulable rack1 m5.12xlarge ip-10-0-0-40.us-west-2.compute.internal Ready <none> rack2 m5.12xlarge
The OCP platform infrastructure and deployment type (AWS, Bare Metal, VMware, etc. Please clarify if it is platform agnostic deployment), (IPI/UPI):
ROSA HCP
The ODF deployment type (Internal, External, Internal-Attached (LSO), Multicluster, DR, Provider, etc):
Internal
The version of all relevant components (OCP, ODF, RHCS, ACM whichever is applicable):
Does this issue impact your ability to continue to work with the product?
Yes
Is there any workaround available to the best of your knowledge?
Label node with rack label manually
Can this issue be reproduced? If so, please provide the hit rate
yes 5/5
Can this issue be reproduced from the UI?
yes
If this is a regression, please provide more details to justify this:
new deployment type
Steps to Reproduce:
1. Run create machinepool with node and label it with "openshif-storage" tag
2. Select any node with osd and cordon it
3. delete OSD pod on unscheduled node
4. verify all OSD pods are running
5. verify rebalancing complete in reasonable time
The exact date and time when the issue was observed, including timezone details:
Actual results:
OSD pod stuck in pending, CEPH cluster not healthy
Expected results:
The StorageCluster can operate across multiple machine pools. Rebalancing complete in reasonable time
Logs collected and log location:
ocs and ocp must-gather: https://url.corp.redhat.com/e7a31de
logs collected after issue was fixed by @Kaustav Majumder
Additional info:
Should be addressed on ROSA HCP 4.16 and 4.17
We need to ensure nodes across different machinepool may be used on a new StorageSystem installation