-
Bug
-
Resolution: Done
-
Undefined
-
None
-
4.13.z
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
No
-
None
-
None
-
None
-
None
-
+
-
None
-
None
-
None
-
-
-
-
None
Description of problem:
Cluster is reporting different values in different endpoints. Example: * From node description: 25 * From csinode: 39 * Machine Type: m7a.xlarge ===== * From node description: 25 * From csinode: 11 * Machine Type: r5.2xlarge Deleting the csinode object and the csinode pod did not help
Version-Release number of selected component (if applicable):
4.13.39
Steps to Reproduce:
I tried to reproduce, but could not have a exact match of output, besides the fact, that even having an EBS volume running, none of these endpoints were updated. I would expect if my instance X have 2 EBS volumes, then both node object AND csinode object.count would return a calculus of: TOTAL CAPACITY - AMOUNT OF CURRENT VOLS IN USE, which then would be 23. for i in $(oc get nodes --no-headers|grep -v master | awk '{ print $1 }'); do oc get node $i -o json | jq -r '{"Instance": .metadata.labels."node.kubernetes.io/instance-type","VolumesInUse": .status.volumesInUse,"Capacity-Attachable-aws-ebs": .status.capacity."attachable-volumes-aws-ebs"}' ; done { "Instance": "r5.xlarge", "VolumesInUse": [ "kubernetes.io/csi/ebs.csi.aws.com^vol-0619d3d702085ea87", "kubernetes.io/csi/ebs.csi.aws.com^vol-0d27cf00137d5b2a9" ], "Capacity-Attachable-aws-ebs": "25" } { "Instance": "m5.xlarge", "VolumesInUse": [ "kubernetes.io/csi/ebs.csi.aws.com^vol-05308553453c98956" ], "Capacity-Attachable-aws-ebs": "25" } { "Instance": "m5.xlarge", "VolumesInUse": null, "Capacity-Attachable-aws-ebs": "25" } { "Instance": "m5.xlarge", "VolumesInUse": null, "Capacity-Attachable-aws-ebs": "25" } { "Instance": "r5.xlarge", "VolumesInUse": [ "kubernetes.io/csi/ebs.csi.aws.com^vol-0f45810f5e6017def", "kubernetes.io/csi/ebs.csi.aws.com^vol-0fc0812941b66eb15" ], "Capacity-Attachable-aws-ebs": "25" } for i in $(oc get nodes --no-headers -l node-role.kubernetes.io/worker= | grep -v infra | awk '{ print $1 }'); do oc get csinode $i -o json | jq -r '.metadata.name,.spec.drivers[]' ; done ip-10-0-157-188.eu-west-2.compute.internal { "allocatable": { "count": 25 }, "name": "ebs.csi.aws.com", "nodeID": "i-028f163741bc9a192", "topologyKeys": [ "topology.ebs.csi.aws.com/zone" ] } ip-10-0-201-140.eu-west-2.compute.internal { "allocatable": { "count": 25 }, "name": "ebs.csi.aws.com", "nodeID": "i-023fd8a280a1199a3", "topologyKeys": [ "topology.ebs.csi.aws.com/zone" ] } ip-10-0-222-63.eu-west-2.compute.internal { "allocatable": { "count": 25 }, "name": "ebs.csi.aws.com", "nodeID": "i-047eb4c719a8d984e", "topologyKeys": [ "topology.ebs.csi.aws.com/zone" ] } $ oc get volumeattachments.storage.k8s.io csi-85b05528f1e0d71ec7fd15e6e01ca68b8056be0721590f1929cbf41fb6ee63fc -o yaml apiVersion: storage.k8s.io/v1 kind: VolumeAttachment metadata: annotations: csi.alpha.kubernetes.io/node-id: i-028f163741bc9a192 creationTimestamp: "2024-04-23T09:27:22Z" finalizers: - external-attacher/ebs-csi-aws-com name: csi-85b05528f1e0d71ec7fd15e6e01ca68b8056be0721590f1929cbf41fb6ee63fc resourceVersion: "87817" uid: e4c45eb6-3af1-454f-ba13-cb62f5f09cfe spec: attacher: ebs.csi.aws.com nodeName: ip-10-0-157-188.eu-west-2.compute.internal source: persistentVolumeName: pvc-84290bca-5129-4b42-ad58-044b353617e1 status: attached: true attachmentMetadata: devicePath: /dev/xvdba
Actual results:
Expected results:
1. Which source the scheduler uses? The main concern here is because nor us nor the customer knows what to believe in terms of "where should we check for the amount of available volumes in a instance"
Additional info:
- Checked similar issues: https://issues.redhat.com/browse/OCPBUGS-23260 https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/1258 https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/1258
- Must-gather available at support case: https://attachments.access.redhat.com/hydra/rest/cases/03794810/attachments/b8cf565a-afb5-4b9c-8e1a-1d3ae99deca6?usePresignedUrl=true
- is triggering
-
STOR-2350 Corrective Measure for OCPBUGS-32738: OCP 4.13.39 Different values for volumes in different endpoints
-
- Closed
-