-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.14
This comes from this bug https://issues.redhat.com/browse/OCPBUGS-29940
After applying the workaround suggested [1][2] with "oc adm must-gather --node-name" we found another issue where must-gather creates the debug pod on all master nodes and gets stuck for a while because the script gather_network_logs_basics loop. Filtering out the NotReady nodes would allow us to apply the workaround.
The script gather_network_logs_basics gets the master nodes by label (node-role.kubernetes.io/master) and saves them in the CLUSTER_NODES variable. It then passes this as a parameter to the function gather_multus_logs $CLUSTER_NODES, where it loops through the list of master nodes and performs debugging for each node.
collection-scripts/gather_network_logs_basics
...
CLUSTER_NODES="${@:-$(oc get node -l node-role.kubernetes.io/master -oname)}"
/usr/bin/gather_multus_logs $CLUSTER_NODES
...
collection-scripts/gather_multus_logs ... function gather_multus_logs { for NODE in "$@"; do nodefilename=$(echo "$NODE" | sed -e 's|node/||') out=$(oc debug "${NODE}" -- \ /bin/bash -c "cat $INPUT_LOG_PATH" 2>/dev/null) && echo "$out" 1> "${OUTPUT_LOG_PATH}/multus-log-$nodefilename.log" done }
This could be resolved with something similar to this:
CLUSTER_NODES="${@:-$(oc get node -l node-role.kubernetes.io/master -o json | jq -r '.items[] | select(.status.conditions[] | select(.type=="Ready" and .status=="True")).metadata.name')}"
/usr/bin/gather_multus_logs $CLUSTER_NODES
[1] - https://access.redhat.com/solutions/6962230
[2] - https://issues.redhat.com/browse/OCPBUGS-29940
- clones
-
OCPBUGS-33959 gather_network_logs_basics script when node is in the NotReady
- Closed
- depends on
-
OCPBUGS-33959 gather_network_logs_basics script when node is in the NotReady
- Closed
- is cloned by
-
OCPBUGS-43052 gather_network_logs_basics script when node is in the NotReady [backport 4.16]
- Closed
- is depended on by
-
OCPBUGS-43052 gather_network_logs_basics script when node is in the NotReady [backport 4.16]
- Closed
- links to
-
RHBA-2024:9610 OpenShift Container Platform 4.17.z bug fix update