-
Bug
-
Resolution: Not a Bug
-
Normal
-
None
-
4.15
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
OCP Node Sprint 272 (Blue)
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
During a chaotic scenario on an ARO cluster where we block inbound and outbound traffic to a node, the worker node never reports notready status. Master nodes aren't able to ping/make a connection to the worker node and no status changes
Version-Release number of selected component (if applicable):
4.15.35
How reproducible:
100%
Steps to Reproduce:
1. Create ARO cluster 2. Find Private IP address of one of the worker nodes 3. Create "chaos" network security group that blocks With rules: Inbound Deny with 0.0.0.0 Port * to Destination: Ip address of worker node with port *, any protocol Outbound Deny with Ip address of worker node with port * with destination of 0.0.0.0 Port * , any protocol 4. Find the virtual network for the set of nodes, set the worker-subnet security group to the chaos security group just created
Actual results:
All nodes stay ready through chaos blocking communication. Can't perform oc debug or get pods running on it % oc debug node/prubenda-aro5-jg26m-worker-northcentralus-tkgn6 Starting pod/prubenda-aro5-jg26m-worker-northcentralus-tkgn6-debug-8gt5p ... To use host binaries, run `chroot /host` Pod IP: 10.0.2.5 If you don't see a command prompt, try pressing enter.
Expected results:
When communication is blocked to the node, the node goes not ready
Additional info:
% oc get nodes NAME STATUS ROLES AGE VERSION prubenda-aro5-jg26m-master-0 Ready control-plane,master 8h v1.28.13+2ca1a23 prubenda-aro5-jg26m-master-1 Ready control-plane,master 8h v1.28.13+2ca1a23 prubenda-aro5-jg26m-master-2 Ready control-plane,master 8h v1.28.13+2ca1a23 prubenda-aro5-jg26m-worker-northcentralus-84znv Ready worker 8h v1.28.13+2ca1a23 prubenda-aro5-jg26m-worker-northcentralus-tkgn6 Ready worker 8h v1.28.13+2ca1a23 prubenda-aro5-jg26m-worker-northcentralus-xszjq Ready worker 8h v1.28.13+2ca1a23 // test connection from master to worker % oc debug node/prubenda-aro5-jg26m-master-0 Starting pod/prubenda-aro5-jg26m-master-0-debug-c8clg ... To use host binaries, run `chroot /host` chroot /hos Pod IP: 10.0.0.9 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-5.1# ping 10.0.2.5. // try communicating with blocked worker ip PING 10.0.2.5 (10.0.2.5) 56(84) bytes of data. ^C --- 10.0.2.5 ping statistics --- 25 packets transmitted, 0 received, 100% packet loss, time 24563ms sh-5.1# ^C sh-5.1# ping 10.0.2.4 // communication to a second worker PING 10.0.2.4 (10.0.2.4) 56(84) bytes of data. 64 bytes from 10.0.2.4: icmp_seq=1 ttl=64 time=2.12 ms 64 bytes from 10.0.2.4: icmp_seq=2 ttl=64 time=1.28 ms 64 bytes from 10.0.2.4: icmp_seq=3 ttl=64 time=1.15 ms 64 bytes from 10.0.2.4: icmp_seq=4 ttl=64 time=1.15 ms ^C --- 10.0.2.4 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3004ms rtt min/avg/max/mdev = 1.152/1.425/2.119/0.403 ms sh-5.1# exit exit sh-4.4# exit exit Removing debug pod ... // test connection from worker node to worker node % oc debug node/prubenda-aro5-jg26m-worker-northcentralus-84znv Starting pod/prubenda-aro5-jg26m-worker-northcentralus-84znv-debug-4z9l5 ... To use host binaries, run `chroot /host` chroot /host Pod IP: 10.0.2.6 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-5.1# ping 10.0.2.4. // ping to other worker node PING 10.0.2.4 (10.0.2.4) 56(84) bytes of data. 64 bytes from 10.0.2.4: icmp_seq=1 ttl=64 time=2.13 ms 64 bytes from 10.0.2.4: icmp_seq=2 ttl=64 time=6.10 ms 64 bytes from 10.0.2.4: icmp_seq=3 ttl=64 time=0.720 ms 64 bytes from 10.0.2.4: icmp_seq=4 ttl=64 time=1.64 ms ^C --- 10.0.2.4 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3020ms rtt min/avg/max/mdev = 0.720/2.647/6.100/2.056 ms sh-5.1# ^C sh-5.1# 10.0.2.5 sh: 10.0.2.5: command not found sh-5.1# ping 10.0.2.5 PING 10.0.2.5 (10.0.2.5) 56(84) bytes of data. ^C --- 10.0.2.5 ping statistics --- 6 packets transmitted, 0 received, 100% packet loss, time 5111ms sh-5.1# ^C // Get nodes during chaos % oc get nodes NAME STATUS ROLES AGE VERSION prubenda-aro5-jg26m-master-0 Ready control-plane,master 8h v1.28.13+2ca1a23 prubenda-aro5-jg26m-master-1 Ready control-plane,master 8h v1.28.13+2ca1a23 prubenda-aro5-jg26m-master-2 Ready control-plane,master 8h v1.28.13+2ca1a23 prubenda-aro5-jg26m-worker-northcentralus-84znv Ready worker 8h v1.28.13+2ca1a23 prubenda-aro5-jg26m-worker-northcentralus-tkgn6 Ready worker 8h v1.28.13+2ca1a23 prubenda-aro5-jg26m-worker-northcentralus-xszjq Ready worker 8h v1.28.13+2ca1a23