-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.13.z
Description of problem:
- Observed that after upgrade to 4.13.30 (from 4.13.24) On all nodes/projects (replicated on two clusters that underwent the same upgrade) - traffic routed from HostNetworked pods (router-default) calling to backends intermittently timeout/fail to reach their destination.
- This manifests as the router pods marking backends as DOWN and dropping traffic; but The behavior can be replicated with curl outside of the HAProxy pods via entering a debug shell to a host node (or SSH) and curling the pod IP directly. A significant percentage of packets time out to the target backend on intermittent subsequent calls.
- We narrowed the behavior down to the moment we applied the NetworkPolicy for `allow-from-ingress` as outlined below - immediately the namespace began to drop packets on a curl loop running from an infra node directly against the pod IP (some 2-3% of all calls timed out).
apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-from-openshift-ingress namespace: testing spec: ingress: - from: - namespaceSelector: matchLabels: policy-group.network.openshift.io/ingress: "" podSelector: {} policyTypes: - Ingress
Version-Release number of selected component (if applicable):
How reproducible:
- every time, all namespaces with this network policy on this clusterversion (replicated on two clusters that underwent the same upgrade).
Steps to Reproduce:
1. Upgrade cluster to 4.13.30
2. Apply test pod running basic HTTP instance at random port
3. Apply networkpolicy to allow-from-ingress and begin curl loop against target pod directly from ingressnode (or other worker node) at host chroot level (nodeIP).
4. Observe that curls time out intermittently --> replicator curl loop is below (note inclusion of --connect-timeout flag to help allow loop to continue more rapidly without waiting for full 2m connect timeout on typical syn failure).
$ while true; do curl --connect-timeout 5 --noproxy '*' -k -w "dnslookup: %{time_namelookup} | connect: %{time_connect} | appconnect: %{time_appconnect} | pretransfer: %{time_pretransfer} | starttransfer: %{time_starttransfer} | total: %{time_total} | size: %{size_download} | response: %{response_code}\n" -o /dev/null -s https://<POD>:<PORT>; done
Actual results:
- Traffic to all backends is dropped/degraded as a result of this intermittent failure marking valid/healthy pods as unavailable due to the connection failure to the backends.
Expected results:
- traffic should not be iimpeded, especially when the application of the networkpolicy to allow said traffic is implemented.
Additional info:
- This behavior began immediately after completed upgrade from 4.13.24 to 4.13.30 and has been replicated on two separate clusters.
- Customer has been forced to reinstall a cluster at downgraded version to ensure stability/deliverables for their user-base and this is a critical impact outage scenario for them
- additional required template details in first comment below.
RCA UPDATE:
So the problem is that host-network namespace is not labeled by ingress controller and if router pods are hostNetworked, network policy with `policy-group.network.openshift.io/ingress: ""` selector won't allow incoming connections. To reproduce, we need to run ingress controller with `EndpointPublishingStrategy=HostNetwork` https://docs.openshift.com/container-platform/4.14/networking/nw-ingress-controller-endpoint-publishing-strategies.html and then check host-network namespace labels with
oc get ns openshift-host-network --show-labels
# expected this
kubernetes.io/metadata.name=openshift-host-network,network.openshift.io/policy-group=ingress,policy-group.network.openshift.io/host-network=,policy-group.network.openshift.io/ingress=
# but before the fix you will see
kubernetes.io/metadata.name=openshift-host-network,policy-group.network.openshift.io/host-network=
Another way to verify this is the same problem (disruptive, only recommended for test environments) is to make CNO unmanaged
oc scale deployment cluster-version-operator -n openshift-cluster-version --replicas=0 oc scale deployment network-operator -n openshift-network-operator --replicas=0
and then label openshift-host-network namespace manually based on expected labels ^ and see if the problem disappears
Potentially affected versions (may need to reproduce to confirm)
4.16.0, 4.15.0, 4.14.0 since https://issues.redhat.com//browse/OCPBUGS-8070
4.13.30 https://issues.redhat.com/browse/OCPBUGS-22293
4.12.48 https://issues.redhat.com/browse/OCPBUGS-24039
Mitigation/support KCS:
https://access.redhat.com/solutions/7055050
- clones
-
OCPBUGS-28920 OCP 4.13.30 - allow-from-ingress NetworkPolicy does not consistently allow traffic from HostNetworked pods or from node IP's (packet timeout)
- Closed
- depends on
-
OCPBUGS-28920 OCP 4.13.30 - allow-from-ingress NetworkPolicy does not consistently allow traffic from HostNetworked pods or from node IP's (packet timeout)
- Closed
- is cloned by
-
OCPBUGS-29300 [4.14] OCP 4.13.30 - allow-from-ingress NetworkPolicy does not consistently allow traffic from HostNetworked pods or from node IP's (packet timeout)
- Closed
- is depended on by
-
OCPBUGS-29300 [4.14] OCP 4.13.30 - allow-from-ingress NetworkPolicy does not consistently allow traffic from HostNetworked pods or from node IP's (packet timeout)
- Closed
- links to
-
RHSA-2023:7198 OpenShift Container Platform 4.15 security update