-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
4.19
-
Critical
-
Yes
-
5
-
MCO Sprint 267
-
1
-
Rejected
-
False
-
-
Release Note Not Required
-
In Progress
Description of problem:
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
Issue found in prow ci
periodic-ci-openshift-openshift-tests-private-release-4.19-multi-nightly-gcp-ipi-ovn-ipsec-arm-mixarch-f14 #1890061783440297984
periodic-ci-openshift-openshift-tests-private-release-4.19-multi-nightly-gcp-ipi-ovn-ipsec-amd-mixarch-f28-destructive #1890035862469611520
periodic-ci-openshift-openshift-tests-private-release-4.19-multi-nightly-gcp-ipi-ovn-ipsec-arm-mixarch-f14 #1890279505117843456
must-gather logs for second one https://gcsweb-qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/qe-private-deck/logs/periodic-ci-o[…]r-must-gather/artifacts/must-gather.tar
% omg get nodes NAME STATUS ROLES AGE VERSION ci-op-9pmd0iim-3eaf1-dcw66-master-0 Ready control-plane,master 1h12m v1.32.1 ci-op-9pmd0iim-3eaf1-dcw66-master-1 Ready control-plane,master 1h13m v1.32.1 ci-op-9pmd0iim-3eaf1-dcw66-master-2 Ready control-plane,master 1h11m v1.32.1 ci-op-9pmd0iim-3eaf1-dcw66-worker-a-d6sw7 Ready worker 1h0m v1.32.1 ci-op-9pmd0iim-3eaf1-dcw66-worker-b-97qfp Ready worker 58m v1.32.1 % omg get pods -n openshift-ovn-kubernetes -o wide NAME READY STATUS RESTARTS AGE IP NODE ovn-ipsec-host-2qfqh 2/2 Running 0 33m 10.0.0.4 ci-op-9pmd0iim-3eaf1-dcw66-master-2 ovn-ipsec-host-bqh5n 0/2 Pending 0 33m 10.0.128.3 ci-op-9pmd0iim-3eaf1-dcw66-worker-b-97qfp ovn-ipsec-host-hdjtx 2/2 Running 0 33m 10.0.0.3 ci-op-9pmd0iim-3eaf1-dcw66-master-1 ovn-ipsec-host-jwn8s 2/2 Running 0 33m 10.0.0.6 ci-op-9pmd0iim-3eaf1-dcw66-master-0 ovn-ipsec-host-n4cpv 0/2 Pending 0 33m 10.0.128.2 ci-op-9pmd0iim-3eaf1-dcw66-worker-a-d6sw7 ovnkube-control-plane-85cbb47f9d-n6rps 2/2 Running 1 55m 10.0.0.6 ci-op-9pmd0iim-3eaf1-dcw66-master-0 ovnkube-control-plane-85cbb47f9d-slb94 2/2 Running 0 47m 10.0.0.3 ci-op-9pmd0iim-3eaf1-dcw66-master-1 ovnkube-node-2hwb6 8/8 Running 0 1h0m 10.0.128.2 ci-op-9pmd0iim-3eaf1-dcw66-worker-a-d6sw7 ovnkube-node-9nhj6 8/8 Running 1 53m 10.0.0.4 ci-op-9pmd0iim-3eaf1-dcw66-master-2 ovnkube-node-h2fd2 8/8 Running 2 53m 10.0.0.3 ci-op-9pmd0iim-3eaf1-dcw66-master-1 ovnkube-node-hwng4 8/8 Running 0 56m 10.0.0.6 ci-op-9pmd0iim-3eaf1-dcw66-master-0 ovnkube-node-k6rfl 8/8 Running 0 58m 10.0.128.3 ci-op-9pmd0iim-3eaf1-dcw66-worker-b-97qfp
% omg get pod ovn-ipsec-host-n4cpv -n openshift-ovn-kubernetes -o yaml apiVersion: v1 kind: Pod metadata: annotations: cluster-autoscaler.kubernetes.io/enable-ds-eviction: 'false' creationTimestamp: '2025-02-13T14:54:05Z' generateName: ovn-ipsec-host- labels: app: ovn-ipsec component: network controller-revision-hash: 8b4dd5dc7 kubernetes.io/os: linux openshift.io/component: network pod-template-generation: '1' type: infra managedFields: - apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:cluster-autoscaler.kubernetes.io/enable-ds-eviction: {} f:target.workload.openshift.io/management: {} f:generateName: {} f:labels: .: {} f:app: {} f:component: {} f:controller-revision-hash: {} f:kubernetes.io/os: {} f:openshift.io/component: {} f:pod-template-generation: {} f:type: {} f:ownerReferences: .: {} k:{"uid":"61870386-d205-465b-832c-061c3bf7366e"}: {} f:spec: f:affinity: .: {} f:nodeAffinity: .: {} f:requiredDuringSchedulingIgnoredDuringExecution: {} f:containers: k:{"name":"ovn-ipsec"}: .: {} f:command: {} f:env: .: {} k:{"name":"K8S_NODE"}: .: {} f:name: {} f:valueFrom: .: {} f:fieldRef: {} f:image: {} f:imagePullPolicy: {} f:lifecycle: .: {} f:preStop: .: {} f:exec: .: {} f:command: {} f:livenessProbe: .: {} f:exec: .: {} f:command: {} f:failureThreshold: {} f:initialDelaySeconds: {} f:periodSeconds: {} f:successThreshold: {} f:timeoutSeconds: {} f:name: {} f:resources: .: {} f:requests: .: {} f:cpu: {} f:memory: {} f:securityContext: .: {} f:privileged: {} f:terminationMessagePath: {} f:terminationMessagePolicy: {} f:volumeMounts: .: {} k:{"mountPath":"/etc"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/etc/cni/net.d"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/etc/openvswitch"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/usr/libexec/ipsec"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/usr/sbin/ipsec"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/var/lib"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/var/log/openvswitch/"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/var/run"}: .: {} f:mountPath: {} f:name: {} k:{"name":"ovn-ipsec-cleanup"}: .: {} f:command: {} f:image: {} f:imagePullPolicy: {} f:name: {} f:resources: .: {} f:requests: .: {} f:cpu: {} f:memory: {} f:securityContext: .: {} f:privileged: {} f:terminationMessagePath: {} f:terminationMessagePolicy: {} f:volumeMounts: .: {} k:{"mountPath":"/etc"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/etc/ovn/"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/var/run"}: .: {} f:mountPath: {} f:name: {} f:dnsPolicy: {} f:enableServiceLinks: {} f:hostNetwork: {} f:hostPID: {} f:initContainers: .: {} k:{"name":"ovn-keys"}: .: {} f:command: {} f:env: .: {} k:{"name":"K8S_NODE"}: .: {} f:name: {} f:valueFrom: .: {} f:fieldRef: {} f:image: {} f:imagePullPolicy: {} f:name: {} f:resources: .: {} f:requests: .: {} f:cpu: {} f:memory: {} f:securityContext: .: {} f:privileged: {} f:terminationMessagePath: {} f:terminationMessagePolicy: {} f:volumeMounts: .: {} k:{"mountPath":"/etc"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/etc/openvswitch"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/etc/ovn/"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/signer-ca"}: .: {} f:mountPath: {} f:name: {} k:{"mountPath":"/var/run"}: .: {} f:mountPath: {} f:name: {} f:nodeSelector: {} f:priorityClassName: {} f:restartPolicy: {} f:schedulerName: {} f:securityContext: {} f:serviceAccount: {} f:serviceAccountName: {} f:terminationGracePeriodSeconds: {} f:tolerations: {} f:volumes: .: {} k:{"name":"etc-openvswitch"}: .: {} f:hostPath: .: {} f:path: {} f:type: {} f:name: {} k:{"name":"etc-ovn"}: .: {} f:hostPath: .: {} f:path: {} f:type: {} f:name: {} k:{"name":"host-cni-netd"}: .: {} f:hostPath: .: {} f:path: {} f:type: {} f:name: {} k:{"name":"host-etc"}: .: {} f:hostPath: .: {} f:path: {} f:type: {} f:name: {} k:{"name":"host-var-lib"}: .: {} f:hostPath: .: {} f:path: {} f:type: {} f:name: {} k:{"name":"host-var-log-ovs"}: .: {} f:hostPath: .: {} f:path: {} f:type: {} f:name: {} k:{"name":"host-var-run"}: .: {} f:hostPath: .: {} f:path: {} f:type: {} f:name: {} k:{"name":"ipsec-bin"}: .: {} f:hostPath: .: {} f:path: {} f:type: {} f:name: {} k:{"name":"ipsec-lib"}: .: {} f:hostPath: .: {} f:path: {} f:type: {} f:name: {} k:{"name":"signer-ca"}: .: {} f:configMap: .: {} f:defaultMode: {} f:name: {} f:name: {} manager: kube-controller-manager operation: Update time: '2025-02-13T14:54:04Z' - apiVersion: v1 fieldsType: FieldsV1 fieldsV1: f:status: f:conditions: k:{"type":"ContainersReady"}: .: {} f:lastProbeTime: {} f:lastTransitionTime: {} f:message: {} f:reason: {} f:status: {} f:type: {} k:{"type":"Initialized"}: .: {} f:lastProbeTime: {} f:lastTransitionTime: {} f:message: {} f:reason: {} f:status: {} f:type: {} k:{"type":"PodReadyToStartContainers"}: .: {} f:lastProbeTime: {} f:lastTransitionTime: {} f:status: {} f:type: {} k:{"type":"Ready"}: .: {} f:lastProbeTime: {} f:lastTransitionTime: {} f:message: {} f:reason: {} f:status: {} f:type: {} f:containerStatuses: {} f:hostIP: {} f:hostIPs: {} f:initContainerStatuses: {} f:podIP: {} f:podIPs: .: {} k:{"ip":"10.0.128.2"}: .: {} f:ip: {} f:startTime: {} manager: kubelet operation: Update subresource: status time: '2025-02-13T14:54:05Z' name: ovn-ipsec-host-n4cpv namespace: openshift-ovn-kubernetes ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: DaemonSet name: ovn-ipsec-host uid: 61870386-d205-465b-832c-061c3bf7366e resourceVersion: '38812' uid: ce7f6619-3015-414d-9de4-5991d74258fd spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchFields: - key: metadata.name operator: In values: - ci-op-9pmd0iim-3eaf1-dcw66-worker-a-d6sw7 containers: - command: - /bin/bash - -c - "#!/bin/bash\nset -exuo pipefail\n\n# Don't start IPsec until ovnkube-node has\ \ finished setting up the node\ncounter=0\nuntil [ -f /etc/cni/net.d/10-ovn-kubernetes.conf\ \ ]\ndo\n counter=$((counter+1))\n sleep 1\n if [ $counter -gt 300 ];\n \ \ then\n echo \"ovnkube-node pod has not started after $counter seconds\"\ \n exit 1\n fi\ndone\necho \"ovnkube-node has configured node.\"\n\ \nif ! pgrep pluto; then\n echo \"pluto is not running, enable the service\ \ and/or check system logs\"\n exit 2\nfi\n\n# The ovs-monitor-ipsec doesn't\ \ set authby, so when it calls ipsec auto --start\n# the default ones defined\ \ at Libreswan's compile time will be used. On restart,\n# Libreswan will use\ \ authby from libreswan.config. If libreswan.config is\n# incompatible with\ \ the Libreswan's compiled-in defaults, then we'll have an\n# authentication\ \ problem. But OTOH, ovs-monitor-ipsec does set ike and esp algorithms,\n# so\ \ those may be incompatible with libreswan.config as well. Hence commenting\ \ out the\n# \"include\" from libreswan.conf to avoid such conflicts.\ndefaultcpinclude=\"\ include \\/etc\\/crypto-policies\\/back-ends\\/libreswan.config\"\nif ! grep\ \ -q \"# ${defaultcpinclude}\" /etc/ipsec.conf; then\n sed -i \"/${defaultcpinclude}/s/^/#\ \ /\" /etc/ipsec.conf\n # since pluto is on the host, we need to restart it\ \ after changing connection\n # parameters.\n chroot /proc/1/root ipsec restart\n\ \n counter=0\n until [ -r /run/pluto/pluto.ctl ]; do\n counter=$((counter+1))\n\ \ sleep 1\n if [ $counter -gt 300 ];\n then\n echo \"ipsec has\ \ not started after $counter seconds\"\n exit 1\n fi\n done\n echo\ \ \"ipsec service is restarted\"\nfi\n\n# Workaround for https://github.com/libreswan/libreswan/issues/373\n\ ulimit -n 1024\n\n/usr/libexec/ipsec/addconn --config /etc/ipsec.conf --checkconfig\n\ # Check kernel modules\n/usr/libexec/ipsec/_stackmanager start\n# Check nss\ \ database status\n/usr/sbin/ipsec --checknss\n\n# Start ovs-monitor-ipsec which\ \ will monitor for changes in the ovs\n# tunnelling configuration (for example\ \ addition of a node) and configures\n# libreswan appropriately.\n# We are running\ \ this in the foreground so that the container will be restarted when ovs-monitor-ipsec\ \ fails.\n/usr/libexec/platform-python /usr/share/openvswitch/scripts/ovs-monitor-ipsec\ \ \\\n --pidfile=/var/run/openvswitch/ovs-monitor-ipsec.pid --ike-daemon=libreswan\ \ --no-restart-ike-daemon \\\n --ipsec-conf /etc/ipsec.d/openshift.conf --ipsec-d\ \ /var/lib/ipsec/nss \\\n --log-file --monitor unix:/var/run/openvswitch/db.sock\n" env: - name: K8S_NODE valueFrom: fieldRef: apiVersion: v1 fieldPath: spec.nodeName image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf imagePullPolicy: IfNotPresent lifecycle: preStop: exec: command: - /bin/bash - -c - '#!/bin/bash set -exuo pipefail # In order to maintain traffic flows during container restart, we # need to ensure that xfrm state and policies are not flushed. # Don''t allow ovs monitor to cleanup persistent state kill "$(cat /var/run/openvswitch/ovs-monitor-ipsec.pid 2>/dev/null)" 2>/dev/null || true ' livenessProbe: exec: command: - /bin/bash - -c - "#!/bin/bash\nif [[ $(ipsec whack --trafficstatus | wc -l) -eq 0 ]]; then\n\ \ echo \"no ipsec traffic configured\"\n exit 10\nfi\n" failureThreshold: 3 initialDelaySeconds: 15 periodSeconds: 60 successThreshold: 1 timeoutSeconds: 1 name: ovn-ipsec resources: requests: cpu: 10m memory: 100Mi securityContext: privileged: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /etc/cni/net.d name: host-cni-netd - mountPath: /var/run name: host-var-run - mountPath: /var/log/openvswitch/ name: host-var-log-ovs - mountPath: /etc/openvswitch name: etc-openvswitch - mountPath: /var/lib name: host-var-lib - mountPath: /etc name: host-etc - mountPath: /usr/sbin/ipsec name: ipsec-bin - mountPath: /usr/libexec/ipsec name: ipsec-lib - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-7rvbc readOnly: true - command: - /bin/bash - -c - "#!/bin/bash\n\n# When NETWORK_NODE_IDENTITY_ENABLE is true, use the per-node\ \ certificate to create a kubeconfig\n# that will be used to talk to the API\n\ \n\n# Wait for cert file\nretries=0\ntries=20\nkey_cert=\"/etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\"\ \nwhile [ ! -f \"${key_cert}\" ]; do\n (( retries += 1 ))\n if [[ \"${retries}\"\ \ -gt ${tries} ]]; then\n echo \"$(date -Iseconds) - ERROR - ${key_cert}\ \ not found\"\n return 1\n fi\n sleep 1\ndone\n\ncat << EOF > /var/run/ovnkube-kubeconfig\n\ apiVersion: v1\nclusters:\n - cluster:\n certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt\n\ \ server: https://api-int.ci-op-9pmd0iim-3eaf1.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:6443\n\ \ name: default-cluster\ncontexts:\n - context:\n cluster: default-cluster\n\ \ namespace: default\n user: default-auth\n name: default-context\n\ current-context: default-context\nkind: Config\npreferences: {}\nusers:\n -\ \ name: default-auth\n user:\n client-certificate: /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\n\ \ client-key: /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\n\ EOF\nexport KUBECONFIG=/var/run/ovnkube-kubeconfig\n\n\n# It is safe to flush\ \ xfrm states and policies and delete openshift.conf\n# file when east-west\ \ ipsec is disabled. This fixes a race condition when\n# ovs-monitor-ipsec is\ \ not fast enough to notice ipsec config change and\n# delete entries before\ \ it's being killed.\n# Since it's cleaning up all xfrm states and policies,\ \ it may cause slight\n# interruption until ipsec is restarted in case of external\ \ ipsec config.\n# We must do this before killing ovs-monitor-ipsec script,\ \ otherwise\n# preStop hook doesn't get a chance to run it because ovn-ipsec\ \ container\n# is abruptly terminated.\n# When east-west ipsec is not disabled,\ \ then do not flush xfrm states and\n# policies in order to maintain traffic\ \ flows during container restart.\nipsecflush() {\n if [ \"$(kubectl get networks.operator.openshift.io\ \ cluster -ojsonpath='{.spec.defaultNetwork.ovnKubernetesConfig.ipsecConfig.mode}')\"\ \ != \"Full\" ] && \\\n [ \"$(kubectl get networks.operator.openshift.io\ \ cluster -ojsonpath='{.spec.defaultNetwork.ovnKubernetesConfig.ipsecConfig}')\"\ \ != \"{}\" ]; then\n ip x s flush\n ip x p flush\n rm -f /etc/ipsec.d/openshift.conf\n\ \ # since pluto is on the host, we need to restart it after the flush\n \ \ chroot /proc/1/root ipsec restart\n fi\n}\n\n# Function to handle SIGTERM\n\ cleanup() {\n echo \"received SIGTERM, flushing ipsec config\"\n # Wait upto\ \ 15 seconds for ovs-monitor-ipsec process to terminate before\n # cleaning\ \ up ipsec entries.\n counter=0\n while kill -0 \"$(cat /var/run/openvswitch/ovs-monitor-ipsec.pid\ \ 2>/dev/null)\"; do\n counter=$((counter+1))\n sleep 1\n if [ $counter\ \ -gt 15 ];\n then\n echo \"ovs-monitor-ipsec has not terminated after\ \ $counter seconds\"\n break\n fi\n done\n ipsecflush\n exit 0\n\ }\n\n# Trap SIGTERM and call cleanup function\ntrap cleanup SIGTERM\n\ncounter=0\n\ until [ -r /var/run/openvswitch/ovs-monitor-ipsec.pid ]; do\n counter=$((counter+1))\n\ \ sleep 1\n if [ $counter -gt 300 ];\n then\n echo \"ovs-monitor-ipsec\ \ has not started after $counter seconds\"\n exit 1\n fi\ndone\necho \"\ ovs-monitor-ipsec is started\"\n\n# Monitor the ovs-monitor-ipsec process.\n\ while kill -0 \"$(cat /var/run/openvswitch/ovs-monitor-ipsec.pid 2>/dev/null)\"\ ; do\n sleep 1\ndone\n\n# Once the ovs-monitor-ipsec process terminates, execute\ \ the cleanup command.\necho \"ovs-monitor-ipsec is terminated, flushing ipsec\ \ config\"\nipsecflush\n\n# Continue running until SIGTERM is received (or exit\ \ naturally)\nwhile true; do\n sleep 1\ndone\n" image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf imagePullPolicy: IfNotPresent name: ovn-ipsec-cleanup resources: requests: cpu: 10m memory: 50Mi securityContext: privileged: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /etc/ovn/ name: etc-ovn - mountPath: /var/run name: host-var-run - mountPath: /etc name: host-etc - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-7rvbc readOnly: true dnsPolicy: Default enableServiceLinks: true hostNetwork: true hostPID: true imagePullSecrets: - name: ovn-kubernetes-node-dockercfg-sds8g initContainers: - command: - /bin/bash - -c - "#!/bin/bash\nset -exuo pipefail\n\n# When NETWORK_NODE_IDENTITY_ENABLE is true,\ \ use the per-node certificate to create a kubeconfig\n# that will be used to\ \ talk to the API\n\n\n# Wait for cert file\nretries=0\ntries=20\nkey_cert=\"\ /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\"\nwhile [ ! -f \"${key_cert}\"\ \ ]; do\n (( retries += 1 ))\n if [[ \"${retries}\" -gt ${tries} ]]; then\n\ \ echo \"$(date -Iseconds) - ERROR - ${key_cert} not found\"\n return\ \ 1\n fi\n sleep 1\ndone\n\ncat << EOF > /var/run/ovnkube-kubeconfig\napiVersion:\ \ v1\nclusters:\n - cluster:\n certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt\n\ \ server: https://api-int.ci-op-9pmd0iim-3eaf1.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:6443\n\ \ name: default-cluster\ncontexts:\n - context:\n cluster: default-cluster\n\ \ namespace: default\n user: default-auth\n name: default-context\n\ current-context: default-context\nkind: Config\npreferences: {}\nusers:\n -\ \ name: default-auth\n user:\n client-certificate: /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\n\ \ client-key: /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\n\ EOF\nexport KUBECONFIG=/var/run/ovnkube-kubeconfig\n\n\n# Every time we restart\ \ this container, we will create a new key pair if\n# we are close to key expiration\ \ or if we do not already have a signed key pair.\n#\n# Each node has a key\ \ pair which is used by OVS to encrypt/decrypt/authenticate traffic\n# between\ \ each node. The CA cert is used as the root of trust for all certs so we need\n\ # the CA to sign our certificate signing requests with the CA private key. In\ \ this way,\n# we can validate that any signed certificates that we receive\ \ from other nodes are\n# authentic.\necho \"Configuring IPsec keys\"\n\ncert_pem=/etc/openvswitch/keys/ipsec-cert.pem\n\ \n# If the certificate does not exist or it will expire in the next 6 months\n\ # (15770000 seconds), we will generate a new one.\nif ! openssl x509 -noout\ \ -dates -checkend 15770000 -in $cert_pem; then\n # We use the system-id as\ \ the CN for our certificate signing request. This\n # is a requirement by\ \ OVN.\n cn=$(ovs-vsctl --retry -t 60 get Open_vSwitch . external-ids:system-id\ \ | tr -d \"\\\"\")\n\n mkdir -p /etc/openvswitch/keys\n\n # Generate an SSL\ \ private key and use the key to create a certitificate signing request\n umask\ \ 077 && openssl genrsa -out /etc/openvswitch/keys/ipsec-privkey.pem 2048\n\ \ openssl req -new -text \\\n -extensions v3_req \\\n \ \ -addext \"subjectAltName = DNS:${cn}\" \\\n -subj \"/C=US/O=ovnkubernetes/OU=kind/CN=${cn}\"\ \ \\\n -key /etc/openvswitch/keys/ipsec-privkey.pem \\\n \ \ -out /etc/openvswitch/keys/ipsec-req.pem\n\n csr_64=$(base64 -w0 /etc/openvswitch/keys/ipsec-req.pem)\ \ # -w0 to avoid line-wrap\n\n # Request that our generated certificate signing\ \ request is\n # signed by the \"network.openshift.io/signer\" signer that\ \ is\n # implemented by the CNO signer controller. This will sign the\n #\ \ certificate signing request using the signer-ca which has been\n # set up\ \ by the OperatorPKI. In this way, we have a signed certificate\n # and our\ \ private key has remained private on this host.\n cat <<EOF | kubectl create\ \ -f -\n apiVersion: certificates.k8s.io/v1\n kind: CertificateSigningRequest\n\ \ metadata:\n generateName: ipsec-csr-$(hostname)-\n labels:\n k8s.ovn.org/ipsec-csr:\ \ $(hostname)\n spec:\n request: ${csr_64}\n signerName: network.openshift.io/signer\n\ \ usages:\n - ipsec tunnel\nEOF\n # Wait until the certificate signing\ \ request has been signed.\n counter=0\n until [ -n \"$(kubectl get csr -lk8s.ovn.org/ipsec-csr=\"\ $(hostname)\" --sort-by=.metadata.creationTimestamp -o jsonpath='{.items[-1:].status.certificate}'\ \ 2>/dev/null)\" ]\n do\n counter=$((counter+1))\n sleep 1\n if [\ \ $counter -gt 60 ];\n then\n echo \"Unable to sign certificate\ \ after $counter seconds\"\n exit 1\n fi\n done\n\n # Decode\ \ the signed certificate.\n kubectl get csr -lk8s.ovn.org/ipsec-csr=\"$(hostname)\"\ \ --sort-by=.metadata.creationTimestamp -o jsonpath='{.items[-1:].status.certificate}'\ \ | base64 -d | openssl x509 -outform pem -text -out $cert_pem\n\n # kubectl\ \ delete csr/$(hostname)\n\n # Get the CA certificate so we can authenticate\ \ peer nodes.\n openssl x509 -in /signer-ca/ca-bundle.crt -outform pem -text\ \ -out /etc/openvswitch/keys/ipsec-cacert.pem\nfi\n\n# Configure OVS with the\ \ relevant keys for this node. This is required by ovs-monitor-ipsec.\n#\n#\ \ Updating the certificates does not need to be an atomic operation as\n# the\ \ will get read and loaded into NSS by the ovs-monitor-ipsec process\n# which\ \ has not started yet.\novs-vsctl --retry -t 60 set Open_vSwitch . other_config:certificate=$cert_pem\ \ \\\n other_config:private_key=/etc/openvswitch/keys/ipsec-privkey.pem\ \ \\\n other_config:ca_cert=/etc/openvswitch/keys/ipsec-cacert.pem\n" env: - name: K8S_NODE valueFrom: fieldRef: apiVersion: v1 fieldPath: spec.nodeName image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf imagePullPolicy: IfNotPresent name: ovn-keys resources: requests: cpu: 10m memory: 100Mi securityContext: privileged: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /etc/ovn/ name: etc-ovn - mountPath: /var/run name: host-var-run - mountPath: /signer-ca name: signer-ca - mountPath: /etc/openvswitch name: etc-openvswitch - mountPath: /etc name: host-etc - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-7rvbc readOnly: true nodeName: ci-op-9pmd0iim-3eaf1-dcw66-worker-a-d6sw7 nodeSelector: kubernetes.io/os: linux preemptionPolicy: PreemptLowerPriority priority: 2000001000 priorityClassName: system-node-critical restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: ovn-kubernetes-node serviceAccountName: ovn-kubernetes-node terminationGracePeriodSeconds: 10 tolerations: - operator: Exists volumes: - hostPath: path: /var/lib/ovn-ic/etc type: '' name: etc-ovn - hostPath: path: /var/log/openvswitch type: DirectoryOrCreate name: host-var-log-ovs - configMap: defaultMode: 420 name: signer-ca name: signer-ca - hostPath: path: /var/lib/openvswitch/etc type: DirectoryOrCreate name: etc-openvswitch - hostPath: path: /var/run/multus/cni/net.d type: '' name: host-cni-netd - hostPath: path: /var/run type: DirectoryOrCreate name: host-var-run - hostPath: path: /var/lib type: DirectoryOrCreate name: host-var-lib - hostPath: path: /etc type: Directory name: host-etc - hostPath: path: /usr/sbin/ipsec type: File name: ipsec-bin - hostPath: path: /usr/libexec/ipsec type: Directory name: ipsec-lib - name: kube-api-access-7rvbc projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace - configMap: items: - key: service-ca.crt path: service-ca.crt name: openshift-service-ca.crt status: conditions: - lastProbeTime: null lastTransitionTime: '2025-02-13T14:54:05Z' status: 'False' type: PodReadyToStartContainers - lastProbeTime: null lastTransitionTime: '2025-02-13T14:54:05Z' message: 'containers with incomplete status: [ovn-keys]' reason: ContainersNotInitialized status: 'False' type: Initialized - lastProbeTime: null lastTransitionTime: '2025-02-13T14:54:05Z' message: 'containers with unready status: [ovn-ipsec ovn-ipsec-cleanup]' reason: ContainersNotReady status: 'False' type: Ready - lastProbeTime: null lastTransitionTime: '2025-02-13T14:54:05Z' message: 'containers with unready status: [ovn-ipsec ovn-ipsec-cleanup]' reason: ContainersNotReady status: 'False' type: ContainersReady - lastProbeTime: null lastTransitionTime: '2025-02-13T14:54:05Z' status: 'True' type: PodScheduled containerStatuses: - image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf imageID: '' lastState: {} name: ovn-ipsec ready: false restartCount: 0 started: false state: waiting: reason: PodInitializing volumeMounts: - mountPath: /etc/cni/net.d name: host-cni-netd - mountPath: /var/run name: host-var-run - mountPath: /var/log/openvswitch/ name: host-var-log-ovs - mountPath: /etc/openvswitch name: etc-openvswitch - mountPath: /var/lib name: host-var-lib - mountPath: /etc name: host-etc - mountPath: /usr/sbin/ipsec name: ipsec-bin - mountPath: /usr/libexec/ipsec name: ipsec-lib - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-7rvbc readOnly: true recursiveReadOnly: Disabled - image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf imageID: '' lastState: {} name: ovn-ipsec-cleanup ready: false restartCount: 0 started: false state: waiting: reason: PodInitializing volumeMounts: - mountPath: /etc/ovn/ name: etc-ovn - mountPath: /var/run name: host-var-run - mountPath: /etc name: host-etc - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-7rvbc readOnly: true recursiveReadOnly: Disabled hostIP: 10.0.128.2 hostIPs: - ip: 10.0.128.2 initContainerStatuses: - image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf imageID: '' lastState: {} name: ovn-keys ready: false restartCount: 0 started: false state: waiting: reason: PodInitializing volumeMounts: - mountPath: /etc/ovn/ name: etc-ovn - mountPath: /var/run name: host-var-run - mountPath: /signer-ca name: signer-ca - mountPath: /etc/openvswitch name: etc-openvswitch - mountPath: /etc name: host-etc - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-7rvbc readOnly: true recursiveReadOnly: Disabled phase: Pending podIP: 10.0.128.2 podIPs: - ip: 10.0.128.2 qosClass: Burstable startTime: '2025-02-13T14:54:05Z' 1.7.1 PrivateBin is a minimalist, open source online pastebin where the server has zero knowledge of pasted data. Data is encrypted/decrypted in the browser using 256 bits AES. More information on the project page. Red Hat Employee Privacy Statement
1.
2.
3.
Actual results:
Expected results:
Additional info:
Please fill in the following template while reporting a bug and provide as much relevant information as possible. Doing so will give us the best chance to find a prompt resolution.
Affected Platforms:
Is it an
- internal CI failure
- customer issue / SD
- internal RedHat testing failure
If it is an internal RedHat testing failure:
- Please share a kubeconfig or creds to a live cluster for the assignee to debug/troubleshoot along with reproducer steps (specially if it's a telco use case like ICNI, secondary bridges or BM+kubevirt).
If it is a CI failure:
- Did it happen in different CI lanes? If so please provide links to multiple failures with the same error instance
- Did it happen in both sdn and ovn jobs? If so please provide links to multiple failures with the same error instance
- Did it happen in other platforms (e.g. aws, azure, gcp, baremetal etc) ? If so please provide links to multiple failures with the same error instance
- When did the failure start happening? Please provide the UTC timestamp of the networking outage window from a sample failure run
- If it's a connectivity issue,
- What is the srcNode, srcIP and srcNamespace and srcPodName?
- What is the dstNode, dstIP and dstNamespace and dstPodName?
- What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)
If it is a customer / SD issue:
- Provide enough information in the bug description that Engineering doesn’t need to read the entire case history.
- Don’t presume that Engineering has access to Salesforce.
- Do presume that Engineering will access attachments through supportshell.
- Describe what each relevant attachment is intended to demonstrate (failed pods, log errors, OVS issues, etc).
- Referring to the attached must-gather, sosreport or other attachment, please provide the following details:
- If the issue is in a customer namespace then provide a namespace inspect.
- If it is a connectivity issue:
- What is the srcNode, srcNamespace, srcPodName and srcPodIP?
- What is the dstNode, dstNamespace, dstPodName and dstPodIP?
- What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)
- Please provide the UTC timestamp networking outage window from must-gather
- Please provide tcpdump pcaps taken during the outage filtered based on the above provided src/dst IPs
- If it is not a connectivity issue:
- Describe the steps taken so far to analyze the logs from networking components (cluster-network-operator, OVNK, SDN, openvswitch, ovs-configure etc) and the actual component where the issue was seen based on the attached must-gather. Please attach snippets of relevant logs around the window when problem has happened if any.
- When showing the results from commands, include the entire command in the output.
- For OCPBUGS in which the issue has been identified, label with “sbr-triaged”
- For OCPBUGS in which the issue has not been identified and needs Engineering help for root cause, label with “sbr-untriaged”
- Do not set the priority, that is owned by Engineering and will be set when the bug is evaluated
- Note: bugs that do not meet these minimum standards will be closed with label “SDN-Jira-template”
- For guidance on using this template please see
OCPBUGS Template Training for Networking components
- blocks
-
SDN-5330 Add ipsec upgrade ci job as mandatory lane
-
- In Progress
-
- links to
-
RHEA-2024:11038 OpenShift Container Platform 4.19.z bug fix update