-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.19
-
Quality / Stability / Reliability
-
False
-
-
5
-
Critical
-
Yes
-
None
-
Rejected
-
MCO Sprint 267
-
1
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
Description of problem:
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
Issue found in prow ci
periodic-ci-openshift-openshift-tests-private-release-4.19-multi-nightly-gcp-ipi-ovn-ipsec-arm-mixarch-f14 #1890061783440297984
periodic-ci-openshift-openshift-tests-private-release-4.19-multi-nightly-gcp-ipi-ovn-ipsec-amd-mixarch-f28-destructive #1890035862469611520
periodic-ci-openshift-openshift-tests-private-release-4.19-multi-nightly-gcp-ipi-ovn-ipsec-arm-mixarch-f14 #1890279505117843456
must-gather logs for second one https://gcsweb-qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/qe-private-deck/logs/periodic-ci-o[…]r-must-gather/artifacts/must-gather.tar
% omg get nodes NAME STATUS ROLES AGE VERSION ci-op-9pmd0iim-3eaf1-dcw66-master-0 Ready control-plane,master 1h12m v1.32.1 ci-op-9pmd0iim-3eaf1-dcw66-master-1 Ready control-plane,master 1h13m v1.32.1 ci-op-9pmd0iim-3eaf1-dcw66-master-2 Ready control-plane,master 1h11m v1.32.1 ci-op-9pmd0iim-3eaf1-dcw66-worker-a-d6sw7 Ready worker 1h0m v1.32.1 ci-op-9pmd0iim-3eaf1-dcw66-worker-b-97qfp Ready worker 58m v1.32.1 % omg get pods -n openshift-ovn-kubernetes -o wide NAME READY STATUS RESTARTS AGE IP NODE ovn-ipsec-host-2qfqh 2/2 Running 0 33m 10.0.0.4 ci-op-9pmd0iim-3eaf1-dcw66-master-2 ovn-ipsec-host-bqh5n 0/2 Pending 0 33m 10.0.128.3 ci-op-9pmd0iim-3eaf1-dcw66-worker-b-97qfp ovn-ipsec-host-hdjtx 2/2 Running 0 33m 10.0.0.3 ci-op-9pmd0iim-3eaf1-dcw66-master-1 ovn-ipsec-host-jwn8s 2/2 Running 0 33m 10.0.0.6 ci-op-9pmd0iim-3eaf1-dcw66-master-0 ovn-ipsec-host-n4cpv 0/2 Pending 0 33m 10.0.128.2 ci-op-9pmd0iim-3eaf1-dcw66-worker-a-d6sw7 ovnkube-control-plane-85cbb47f9d-n6rps 2/2 Running 1 55m 10.0.0.6 ci-op-9pmd0iim-3eaf1-dcw66-master-0 ovnkube-control-plane-85cbb47f9d-slb94 2/2 Running 0 47m 10.0.0.3 ci-op-9pmd0iim-3eaf1-dcw66-master-1 ovnkube-node-2hwb6 8/8 Running 0 1h0m 10.0.128.2 ci-op-9pmd0iim-3eaf1-dcw66-worker-a-d6sw7 ovnkube-node-9nhj6 8/8 Running 1 53m 10.0.0.4 ci-op-9pmd0iim-3eaf1-dcw66-master-2 ovnkube-node-h2fd2 8/8 Running 2 53m 10.0.0.3 ci-op-9pmd0iim-3eaf1-dcw66-master-1 ovnkube-node-hwng4 8/8 Running 0 56m 10.0.0.6 ci-op-9pmd0iim-3eaf1-dcw66-master-0 ovnkube-node-k6rfl 8/8 Running 0 58m 10.0.128.3 ci-op-9pmd0iim-3eaf1-dcw66-worker-b-97qfp
% omg get pod ovn-ipsec-host-n4cpv -n openshift-ovn-kubernetes -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
cluster-autoscaler.kubernetes.io/enable-ds-eviction: 'false'
creationTimestamp: '2025-02-13T14:54:05Z'
generateName: ovn-ipsec-host-
labels:
app: ovn-ipsec
component: network
controller-revision-hash: 8b4dd5dc7
kubernetes.io/os: linux
openshift.io/component: network
pod-template-generation: '1'
type: infra
managedFields:
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:cluster-autoscaler.kubernetes.io/enable-ds-eviction: {}
f:target.workload.openshift.io/management: {}
f:generateName: {}
f:labels:
.: {}
f:app: {}
f:component: {}
f:controller-revision-hash: {}
f:kubernetes.io/os: {}
f:openshift.io/component: {}
f:pod-template-generation: {}
f:type: {}
f:ownerReferences:
.: {}
k:{"uid":"61870386-d205-465b-832c-061c3bf7366e"}: {}
f:spec:
f:affinity:
.: {}
f:nodeAffinity:
.: {}
f:requiredDuringSchedulingIgnoredDuringExecution: {}
f:containers:
k:{"name":"ovn-ipsec"}:
.: {}
f:command: {}
f:env:
.: {}
k:{"name":"K8S_NODE"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:fieldRef: {}
f:image: {}
f:imagePullPolicy: {}
f:lifecycle:
.: {}
f:preStop:
.: {}
f:exec:
.: {}
f:command: {}
f:livenessProbe:
.: {}
f:exec:
.: {}
f:command: {}
f:failureThreshold: {}
f:initialDelaySeconds: {}
f:periodSeconds: {}
f:successThreshold: {}
f:timeoutSeconds: {}
f:name: {}
f:resources:
.: {}
f:requests:
.: {}
f:cpu: {}
f:memory: {}
f:securityContext:
.: {}
f:privileged: {}
f:terminationMessagePath: {}
f:terminationMessagePolicy: {}
f:volumeMounts:
.: {}
k:{"mountPath":"/etc"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/etc/cni/net.d"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/etc/openvswitch"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/usr/libexec/ipsec"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/usr/sbin/ipsec"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/var/lib"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/var/log/openvswitch/"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/var/run"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"name":"ovn-ipsec-cleanup"}:
.: {}
f:command: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:resources:
.: {}
f:requests:
.: {}
f:cpu: {}
f:memory: {}
f:securityContext:
.: {}
f:privileged: {}
f:terminationMessagePath: {}
f:terminationMessagePolicy: {}
f:volumeMounts:
.: {}
k:{"mountPath":"/etc"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/etc/ovn/"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/var/run"}:
.: {}
f:mountPath: {}
f:name: {}
f:dnsPolicy: {}
f:enableServiceLinks: {}
f:hostNetwork: {}
f:hostPID: {}
f:initContainers:
.: {}
k:{"name":"ovn-keys"}:
.: {}
f:command: {}
f:env:
.: {}
k:{"name":"K8S_NODE"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:fieldRef: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:resources:
.: {}
f:requests:
.: {}
f:cpu: {}
f:memory: {}
f:securityContext:
.: {}
f:privileged: {}
f:terminationMessagePath: {}
f:terminationMessagePolicy: {}
f:volumeMounts:
.: {}
k:{"mountPath":"/etc"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/etc/openvswitch"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/etc/ovn/"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/signer-ca"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/var/run"}:
.: {}
f:mountPath: {}
f:name: {}
f:nodeSelector: {}
f:priorityClassName: {}
f:restartPolicy: {}
f:schedulerName: {}
f:securityContext: {}
f:serviceAccount: {}
f:serviceAccountName: {}
f:terminationGracePeriodSeconds: {}
f:tolerations: {}
f:volumes:
.: {}
k:{"name":"etc-openvswitch"}:
.: {}
f:hostPath:
.: {}
f:path: {}
f:type: {}
f:name: {}
k:{"name":"etc-ovn"}:
.: {}
f:hostPath:
.: {}
f:path: {}
f:type: {}
f:name: {}
k:{"name":"host-cni-netd"}:
.: {}
f:hostPath:
.: {}
f:path: {}
f:type: {}
f:name: {}
k:{"name":"host-etc"}:
.: {}
f:hostPath:
.: {}
f:path: {}
f:type: {}
f:name: {}
k:{"name":"host-var-lib"}:
.: {}
f:hostPath:
.: {}
f:path: {}
f:type: {}
f:name: {}
k:{"name":"host-var-log-ovs"}:
.: {}
f:hostPath:
.: {}
f:path: {}
f:type: {}
f:name: {}
k:{"name":"host-var-run"}:
.: {}
f:hostPath:
.: {}
f:path: {}
f:type: {}
f:name: {}
k:{"name":"ipsec-bin"}:
.: {}
f:hostPath:
.: {}
f:path: {}
f:type: {}
f:name: {}
k:{"name":"ipsec-lib"}:
.: {}
f:hostPath:
.: {}
f:path: {}
f:type: {}
f:name: {}
k:{"name":"signer-ca"}:
.: {}
f:configMap:
.: {}
f:defaultMode: {}
f:name: {}
f:name: {}
manager: kube-controller-manager
operation: Update
time: '2025-02-13T14:54:04Z'
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:conditions:
k:{"type":"ContainersReady"}:
.: {}
f:lastProbeTime: {}
f:lastTransitionTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"Initialized"}:
.: {}
f:lastProbeTime: {}
f:lastTransitionTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"PodReadyToStartContainers"}:
.: {}
f:lastProbeTime: {}
f:lastTransitionTime: {}
f:status: {}
f:type: {}
k:{"type":"Ready"}:
.: {}
f:lastProbeTime: {}
f:lastTransitionTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
f:containerStatuses: {}
f:hostIP: {}
f:hostIPs: {}
f:initContainerStatuses: {}
f:podIP: {}
f:podIPs:
.: {}
k:{"ip":"10.0.128.2"}:
.: {}
f:ip: {}
f:startTime: {}
manager: kubelet
operation: Update
subresource: status
time: '2025-02-13T14:54:05Z'
name: ovn-ipsec-host-n4cpv
namespace: openshift-ovn-kubernetes
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: DaemonSet
name: ovn-ipsec-host
uid: 61870386-d205-465b-832c-061c3bf7366e
resourceVersion: '38812'
uid: ce7f6619-3015-414d-9de4-5991d74258fd
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- ci-op-9pmd0iim-3eaf1-dcw66-worker-a-d6sw7
containers:
- command:
- /bin/bash
- -c
- "#!/bin/bash\nset -exuo pipefail\n\n# Don't start IPsec until ovnkube-node has\
\ finished setting up the node\ncounter=0\nuntil [ -f /etc/cni/net.d/10-ovn-kubernetes.conf\
\ ]\ndo\n counter=$((counter+1))\n sleep 1\n if [ $counter -gt 300 ];\n \
\ then\n echo \"ovnkube-node pod has not started after $counter seconds\"\
\n exit 1\n fi\ndone\necho \"ovnkube-node has configured node.\"\n\
\nif ! pgrep pluto; then\n echo \"pluto is not running, enable the service\
\ and/or check system logs\"\n exit 2\nfi\n\n# The ovs-monitor-ipsec doesn't\
\ set authby, so when it calls ipsec auto --start\n# the default ones defined\
\ at Libreswan's compile time will be used. On restart,\n# Libreswan will use\
\ authby from libreswan.config. If libreswan.config is\n# incompatible with\
\ the Libreswan's compiled-in defaults, then we'll have an\n# authentication\
\ problem. But OTOH, ovs-monitor-ipsec does set ike and esp algorithms,\n# so\
\ those may be incompatible with libreswan.config as well. Hence commenting\
\ out the\n# \"include\" from libreswan.conf to avoid such conflicts.\ndefaultcpinclude=\"\
include \\/etc\\/crypto-policies\\/back-ends\\/libreswan.config\"\nif ! grep\
\ -q \"# ${defaultcpinclude}\" /etc/ipsec.conf; then\n sed -i \"/${defaultcpinclude}/s/^/#\
\ /\" /etc/ipsec.conf\n # since pluto is on the host, we need to restart it\
\ after changing connection\n # parameters.\n chroot /proc/1/root ipsec restart\n\
\n counter=0\n until [ -r /run/pluto/pluto.ctl ]; do\n counter=$((counter+1))\n\
\ sleep 1\n if [ $counter -gt 300 ];\n then\n echo \"ipsec has\
\ not started after $counter seconds\"\n exit 1\n fi\n done\n echo\
\ \"ipsec service is restarted\"\nfi\n\n# Workaround for https://github.com/libreswan/libreswan/issues/373\n\
ulimit -n 1024\n\n/usr/libexec/ipsec/addconn --config /etc/ipsec.conf --checkconfig\n\
# Check kernel modules\n/usr/libexec/ipsec/_stackmanager start\n# Check nss\
\ database status\n/usr/sbin/ipsec --checknss\n\n# Start ovs-monitor-ipsec which\
\ will monitor for changes in the ovs\n# tunnelling configuration (for example\
\ addition of a node) and configures\n# libreswan appropriately.\n# We are running\
\ this in the foreground so that the container will be restarted when ovs-monitor-ipsec\
\ fails.\n/usr/libexec/platform-python /usr/share/openvswitch/scripts/ovs-monitor-ipsec\
\ \\\n --pidfile=/var/run/openvswitch/ovs-monitor-ipsec.pid --ike-daemon=libreswan\
\ --no-restart-ike-daemon \\\n --ipsec-conf /etc/ipsec.d/openshift.conf --ipsec-d\
\ /var/lib/ipsec/nss \\\n --log-file --monitor unix:/var/run/openvswitch/db.sock\n"
env:
- name: K8S_NODE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
exec:
command:
- /bin/bash
- -c
- '#!/bin/bash
set -exuo pipefail
# In order to maintain traffic flows during container restart, we
# need to ensure that xfrm state and policies are not flushed.
# Don''t allow ovs monitor to cleanup persistent state
kill "$(cat /var/run/openvswitch/ovs-monitor-ipsec.pid 2>/dev/null)" 2>/dev/null
|| true
'
livenessProbe:
exec:
command:
- /bin/bash
- -c
- "#!/bin/bash\nif [[ $(ipsec whack --trafficstatus | wc -l) -eq 0 ]]; then\n\
\ echo \"no ipsec traffic configured\"\n exit 10\nfi\n"
failureThreshold: 3
initialDelaySeconds: 15
periodSeconds: 60
successThreshold: 1
timeoutSeconds: 1
name: ovn-ipsec
resources:
requests:
cpu: 10m
memory: 100Mi
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /etc/cni/net.d
name: host-cni-netd
- mountPath: /var/run
name: host-var-run
- mountPath: /var/log/openvswitch/
name: host-var-log-ovs
- mountPath: /etc/openvswitch
name: etc-openvswitch
- mountPath: /var/lib
name: host-var-lib
- mountPath: /etc
name: host-etc
- mountPath: /usr/sbin/ipsec
name: ipsec-bin
- mountPath: /usr/libexec/ipsec
name: ipsec-lib
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-7rvbc
readOnly: true
- command:
- /bin/bash
- -c
- "#!/bin/bash\n\n# When NETWORK_NODE_IDENTITY_ENABLE is true, use the per-node\
\ certificate to create a kubeconfig\n# that will be used to talk to the API\n\
\n\n# Wait for cert file\nretries=0\ntries=20\nkey_cert=\"/etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\"\
\nwhile [ ! -f \"${key_cert}\" ]; do\n (( retries += 1 ))\n if [[ \"${retries}\"\
\ -gt ${tries} ]]; then\n echo \"$(date -Iseconds) - ERROR - ${key_cert}\
\ not found\"\n return 1\n fi\n sleep 1\ndone\n\ncat << EOF > /var/run/ovnkube-kubeconfig\n\
apiVersion: v1\nclusters:\n - cluster:\n certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt\n\
\ server: https://api-int.ci-op-9pmd0iim-3eaf1.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:6443\n\
\ name: default-cluster\ncontexts:\n - context:\n cluster: default-cluster\n\
\ namespace: default\n user: default-auth\n name: default-context\n\
current-context: default-context\nkind: Config\npreferences: {}\nusers:\n -\
\ name: default-auth\n user:\n client-certificate: /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\n\
\ client-key: /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\n\
EOF\nexport KUBECONFIG=/var/run/ovnkube-kubeconfig\n\n\n# It is safe to flush\
\ xfrm states and policies and delete openshift.conf\n# file when east-west\
\ ipsec is disabled. This fixes a race condition when\n# ovs-monitor-ipsec is\
\ not fast enough to notice ipsec config change and\n# delete entries before\
\ it's being killed.\n# Since it's cleaning up all xfrm states and policies,\
\ it may cause slight\n# interruption until ipsec is restarted in case of external\
\ ipsec config.\n# We must do this before killing ovs-monitor-ipsec script,\
\ otherwise\n# preStop hook doesn't get a chance to run it because ovn-ipsec\
\ container\n# is abruptly terminated.\n# When east-west ipsec is not disabled,\
\ then do not flush xfrm states and\n# policies in order to maintain traffic\
\ flows during container restart.\nipsecflush() {\n if [ \"$(kubectl get networks.operator.openshift.io\
\ cluster -ojsonpath='{.spec.defaultNetwork.ovnKubernetesConfig.ipsecConfig.mode}')\"\
\ != \"Full\" ] && \\\n [ \"$(kubectl get networks.operator.openshift.io\
\ cluster -ojsonpath='{.spec.defaultNetwork.ovnKubernetesConfig.ipsecConfig}')\"\
\ != \"{}\" ]; then\n ip x s flush\n ip x p flush\n rm -f /etc/ipsec.d/openshift.conf\n\
\ # since pluto is on the host, we need to restart it after the flush\n \
\ chroot /proc/1/root ipsec restart\n fi\n}\n\n# Function to handle SIGTERM\n\
cleanup() {\n echo \"received SIGTERM, flushing ipsec config\"\n # Wait upto\
\ 15 seconds for ovs-monitor-ipsec process to terminate before\n # cleaning\
\ up ipsec entries.\n counter=0\n while kill -0 \"$(cat /var/run/openvswitch/ovs-monitor-ipsec.pid\
\ 2>/dev/null)\"; do\n counter=$((counter+1))\n sleep 1\n if [ $counter\
\ -gt 15 ];\n then\n echo \"ovs-monitor-ipsec has not terminated after\
\ $counter seconds\"\n break\n fi\n done\n ipsecflush\n exit 0\n\
}\n\n# Trap SIGTERM and call cleanup function\ntrap cleanup SIGTERM\n\ncounter=0\n\
until [ -r /var/run/openvswitch/ovs-monitor-ipsec.pid ]; do\n counter=$((counter+1))\n\
\ sleep 1\n if [ $counter -gt 300 ];\n then\n echo \"ovs-monitor-ipsec\
\ has not started after $counter seconds\"\n exit 1\n fi\ndone\necho \"\
ovs-monitor-ipsec is started\"\n\n# Monitor the ovs-monitor-ipsec process.\n\
while kill -0 \"$(cat /var/run/openvswitch/ovs-monitor-ipsec.pid 2>/dev/null)\"\
; do\n sleep 1\ndone\n\n# Once the ovs-monitor-ipsec process terminates, execute\
\ the cleanup command.\necho \"ovs-monitor-ipsec is terminated, flushing ipsec\
\ config\"\nipsecflush\n\n# Continue running until SIGTERM is received (or exit\
\ naturally)\nwhile true; do\n sleep 1\ndone\n"
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf
imagePullPolicy: IfNotPresent
name: ovn-ipsec-cleanup
resources:
requests:
cpu: 10m
memory: 50Mi
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /etc/ovn/
name: etc-ovn
- mountPath: /var/run
name: host-var-run
- mountPath: /etc
name: host-etc
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-7rvbc
readOnly: true
dnsPolicy: Default
enableServiceLinks: true
hostNetwork: true
hostPID: true
imagePullSecrets:
- name: ovn-kubernetes-node-dockercfg-sds8g
initContainers:
- command:
- /bin/bash
- -c
- "#!/bin/bash\nset -exuo pipefail\n\n# When NETWORK_NODE_IDENTITY_ENABLE is true,\
\ use the per-node certificate to create a kubeconfig\n# that will be used to\
\ talk to the API\n\n\n# Wait for cert file\nretries=0\ntries=20\nkey_cert=\"\
/etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\"\nwhile [ ! -f \"${key_cert}\"\
\ ]; do\n (( retries += 1 ))\n if [[ \"${retries}\" -gt ${tries} ]]; then\n\
\ echo \"$(date -Iseconds) - ERROR - ${key_cert} not found\"\n return\
\ 1\n fi\n sleep 1\ndone\n\ncat << EOF > /var/run/ovnkube-kubeconfig\napiVersion:\
\ v1\nclusters:\n - cluster:\n certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt\n\
\ server: https://api-int.ci-op-9pmd0iim-3eaf1.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:6443\n\
\ name: default-cluster\ncontexts:\n - context:\n cluster: default-cluster\n\
\ namespace: default\n user: default-auth\n name: default-context\n\
current-context: default-context\nkind: Config\npreferences: {}\nusers:\n -\
\ name: default-auth\n user:\n client-certificate: /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\n\
\ client-key: /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem\n\
EOF\nexport KUBECONFIG=/var/run/ovnkube-kubeconfig\n\n\n# Every time we restart\
\ this container, we will create a new key pair if\n# we are close to key expiration\
\ or if we do not already have a signed key pair.\n#\n# Each node has a key\
\ pair which is used by OVS to encrypt/decrypt/authenticate traffic\n# between\
\ each node. The CA cert is used as the root of trust for all certs so we need\n\
# the CA to sign our certificate signing requests with the CA private key. In\
\ this way,\n# we can validate that any signed certificates that we receive\
\ from other nodes are\n# authentic.\necho \"Configuring IPsec keys\"\n\ncert_pem=/etc/openvswitch/keys/ipsec-cert.pem\n\
\n# If the certificate does not exist or it will expire in the next 6 months\n\
# (15770000 seconds), we will generate a new one.\nif ! openssl x509 -noout\
\ -dates -checkend 15770000 -in $cert_pem; then\n # We use the system-id as\
\ the CN for our certificate signing request. This\n # is a requirement by\
\ OVN.\n cn=$(ovs-vsctl --retry -t 60 get Open_vSwitch . external-ids:system-id\
\ | tr -d \"\\\"\")\n\n mkdir -p /etc/openvswitch/keys\n\n # Generate an SSL\
\ private key and use the key to create a certitificate signing request\n umask\
\ 077 && openssl genrsa -out /etc/openvswitch/keys/ipsec-privkey.pem 2048\n\
\ openssl req -new -text \\\n -extensions v3_req \\\n \
\ -addext \"subjectAltName = DNS:${cn}\" \\\n -subj \"/C=US/O=ovnkubernetes/OU=kind/CN=${cn}\"\
\ \\\n -key /etc/openvswitch/keys/ipsec-privkey.pem \\\n \
\ -out /etc/openvswitch/keys/ipsec-req.pem\n\n csr_64=$(base64 -w0 /etc/openvswitch/keys/ipsec-req.pem)\
\ # -w0 to avoid line-wrap\n\n # Request that our generated certificate signing\
\ request is\n # signed by the \"network.openshift.io/signer\" signer that\
\ is\n # implemented by the CNO signer controller. This will sign the\n #\
\ certificate signing request using the signer-ca which has been\n # set up\
\ by the OperatorPKI. In this way, we have a signed certificate\n # and our\
\ private key has remained private on this host.\n cat <<EOF | kubectl create\
\ -f -\n apiVersion: certificates.k8s.io/v1\n kind: CertificateSigningRequest\n\
\ metadata:\n generateName: ipsec-csr-$(hostname)-\n labels:\n k8s.ovn.org/ipsec-csr:\
\ $(hostname)\n spec:\n request: ${csr_64}\n signerName: network.openshift.io/signer\n\
\ usages:\n - ipsec tunnel\nEOF\n # Wait until the certificate signing\
\ request has been signed.\n counter=0\n until [ -n \"$(kubectl get csr -lk8s.ovn.org/ipsec-csr=\"\
$(hostname)\" --sort-by=.metadata.creationTimestamp -o jsonpath='{.items[-1:].status.certificate}'\
\ 2>/dev/null)\" ]\n do\n counter=$((counter+1))\n sleep 1\n if [\
\ $counter -gt 60 ];\n then\n echo \"Unable to sign certificate\
\ after $counter seconds\"\n exit 1\n fi\n done\n\n # Decode\
\ the signed certificate.\n kubectl get csr -lk8s.ovn.org/ipsec-csr=\"$(hostname)\"\
\ --sort-by=.metadata.creationTimestamp -o jsonpath='{.items[-1:].status.certificate}'\
\ | base64 -d | openssl x509 -outform pem -text -out $cert_pem\n\n # kubectl\
\ delete csr/$(hostname)\n\n # Get the CA certificate so we can authenticate\
\ peer nodes.\n openssl x509 -in /signer-ca/ca-bundle.crt -outform pem -text\
\ -out /etc/openvswitch/keys/ipsec-cacert.pem\nfi\n\n# Configure OVS with the\
\ relevant keys for this node. This is required by ovs-monitor-ipsec.\n#\n#\
\ Updating the certificates does not need to be an atomic operation as\n# the\
\ will get read and loaded into NSS by the ovs-monitor-ipsec process\n# which\
\ has not started yet.\novs-vsctl --retry -t 60 set Open_vSwitch . other_config:certificate=$cert_pem\
\ \\\n other_config:private_key=/etc/openvswitch/keys/ipsec-privkey.pem\
\ \\\n other_config:ca_cert=/etc/openvswitch/keys/ipsec-cacert.pem\n"
env:
- name: K8S_NODE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf
imagePullPolicy: IfNotPresent
name: ovn-keys
resources:
requests:
cpu: 10m
memory: 100Mi
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /etc/ovn/
name: etc-ovn
- mountPath: /var/run
name: host-var-run
- mountPath: /signer-ca
name: signer-ca
- mountPath: /etc/openvswitch
name: etc-openvswitch
- mountPath: /etc
name: host-etc
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-7rvbc
readOnly: true
nodeName: ci-op-9pmd0iim-3eaf1-dcw66-worker-a-d6sw7
nodeSelector:
kubernetes.io/os: linux
preemptionPolicy: PreemptLowerPriority
priority: 2000001000
priorityClassName: system-node-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: ovn-kubernetes-node
serviceAccountName: ovn-kubernetes-node
terminationGracePeriodSeconds: 10
tolerations:
- operator: Exists
volumes:
- hostPath:
path: /var/lib/ovn-ic/etc
type: ''
name: etc-ovn
- hostPath:
path: /var/log/openvswitch
type: DirectoryOrCreate
name: host-var-log-ovs
- configMap:
defaultMode: 420
name: signer-ca
name: signer-ca
- hostPath:
path: /var/lib/openvswitch/etc
type: DirectoryOrCreate
name: etc-openvswitch
- hostPath:
path: /var/run/multus/cni/net.d
type: ''
name: host-cni-netd
- hostPath:
path: /var/run
type: DirectoryOrCreate
name: host-var-run
- hostPath:
path: /var/lib
type: DirectoryOrCreate
name: host-var-lib
- hostPath:
path: /etc
type: Directory
name: host-etc
- hostPath:
path: /usr/sbin/ipsec
type: File
name: ipsec-bin
- hostPath:
path: /usr/libexec/ipsec
type: Directory
name: ipsec-lib
- name: kube-api-access-7rvbc
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
- configMap:
items:
- key: service-ca.crt
path: service-ca.crt
name: openshift-service-ca.crt
status:
conditions:
- lastProbeTime: null
lastTransitionTime: '2025-02-13T14:54:05Z'
status: 'False'
type: PodReadyToStartContainers
- lastProbeTime: null
lastTransitionTime: '2025-02-13T14:54:05Z'
message: 'containers with incomplete status: [ovn-keys]'
reason: ContainersNotInitialized
status: 'False'
type: Initialized
- lastProbeTime: null
lastTransitionTime: '2025-02-13T14:54:05Z'
message: 'containers with unready status: [ovn-ipsec ovn-ipsec-cleanup]'
reason: ContainersNotReady
status: 'False'
type: Ready
- lastProbeTime: null
lastTransitionTime: '2025-02-13T14:54:05Z'
message: 'containers with unready status: [ovn-ipsec ovn-ipsec-cleanup]'
reason: ContainersNotReady
status: 'False'
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: '2025-02-13T14:54:05Z'
status: 'True'
type: PodScheduled
containerStatuses:
- image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf
imageID: ''
lastState: {}
name: ovn-ipsec
ready: false
restartCount: 0
started: false
state:
waiting:
reason: PodInitializing
volumeMounts:
- mountPath: /etc/cni/net.d
name: host-cni-netd
- mountPath: /var/run
name: host-var-run
- mountPath: /var/log/openvswitch/
name: host-var-log-ovs
- mountPath: /etc/openvswitch
name: etc-openvswitch
- mountPath: /var/lib
name: host-var-lib
- mountPath: /etc
name: host-etc
- mountPath: /usr/sbin/ipsec
name: ipsec-bin
- mountPath: /usr/libexec/ipsec
name: ipsec-lib
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-7rvbc
readOnly: true
recursiveReadOnly: Disabled
- image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf
imageID: ''
lastState: {}
name: ovn-ipsec-cleanup
ready: false
restartCount: 0
started: false
state:
waiting:
reason: PodInitializing
volumeMounts:
- mountPath: /etc/ovn/
name: etc-ovn
- mountPath: /var/run
name: host-var-run
- mountPath: /etc
name: host-etc
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-7rvbc
readOnly: true
recursiveReadOnly: Disabled
hostIP: 10.0.128.2
hostIPs:
- ip: 10.0.128.2
initContainerStatuses:
- image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e262b9ed22e74a3a8d7a345b775645267acfbcd571b510e1ace519cc2f658bf
imageID: ''
lastState: {}
name: ovn-keys
ready: false
restartCount: 0
started: false
state:
waiting:
reason: PodInitializing
volumeMounts:
- mountPath: /etc/ovn/
name: etc-ovn
- mountPath: /var/run
name: host-var-run
- mountPath: /signer-ca
name: signer-ca
- mountPath: /etc/openvswitch
name: etc-openvswitch
- mountPath: /etc
name: host-etc
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-7rvbc
readOnly: true
recursiveReadOnly: Disabled
phase: Pending
podIP: 10.0.128.2
podIPs:
- ip: 10.0.128.2
qosClass: Burstable
startTime: '2025-02-13T14:54:05Z'
1.7.1
PrivateBin is a minimalist, open source online pastebin where the server has zero knowledge of pasted data. Data is encrypted/decrypted in the browser using 256 bits AES. More information on the project page. Red Hat Employee Privacy Statement
1.
2.
3.
Actual results:
Expected results:
Additional info:
Please fill in the following template while reporting a bug and provide as much relevant information as possible. Doing so will give us the best chance to find a prompt resolution.
Affected Platforms:
Is it an
- internal CI failure
- customer issue / SD
- internal RedHat testing failure
If it is an internal RedHat testing failure:
- Please share a kubeconfig or creds to a live cluster for the assignee to debug/troubleshoot along with reproducer steps (specially if it's a telco use case like ICNI, secondary bridges or BM+kubevirt).
If it is a CI failure:
- Did it happen in different CI lanes? If so please provide links to multiple failures with the same error instance
- Did it happen in both sdn and ovn jobs? If so please provide links to multiple failures with the same error instance
- Did it happen in other platforms (e.g. aws, azure, gcp, baremetal etc) ? If so please provide links to multiple failures with the same error instance
- When did the failure start happening? Please provide the UTC timestamp of the networking outage window from a sample failure run
- If it's a connectivity issue,
- What is the srcNode, srcIP and srcNamespace and srcPodName?
- What is the dstNode, dstIP and dstNamespace and dstPodName?
- What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)
If it is a customer / SD issue:
- Provide enough information in the bug description that Engineering doesn’t need to read the entire case history.
- Don’t presume that Engineering has access to Salesforce.
- Do presume that Engineering will access attachments through supportshell.
- Describe what each relevant attachment is intended to demonstrate (failed pods, log errors, OVS issues, etc).
- Referring to the attached must-gather, sosreport or other attachment, please provide the following details:
- If the issue is in a customer namespace then provide a namespace inspect.
- If it is a connectivity issue:
- What is the srcNode, srcNamespace, srcPodName and srcPodIP?
- What is the dstNode, dstNamespace, dstPodName and dstPodIP?
- What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)
- Please provide the UTC timestamp networking outage window from must-gather
- Please provide tcpdump pcaps taken during the outage filtered based on the above provided src/dst IPs
- If it is not a connectivity issue:
- Describe the steps taken so far to analyze the logs from networking components (cluster-network-operator, OVNK, SDN, openvswitch, ovs-configure etc) and the actual component where the issue was seen based on the attached must-gather. Please attach snippets of relevant logs around the window when problem has happened if any.
- When showing the results from commands, include the entire command in the output.
- For OCPBUGS in which the issue has been identified, label with “sbr-triaged”
- For OCPBUGS in which the issue has not been identified and needs Engineering help for root cause, label with “sbr-untriaged”
- Do not set the priority, that is owned by Engineering and will be set when the bug is evaluated
- Note: bugs that do not meet these minimum standards will be closed with label “SDN-Jira-template”
- For guidance on using this template please see
OCPBUGS Template Training for Networking components
- blocks
-
CORENET-5581 Add ipsec upgrade ci job as mandatory lane
-
- Closed
-
- links to
-
RHEA-2024:11038
OpenShift Container Platform 4.19.z bug fix update