-
Bug
-
Resolution: Done-Errata
-
Minor
-
None
-
1
-
False
-
-
False
-
CLOSED
-
---
-
---
-
-
-
cnv-netperf-20 230122
-
Medium
-
None
+++ This bug was initially created as a clone of Bug #2156902 +++
Target release: v4.12.1
Description of problem:
When a checkup encounter a setup failure, the components created by the job are not deleted.
Version-Release number of selected component (if applicable):
kubevirt-hyperconverged-operator.v4.12.0 OpenShift Virtualization 4.12.0 kubevirt-hyperconverged-operator.v4.11.1 Succeeded
Client Version: 4.12.0-rc.6
Kustomize Version: v4.5.7
Server Version: 4.12.0-rc.6
Kubernetes Version: v1.25.4+77bec7a
How reproducible:
Create a checkup configmap with a nonexistent node specified as the source node. The first virt-launcher pod will stay in pending mode and will never get to a running state and the actual checkup will never start.
Steps to Reproduce:
1. create a Namespace
oc new-project test-latency
2. Create a Bridge with this yaml:
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
name: br10
spec:
desiredState:
interfaces:
- bridge:
options:
stp:
enabled: false
port: - name: ens9
ipv4:
auto-dns: true
dhcp: false
enabled: false
ipv6:
auto-dns: true
autoconf: false
dhcp: false
enabled: false
name: br10
state: up
type: linux-bridge
nodeSelector:
node-role.kubernetes.io/worker: ''
3. Create a NAD with this yaml:
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: bridge-network-nad
spec:
config: |
{
"cniVersion":"0.3.1",
"name": "br10",
"plugins": [
]
}
~
4. Create a service-account, role, and role-binding:
cat <<EOF | kubectl apply -f -
—
apiVersion: v1
kind: ServiceAccount
metadata:
name: vm-latency-checkup-sa
—
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: kubevirt-vm-latency-checker
rules:
- apiGroups: ["kubevirt.io"]
resources: ["virtualmachineinstances"]
verbs: ["get", "create", "delete"] - apiGroups: ["subresources.kubevirt.io"]
resources: ["virtualmachineinstances/console"]
verbs: ["get"] - apiGroups: ["k8s.cni.cncf.io"]
resources: ["network-attachment-definitions"]
verbs: ["get"]
—
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kubevirt-vm-latency-checker
subjects: - kind: ServiceAccount
name: vm-latency-checkup-sa
roleRef:
kind: Role
name: kubevirt-vm-latency-checker
apiGroup: rbac.authorization.k8s.io
—
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: kiagnose-configmap-access
rules: - apiGroups: [ "" ]
resources: [ "configmaps" ]
verbs: ["get", "update"]
—
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kiagnose-configmap-access
subjects: - kind: ServiceAccount
name: vm-latency-checkup-sa
roleRef:
kind: Role
name: kiagnose-configmap-access
apiGroup: rbac.authorization.k8s.io
EOF
5. Create the ConfigMap with the "spec.param.max_desired_latency_milliseconds" filed set to 0:
cat <<EOF | kubectl apply -f -
—
apiVersion: v1
kind: ConfigMap
metadata:
name: kubevirt-vm-latency-checkup-config
data:
spec.timeout: 5m
spec.param.network_attachment_definition_namespace: "manual-latency-check"
spec.param.network_attachment_definition_name: "bridge-network-nad"
spec.param.max_desired_latency_milliseconds: "0"
spec.param.sample_duration_seconds: "5"
spec.param.source_node: non-existent-node
spec.param.target_node: cnv-qe-14.cnvqe.lab.eng.rdu2.redhat.com
EOF
6. Create a job:
cat <<EOF | kubectl apply -f -
—
apiVersion: batch/v1
kind: Job
metadata:
name: kubevirt-vm-latency-checkup
spec:
backoffLimit: 0
template:
spec:
serviceAccountName: vm-latency-checkup-sa
restartPolicy: Never
containers:
- name: vm-latency-checkup
image: brew.registry.redhat.io/rh-osbs/container-native-virtualization-vm-network-latency-checkup:v4.12.0
securityContext:
runAsUser: 1000
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
runAsNonRoot: true
seccompProfile:
type: "RuntimeDefault"
env: - name: CONFIGMAP_NAMESPACE
value: test-latency - name: CONFIGMAP_NAME
value: kubevirt-vm-latency-checkup-config
EOF
Actual results:
When the job is deleted the pods and VMI's are not deleted:
oc get all
NAME READY STATUS RESTARTS AGE
pod/latency-nonexistent-node-job-qt4wk 0/1 Error 0 74m
pod/virt-launcher-latency-check-source-4fqgk 0/2 Pending 0 74m
pod/virt-launcher-latency-check-target-smj9r 2/2 Running 0 74m
NAME COMPLETIONS DURATION AGE
job.batch/latency-nonexistent-node-job 0/1 74m 74m
NAME AGE PHASE IP NODENAME READY
virtualmachineinstance.kubevirt.io/latency-check-source 74m Scheduling False
virtualmachineinstance.kubevirt.io/latency-check-target 74m Running 192.168.100.20 cnv-qe-14.cnvqe.lab.eng.rdu2.redhat.com True
Expected results:
All the resources created by the Job are deleted as the job gets deleted.
- is blocked by
-
CNV-23732 [2156902] VM latency checkup - Checkup not performing a teardown in case of setup failure
- Closed
- external trackers