-
Bug
-
Resolution: Done
-
Normal
-
None
-
4.13
-
None
Description of problem:
TargetDown and KubeletDown alerts are firing in a cluster that has been upgraded from 4.12 to 4.13 The bootstrap configmap is deleted from the kube-system namespace at 4.12
Version-Release number of selected component (if applicable):
How reproducible:
always
Steps to Reproduce:
$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.12.26 True False 106m Cluster version is 4.12.26 $ oc get configmap -n kube-system bootstrap NAME DATA AGE bootstrap 1 118m $ oc delete configmap -n kube-system bootstrap configmap "bootstrap" deleted $ oc get configmap -n kube-system bootstrap Error from server (NotFound): configmaps "bootstrap" not found $ oc -n openshift-config patch cm admin-acks --patch '{"data":{"ack-4.12-kube-1.26-api-removals-in-4.13":"true"}}' --type=merge configmap/admin-acks patchedoc get clusterversion Upgrade from 4.12 to 4.13 NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.13.11 True False 15m Cluster version is 4.13.11 $ oc get secret metrics-client-certs -o yaml | grep crt | awk '{print $2}' | base64 -d > secret.pem $ openssl x509 -in secret.pem -text -noout Certificate: Data: Version: 3 (0x2) Serial Number: d7:9e:f0:6a:dd:86:28:b9:b4:45:95:76:af:ca:8b:1d Signature Algorithm: sha256WithRSAEncryption Issuer: OU = openshift, CN = kubelet-signer Validity Not Before: Oct 26 13:59:46 2023 GMT Not After : Oct 27 13:48:08 2023 GMT Subject: CN = system:serviceaccount:openshift-monitoring:prometheus-k8s Subject Public Key Info: Public Key Algorithm: id-ecPublicKey Public-Key: (256 bit) pub: 04:50:a9:d9:d4:36:66:29:fd:ca:51:22:b3:41:cf: e1:bc:3e:63:bb:fe:84:82:3a:23:f2:74:99:76:d4: a7:09:fa:df:a8:25:81:a3:fc:94:f3:6f:2b:85:3b: 90:9f:89:e1:01:68:6f:55:49:16:ee:7e:7a:e4:5a: b5:d2:5d:9c:34 ASN1 OID: prime256v1 NIST CURVE: P-256 X509v3 extensions: X509v3 Key Usage: critical Digital Signature, Key Encipherment X509v3 Extended Key Usage: TLS Web Client Authentication X509v3 Basic Constraints: critical CA:FALSE X509v3 Authority Key Identifier: E5:80:A2:00:E1:8D:9F:ED:78:5F:EF:17:BA:84:A6:1B:33:F4:32:B3 Signature Algorithm: sha256WithRSAEncryption Signature Value: 42:c8:e3:0f:7a:01:31:52:87:8d:e7:30:36:f4:c6:e1:d6:09: 08:b1:d5:20:d9:1e:b3:e9:e6:65:a5:2c:fe:79:ad:f9:26:39: 71:90:4f:16:8e:e0:05:df:2c:29:ec:f5:9f:a5:18:3f:3d:9e: 5f:38:d6:9c:a7:f8:59:f6:c8:20:85:1b:26:1d:72:4a:dd:09: 2d:7d:2f:1f:e4:1c:bc:df:e2:31:78:a9:f0:f8:de:11:cc:da: 66:87:d5:c5:8b:9e:73:15:96:d3:b7:38:1b:e2:69:2d:6e:4d: 4d:fc:1f:87:e5:e4:2e:5a:e2:6d:77:9a:2f:ce:60:7c:27:4c: 25:b0:94:37:55:3e:c6:46:e0:b7:15:1a:2b:5a:f1:35:e6:8a: c1:19:ed:39:aa:79:ba:7b:1f:d6:2f:5a:b1:b2:e1:25:c3:b8: 49:f8:0a:27:b6:67:87:43:b1:be:e2:48:48:ec:df:60:75:2d: 7f:bb:3a:c1:0b:5a:5b:bc:38:29:bc:09:07:cd:2f:b8:8e:13: 88:ee:71:45:fa:d5:f9:03:19:b0:66:5b:c6:24:4b:2d:31:8b: ac:d3:06:79:16:47:ab:d2:37:5c:f7:45:b2:66:1c:53:bb:0b: 92:dd:a6:84:22:a4:fc:5c:35:76:64:1d:f5:16:0b:ce:29:4a: ef:9d:a1:cb $ oc delete secret metrics-client-certs secret "metrics-client-certs" deleted $ oc get secrets metrics-client-certs NAME TYPE DATA AGE metrics-client-certs Opaque 2 44s $ oc get secret metrics-client-certs -o yaml | grep crt | awk '{print $2}' | base64 -d > secret.pem $ openssl x509 -in secret.pem -text -noout Certificate: Data: Version: 3 (0x2) Serial Number: 6c:b3:48:68:f2:7e:27:77:13:bc:7b:0e:e0:f5:2b:6d Signature Algorithm: sha256WithRSAEncryption Issuer: CN = kube-csr-signer_@1698329032 Validity Not Before: Oct 26 17:26:27 2023 GMT Not After : Oct 27 13:48:08 2023 GMT Subject: CN = system:serviceaccount:openshift-monitoring:prometheus-k8s Subject Public Key Info: Public Key Algorithm: id-ecPublicKey Public-Key: (256 bit) pub: 04:20:dd:2f:62:c3:1e:d6:bd:f1:dc:c8:f1:cb:22: 31:27:5c:30:d8:01:19:93:da:ed:20:2e:4f:5d:f9: 5d:02:95:f4:10:f0:79:bf:b6:be:b6:54:66:17:29: a4:61:62:ec:8b:cf:6f:2f:01:4e:8f:71:3b:cb:9d: 24:84:c5:52:08 ASN1 OID: prime256v1 NIST CURVE: P-256 X509v3 extensions: X509v3 Key Usage: critical Digital Signature, Key Encipherment X509v3 Extended Key Usage: TLS Web Client Authentication X509v3 Basic Constraints: critical CA:FALSE X509v3 Authority Key Identifier: 77:7A:20:8D:D4:ED:A8:EE:B1:0B:94:87:C8:A7:8E:1C:C0:B3:3E:B3 Signature Algorithm: sha256WithRSAEncryption Signature Value: 28:a6:c9:e0:22:ad:af:15:04:89:23:7f:fc:92:41:c2:b3:29: 5d:c6:26:ac:d5:6b:71:e1:44:45:37:90:35:04:9b:6e:7e:ae: 43:6d:3e:8e:c3:5f:25:d7:79:62:6c:e6:7f:6a:10:a6:e4:39: 1e:1f:34:6e:7d:63:02:33:a7:32:b6:29:bc:f6:c3:bf:e6:47: 52:90:03:b8:45:30:46:a8:f2:32:c0:c2:31:68:2e:94:b4:4e: 51:10:7f:e0:fd:ff:33:94:1a:d6:96:dd:0c:ce:29:90:c6:ca: fc:88:62:02:0e:4a:01:6f:48:54:0a:ba:1d:0f:76:df:ce:ff: 8c:91:cf:16:93:e8:59:6d:ef:24:31:bc:53:cf:3f:df:bf:7b: 96:ea:f6:30:de:98:32:47:be:46:8a:f8:e2:44:84:57:47:d4: 2c:cc:1a:f9:7a:3f:a0:80:90:1f:eb:75:35:9d:84:2f:a5:da: ff:9e:5e:1f:5a:f4:cc:7a:68:15:d7:e7:2b:38:af:8b:d5:ed: 00:7a:02:0a:07:8f:8d:8f:0b:50:9f:81:a2:d2:35:79:81:11: 04:e5:e2:95:7b:8d:fa:a6:97:1c:0f:4f:ab:55:ce:50:1f:6f: d7:8d:76:06:20:39:d0:b3:ec:d8:a3:25:f0:db:88:74:6b:1f: 61:84:8b:74 oc get configmap -n kube-system bootstrap Error from server (NotFound): configmaps "bootstrap" not found $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s --data-urlencode "query=ALERTS{alertname='KubeletDown'}" 'http://localhost:9090/api/v1/query?' | jq { "status": "success", "data": { "resultType": "vector", "result": [ { "metric": { "__name__": "ALERTS", "alertname": "KubeletDown", "alertstate": "pending", "namespace": "kube-system", "severity": "critical" }, "value": [ 1698342150.849, "1" ] } ] } } $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s --data-urlencode "query=up{job='kubelet',metrics_path='/metrics'}" 'http://localhost:9090/api/v1/query?' | jq { "status": "success", "data": { "resultType": "vector", "result": [ { "metric": { "__name__": "up", "endpoint": "https-metrics", "instance": "10.0.90.25:10250", "job": "kubelet", "metrics_path": "/metrics", "namespace": "kube-system", "node": "worker-0.nigeltest.lab.upshift.rdu2.redhat.com", "service": "kubelet" }, "value": [ 1698342160.462, "0" ] }, { "metric": { "__name__": "up", "endpoint": "https-metrics", "instance": "10.0.91.12:10250", "job": "kubelet", "metrics_path": "/metrics", "namespace": "kube-system", "node": "worker-1.nigeltest.lab.upshift.rdu2.redhat.com", "service": "kubelet" }, "value": [ 1698342160.462, "0" ] }, { "metric": { "__name__": "up", "endpoint": "https-metrics", "instance": "10.0.91.159:10250", "job": "kubelet", "metrics_path": "/metrics", "namespace": "kube-system", "node": "worker-2.nigeltest.lab.upshift.rdu2.redhat.com", "service": "kubelet" }, "value": [ 1698342160.462, "0" ] }, { "metric": { "__name__": "up", "endpoint": "https-metrics", "instance": "10.0.91.228:10250", "job": "kubelet", "metrics_path": "/metrics", "namespace": "kube-system", "node": "master-1.nigeltest.lab.upshift.rdu2.redhat.com", "service": "kubelet" }, "value": [ 1698342160.462, "0" ] }, { "metric": { "__name__": "up", "endpoint": "https-metrics", "instance": "10.0.92.103:10250", "job": "kubelet", "metrics_path": "/metrics", "namespace": "kube-system", "node": "master-2.nigeltest.lab.upshift.rdu2.redhat.com", "service": "kubelet" }, "value": [ 1698342160.462, "0" ] }, { "metric": { "__name__": "up", "endpoint": "https-metrics", "instance": "10.0.95.36:10250", "job": "kubelet", "metrics_path": "/metrics", "namespace": "kube-system", "node": "master-0.nigeltest.lab.upshift.rdu2.redhat.com", "service": "kubelet" }, "value": [ 1698342160.462, "0" ] } ] } } $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s --data-urlencode "query=ALERTS{alertname='KubeletDown'}" 'http://localhost:9090/api/v1/query?' | jq { "status": "success", "data": { "resultType": "vector", "result": [ { "metric": { "__name__": "ALERTS", "alertname": "KubeletDown", "alertstate": "firing", "namespace": "kube-system", "severity": "critical" }, "value": [ 1698343484.200, "1" ] } ] } }
Actual results:
kubeletdown alerts fire in the cluster as the targets have disapeared
Expected results:
that the targets remain up
Additional info:
https://issues.redhat.com/browse/OCPBUGS-4521 https://issues.redhat.com/browse/OCPBUGS-4521?focusedId=21376346&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-21376346 https://issues.redhat.com/browse/OCPBUGS-17926 https://issues.redhat.com/browse/OCPBUGS-5888
- is related to
-
OCPBUGS-4521 all kubelet targets are down after a few hours
- Closed
-
OCPBUGS-17926 Kubelet targets down after upgrade from 4.12 to 4.13
- Closed
- links to