Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-22450

Kubelet targets down after upgrade from 4.12 to 4.13 - in cluster with missing bootstrap configmap

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Normal
    • None
    • 4.13
    • Monitoring
    • None
    • Moderate
    • No
    • MON Sprint 244
    • 1
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      TargetDown and KubeletDown alerts are firing in a cluster that has been upgraded from 4.12 to 4.13
      
      The bootstrap configmap is deleted from the kube-system namespace at 4.12

      Version-Release number of selected component (if applicable):

       

      How reproducible:

      always

      Steps to Reproduce:

      $ oc get clusterversion 
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.12.26   True        False         106m    Cluster version is 4.12.26
      $ oc get configmap -n kube-system bootstrap 
      NAME        DATA   AGE
      bootstrap   1      118m
      $ oc delete configmap -n kube-system bootstrap 
      configmap "bootstrap" deleted
      $ oc get configmap -n kube-system bootstrap 
      Error from server (NotFound): configmaps "bootstrap" not found
      $ oc -n openshift-config patch cm admin-acks --patch '{"data":{"ack-4.12-kube-1.26-api-removals-in-4.13":"true"}}' --type=merge 
      configmap/admin-acks patchedoc get clusterversion
      
      Upgrade from 4.12 to 4.13
       
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.13.11   True        False         15m     Cluster version is 4.13.11
      $ oc get secret metrics-client-certs -o yaml | grep crt | awk '{print $2}' | base64 -d > secret.pem 
      $ openssl x509 -in secret.pem -text -noout 
      Certificate:
          Data:
              Version: 3 (0x2)
              Serial Number:
                  d7:9e:f0:6a:dd:86:28:b9:b4:45:95:76:af:ca:8b:1d
              Signature Algorithm: sha256WithRSAEncryption
              Issuer: OU = openshift, CN = kubelet-signer
              Validity
                  Not Before: Oct 26 13:59:46 2023 GMT
                  Not After : Oct 27 13:48:08 2023 GMT
              Subject: CN = system:serviceaccount:openshift-monitoring:prometheus-k8s
              Subject Public Key Info:
                  Public Key Algorithm: id-ecPublicKey
                      Public-Key: (256 bit)
                      pub:
                          04:50:a9:d9:d4:36:66:29:fd:ca:51:22:b3:41:cf:
                          e1:bc:3e:63:bb:fe:84:82:3a:23:f2:74:99:76:d4:
                          a7:09:fa:df:a8:25:81:a3:fc:94:f3:6f:2b:85:3b:
                          90:9f:89:e1:01:68:6f:55:49:16:ee:7e:7a:e4:5a:
                          b5:d2:5d:9c:34
                      ASN1 OID: prime256v1
                      NIST CURVE: P-256
              X509v3 extensions:
                  X509v3 Key Usage: critical
                      Digital Signature, Key Encipherment
                  X509v3 Extended Key Usage: 
                      TLS Web Client Authentication
                  X509v3 Basic Constraints: critical
                      CA:FALSE
                  X509v3 Authority Key Identifier: 
                      E5:80:A2:00:E1:8D:9F:ED:78:5F:EF:17:BA:84:A6:1B:33:F4:32:B3
          Signature Algorithm: sha256WithRSAEncryption
          Signature Value:
              42:c8:e3:0f:7a:01:31:52:87:8d:e7:30:36:f4:c6:e1:d6:09:
              08:b1:d5:20:d9:1e:b3:e9:e6:65:a5:2c:fe:79:ad:f9:26:39:
              71:90:4f:16:8e:e0:05:df:2c:29:ec:f5:9f:a5:18:3f:3d:9e:
              5f:38:d6:9c:a7:f8:59:f6:c8:20:85:1b:26:1d:72:4a:dd:09:
              2d:7d:2f:1f:e4:1c:bc:df:e2:31:78:a9:f0:f8:de:11:cc:da:
              66:87:d5:c5:8b:9e:73:15:96:d3:b7:38:1b:e2:69:2d:6e:4d:
              4d:fc:1f:87:e5:e4:2e:5a:e2:6d:77:9a:2f:ce:60:7c:27:4c:
              25:b0:94:37:55:3e:c6:46:e0:b7:15:1a:2b:5a:f1:35:e6:8a:
              c1:19:ed:39:aa:79:ba:7b:1f:d6:2f:5a:b1:b2:e1:25:c3:b8:
              49:f8:0a:27:b6:67:87:43:b1:be:e2:48:48:ec:df:60:75:2d:
              7f:bb:3a:c1:0b:5a:5b:bc:38:29:bc:09:07:cd:2f:b8:8e:13:
              88:ee:71:45:fa:d5:f9:03:19:b0:66:5b:c6:24:4b:2d:31:8b:
              ac:d3:06:79:16:47:ab:d2:37:5c:f7:45:b2:66:1c:53:bb:0b:
              92:dd:a6:84:22:a4:fc:5c:35:76:64:1d:f5:16:0b:ce:29:4a:
              ef:9d:a1:cb
      $ oc delete secret metrics-client-certs  
      secret "metrics-client-certs" deleted
      $ oc get secrets metrics-client-certs 
      NAME                   TYPE     DATA   AGE
      metrics-client-certs   Opaque   2      44s
      $ oc get secret metrics-client-certs -o yaml | grep crt | awk '{print $2}' | base64 -d > secret.pem 
      $ openssl x509 -in secret.pem -text -noout 
      Certificate:
          Data:
              Version: 3 (0x2)
              Serial Number:
                  6c:b3:48:68:f2:7e:27:77:13:bc:7b:0e:e0:f5:2b:6d
              Signature Algorithm: sha256WithRSAEncryption
              Issuer: CN = kube-csr-signer_@1698329032
              Validity
                  Not Before: Oct 26 17:26:27 2023 GMT
                  Not After : Oct 27 13:48:08 2023 GMT
              Subject: CN = system:serviceaccount:openshift-monitoring:prometheus-k8s
              Subject Public Key Info:
                  Public Key Algorithm: id-ecPublicKey
                      Public-Key: (256 bit)
                      pub:
                          04:20:dd:2f:62:c3:1e:d6:bd:f1:dc:c8:f1:cb:22:
                          31:27:5c:30:d8:01:19:93:da:ed:20:2e:4f:5d:f9:
                          5d:02:95:f4:10:f0:79:bf:b6:be:b6:54:66:17:29:
                          a4:61:62:ec:8b:cf:6f:2f:01:4e:8f:71:3b:cb:9d:
                          24:84:c5:52:08
                      ASN1 OID: prime256v1
                      NIST CURVE: P-256
              X509v3 extensions:
                  X509v3 Key Usage: critical
                      Digital Signature, Key Encipherment
                  X509v3 Extended Key Usage: 
                      TLS Web Client Authentication
                  X509v3 Basic Constraints: critical
                      CA:FALSE
                  X509v3 Authority Key Identifier: 
                      77:7A:20:8D:D4:ED:A8:EE:B1:0B:94:87:C8:A7:8E:1C:C0:B3:3E:B3
          Signature Algorithm: sha256WithRSAEncryption
          Signature Value:
              28:a6:c9:e0:22:ad:af:15:04:89:23:7f:fc:92:41:c2:b3:29:
              5d:c6:26:ac:d5:6b:71:e1:44:45:37:90:35:04:9b:6e:7e:ae:
              43:6d:3e:8e:c3:5f:25:d7:79:62:6c:e6:7f:6a:10:a6:e4:39:
              1e:1f:34:6e:7d:63:02:33:a7:32:b6:29:bc:f6:c3:bf:e6:47:
              52:90:03:b8:45:30:46:a8:f2:32:c0:c2:31:68:2e:94:b4:4e:
              51:10:7f:e0:fd:ff:33:94:1a:d6:96:dd:0c:ce:29:90:c6:ca:
              fc:88:62:02:0e:4a:01:6f:48:54:0a:ba:1d:0f:76:df:ce:ff:
              8c:91:cf:16:93:e8:59:6d:ef:24:31:bc:53:cf:3f:df:bf:7b:
              96:ea:f6:30:de:98:32:47:be:46:8a:f8:e2:44:84:57:47:d4:
              2c:cc:1a:f9:7a:3f:a0:80:90:1f:eb:75:35:9d:84:2f:a5:da:
              ff:9e:5e:1f:5a:f4:cc:7a:68:15:d7:e7:2b:38:af:8b:d5:ed:
              00:7a:02:0a:07:8f:8d:8f:0b:50:9f:81:a2:d2:35:79:81:11:
              04:e5:e2:95:7b:8d:fa:a6:97:1c:0f:4f:ab:55:ce:50:1f:6f:
              d7:8d:76:06:20:39:d0:b3:ec:d8:a3:25:f0:db:88:74:6b:1f:
              61:84:8b:74
      
      oc get configmap -n kube-system bootstrap 
      Error from server (NotFound): configmaps "bootstrap" not found
      
      $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s --data-urlencode "query=ALERTS{alertname='KubeletDown'}" 'http://localhost:9090/api/v1/query?' | jq 
      {
        "status": "success",
        "data": {
          "resultType": "vector",
          "result": [
            {
              "metric": {
                "__name__": "ALERTS",
                "alertname": "KubeletDown",
                "alertstate": "pending",
                "namespace": "kube-system",
                "severity": "critical"
              },
              "value": [
                1698342150.849,
                "1"
              ]
            }
          ]
        }
      }
      $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s --data-urlencode "query=up{job='kubelet',metrics_path='/metrics'}" 'http://localhost:9090/api/v1/query?' | jq 
      {
        "status": "success",
        "data": {
          "resultType": "vector",
          "result": [
            {
              "metric": {
                "__name__": "up",
                "endpoint": "https-metrics",
                "instance": "10.0.90.25:10250",
                "job": "kubelet",
                "metrics_path": "/metrics",
                "namespace": "kube-system",
                "node": "worker-0.nigeltest.lab.upshift.rdu2.redhat.com",
                "service": "kubelet"
              },
              "value": [
                1698342160.462,
                "0"
              ]
            },
            {
              "metric": {
                "__name__": "up",
                "endpoint": "https-metrics",
                "instance": "10.0.91.12:10250",
                "job": "kubelet",
                "metrics_path": "/metrics",
                "namespace": "kube-system",
                "node": "worker-1.nigeltest.lab.upshift.rdu2.redhat.com",
                "service": "kubelet"
              },
              "value": [
                1698342160.462,
                "0"
              ]
            },
            {
              "metric": {
                "__name__": "up",
                "endpoint": "https-metrics",
                "instance": "10.0.91.159:10250",
                "job": "kubelet",
                "metrics_path": "/metrics",
                "namespace": "kube-system",
                "node": "worker-2.nigeltest.lab.upshift.rdu2.redhat.com",
                "service": "kubelet"
              },
              "value": [
                1698342160.462,
                "0"
              ]
            },
            {
              "metric": {
                "__name__": "up",
                "endpoint": "https-metrics",
                "instance": "10.0.91.228:10250",
                "job": "kubelet",
                "metrics_path": "/metrics",
                "namespace": "kube-system",
                "node": "master-1.nigeltest.lab.upshift.rdu2.redhat.com",
                "service": "kubelet"
              },
              "value": [
                1698342160.462,
                "0"
              ]
            },
            {
              "metric": {
                "__name__": "up",
                "endpoint": "https-metrics",
                "instance": "10.0.92.103:10250",
                "job": "kubelet",
                "metrics_path": "/metrics",
                "namespace": "kube-system",
                "node": "master-2.nigeltest.lab.upshift.rdu2.redhat.com",
                "service": "kubelet"
              },
              "value": [
                1698342160.462,
                "0"
              ]
            },
            {
              "metric": {
                "__name__": "up",
                "endpoint": "https-metrics",
                "instance": "10.0.95.36:10250",
                "job": "kubelet",
                "metrics_path": "/metrics",
                "namespace": "kube-system",
                "node": "master-0.nigeltest.lab.upshift.rdu2.redhat.com",
                "service": "kubelet"
              },
              "value": [
                1698342160.462,
                "0"
              ]
            }
          ]
        }
      }
      
      $ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s --data-urlencode "query=ALERTS{alertname='KubeletDown'}" 'http://localhost:9090/api/v1/query?' | jq 
      {
        "status": "success",
        "data": {
          "resultType": "vector",
          "result": [
            {
              "metric": {
                "__name__": "ALERTS",
                "alertname": "KubeletDown",
                "alertstate": "firing",
                "namespace": "kube-system",
                "severity": "critical"
              },
              "value": [
                1698343484.200,
                "1"
              ]
            }
          ]
        }
      }
        

      Actual results:

      kubeletdown alerts fire in the cluster as the targets have disapeared

      Expected results:

      that the targets remain up

      Additional info:

      https://issues.redhat.com/browse/OCPBUGS-4521
      
      https://issues.redhat.com/browse/OCPBUGS-4521?focusedId=21376346&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-21376346
      
      https://issues.redhat.com/browse/OCPBUGS-17926
      
      https://issues.redhat.com/browse/OCPBUGS-5888 
      
      

      Attachments

        Issue Links

          Activity

            People

              rh-ee-amrini Ayoub Mrini
              rhn-support-nigsmith Nigel Smith
              Junqi Zhao Junqi Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: