Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-2071

revert "force cert rotation every couple days for development" in 4.12

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • 4.12.z
    • 4.12
    • kube-apiserver
    • None
    • Approved
    • False
    • Hide

      None

      Show
      None
    • NA

      Description of problem:

      revert "force cert rotation every couple days for development" in 4.12
      
      We want short expiry times during development and long expiry times when we ship.
      
      --- Additional comment from Eric Paris on 2020-04-02 19:57:29 CEST ---
      
      This bug has been set to target the 4.5.0 release without specifying a severity. As part of triage when determining the priority of bugs a severity should be specified. Since these bugs have no been properly triaged I am removing the target release. Teams will need to add a severity before deferring these bugs again.
      
      --- Additional comment from Michal Fojtik on 2020-05-12 12:45:25 CEST ---
      
      This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.
      
      As such, we're marking this bug as "LifecycleStale" and decreasing the severity. 
      
      If you have further information on the current state of the bug, please update it, otherwise this bug will be automatically closed in 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.
      
      --- Additional comment from Standa Laznicka on 2020-05-12 14:53:12 CEST ---
      
      you don't really want to close this
      
      --- Additional comment from Stefan Schimanski on 2020-05-19 13:11:00 CEST ---
      
      Waiting for master to open. We will fix it then on the release branch.
      
      --- Additional comment from Stefan Schimanski on 2020-06-18 12:23:34 CEST ---
      
      Will be done when 4.6 branches from master.
      
      --- Additional comment from Michal Fojtik on 2020-07-09 14:46:02 CEST ---
      
      Stefan is PTO, adding UpcomingSprint to his bugs to fulfill the duty.
      
      --- Additional comment from Michal Fojtik on 2020-08-24 15:12:08 CEST ---
      
      This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.
      
      --- Additional comment from Michal Fojtik on 2020-08-31 15:59:33 CEST ---
      
      This bug hasn't had any activity 7 days after it was marked as LifecycleStale, so we are closing this bug as WONTFIX. If you consider this bug still valuable, please reopen it or create new bug.
      
      --- Additional comment from Michal Fojtik on 2020-08-31 17:00:25 CEST ---
      
      The LifecycleStale keyword was removed because the bug got commented on recently.
      The bug assignee was notified.
      
      --- Additional comment from Stefan Schimanski on 2020-09-11 13:00:27 CEST ---
      
      This is waiting for Eric Paris to stop fast forwarding release-4.6 from master.
      
      --- Additional comment from Michal Fojtik on 2020-10-30 11:12:07 CET ---
      
      This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.
      
      --- Additional comment from Nick Stielau on 2021-01-20 18:49:09 CET ---
      
      Can we get some context on why this is blocker+?  Would we further delay the release if we don't get a fix in for this?
      
      --- Additional comment from Stefan Schimanski on 2021-03-16 17:28:08 CET ---
      
      --- Additional comment from Eric Paris on 2021-06-08 14:00:16 CEST ---
      
      This bug sets blocker+ without setting a Target Release. This is an invalid state as it is impossible to determine what is being blocked. Please be sure to set Priority, Severity, and Target Release before you attempt to set blocker+
      
      --- Additional comment from Michal Fojtik on 2021-06-10 10:49:36 CEST ---
      
      This is a blocker? until we have Target Release 4.9 (it is a blocker+ for 4.9).
      
      --- Additional comment from Wally on 2021-06-11 15:14:26 CEST ---
      
      Setting blocker- until next week to clear reports heading to code freeze.  Will reset once 4.9 opens.
      
      --- Additional comment from Wally on 2021-08-31 19:26:13 UTC ---
      
      Setting blocker- until next week to clear reports heading to code freeze.  Will reset once 4.10 opens.
      
      --- Additional comment from Michal Fojtik on 2022-02-03 21:53:15 UTC ---
      
      ** A NOTE ABOUT USING URGENT **
      
      This BZ has been set to urgent severity and priority. When a BZ is marked urgent priority Engineers are asked to stop whatever they are doing, putting everything else on hold.
      Please be prepared to have reasonable justification ready to discuss, and ensure your own and engineering management are aware and agree this BZ is urgent. Keep in mind, urgent bugs are very expensive and have maximal management visibility.
      
      NOTE: This bug was automatically assigned to an engineering manager with the severity reset to *unspecified* until the emergency is vetted and confirmed. Please do not manually override the severity.
      
      ** INFORMATION REQUIRED **
      
      Please answer these questions before escalation to engineering:
      
      1. Has a link to must-gather output been provided in this BZ? We cannot work without. If must-gather fails to run, attach all relevant logs and provide the error message of must-gather.
      2. Give the output of "oc get clusteroperators -o yaml".
      3. In case of degraded/unavailable operators, have all their logs and the logs of the operands been analyzed [yes/no]
      4. List the top 5 relevant errors from the logs of the operators and operands in (3).
      5. Order the list of degraded/unavailable operators according to which is likely the cause of the failure of the other, root-cause at the top.
      6. Explain why (5) is likely the right order and list the information used for that assessment.
      7. Explain why Engineering is necessary to make progress.
      
      --- Additional comment from Wally on 2022-02-09 20:11:25 UTC ---
      
      Setting blocker- for now but will add reminder and keep in my queue for visibility.
      
      --- Additional comment from Red Hat Bugzilla on 2022-05-09 08:32:21 UTC ---
      
      Account disabled by LDAP Audit for extended failure
      
      --- Additional comment from OpenShift Automated Release Tooling on 2022-06-24 01:06:13 UTC ---
      
      Elliott changed bug status from MODIFIED to ON_QA.
      This bug is expected to ship in the next 4.11 release.
      
      --- Additional comment from Ke Wang on 2022-06-24 15:24:03 UTC ---
      
      To verify the bug, refer to https://bugzilla.redhat.com/show_bug.cgi?id=1921139#c6
      
      --- Additional comment from OpenShift BugZilla Robot on 2022-06-25 12:40:12 UTC ---
      
      Bugfix included in accepted release 4.11.0-0.nightly-2022-06-25-081133
      Bug will not be automatically moved to VERIFIED for the following reasons:
      - PR openshift/cluster-kube-apiserver-operator#1307 not approved by QA contact
      
      This bug must now be manually moved to VERIFIED by dpunia@redhat.com
      
      --- Additional comment from Deepak Punia on 2022-06-27 08:20:33 UTC ---
      
      Below is the steps to verify this bug:
      
      # oc adm release info --commits registry.ci.openshift.org/ocp/release:4.11.0-0.nightly-2022-06-25-081133|grep -i cluster-kube-apiserver-operator
        cluster-kube-apiserver-operator                https://github.com/openshift/cluster-kube-apiserver-operator                7764681777edfa3126981a0a1d390a6060a840a3
      
      # git log --date local --pretty="%h %an %cd - %s" 776468 |grep -i "#1307"
      08973b820 openshift-ci[bot] Thu Jun 23 22:40:08 2022 - Merge pull request #1307 from tkashem/revert-cert-rotation
      
      # oc get clusterversions.config.openshift.io 
      NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.11.0-0.nightly-2022-06-25-081133   True        False         64m     Cluster version is 4.11.0-0.nightly-2022-06-25-081133
      
      $ cat scripts/check_secret_expiry.sh
      FILE="$1"
      if [ ! -f "$1" ]; then
        echo "must provide \$1" && exit 0
      fi
      export IFS=$'\n'
      for i in `cat "$FILE"`
      do
        if `echo "$i" | grep "^#" > /dev/null`; then
          continue
        fi
        NS=`echo $i | cut -d ' ' -f 1`
        SECRET=`echo $i | cut -d ' ' -f 2`
        rm -f tls.crt; oc extract secret/$SECRET -n $NS --confirm > /dev/null
        echo "Check cert dates of $SECRET in project $NS:"
        openssl x509 -noout --dates -in tls.crt; echo
      done
      
      $ cat certs.txt
      openshift-kube-controller-manager-operator csr-signer-signer
      openshift-kube-controller-manager-operator csr-signer
      openshift-kube-controller-manager kube-controller-manager-client-cert-key
      openshift-kube-apiserver-operator aggregator-client-signer
      openshift-kube-apiserver aggregator-client
      openshift-kube-apiserver external-loadbalancer-serving-certkey
      openshift-kube-apiserver internal-loadbalancer-serving-certkey
      openshift-kube-apiserver service-network-serving-certkey
      openshift-config-managed kube-controller-manager-client-cert-key
      openshift-config-managed kube-scheduler-client-cert-key
      openshift-kube-scheduler kube-scheduler-client-cert-key
      
      Checking the Certs,  they are with one day expiry times, this is as expected.
      # ./check_secret_expiry.sh certs.txt
      Check cert dates of csr-signer-signer in project openshift-kube-controller-manager-operator:
      notBefore=Jun 27 04:41:38 2022 GMT
      notAfter=Jun 28 04:41:38 2022 GMT
      
      Check cert dates of csr-signer in project openshift-kube-controller-manager-operator:
      notBefore=Jun 27 04:52:21 2022 GMT
      notAfter=Jun 28 04:41:38 2022 GMT
      
      Check cert dates of kube-controller-manager-client-cert-key in project openshift-kube-controller-manager:
      notBefore=Jun 27 04:52:26 2022 GMT
      notAfter=Jul 27 04:52:27 2022 GMT
      
      Check cert dates of aggregator-client-signer in project openshift-kube-apiserver-operator:
      notBefore=Jun 27 04:41:37 2022 GMT
      notAfter=Jun 28 04:41:37 2022 GMT
      
      Check cert dates of aggregator-client in project openshift-kube-apiserver:
      notBefore=Jun 27 04:52:26 2022 GMT
      notAfter=Jun 28 04:41:37 2022 GMT
      
      Check cert dates of external-loadbalancer-serving-certkey in project openshift-kube-apiserver:
      notBefore=Jun 27 04:52:26 2022 GMT
      notAfter=Jul 27 04:52:27 2022 GMT
      
      Check cert dates of internal-loadbalancer-serving-certkey in project openshift-kube-apiserver:
      notBefore=Jun 27 04:52:49 2022 GMT
      notAfter=Jul 27 04:52:50 2022 GMT
      
      Check cert dates of service-network-serving-certkey in project openshift-kube-apiserver:
      notBefore=Jun 27 04:52:28 2022 GMT
      notAfter=Jul 27 04:52:29 2022 GMT
      
      Check cert dates of kube-controller-manager-client-cert-key in project openshift-config-managed:
      notBefore=Jun 27 04:52:26 2022 GMT
      notAfter=Jul 27 04:52:27 2022 GMT
      
      Check cert dates of kube-scheduler-client-cert-key in project openshift-config-managed:
      notBefore=Jun 27 04:52:47 2022 GMT
      notAfter=Jul 27 04:52:48 2022 GMT
      
      Check cert dates of kube-scheduler-client-cert-key in project openshift-kube-scheduler:
      notBefore=Jun 27 04:52:47 2022 GMT
      notAfter=Jul 27 04:52:48 2022 GMT
      # 
      
      # cat check_secret_expiry_within.sh
      #!/usr/bin/env bash
      # usage: ./check_secret_expiry_within.sh 1day # or 15min, 2days, 2day, 2month, 1year
      WITHIN=${1:-24hours}
      echo "Checking validity within $WITHIN ..."
      oc get secret --insecure-skip-tls-verify -A -o json | jq -r '.items[] | select(.metadata.annotations."auth.openshift.io/certificate-not-after" | . != null and fromdateiso8601<='$( date --date="+$WITHIN" +%s )') | "\(.metadata.annotations."auth.openshift.io/certificate-not-before")  \(.metadata.annotations."auth.openshift.io/certificate-not-after")  \(.metadata.namespace)\t\(.metadata.name)"'
      
      # ./check_secret_expiry_within.sh 1day
      Checking validity within 1day ...
      2022-06-27T04:41:37Z  2022-06-28T04:41:37Z  openshift-kube-apiserver-operator	aggregator-client-signer
      2022-06-27T04:52:26Z  2022-06-28T04:41:37Z  openshift-kube-apiserver	aggregator-client
      2022-06-27T04:52:21Z  2022-06-28T04:41:38Z  openshift-kube-controller-manager-operator	csr-signer
      2022-06-27T04:41:38Z  2022-06-28T04:41:38Z  openshift-kube-controller-manager-operator	csr-signer-signer
      
      

      Version-Release number of selected component (if applicable):

       

      How reproducible:

       

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

       

      Additional info:

       

            akashem@redhat.com Abu H Kashem
            akashem@redhat.com Abu H Kashem
            Ke Wang Ke Wang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: