Uploaded image for project: 'Machine Config Operator'
  1. Machine Config Operator
  2. MCO-638

MCO behavior to consider during cluster hibernation

XMLWordPrintable

    • Resilient Cert Refresh
    • False
    • None
    • False
    • Not Selected
    • To Do
    • OCPSTRAT-543 - Shutdown/Resume of managed OSD/ROSA clusters
    • OCPSTRAT-543Shutdown/Resume of managed OSD/ROSA clusters
    • 0% To Do, 0% In Progress, 100% Done
    • 0
    • 0

      Hive is working on a feature which will enable OCP cluster to hibernate and resume the cluster on demand

      Without proper handling, it is possible that hibernating and resuming a cluster can lead to undesired behavior in MCO like expired certs . To make sure that cluster can be safely hibernated and resumed in automated fashion, we need to know:

      • Which certs expiry can cause issue and how frequently they get renewed?
      • Will MCO be able to successfully resume update of any expired certs when cluster resumes from hibernation
      • What happens when certs like kubelet ca expires during hibernation. Does kube apiserver automatically generates new cert bundle when cluster resumes.
      • How to tell that  MCP update is in progress
      • In 4.13, for paused pool cert update will happen automatically. Can we detect if a cert update is in progress for paused pool?
      • With 4.14, workflow for cert update will get decoupled from renderd MC. In this case how we will detect that applying cert is in progress?
      • How long cluster can be safely hibernated?

      Related Hive user story https://issues.redhat.com/browse/HIVE-2224

              team-mco Team MCO
              rhn-engineering-skumari Sinny Kumari
              Sinny Kumari Sinny Kumari
              Mark Russell Mark Russell
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: