Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4504

Default to floating automaticRestart for new GCP instances

    XMLWordPrintable

Details

    • Moderate
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, instances were not set to respect the GCP infrastructure default option for automated restarts. As a result, instances could be created without using the infrastructure default for automatic restarts. This sometimes meant that instances were terminated in GCP but their associated machines were still listed in the `Running` state because they did not automatically restart. With this release, the code for passing the automatic restart option has been improved to better detect and pass on the default option selection from users. Instances now use the infrastructure default properly and are automatically restarted when the user requests the default functionality.
      (link:https://issues.redhat.com/browse/OCPBUGS-4504[*OCPBUGS-4504*])
      Show
      * Previously, instances were not set to respect the GCP infrastructure default option for automated restarts. As a result, instances could be created without using the infrastructure default for automatic restarts. This sometimes meant that instances were terminated in GCP but their associated machines were still listed in the `Running` state because they did not automatically restart. With this release, the code for passing the automatic restart option has been improved to better detect and pass on the default option selection from users. Instances now use the infrastructure default properly and are automatically restarted when the user requests the default functionality. (link: https://issues.redhat.com/browse/OCPBUGS-4504 [* OCPBUGS-4504 *])
    • Bug Fix
    • Done

    Description

      This is a clone of issue OCPBUGS-1557. The following is the description of the original issue:

      Seen in an instance created recently by a 4.12.0-ec.2 GCP provider:

        "scheduling": {
          "automaticRestart": false,
          "onHostMaintenance": "MIGRATE",
          "preemptible": false,
          "provisioningModel": "STANDARD"
        },
      

      From GCP's docs, they may stop instances on hardware failures and other causes, and we'd need automaticRestart: true to auto-recover from that. Also from GCP docs, the default for automaticRestart is true. And on the Go provider side, we doc:

      If omitted, the platform chooses a default, which is subject to change over time, currently that default is "Always".

      But the implementing code does not actually float the setting. Seems like a regression here, which is part of 4.10:

      $ git clone https://github.com/openshift/machine-api-provider-gcp.git
      $ cd machine-api-provider-gcp
      $ git log --oneline origin/release-4.10 | grep 'migrate to openshift/api'
      44f0f958 migrate to openshift/api
      

      But that's not where the 4.9 and earlier code is located:

      $ git branch -a | grep origin/release
        remotes/origin/release-4.10
        remotes/origin/release-4.11
        remotes/origin/release-4.12
        remotes/origin/release-4.13
      

      Hunting for 4.9 code:

      $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.9.48-x86_64 | grep gcp
        gcp-machine-controllers                        https://github.com/openshift/cluster-api-provider-gcp                       c955c03b2d05e3b8eb0d39d5b4927128e6d1c6c6
        gcp-pd-csi-driver                              https://github.com/openshift/gcp-pd-csi-driver                              48d49f7f9ef96a7a42a789e3304ead53f266f475
        gcp-pd-csi-driver-operator                     https://github.com/openshift/gcp-pd-csi-driver-operator                     d8a891de5ae9cf552d7d012ebe61c2abd395386e
      

      So looking there:

      $ git clone https://github.com/openshift/cluster-api-provider-gcp.git
      $ cd cluster-api-provider-gcp
      $ git log --oneline | grep 'migrate to openshift/api'
      ...no hits...
      $ git grep -i automaticRestart origin/release-4.9  | grep -v '"description"\|compute-gen.go'
      origin/release-4.9:vendor/google.golang.org/api/compute/v1/compute-api.json:        "automaticRestart": {
      

      Not actually clear to me how that code is structured. So 4.10 and later GCP machine-API providers are impacted, and I'm unclear on 4.9 and earlier.

      Attachments

        Issue Links

          Activity

            People

              mimccune@redhat.com Michael McCune
              openshift-crt-jira-prow OpenShift Prow Bot
              Zhaohua Sun Zhaohua Sun
              Jeana Routh Jeana Routh
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: