Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-56913

ABI failures after cluster is registered are ignored

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • In Progress
    • Bug Fix
    • Hide
      *Cause*: When applying the configuration from the AgentClusterInstall cluster-manifest, the install-config overrides must be applied in a separate step. If this second step failed, it would not be retried owing to the partial success of getting to that point.
      *Consequence*: Any error in the install-config overrides means that all config passed in the install-config overrides is ignored. This includes setting of FIPS mode.
      *Fix*: On retry, each step is checked for successful completion.
      *Result*: Either all of the install-config overrides are applied successfully, or cluster installation does not proceed.
      Show
      *Cause*: When applying the configuration from the AgentClusterInstall cluster-manifest, the install-config overrides must be applied in a separate step. If this second step failed, it would not be retried owing to the partial success of getting to that point. *Consequence*: Any error in the install-config overrides means that all config passed in the install-config overrides is ignored. This includes setting of FIPS mode. *Fix*: On retry, each step is checked for successful completion. *Result*: Either all of the install-config overrides are applied successfully, or cluster installation does not proceed.
    • None
    • None
    • None
    • None

      This issue was found when debugging https://issues.redhat.com/browse/OCPBUGS-56596. When the cluster is registered with assisted-service in the agent-register-cluster.service, the first thing it does is register the cluster https://github.com/openshift/assisted-service/blob/master/cmd/agentbasedinstaller/register.go#L111 and then it sets any installConfig Overrides here https://github.com/openshift/assisted-service/blob/master/cmd/agentbasedinstaller/register.go#L123-L147.

      If there is a failure in setting the installConfig overrides the service will terminate with a failure and then be restarted. The problem is, since the cluster HAS been registered, when it restarts it checks for the cluster, finds it, and skips the registration here
      https://github.com/openshift/assisted-service/blob/master/cmd/agentbasedinstaller/client/main.go#L160-L164

      the service then terminates successfully and the fact that a failure occurred when overriding the installConfig is lost.

              zabitter Zane Bitter
              bfournie@redhat.com Robert Fournier
              None
              None
              zhenying niu zhenying niu
              None
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated: