Uploaded image for project: 'OpenShift Hive'
  1. OpenShift Hive
  2. HIVE-2256

ClusterPools + MachinePools: Rethink the whole thing

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • ClusterPools + MachinePools: Rethink the whole thing
    • False
    • None
    • False
    • Not Selected
    • To Do

      For reasons lost to history (read: I haven't dug into it yet) the ClusterPool controller creates a MachinePool with each pool cluster by default. This can be optionally disabled via ClusterPool.spec.skipMachinePools.

      The problem is that the MachinePool we generate has hardcoded values that will only match what's in the default worker pool in the install-config by coincidence. If anything in the install-config is nonstandard, you can end up in a situation where your spoke MachineSets will represent some non-intuitive combination of that and the hardcoded MachinePool values:

      • If your install-config's instance type doesn't match the one we hardcode in the MachinePool (e.g. m5.xlarge for AWS), the install-config will win... unless Machines are deleted in which case MAPI will recreate them with the MachinePool's instance type. Similar for other things under platform.
      • We hardcode replicas to 3, meaning whatever replica count you have in your install-config is effectively ignored.

      We do properly set up RBAC for the owner of the ClusterClaim, so that user would be able to edit the MachinePool – but we only allow editing the replica count!

      Bringing the matter more to the foreground, ACM is starting to consume both ClusterPool and MachinePool, so we need to be considering their UX with whatever we decide here.

      We also need to consider that MAPI is on the way out – but CAPI isn't here yet – so we need to be careful not to invest too much in making the existing thing perfect; and we can't expect MAPI changes.

      Initial thoughts:

      • Consider changing to default skipMachinePools: true. This is technically a breaking change, so it will need to be thought out carefully. But I think it's pretty safe given the limitations of what's happening today.
      • Populate MachinePool based on values from install-config. This is where we want to be careful about how much we're investing, as this will be a nontrivial amount of work – both code and test surface.

      Here's the thread that prompted this.

              Unassigned Unassigned
              efried.openshift Eric Fried
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: