Loading...

XML

Word

Printable

Type: Epic
Resolution: Done
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: openstack-operator-dataplane
Labels:
None

Epic Name:
Allow re-execution of Ansible jobs against already deployed Nodes
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Dev Approval:
Committed
Docs Approval:
Committed
Epic Status:
To Do
PM Approval:
Proposed
QE Approval:
Proposed
Hierarchy Progress Bar:

0% To Do, 0% In Progress, 100% Done
Intelligence Requested:
Market:

Planning Target:

2024Q1

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Our current workflow necessitates that users delete and recreate the `OpenStackDataPlaneNodeSet` CR for any configuration changes. While this works for PreProvisioned nodes, it doesn't work for baremetal. Since the act of deleting the NodeSet will also delete the baremetal node from Metal3.

Thus, it is necessary for us to provide users with the ability to make changes to their nodes, such as network configuration changes (MTU, configure additional networks, gateways, DNS servers, etc), or any other service configuration that might need to be changed as part of a day 2 operation.

Issues with current implementation

When we have a node deployed:

NAME         STATE         CONSUMER              ONLINE   ERROR   AGE
compute-01   provisioned   openstack-edpm-ipam   true             6m20s

Then we delete the nodeSet to make changes:

[m3@localhost ~]$ oc delete osdpns --all
openstackdataplanenodeset.dataplane.openstack.org "openstack-edpm-ipam" deleted

We can see the related baremetal node is also removed:

[m3@localhost ~]$ oc get bmh -n openshift-machine-api compute-01
NAME         STATE            CONSUMER   ONLINE   ERROR   AGE
compute-01   deprovisioning              false            6m54s

Even if we blocked the deletion of the `BareMetalHost` when the NodeSet is deleted, we will then be putting the burden back on the user to configure this as a now PreProvisioned node.

Proposal

While I understand the original idea of designing something consistent with Ansible. We should instead maintain consistency with Kubernetes constructs, since the primary user interface is via Kubernetes and not via Ansible. Users are never expected to interact directly with Ansible, so it makes more sense if our design is consistent with Kubernetes constructs and design patterns rather than trying to make Kubernetes function more like Ansible.

To address this issue, I propose we do the following:

We implement a `MutatingWebhookValidation` that prevents updates to the `baremetalSetTemplate`. This will ensure no changes are made to the `openstackbaremetalset`.
We decide whether we want to re-execute all `services` for any changes that are made. If we only want to execute a sub-set of the service playbooks, then we will need a mechanism to determine which services should only run during the initial configuration. Maybe a `runOnce` field on each service:

apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
metadata:
  labels:
    app.kubernetes.io/name: openstackdataplaneservice
    app.kubernetes.io/instance: openstackdataplaneservice-bootstrap
    app.kubernetes.io/part-of: dataplane-operator
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: dataplane-operator
  name: bootstrap
spec:
  label: dataplane-deployment-bootstrap
  playbook: osp.edpm.bootstrap
  runOnce: true

Then some code to only run playbooks without `service.Spec.RunOnce` defined if the `NodeSet` is already deployed.

Alternative

The alternative to this would be to remove the Baremetal Provisioning from the dataplane-operator and have the users manually configure their baremetal nodes, then provide them to us as pre-provisioned nodes that we can then configure.

Assignee:: Unassigned

Reporter:: Brendan Shephard

Team:: rhos-dfg-df

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2023/10/22 11:32 PM

Updated:: 2024/11/14 7:08 PM

Resolved:: 2024/11/14 7:08 PM

Details

Description

Issues with current implementation

Proposal

Alternative

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty