Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-20811

Documentation bug: Incomplete procedure for removing compute nodes causes database integrity errors on redeployment.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Undefined Undefined
    • None
    • rhos-18.0.10 FR 3
    • documentation
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • rhos-workloads-compute
    • None
    • Moderate

      The official procedures for removing an OpenStackDataPlaneNode and an OpenStackDataPlaneNodeSet are missing a critical step, which results in a broken state that prevents the same compute node from being successfully redeployed. 

      Affected Documentation

      • "Removing an OpenStackDataPlaneNodeSet resource" (proc_removing-an-OpenStackDataPlaneNodeSet-resource.adoc)
      • "Removing a Compute node from the data plane" (proc_removing-a-Compute-node-from-the-data-plane.adoc)

      Problem Details

      The current procedures correctly guide the user to delete the nova-compute service using openstack compute service delete. However, this action does not remove the corresponding compute_nodes record from the Nova database (see the linked Nova issue), if there are instances on the host being removed (could be an error state as well).

      This leaves an orphaned record in the database. If a user then attempts to reprovision and redeploy the same physical node, the new nova-compute service fails to start, logging a pymysql.err.IntegrityError: (1062, "Duplicate entry '...' for key 'uniq_compute_nodes0host0hypervisor_hostname0deleted'").

      Impact

      This documentation gap makes the node removal procedure a destructive, one-way action and prevents reliable node replacement or maintenance workflows.

      Workaround

      unknown

      Suggested Resolution

      The documentation or KB article must be updated to include an additional step to manually delete the compute_nodes entry after deleting the nova-compute service. Proposed workaround and the new step:


       (to be inserted before deleting the compute service and network agents)

      Make sure no instances are assigned to the host being deleted, all VM states should be accounted, including VMs in the 'error' state.

       

              joflynn@redhat.com Joanne O'Flynn
              bdobreli@redhat.com Bohdan Dobrelia
              rhos-workloads-compute
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: