Uploaded image for project: 'Satellite'
  1. Satellite
  2. SAT-42156

Race condition between hammer host delete and API host creation leads to "Name has already been taken" validation error and orphaned ghost records.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • 6.16.z, 6.17.z, 6.17.2, 6.18.z, 6.16.5.6
    • API, Hammer, Hosts
    • None
    • False
    • sat-endeavour
    • None
    • None
    • None
    • None

      Description of problem:

      A race condition exists between the hammer host delete command and the /api/v2/hosts POST (create) endpoint in Satellite 6.16/6.17. When a host is deleted via Hammer, the CLI returns "Host deleted" to the user/automation script before the background database transaction has fully committed. An immediate subsequent API call to create a host with the same name fails with a "Validation failed: Name has already been taken."

      How reproducible:

      High (Reproduced on 1st iteration of a shell script loop in Satellite 6.17).

      Is this issue a regression from an earlier version:

      NO

      Steps to Reproduce:

      • Create a managed or unmanaged host: hammer host create --name "repro-host.example.com" --organization "rhsat" --location "apac" --managed false
      • Run a deletion and an immediate creation in a single execution string (to mimic CI/CD speed): 
        hammer host delete --name "repro-host.example.com" && curl -u admin:password -k -X POST -d '{"host": {"name": "repro-host.example.com", "organization_name": "rhsat", "location_name": "apac"}}' https://localhost/api/v2/hosts

         

      • Observe the API response.

      Actual behavior:

      The API returns:

      {{{"error": {"id":null,"errors":
      {"name":["has already been taken"]}
      ,"full_messages":["name has already been taken"]}}}}
      

       

      If checked via foreman-rake console immediately after the error:
       

      irb(main):001:0> Host.unscoped.find_by_name("repro-host.example.com")
      => #<Host::Managed id: 18, name: "repro-host.example.com", type: "Host::Managed" ...>
      

      The record still exists for several milliseconds/seconds after Hammer reported success.

      Expected behavior:

      • hammer host delete should provide an option to block/wait until the database transaction is fully committed.
      • The API should handle the "in-flight" deletion gracefully or the orchestration engine should ensure transactional integrity across the CLI and API.

      Business Impact / Additional info:

       

              Unassigned Unassigned
              rhn-engineering-cbora Chandra Sekhar Reddy Bora
              Red Hat Employee
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: