Uploaded image for project: 'CPE Infrastructure'
  1. CPE Infrastructure
  2. CPE-4002

[SN#1648] Figure out a way to help Duffy cleanup failed provisions

XMLWordPrintable

    • 5
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • Testable
    • rhel-cle-pnp

      https://gitlab.com/CentOS/infra/tracker/-/issues/1648

      We recently discovered that there are many machines in EC2 that are a result of failed provisions from Duffy.

      When Duffy runs the Ansible playbook to create a machine, it can fail, and then Duffy will retry. However, it seems that the failed host might have been created and then something else goes wrong (e.g. perhaps a communication error), which causes Duffy to retry without deleting the host.

      We should look into some form of reporting / cleanup mechanism (even just a Zabbix notify) that helps us avoid letting these hosts build up again.

      _This issue ticket was originally created [here](https://pagure.io/centos-infra/issue/1648) on a Pagure repository,
      [**centos-infra**](https://pagure.io/centos-infra) by [**Greg Sutcliffe**](https://accounts.fedoraproject.org/user/gwmngilfen) on
      [**Wed Apr 23 13:59:21 2025** UTC](https://savvytime.com/converter/utc/apr-23-2025/13-59)._

      _This issue ticket was automatically created by the
      [**Pagure Exporter**](https://github.com/gridhead/pagure-exporter)._

              gsutclif Greg Sutcliffe
              rh-ee-mkonecny Michal Konecny
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: