Uploaded image for project: 'Satellite'
  1. Satellite
  2. SAT-19275

satellite-change-hostname too destructive when it fails due to unability to forward resolve the new hostname

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • 6.11.0
    • None
    • None
    • None
    • None
    • None

      Description of problem:
      When running spacewalk-change-hostname without the new FQDN actually being forward resolvable by the DNS (i.e. attempting to only run locally), the action fails. I don't consider the fact it failed a bug, but the way it fails is quite destructive.

      After the failure, the Satellite doesn't run, I can't connect to WebUI.
      After `foreman-maintain service restart`, there is an issue with Candlepin, WebUI only works to throw this error: `Oops, we're sorry but something went wrong A backend service [ Candlepin ] is unreachable`
      Attempting to revert the changes to get a working satellite again:
      ```

      1. hammer ping
        Could not load the API description from the server: SSL certificate verification failed
        Make sure you configured the correct URL and have the server's CA certificate installed on your system.

      The following configuration option were used for the SSL connection:
      ssl_ca_file = /etc/pki/katello/certs/katello-server-ca.crt

      Make sure the location contains an unexpired and valid CA certificate for https://fish.example.com.

      Warning: An error occured while loading module hammer_cli_foreman.
      Could not load the API description from the server: SSL certificate verification failed
      Make sure you configured the correct URL and have the server's CA certificate installed on your system.

      The following configuration option were used for the SSL connection:
      ssl_ca_file = /etc/pki/katello/certs/katello-server-ca.crt

      Make sure the location contains an unexpired and valid CA certificate for https://fish.example.com.

      [...]

      Warning: An error occured while loading module hammer_cli_katello.
      Error: No such sub-command 'ping'.

      See: 'hammer --help'.
      ```
      Trying to workaround by manually changing hostname doesn't help either:

      1. hostnamectl set-hostname dhcp-3-121.vms.sat.rdu2.redhat.com
      2. foreman-maintain service restart
        => restarts successfully, but in WebUI: `Oops, we're sorry but something went wrong A backend service [ Candlepin ] is unreachable`

      Because I ran a tool without DNS being able to resolve my new FQDN, I ended up with a bricked Satellite. Perhaps it can be fixed but this is way too destructive for such a simple mishap.

      The log of satellite-change-hostname was:
      ```

      1. satellite-change-hostname -uadmin -p<PASSWORD> fish.example.com
        [...]
        stopping services
        removing old cert rpms
        No Match for argument: dhcp-3-121.vms.sat.rdu2.redhat.com-apache*
        No Match for argument: dhcp-3-121.vms.sat.rdu2.redhat.com-foreman-client*
        No Match for argument: dhcp-3-121.vms.sat.rdu2.redhat.com-foreman-proxy*
        [...]
        deleting old certs
        backed up /var/www/html/pub to /var/www/html/pub/dhcp-3-121.vms.sat.rdu2.redhat.com-20220609093929.backup
        updating hostname in /etc/hosts
        updating hostname in foreman installer scenarios
        updating hostname in hammer configuration
        backing up last_scenario.yaml
        removing last_scenario.yaml
        re-running the installer
        satellite-installer --scenario satellite -v --disable-system-checks --certs-regenerate=true --foreman-proxy-register-in-foreman true
        Output of 'facter fqdn' is different from 'hostname -f'

      Make sure above command gives the same output. If needed, change the hostname permanently via the
      'hostname' or 'hostnamectl set-hostname' command
      and editing the appropriate configuration file.
      (e.g. on Red Hat systems /etc/sysconfig/network,
      on Debian based systems /etc/hostname).

      If 'hostname -f' still returns an unexpected result, check /etc/hosts and put
      the hostname entry in the correct order, for example:

      1.2.3.4 hostname.example.com hostname

      The fully qualified hostname must be the first entry on the line
      [...]
      Something went wrong with the Satellite installer.
      Please check the above output and the corresponding logs.

      Once the issue is resolved you may complete the hostname change[...quite complicated steps...]
      ```

      Version-Release number of selected component (if applicable):
      6.11 snap 23, probably not a regression

      How reproducible:
      Deterministic

      Steps to Reproduce:

      1. satellite-change-hostname nonsense-hostname.example.com -uadmin -p<PASSWORD>

      Actual results:
      Destructive failure

      Expected results:
      Graceful failure, with a simplish way to revert the changes and get a working Satellite with the original hostname.
      I would also like there to be a --local-only switch that would ignore the fact that I am using a FQDN that nobody will be able to reach me with.

      Additional info:
      See also bug 1861831 which also shows inability to re-run spacewalk-change-hostname upon failure, the bug has been fixed but it's effectively still not possible.
      Setting severity medium because although this bricks a Satellite, it is due to previous mistake by the user, in a not-so-common workflow and is probably recoverable.

              jira-bugzilla-migration RH Bugzilla Integration
              jira-bugzilla-migration RH Bugzilla Integration
              RH Bugzilla Integration RH Bugzilla Integration
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: