-
Sub-task
-
Resolution: Done
-
Undefined
-
None
-
None
-
0
-
False
-
-
False
-
0
-
-
-
Sprint 127
+++ This bug was initially created as a clone of Bug #2240956 +++
Description of problem:
From customer case with failing Remote execution that started after upgrade to 6.13. Troubleshooting resulted in finding that host's Content source gets set to null when re-registering. That resulted in Remote execution not working with error message:
Failed to initialize: RuntimeError - Could not use any Capsule for the ["SSH", "Script"] job. Consider configuring remote_execution_global_proxy, remote_execution_fallback_proxy in settings
Due to network segmentation, setting anything other than:
Fallback to Any Capsule: Yes
Enable Global Capsule: No
will fail for them.
Version-Release number of selected component (if applicable):
[root@jb-rhel8-sat613 ~]# rpm -qa | grep satellite
satellite-installer-6.13.0.7-1.el8sat.noarch
satellite-common-6.13.4-1.el8sat.noarch
ansible-collection-redhat-satellite-3.9.0-2.el8sat.noarch
rubygem-foreman_theme_satellite-11.0.0.5-1.el8sat.noarch
satellite-6.13.4-1.el8sat.noarch
ansible-collection-redhat-satellite_operations-1.3.0-2.el8sat.noarch
satellite-maintain-0.0.1-1.el8sat.noarch
satellite-cli-6.13.4-1.el8sat.noarch
How reproducible:
Always
Steps to Reproduce:
Satellite server: jb-rhel8-sat613.jb.lab
Client: jb-rhel8-test01.jb.lab
Capsules: jb-rhel8-sat613-caps01.jb.lab, jb-rhel8-sat613-caps02.jb.lab
Load balancer: jb-haproxy01.jb.lab
1. Setup satellite and capsules with load balancing according to latest documentation.
2. Register or re-register a host with --serverurl=<loadbalancer-address or ip-address for capsule> sets Content source to null.
[root@jb-rhel8-test01 ~]# subscription-manager register --serverurl=https://jb-haproxy01.jb.lab --force
3. Register or re-register a host with --serverurl=<capsule hostname> sets the correct Content source.
[root@jb-rhel8-test01 ~]# subscription-manager register --serverurl=https://jb-rhel8-sat613-caps02.jb.lab --force
Actual results:
[root@jb-rhel8-sat613 ~]# hammer host info --name jb-rhel8-test01.jb.lab | grep -A2 'Content Source'
Content Source:
Id:
Name:
Expected results:
[root@jb-rhel8-sat613 ~]# hammer host info --name jb-rhel8-test01.jb.lab | grep -A2 'Content Source'
Content Source:
Id: 2
Name: jb-rhel8-sat613-caps01.jb.lab
Additional info:
Debug logs attached
When reinstalling katello-ca-consumer-latest.rpm and using registering with no --serverurl, content source is
properly set from value in package.
— Additional comment from on 2023-09-27T10:40:27Z
Created attachment 1990798
Debug when registering directly to capsule
— Additional comment from on 2023-09-27T10:40:58Z
Created attachment 1990799
Debug when registering through load balancer
— Additional comment from on 2023-10-03T11:35:53Z
Are you also getting this error when re-registering the host using the global registration?
— Additional comment from on 2023-10-03T12:06:49Z
@nalfassi Yes, the behavior is the same when re-registering with global registration.
— Additional comment from on 2023-10-30T11:29:29Z
Bulk setting Target Milestone = 6.15.0 where sat-6.15.0+ is set.
— Additional comment from on 2023-11-10T10:27:36Z
Customer reached out asking about an update.
What can I tell them to set the right expectations?
Expected to be resolved in 6.14, 6.15?
— Additional comment from on 2023-11-13T13:30:12Z
This looks like a problem in Katello's RHSM endpoints & handling registration data,
but I might be wrong, sorry in advance if I moved it to the wrong category.
— Additional comment from on 2023-11-16T17:50:29Z
Created redmine issue https://projects.theforeman.org/issues/36928 from this bug
— Additional comment from on 2023-11-16T18:21:23Z
Turns out the issue is not the null content source; instead it's the fact that the host's 'registered_through' value doesn't match any known capsule name, so Remote Execution doesn't know what capsule to use for the host. I'll update the title a bit.
— Additional comment from on 2023-12-07T16:35:56Z
Hi @jbjornel@redhat.com
Are you able to try the upstream patch with the customer and see if it resolves the issue?
https://github.com/Katello/katello/pull/10804
(You will need to re-register the affected hosts after applying the patch.)
— Additional comment from on 2023-12-08T07:54:39Z
I tested this with a 6.15 LB setup
Sat: vm205-223.uatlab.pnq2.redhat.com
Cap1: vm207-129.uatlab.pnq2.redhat.com
Cap2: vm205-222.uatlab.pnq2.redhat.com
haproxy\lb: vm205-217.uatlab.pnq2.redhat.com
rh7-client: dhcp130-248.gsslab.pnq2.redhat.com
rh8-client: dhcp130-245.gsslab.pnq2.redhat.com
Steps:
- Installed the whole setup as expected
- Configured the capsules in this way afterward:
On each capsule:
satellite-installer --scenario capsule \
--enable-foreman-proxy-plugin-ansible \
--foreman-proxy-registration true \
--foreman-proxy-templates true \
--foreman-proxy-registration-url "https://vm205-217.uatlab.pnq2.redhat.com:9090" \
--foreman-proxy-template-url "https://vm205-217.uatlab.pnq2.redhat.com:8000"
On satellite:
for i in `hammer --csv --no-headers capsule list --fields Id`; do hammer capsule refresh-features --id $i; sleep 3; done
- Synced satellite-client-6 repo on satellite and capsules
- Created an AK
- Generated a registration command from satellite by selecting an external capsule: [ It generates with LB FQDN as expected ]
curl -sS --insecure 'https://vm205-217.uatlab.pnq2.redhat.com:9090/register?activation_keys=TEST_AK&force=true&location_id=2&organization_id=1&setup_insights=false&update_packages=false' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo0LCJpYXQiOjE3MDIwMTkxMjAsImp0aSI6IjI5ZGRhMGQ0MmM1YzEyMTdjNTZiNzRkNzdmZmI0NjM1OWE4MDIyNmNmMmY3ZWE3Zjc4ODc0NGVjYjhjNDYwN2MiLCJzY29wZSI6InJlZ2lzdHJhdGlvbiNnbG9iYWwgcmVnaXN0cmF0aW9uI2hvc3QifQ.aWL7x0OfWpjmMF0NBSGdw0xF7cjEJMwMKLdOtwGjc4s' | bash
- Registered both the systems with satellite. And this is the status from satellite w.r.t content source and registered_through capsules:
-
- RECORD 3 and 4
- su - postgres -c 'psql -x foreman -c "select h.id,h.name,o.title,k.registered_at,k.last_checkin,k.registered_through from hosts as h left join katello_subscription_facets k on k.host_id = h.id left join operatingsystems o on o.id = h.operatingsystem_id order by last_checkin;"'
[ RECORD 1 ]----+----------------------------------
id | 3
name | vm205-222.uatlab.pnq2.redhat.com
title | RHEL 8.9
registered_at | 2023-12-07 18:44:30
last_checkin | 2023-12-08 06:53:14.651602
registered_through | vm205-223.uatlab.pnq2.redhat.com
[ RECORD 2 ]----+----------------------------------
id | 2
name | vm207-129.uatlab.pnq2.redhat.com
title | RHEL 8.9
registered_at | 2023-12-07 18:36:54
last_checkin | 2023-12-08 06:59:52.594299
registered_through | vm205-223.uatlab.pnq2.redhat.com
[ RECORD 3 ]----+----------------------------------
id | 4
name | dhcp130-248.gsslab.pnq2.redhat.com
title | RedHat 7.9
registered_at | 2023-12-08 07:08:03
last_checkin | 2023-12-08 07:08:09.112907
registered_through | vm205-217.uatlab.pnq2.redhat.com
[ RECORD 4 ]----+----------------------------------
id | 5
name | dhcp130-245.gsslab.pnq2.redhat.com
title | RedHat 8.8
registered_at | 2023-12-08 07:08:09
last_checkin | 2023-12-08 07:08:16.028888
registered_through | vm205-217.uatlab.pnq2.redhat.com
[ RECORD 5 ]----+----------------------------------
id | 1
name | vm205-223.uatlab.pnq2.redhat.com
title | RHEL 8.9
registered_at |
last_checkin |
registered_through |
[root@vm205-223 ~]# hammer host info --name dhcp130-248.gsslab.pnq2.redhat.com --fields 'Content information/content source/name'
Content Information:
Content Source:
Name:
[root@vm205-223 ~]# hammer host info --name dhcp130-245.gsslab.pnq2.redhat.com --fields 'Content information/content source/name'
Content Information:
Content Source:
Name:
- In Satellite Settings:
- hammer --csv --csv-separator="|" --no-headers settings list | egrep "global_proxy|fallback|registered_through" | column -s"|" -t
remote_execution_fallback_proxy Fallback to Any Proxy true Search the host for any proxy with Remote Execution, useful when the host has no subnet or the subnet does not have an execution proxy
remote_execution_global_proxy Enable Global Proxy false Search for remote execution proxy outside of the proxies assigned to the host. The search will be limited to the host's organization and location.
remote_execution_prefer_registered_through_proxy Prefer registered through proxy for remote execution true Prefer using a proxy to which a host is registered when using remote execution
Before using the patch:
Remote execution as well as ansible role execution fails on both the systems with error
Could not use any Capsule for the ["SSH", "Script"] job. Consider configuring remote_execution_global_proxy, remote_execution_fallback_proxy in settings (RuntimeError)
/usr/share/gems/gems/foreman_remote_execution-11.1.1/app/lib/actions/remote_execution/run_host_job.rb:259:in `determine_proxy!'
/usr/share/gems/gems/foreman_remote_execution-11.1.1/app/lib/actions/remote_execution/run_host_job.rb:50:in `inner_plan'
/usr/share/gems/gems/foreman_remote_execution-11.1.1/app/lib/actions/remote_execution/run_host_job.rb:24:in `block in plan'
/usr/share/gems/gems/foreman_remote_execution-11.1.1/app/lib/actions/remote_execution/template_invocation_progress_logging.rb:20:in `block in with_template_invocation_error_logging'
/usr/share/gems/gems/foreman_remote_execution-11.1.1/app/lib/actions/remote_execution/template_invocation_progress_logging.rb:20:in `catch'
/usr/share/gems/gems/foreman_remote_execution-11.1.1/app/lib/actions/remote_execution/template_invocation_progress_logging.rb:20:in `with_template_invocation_error_logging'
/usr/share/gems/gems/foreman_remote_execution-11.1.1/app/lib/actions/remote_execution/run_host_job.rb:23:in `plan'
/usr/share/gems/gems/dynflow-1.8.2/lib/dynflow/action.rb:532:in `block (3 levels) in execute_plan'
After sanitizing the patch ( removal of the stuff related to test/models/concerns/host_managed_extensions_test.rb ) and applying it on satellite + restart of services :
-
- Content Source is still blank [ as expected ] and I haven't re-registered the systems yet.
-
- Surprisingly, REX works fine on both systems and so as the ansible roles. There are no errors reported anywhere.
-
- When both capsules are active, haproxy uses capsule1 to run the job on rhel7 client and capsule2 on rhel8
-
- If i stop httpd and foreman-proxy service on capsule1, haproxy uses capsule2 to run job on both systems.
-
- If i stop the same services on capsule2 and turn them back on capsule1, haproxy uses capsule1 instead for both systems.
The Content Source is still blank for the hosts but I think that is not causing any problems at all, So no need to re-register any systems.
– here the main BZ testing ends –
Now, Just to check the content-source part, I have re-registered the clients with the same exact curl command [ while the satellite is in patched state ].
registered_through looks good:
[ RECORD 3 ]----+----------------------------------
id | 5
name | dhcp130-245.gsslab.pnq2.redhat.com
title | RedHat 8.8
registered_at | 2023-12-08 07:49:41
last_checkin | 2023-12-08 07:49:47.587887
registered_through | vm205-217.uatlab.pnq2.redhat.com
[ RECORD 4 ]----+----------------------------------
id | 4
name | dhcp130-248.gsslab.pnq2.redhat.com
title | RedHat 7.9
registered_at | 2023-12-08 07:49:42
last_checkin | 2023-12-08 07:49:48.794278
registered_through | vm205-217.uatlab.pnq2.redhat.com
Content source has been populated to one of the capsules:
- hammer host info --name dhcp130-248.gsslab.pnq2.redhat.com --fields 'Content information/content source/name'
Content Information:
Content Source:
Name: vm205-222.uatlab.pnq2.redhat.com
- hammer host info --name dhcp130-245.gsslab.pnq2.redhat.com --fields 'Content information/content source/name'
Content Information:
Content Source:
Name: vm205-222.uatlab.pnq2.redhat.com
Most importantly, REX and Ansible roles continue to work fine via LB+Capsules as they were doing before re-registration ..
Conclusion: The patch works as expected and the re-registration of client hosts are only needed if someone is just worried about content source, but That makes no difference in the outcome.
— Additional comment from on 2023-12-18T22:00:56Z
Hi team,
I'm not able to see this patch (https://patch-diff.githubusercontent.com/raw/Katello/katello/pull/10804.diff) included in snap_6.15.0_3.0.
— Additional comment from on 2023-12-18T22:32:09Z
Actually, the katello version shipped in snap_6.15.0_3.0 is katello-4.11.0-0.2.rc2.el8sat.noarch but the OForeman issue is marked for "Katello 4.12.0".
— Additional comment from on 2023-12-19T14:15:44Z
Moved by mistake, sorry about that.
QE Tracker for https://issues.redhat.com/browse/SAT-22356
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2257331
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2240956