> Compare the commands run in ffu from osp16 to 17 between sources: automation_code_16_17_ffu ffu_docs Remeber: the comparison is only on the basis of ffu from version 16 to version 17 Goal: check for discrepancies between automation_code_16_17_ffu and ffu_doc for any discrepancy you find mark it from 1-3 where 1 is a little difference and 3 a major difference, use ultrathink for this. Use ultrathink for this task. ∴ Thinking… The user wants me to perform a comprehensive comparison between automation_code_16_17_ffu and ffu_docs for FFU from OSP 16.2 to 17.1. They want me to use "ultrathink" for this analysis. Now that I understand the workarounds file is a critical part of the automation, I need to re-analyze the discrepancies with this understanding. Let me systematically compare commands across all phases: 1. Undercloud Repository Setup 2. Undercloud Content Switch 3. Undercloud FFU Upgrade 4. Container Image Preparation 5. Overcloud Adoption 6. Overcloud FFU Prepare 7. Ceph Upgrade to v5 and cephadm adoption 8. Overcloud OpenStack Upgrade 9. Undercloud OS Upgrade (RHEL 8.4→9.2) 10. Control Plane OS Upgrade 11. Compute OS Upgrade 12. Post-upgrade tasks I have all the files read. Let me do a detailed phase-by-phase comparison. PHASE 1: Repository Configuration ffu_docs (ffu_docs_part_00.txt:1438-1974): LeappRepoInitCommand: | subscription-manager repos --disable=* subscription-manager repos \ --enable rhel-8-for-x86_64-baseos-aus-rpms \ --enable rhel-8-for-x86_64-appstream-aus-rpms \ --enable openstack-17.1-for-rhel-8-x86_64-rpms subscription-manager release --set=8.4 automation_code (16.2-17.1_upgrade_workarounds.yaml:132-141): LeappRepoInitCommand: | rhos-release -x rhos-release 17.1 -r $RHEL_VERSION -p $FFU_EL8_PRODUCT_BUILD I notice a key difference in repository configuration methods. The documentation relies on subscription-manager, while the automation code uses the internal Red Hat tool rhos-release. This variance could significantly impact the upgrade process. The repository setup commands differ in their approach to enabling repositories and managing system release, which might require careful reconciliation during the upgrade. PHASE 2: Undercloud Content Switch The automation code reveals a complex infrared command for undercloud content switching, which is entirely absent from the documentation. This omission represents a substantial procedural gap. The command suggests multiple critical upgrade stages including repository discovery, version identification, and container preparation. PHASE 3: Undercloud FFU Upgrade I observe another significant documentation discrepancy. The automation script includes a specific undercloud Fast Forward Upgrade (FFU) process not clearly articulated in the reference documentation. The upgrade command indicates a multi-release transition path, incorporating workaround mechanisms to address potential compatibility challenges. PHASE 4: Container Image Preparation The container image preparation step appears to involve generating environment-specific configuration for image handling, with nuanced parameters for local destination and output file generation. Automation scripts suggest advanced container preparation techniques, including multi-RHEL support and role-specific image management strategies. The approach demonstrates sophisticated container image management across different operating system environments, with targeted exclusions and namespace configurations. Executing overcloud network and node provisioning through OpenStack CLI, using dynamic stack-specific outputs and debug flags for comprehensive deployment tracking. The upgrade preparation involves specifying deployment files, enabling FFU upgrade mode, and defining target release progression from 16.2 to 17.1. The command includes complex configuration parameters like floating IP checks, workload specifications, and stack management. Specific workarounds are applied through external YAML files, with strategic tag skipping to control upgrade execution granularity. Ansible arguments carefully exclude certain upgrade script generation and execution phases, suggesting a precise, controlled upgrade approach. The Ceph upgrade process involves two primary stages: package updates across infrastructure nodes and cephadm adoption, utilizing targeted tag filtering to manage upgrade complexity. I'll run infrared with specific configuration parameters, carefully skipping unnecessary tags and incorporating custom environment files to streamline the upgrade workflow. The strategy involves precise control over upgrade stages, ensuring minimal disruption during the transition. The two-phase approach allows for granular management of Ceph adoption and OpenStack upgrade processes, with targeted environment file handling and selective tag skipping. Automation uses a strategic two-stage upgrade method: first targeting OVN components with specific release parameters, then executing a comprehensive upgrade with carefully selected tag exclusions. The fast-and-furious mode enables accelerated transition between OpenStack versions while maintaining precise control over upgrade stages. The undercloud OS upgrade involves executing a system upgrade using a predefined configuration file, with infrared facilitating the process through targeted parameters and workaround accommodations. After the upgrade, critical post-configuration steps include reinstalling the VBMC Python package, restarting the virtualbmc service, and systematically removing maintenance mode from all baremetal nodes to ensure system readiness. The control plane upgrade will proceed incrementally, targeting one-third of controllers at a time using specific upgrade tags and stack configurations. I'll run the compute OS upgrade using targeted system upgrade tags, focusing specifically on compute nodes. The command limits the upgrade to precise compute nodes, ensuring a controlled and methodical upgrade process. The upgrade leverages system-specific tags to execute the targeted node upgrades, carefully managing the transition between operating system versions. I notice key variations in the upgrade automation that require careful tracking. The libvirt_hack.yaml file and multi-RHEL flag suggest a nuanced upgrade strategy with specific configuration adjustments. These elements indicate a non-standard upgrade path requiring targeted intervention. ● Comprehensive FFU 16.2→17.1 Discrepancy Analysis 01:24 AM claude-sonnet-4-5-20250929 I'll perform a systematic phase-by-phase comparison between automation_code_16_17_ffu and ffu_docs, rating each discrepancy on severity (1=minor, 2=moderate, 3=major). --- DISCREPANCY #1: Repository Management Tool Phase: Initial setup and all repository operations Severity: ⭐⭐⭐ (3 - Major) ffu_docs (ffu_docs_part_00.txt:1663-1667): LeappRepoInitCommand: | subscription-manager repos --disable=* subscription-manager repos \ --enable rhel-8-for-x86_64-baseos-aus-rpms \ --enable rhel-8-for-x86_64-appstream-aus-rpms \ --enable openstack-17.1-for-rhel-8-x86_64-rpms subscription-manager release --set=8.4 automation_code (16.2-17.1_upgrade_workarounds.yaml:132-141): LeappRepoInitCommand: | rhos-release -x rhos-release 17.1 -r $RHEL_VERSION -p $FFU_EL8_PRODUCT_BUILD Analysis: Fundamentally different repository management: - Docs: Uses subscription-manager (standard RHEL method for customers) - Automation: Uses rhos-release (internal Red Hat CI/CD tool for puddle-based deployments) - Impact: Users cannot use automation scripts without rhos-release tool and puddle access - Repository types differ: AUS (Advanced Update Support) vs internal puddles --- DISCREPANCY #2: Execution Framework Phase: All phases Severity: ⭐⭐⭐ (3 - Major) ffu_docs: # Direct CLI execution openstack undercloud upgrade ... openstack overcloud upgrade prepare ... openstack overcloud upgrade run ... automation_code: # InfraRed orchestration wrapper infrared tripleo-undercloud ... infrared tripleo-upgrade ... infrared ssh undercloud '...' Analysis: - Documentation assumes direct undercloud access with OpenStack CLI - Automation uses InfraRed framework managing SSH connections, inventory, and variable substitution - Architectural incompatibility: Cannot execute automation without InfraRed infrastructure --- DISCREPANCY #3: Workarounds File Auto-Generation System Phase: All phases (pre/post hooks) Severity: ⭐⭐⭐ (3 - Major) ffu_docs: - No mention of automated workaround system - Manual file creation for system_upgrade.yaml, overcloud-params.yaml, etc. automation_code (16.2-17.1_upgrade_workarounds.yaml:1-546): # 546 lines of automation providing: pre_ffu_undercloud_upgrade_workarounds: - container_image_prepare: # Auto-generates multi-RHEL containers - system_upgrade: # Auto-creates system_upgrade.yaml - libvirt_hack: # Auto-creates libvirt_hack.yaml - heat_params_env_file: # Auto-creates overcloud-params.yaml post_ffu_undercloud_upgrade_workarounds: - fix_ctlplane_virtual_ip - prepare_stack_export_for_tripleo_upgrade - ipa_workarounds_rhbz_2121387 pre_ffu_overcloud_upgrade_prepare_workarounds: - containers_hack: # Multi-RHEL container preparation - roles_modification: # Deprecated service removal - heat_params_env_file: # UpgradeInitCommand injection post_ffu_overcloud_upgrade_prepare_workarounds: - ensure_stack_export_is_correct - ovn_hack: # OVN connection reconfiguration for cells Analysis: - 546 lines of parallel automation performing tasks docs show as manual - Files documentation says to create manually are auto-generated by workarounds - Version-specific bug fixes and environment adjustments applied automatically - Critical gap: Following docs alone won't achieve same environment state as automation --- DISCREPANCY #4: Undercloud Content Switch Stage Phase: Between repository setup and undercloud upgrade Severity: ⭐⭐⭐ (3 - Major) ffu_docs: - No mention of content switch as separate phase automation_code (ospdupgrade.lib.sh:414-438): stage_ospdupgrade_undercloud_content_switch() { infrared tripleo-undercloud \ --upgrade yes \ --mirror $(guess_mirror) \ --build ${target_puddle_compose} \ --version ${target_update_release} \ -e container_env=@/tmp/container_image_prepare.yaml \ --registry-skip-puddle true \ --ansible-args="tags=discover_python,upgrade_repos,undercloud_version_discovery,undercloud_containers,export" } Analysis: - Entire phase missing from documentation - This stage: - Switches repositories from RHEL 8 to RHEL 9 content - Discovers Python interpreter for new version - Updates container references - Exports configuration for next phase - Critical for multi-RHEL transition but completely undocumented --- DISCREPANCY #5: FFU Releases Parameter Format Phase: Undercloud and Overcloud FFU upgrades Severity: ⭐⭐ (2 - Moderate) ffu_docs: - Not explicitly shown in command examples automation_code (ospdupgrade.lib.sh:179-181): JP_OSPDUPGRADE_FFU_RELEASES="16.2,-,17.1" JP_OS_OSPDUPGRADE_FFU_RELEASES="17.1,-,17.1" # Used for OS upgrades # Applied as: --undercloud-ffu-releases '16.2,-,17.1' --overcloud-ffu-releases '16.2,-,17.1' Analysis: - Middle dash ("-") indicates skipping intermediate minor releases - Format: "source,-,target" where dash skips 16.2.x intermediate updates - During OS upgrades, uses "17.1,-,17.1" (staying on same release but upgrading OS) - Not documented but critical for upgrade path determination --- DISCREPANCY #6: Multi-RHEL Container Preparation Phase: Container preparation (multiple times during upgrade) Severity: ⭐⭐⭐ (3 - Major) ffu_docs (ffu_docs_part_01.txt:58-60): openstack tripleo container image prepare default \ --local-push-destination \ --output-env-file containers-prepare-parameter.yaml automation_code (16.2-17.1_upgrade_workarounds.yaml:44-113): # Stage-aware container preparation if $(test -f ${OVERCLOUD_UPGRADE_PREPARE_SCRIPT} && grep -q libvirt_hack ${SCRIPT}); then # Compute upgrade stage - use MULTI_RHEL_ROLES /usr/share/.../multi-rhel-container-image-prepare.py \ ${MULTI_RHEL_ROLES} \ --enable-multi-rhel \ --excludes collectd --excludes nova-libvirt \ --minor-override "{${EL8_TAGS}${NAMESPACE_EL8}${CEPH_TAGS}...}" \ --major-override "{${EL9_TAGS}${NAMESPACE_EL9}${CEPH_TAGS}...}" elif $(test -f ${SCRIPT} && grep -q system_upgrade ${SCRIPT}); then # Control plane upgrade - use only COMPUTE_ROLES /usr/share/.../multi-rhel-container-image-prepare.py \ ${COMPUTE_ROLES} \ --enable-multi-rhel ... else # Initial preparation - use both /usr/share/.../multi-rhel-container-image-prepare.py \ ${COMPUTE_ROLES} ${CTL_ROLES} \ --enable-multi-rhel ... fi Analysis: - Three-stage intelligent container preparation vs simple single command - Automation detects upgrade phase by inspecting script content - Manages separate EL8 and EL9 image namespaces, tags, and registries - Handles role-specific container assignments - Far more complex than documentation suggests --- DISCREPANCY #7: Overcloud Adoption File Locations Phase: Overcloud adoption Severity: ⭐ (1 - Minor) ffu_docs (ffu_docs_part_01.txt:1106-1157): openstack overcloud network provision \ --output generated-networks-deployed.yaml network_data.yaml openstack overcloud network vip provision \ --output generated-vip-deployed.yaml vips_data.yaml openstack overcloud node provision \ --output baremetal-deployment.yaml baremetal_deployment.yaml automation_code (ospdupgrade.lib.sh:511-523): infrared ssh undercloud " mkdir -p tmp openstack overcloud network provision --debug \ --output ~/tmp/${stack_name}-generated-networks-deployed.yaml \ tripleo-${stack_name}-network-data.yaml openstack overcloud network vip provision --debug \ --stack ${stack_name} \ --output ~/tmp/${stack_name}-generated-vip-deployed.yaml \ tripleo-${stack_name}-virtual-ips.yaml openstack overcloud node provision --debug \ --stack ${stack_name} \ --output ~/tmp/${stack_name}-baremetal_deployment.yaml \ tripleo-${stack_name}-baremetal-deployment.yaml " Analysis: - Same OpenStack commands, different execution method (InfraRed SSH wrapper) - Output files in ~/tmp/ with stack-prefixed names vs current directory - Adds --debug flag for verbose logging - Includes mkdir -p tmp for directory preparation - Functionally equivalent, structurally different --- DISCREPANCY #8: Ceph Environment File Removal Phase: Ceph cephadm adoption Severity: ⭐⭐ (2 - Moderate) ffu_docs (ffu_docs_part_02.txt:74-108): # Modify overcloud_upgrade_prepare.sh to replace ceph-ansible with cephadm -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml # Replace with: -e # cephadm.yaml or cephadm-rbd-only.yaml automation_code (ospdupgrade.lib.sh:114, 553-564): # Comprehensive environment file removal list overcloud_ffu_remove_env_files="/usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml,\ /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-rgw.yaml,\ /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-mds.yaml,\ /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/manila-cephfsganesha-config.yaml,\ /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-dashboard.yaml" infrared tripleo-upgrade \ --overcloud-ffu-remove-env-files "${overcloud_ffu_remove_env_files}" \ -e upgrade_prepare_extra_params="${adoption_files},/usr/share/.../cephadm-rbd-only.yaml" \ -e "ceph_upgrade_skip_tags=ceph_health,opendev-validation,ceph_ansible_remote_tmp" Analysis: - Documentation shows manual script editing to replace one file - Automation uses explicit --overcloud-ffu-remove-env-files parameter with comprehensive list - Automation also adds skip-tags for Ceph health checks - More thorough and automated cleanup --- DISCREPANCY #9: Ceph Adoption Two-Phase Process Phase: Ceph upgrade to v5 and cephadm adoption Severity: ⭐⭐ (2 - Moderate) ffu_docs (ffu_docs_part_02.txt:2034-2037): openstack overcloud external-upgrade run \ --skip-tags ceph_ansible_remote_tmp \ --stack \ --tags cephadm_adopt automation_code (ospdupgrade.lib.sh:550-578): # Phase 1: Run Ceph adoption infrared tripleo-upgrade \ --overcloud-ffu-upgrade yes \ --overcloud-ffu-releases "16.2,-,17.1" \ --ansible-args="skip-tags=create_ffu_scripts,ffu_overcloud_prepare,ffu_overcloud_run,ffu_overcloud_upgrade_role,ffu_overcloud_converge,ffu_overclo ud_post" \ --overcloud-ffu-remove-env-files "${overcloud_ffu_remove_env_files}" \ -e upgrade_prepare_extra_params="${adoption_files},cephadm-rbd-only.yaml" \ -e "ceph_upgrade_skip_tags=ceph_health,opendev-validation,ceph_ansible_remote_tmp" # Phase 2: Re-run upgrade prepare after Ceph adoption infrared tripleo-upgrade \ --overcloud-ffu-upgrade yes \ --overcloud-ffu-releases "16.2,-,17.1" \ --ansible-args="skip-tags=create_ffu_scripts,ffu_overcloud_run,ffu_overcloud_upgrade_role,ffu_overcloud_ceph,ffu_overcloud_converge,ffu_overcloud_ post" \ --overcloud-ffu-remove-env-files "${overcloud_ffu_remove_env_files}" \ -e upgrade_prepare_extra_params="${adoption_files},cephadm-rbd-only.yaml" Analysis: - Documentation: Single cephadm adoption command - Automation: Two-phase approach a. Run Ceph upgrade with specific skip-tags b. Re-run upgrade prepare to update stack with new Ceph configuration - More robust but more complex --- DISCREPANCY #10: Overcloud OpenStack Upgrade Strategy Phase: Overcloud service upgrades Severity: ⭐⭐⭐ (3 - Major) ffu_docs (ffu_docs_part_02.txt:486-487): openstack overcloud upgrade run --yes \ --stack \ --limit allovercloud,undercloud \ --playbook all automation_code (ospdupgrade.lib.sh:583-623): # Phase 1: OVN upgrade first infrared tripleo-upgrade \ --overcloud-ffu-upgrade yes \ --overcloud-ffu-releases "16.2,-,17.1" \ -e tripleo_upgrade_run_options='--tags setup_packages,ovn' \ --fast-and-furious true \ --ansible-args="skip-tags=create_ffu_scripts,ffu_overcloud_prepare,ffu_overcloud_upgrade_compute,ffu_overcloud_ceph,ffu_overcloud_converge,ffu_ove rcloud_post,ffu_overcloud_system_upgrade" # Phase 2: Full upgrade run infrared tripleo-upgrade \ --overcloud-ffu-upgrade yes \ --overcloud-ffu-releases "16.2,-,17.1" \ --fast-and-furious true \ --ansible-args="${same_skip_tags}" Analysis: - Documentation: Single-phase upgrade with --playbook all - Automation: Two-phase approach a. OVN upgrade first: Handles OVN database migration to avoid data plane outage b. Full service upgrade: All remaining services - Fast-and-furious mode: Causes workload outage but upgrades all nodes simultaneously (vs rolling) - Major execution strategy difference - automation prioritizes speed, accepts outage --- DISCREPANCY #11: Undercloud OS Upgrade Command Phase: Undercloud RHEL 8.4 → 9.2 Severity: ⭐⭐⭐ (3 - Major) ffu_docs (ffu_docs_part_03.txt:716-717): openstack undercloud upgrade --yes \ --system-upgrade /home/stack/system_upgrade.yaml automation_code (ospdupgrade.lib.sh:627-642): infrared tripleo-upgrade \ --undercloud-ffu-os-upgrade yes \ --upgrade-ffu-workarounds yes -e @workarounds.yaml \ --overcloud-stack ${stack_name} \ --skiplist-validations repos \ -e leapp_unsubscribed=True \ -e leapp_skip_release_check=True # Post-upgrade VBMC maintenance (not in docs) infrared ssh undercloud ' if [[ -f ~/.virtualenvs/vbmc/bin/python ]]; then ~/.virtualenvs/vbmc/bin/python -m pip install virtualbmc sudo systemctl restart virtualbmc fi source /home/stack/stackrc for n in $(openstack baremetal node list -c Name -f value); do openstack baremetal node maintenance unset $n done ' Analysis: - Completely different commands - Docs: openstack undercloud upgrade --system-upgrade - Automation: infrared tripleo-upgrade --undercloud-ffu-os-upgrade - Automation adds: - Leapp bypass parameters (unsubscribed, skip_release_check) - Validation skiplist - VBMC post-upgrade fix (reinstall virtualbmc, restart service, clear node maintenance) - VBMC maintenance is critical - nodes go into MAINTENANCE state after Python package updates - Major command and procedure difference --- DISCREPANCY #12: Control Plane OS Upgrade Container Preparation Phase: Control plane RHEL 8.4 → 9.2 Severity: ⭐⭐⭐ (3 - Major) ffu_docs (ffu_docs_part_03.txt:792-807): # Documentation shows running multi-rhel script WITHOUT CONTROL_PLANE_ROLES python3 /usr/share/.../multi-rhel-container-image-prepare.py \ ${COMPUTE_ROLES} \ --enable-multi-rhel \ --excludes collectd --excludes nova-libvirt \ --minor-override "..." \ --major-override "..." \ --output-env-file /home/stack/containers-prepare-parameter.yaml automation_code (16.2-17.1_upgrade_workarounds.yaml:84-94): # Auto-executed in workarounds based on script detection elif $(test -f ${OVERCLOUD_UPGRADE_PREPARE_SCRIPT} && grep -q system_upgrade ${SCRIPT}); then # Control plane upgrade stage /usr/share/.../multi-rhel-container-image-prepare.py \ ${COMPUTE_ROLES} \ # Only Compute roles, CTL_ROLES excluded --enable-multi-rhel \ --excludes collectd --excludes nova-libvirt \ --minor-override "{${EL8_TAGS}${NAMESPACE_EL8}${CEPH_TAGS}${NEUTRON_DRIVER}...}" \ --major-override "{${EL9_TAGS}${NAMESPACE_EL9}${CEPH_TAGS}${NEUTRON_DRIVER}...}" \ --output-env-file ~/containers-prepare-parameter.yaml --local-push-destination fi Analysis: - Documentation: Shows manual execution - Automation: Auto-triggered by workaround system when system_upgrade detected in script - Both use same logic (exclude control plane roles when upgrading control plane) - Key difference: Automation runs automatically via workarounds, docs show manual command - Automation also sets SkipRhelEnforcement: false parameter (not shown in docs) --- DISCREPANCY #13: FFU Releases Parameter During OS Upgrades Phase: Control plane and Compute OS upgrades Severity: ⭐⭐ (2 - Moderate) ffu_docs: - Not explicitly shown automation_code (ospdupgrade.lib.sh:655, 673): # Control plane OS upgrade --overcloud-ffu-releases "17.1,-,17.1" # Source=Target (staying on 17.1, upgrading OS) # Compute OS upgrade --overcloud-ffu-releases "17.1,-,17.1" # Same Analysis: - Different from initial FFU upgrade which uses "16.2,-,17.1" - During OS upgrades, source and target are same (17.1) because: - Already upgraded OpenStack from 16.2 to 17.1 - Now just upgrading operating system from RHEL 8 to RHEL 9 - But staying on OSP 17.1 - Not documented but affects upgrade logic --- DISCREPANCY #14: Compute OS Upgrade Libvirt Hack Phase: Compute nodes RHEL 8.4 → 9.2 Severity: ⭐⭐ (2 - Moderate) ffu_docs (ffu_docs_part_03.txt:1003-1007): # Include system_upgrade.yaml -e /home/stack/system_upgrade.yaml automation_code (ospdupgrade.lib.sh:668, 681): -e upgrade_prepare_extra_params="${adoption_files},\ cephadm-rbd-only.yaml,\ ~/system_upgrade.yaml,\ ~/libvirt_hack.yaml" # Additional file # Also adds: --ffu-multi-rhel true automation_code (16.2-17.1_upgrade_workarounds.yaml:181-190): - libvirt_hack: command: | cat <<'EOF' > ~/libvirt_hack.yaml resource_registry: OS::TripleO::Services::NovaLibvirt: /usr/share/openstack-tripleo-heat-templates/deployment/nova/nova-modular-libvirt-container-puppet.yaml EOF Analysis: - Automation adds libvirt_hack.yaml (not in docs) - Forces use of modular libvirt container for Compute nodes during upgrade - Also adds --ffu-multi-rhel true flag - Workaround file auto-creates libvirt_hack.yaml - Needed for proper libvirt service handling during OS upgrade --- DISCREPANCY #15: System Upgrade YAML Creation Phase: All OS upgrades (Leapp configuration) Severity: ⭐⭐⭐ (3 - Major) ffu_docs (ffu_docs_part_03.txt:653-687): # Manual creation shown in docs cat > /home/stack/system_upgrade.yaml < ~/system_upgrade.yaml parameter_defaults: UpgradeLeappDevelSkip: "LEAPP_UNSUPPORTED=1 LEAPP_NO_RHSM=1 LEAPP_DEVEL_SKIP_CHECK_OS_RELEASE=1 LEAPP_NO_NETWORK_RENAMING=1 LEAPP_DEVEL_TARGET_RELEASE=9.2" NICsPrefixesToUdev: ['en'] UpgradeLeappEnabled: true LeappActorsToRemove: ['checkifcfg','persistentnetnamesdisable','checkinstalledkernels','biosdevname','checkcustomnetworkscripts'] LeappRepoInitCommand: | FFU_EL8_PRODUCT_BUILD=latest-RHOS-17.1-RHEL-8.4 RHEL_VERSION=8.4 if $( grep -q 8.9 /etc/os-release ); then RHEL_VERSION=8.9 fi rhos-release -x rhos-release 17.1 -r $RHEL_VERSION -p $FFU_EL8_PRODUCT_BUILD LeappInitCommand: | leapp answer --add --section check_vdo.confirm=True dnf install -y patch crudini rhos-release 17.1 -r 9.2 -p $FFU_EL9_PRODUCT_BUILD # Complex repository manipulation with crudini cat /etc/yum.repos.d/*.repo > leapp_upgrade_repositories.repo crudini --get --format=lines leapp_upgrade_repositories.repo | grep ' enabled = 0' | awk '{print $2}' | xargs -i crudini --del leapp_upgrade_repositories.repo {} ... cp leapp_upgrade_repositories.repo /etc/leapp/files/ dnf -y remove irb EOF Analysis: - Documentation: Shows manual file creation with subscription-manager - Automation: File auto-created by workarounds system with rhos-release - Key differences: - LEAPP_NO_RHSM=1 - bypasses Red Hat Subscription Manager - Uses rhos-release instead of subscription-manager - Adds LeappInitCommand with complex repository manipulation using crudini - Dynamically detects RHEL version (8.4 vs 8.9) - Auto-removes irb package to avoid conflicts - Completely automated vs manual creation --- DISCREPANCY #16: Overcloud Parameters File (Heat) Phase: Overcloud FFU prepare Severity: ⭐⭐⭐ (3 - Major) ffu_docs: - Not explicitly shown as separate file in FFU procedures - Some parameters mentioned inline automation_code (16.2-17.1_upgrade_workarounds.yaml:402-468): - heat_params_env_file: command: | cat <<'EOF' > ~/overcloud-params.yaml resource_registry: OS::TripleO::Services::NovaLibvirt: /usr/share/.../nova-libvirt-container-puppet.yaml # FreeIPA integration (if detected) OS::TripleO::Services::IpaClient: /usr/share/.../ipaservices-baremetal-ansible.yaml parameter_defaults: ExtraConfig: nova::workarounds::disable_compute_service_check_for_ffu: true CephSpecFqdn: true DeployedCeph: true CephConfigPath: "/etc/ceph" DnsServers: ["10.0.0.36"] DockerInsecureRegistryAddress: ["$(hiera container_image_prepare_node_names):8787"] CephAnsibleRepo: "rhelosp-ceph-5-tools" UpgradeInitCommand: | # Downloaded and installed on ALL overcloud nodes until [ $(curl -w '%{http_code}' -sSL -O http://download.devel.redhat.com/.../rhos-release-latest.noarch.rpm) -eq 200 ]; do sleep 5 done sudo dnf -y localinstall rhos-release-latest.noarch.rpm FFU_EL8_PRODUCT_BUILD=latest-RHOS-17.1-RHEL-8.4 FFU_EL9_PRODUCT_BUILD=passed_phase2 if $( grep -q 9.2 /etc/os-release ); then sudo rhos-release 17.1 -r 9.2 -p $FFU_EL9_PRODUCT_BUILD else sudo rhos-release 17.1 -r 8.4 -p $FFU_EL8_PRODUCT_BUILD fi sudo dnf -y install cephadm EOF Analysis: - Entire file auto-generated by workarounds, not mentioned in docs - Includes critical parameters: - Nova FFU workaround to disable compute service checks - Ceph configuration for deployed clusters - UpgradeInitCommand that runs on ALL nodes before upgrade - Installs rhos-release - Configures repositories based on detected OS version - Installs cephadm - FreeIPA detection and integration (if /etc/ipa/ca.crt exists) - Not documented but essential for upgrade success --- DISCREPANCY #17: Role Modifications and Deprecated Service Removal Phase: Pre-upgrade prepare Severity: ⭐⭐ (2 - Moderate) ffu_docs: - Not mentioned automation_code (16.2-17.1_upgrade_workarounds.yaml:342-400): - roles_modification: command: | # Auto-remove deprecated services from roles file echo 'OS::TripleO::Services::CinderBackendDellEMCXTREMIOISCSI OS::TripleO::Services::Ec2Api OS::TripleO::Services::Fluentd OS::TripleO::Services::MistralApi OS::TripleO::Services::MistralEngine OS::TripleO::Services::NeutronLbaasv2Agent OS::TripleO::Services::NovaConsoleauth OS::TripleO::Services::Ntp OS::TripleO::Services::OpenDaylightApi OS::TripleO::Services::PankoApi OS::TripleO::Services::SaharaApi OS::TripleO::Services::SaharaEngine OS::TripleO::Services::Tacker OS::TripleO::Services::Zaqar' | while read i; do sed -ie "/$i/d" ${ROLES_FILE} done # Add GlanceApiInternal if missing grep GlanceApiInternal ${ROLES_FILE} || \ sed -ie 's/\(.*\)GlanceApi$/\1GlanceApi\n\1GlanceApiInternal/' ${ROLES_FILE} # Remove obsolete network port definitions sed -i -e '/OS::TripleO::Network::Ports::OVNDBsVipPort/d' ... sed -i -e '/OS::TripleO::Network::Ports::RedisVipPort/d' ... Analysis: - Automation automatically removes 20+ deprecated services from roles file - Includes services deprecated in OSP 17.1: - Mistral (workflow service) - removed - LBaaSv2 - removed - Nova Consoleauth - removed - NTP - replaced by chronyd - Panko - removed - Sahara - removed - Tacker - removed - Adds GlanceApiInternal service if missing - Removes obsolete VIP port definitions - Not documented - users would need to manually identify and remove these --- DISCREPANCY #18: OVN Workaround for Multi-Cell Deployments Phase: Post-upgrade prepare (for cell deployments) Severity: ⭐⭐ (2 - Moderate) ffu_docs (ffu_docs_part_02.txt:518-574): # Shows OVN workaround but as manual steps cat <<'EOF' > ~/ovn_workaround.yaml # Playbook to reconfigure OVN connections ... EOF ansible-playbook -i ... -e @~/overcloud-export.yaml \ -e ovn_compute_role=Compute ovn_workaround.yaml automation_code (16.2-17.1_upgrade_workarounds.yaml:242-291): - ovn_hack: command: | {% raw %} cat <<'EOF' > ~/ovn_workaround.yaml - become: true hosts: '{{ ovn_compute_role }}' tasks: - name: Read ovn southbound port command: puppet lookup --facts /etc/puppet/hieradata/service_configs.json ovn::southbound::port --render-as s - name: Create the new connection set_fact: new_ovn_connection: "ssl:{{ovn_dbs_vip.stdout}}:{{port}},ssl:{{node_ips[0]}}:{{port}},..." - name: Reconfigure ovn connection command: ovs-vsctl set Open_vSwitch . external_ids:ovn-remote="{{new_ovn_connection}}" - name: Reconfigure metadata agents shell: | crudini --set /var/lib/config-data/.../networking-ovn-metadata-agent.ini ovn ovn_sb_connection "{{ new_ovn_connection }}" systemctl restart tripleo_ovn_metadata_agent EOF # Only run for cell stacks if [[ $OVERCLOUD_STACK_NAME == *"ell"* ]]; then ansible-playbook -i ... ovn_workaround.yaml fi {% endraw %} Analysis: - Same playbook content but automation: - Auto-detects cell deployments (stack name contains "ell") - Auto-executes without manual intervention - Uses Jinja2 templating for variable substitution - Purpose: Updates OVN controller connections after DB migration from Pacemaker to Ansible-controlled cluster - Prevents data plane outage in DCN/multi-cell environments - Documentation shows as optional manual step, automation makes it automatic for cells --- DISCREPANCY #19: Stack Export and Preparation Phase: Post-undercloud upgrade Severity: ⭐⭐ (2 - Moderate) ffu_docs: - Not explicitly shown automation_code (16.2-17.1_upgrade_workarounds.yaml:209-220, 232-240): - prepare_stack_export_for_tripleo_upgrade: command: | # Prepare directory structure for tripleo-upgrade for stack in $(ls overcloud-deploy/|grep -v undercloud); do mkdir -p ~/overcloud-deploy/${stack}/config-download/${stack} sudo cat /var/lib/mistral/${stack}/tripleo-ansible-inventory.yaml > \ ~/overcloud-deploy/${stack}/config-download/${stack}/tripleo-ansible-inventory.yaml # Ensure export files are in correct locations ls ~/overcloud-deploy/${stack}/${stack}-export.yaml || \ cp ${stack}-export.yaml ~/overcloud-deploy/${stack}/ done - ensure_stack_export_is_correct: command: | source ~/stackrc # Re-run upgrade_steps_playbook to ensure correct state openstack overcloud upgrade run --yes \ --limit allovercloud,undercloud \ --stack {{ overcloud_stack_name }} \ --playbook {{ overcloud_stack_name }}/upgrade_steps_playbook.yaml \ --skip-tags upgrade_steps # Export stack and cell data openstack overcloud export --stack {{ overcloud_stack_name }} -f openstack overcloud cell export \ --control-plane-stack {{ overcloud_stack_name }} \ --force-overwrite \ --output-file overcloud-deploy/{{ overcloud_stack_name }}/{{ overcloud_stack_name }}-cell-export.yaml Analysis: - Automation auto-prepares directory structure for tripleo-upgrade plugin - Copies Ansible inventory from /var/lib/mistral/ to expected locations - Re-runs upgrade_steps_playbook to ensure heat stack is in correct state - Exports stack data for use in subsequent stages - Exports cell data for multi-cell deployments - Critical for tripleo-upgrade plugin to function, not documented --- DISCREPANCY #20: Control Plane Virtual IP Fix Phase: Post-undercloud upgrade Severity: ⭐ (1 - Minor) ffu_docs: - Not mentioned automation_code (16.2-17.1_upgrade_workarounds.yaml:193-207): - fix_ctlplane_virtual_ip: command: | cp ~/overcloud-deploy/{{ overcloud_stack_name }}/tripleo-{{ overcloud_stack_name }}-virtual-ips.yaml \ ~/tripleo-{{ overcloud_stack_name }}-virtual-ips.orig.yaml # Add ctlplane VIP if missing if $(! grep -q ctlplane ~/tripleo-{{ overcloud_stack_name }}-virtual-ips.orig.yaml); then cat <<'EOF' >> ~/overcloud-deploy/{{ overcloud_stack_name }}/tripleo-{{ overcloud_stack_name }}-virtual-ips.yaml - dns_name: overcloud ip_address: 192.168.24.20 name: control_virtual_ip network: ctlplane subnet: ctlplane-subnet EOF fi Analysis: - Bug fix: Some environments missing ctlplane VIP entry after undercloud upgrade - Auto-detects and adds if missing - Specific IP (192.168.24.20) may be environment-specific - Small fix but prevents adoption failures --- DISCREPANCY #21: FreeIPA Integration Workaround Phase: Post-undercloud upgrade Severity: ⭐ (1 - Minor) ffu_docs: - Not mentioned automation_code (16.2-17.1_upgrade_workarounds.yaml:222-229): - ipa_workarounds_rhbz_2121387: command: | stat /etc/ipa/ca.crt || exit 0 # Fix FreeIPA permission for Nova echo 12345678 | kinit admin && \ ipa privilege-add-permission 'Nova Host Management' \ --permission 'System: Modify Realm Domains' || true Analysis: - Bug fix for RHBZ#2121387: FreeIPA permission issue after upgrade - Only runs if FreeIPA is detected (/etc/ipa/ca.crt exists) - Uses hardcoded admin password "12345678" (test environment) - Adds Nova permission to modify realm domains - Environment-specific, not in general docs --- DISCREPANCY #22: Pre-Validation Glance Healthcheck Disable Phase: Before running validations Severity: ⭐ (1 - Minor) ffu_docs: - Not mentioned automation_code (16.2-17.1_upgrade_workarounds.yaml:537-545): - disable_glance_api_cron_healthcheck_service: ansible_hosts: 'glance_api:Undercloud' command: | sudo systemctl stop tripleo_glance_api_cron_healthcheck.timer \ tripleo_glance_api_cron_healthcheck.service sudo systemctl reset-failed Analysis: - Bug fix: Glance API cron healthcheck can fail during validations - Stops timer and service on undercloud and glance_api hosts - Resets failed state - Prevents validation failures due to transient healthcheck issues - Small fix but avoids validation noise --- DISCREPANCY #23: Quota Limits Removal Phase: Pre-upgrade prepare Severity: ⭐ (1 - Minor) ffu_docs: - Not mentioned automation_code (16.2-17.1_upgrade_workarounds.yaml:327-336): - unset_quota_limits: command: | stat quota || source {{ overcloud_stack_name }}rc openstack quota set --cores -1 --instances -1 --volumes -1 \ --networks -1 --subnets -1 --ports -1 --routers -1 \ --ram -1 --secgroups -1 --floating-ips -1 admin && touch quota Analysis: - Sets all quotas to unlimited (-1) for admin project - Prevents quota issues during upgrade when temporary resources are created - Creates marker file to avoid re-running - Operational workaround, not documented --- DISCREPANCY #24: Tempest Configuration Fix Phase: Post-undercloud OS upgrade Severity: ⭐ (1 - Minor) ffu_docs: - Not mentioned automation_code (16.2-17.1_upgrade_workarounds.yaml:482-492): - tempest_fix: command: | if [ -e "~/tempest_17.1/.stestr.conf" ]; then sed -i s/python3.6/python3.9/g ~/tempest_17.1/.stestr.conf fi if [ -e "~/tempest_image" ]; then rm -f ~/tempest_image fi Analysis: - Bug fix: Tempest configuration still references Python 3.6 after RHEL 9 upgrade - Updates to Python 3.9 (RHEL 9 default) - Removes outdated tempest_image marker - Testing-specific, not in production docs --- Summary Matrix | # | Discrepancy | Phase | Severity | Impact | |-----|--------------------------------------------------------|-----------------|----------|----------------------------------------------| | 1 | Repository tool (subscription-manager vs rhos-release) | All | ⭐⭐⭐ | Cannot use automation without internal tools | | 2 | Execution framework (CLI vs InfraRed) | All | ⭐⭐⭐ | Architectural incompatibility | | 3 | Workarounds file auto-generation system (546 lines) | All | ⭐⭐⭐ | Parallel automation layer | | 4 | Undercloud content switch stage | Pre-upgrade | ⭐⭐⭐ | Entire phase missing | | 5 | FFU releases parameter format | Upgrade | ⭐⭐ | Parameter not documented | | 6 | Multi-RHEL container preparation (3 stages) | Container prep | ⭐⭐⭐ | Complex vs simple | | 7 | Adoption file locations and execution | Adoption | ⭐ | Minor structural difference | | 8 | Ceph environment file removal | Ceph upgrade | ⭐⭐ | Comprehensive vs manual | | 9 | Ceph adoption two-phase process | Ceph upgrade | ⭐⭐ | More robust approach | | 10 | Overcloud upgrade strategy (OVN first + F&F) | OC upgrade | ⭐⭐⭐ | Major execution difference | | 11 | Undercloud OS upgrade command + VBMC | UC OS upgrade | ⭐⭐⭐ | Different commands + critical fix | | 12 | Control plane container prep automation | CP OS upgrade | ⭐⭐⭐ | Auto vs manual | | 13 | FFU releases during OS upgrades | OS upgrades | ⭐⭐ | Different parameter values | | 14 | Compute libvirt hack | Compute OS | ⭐⭐ | Additional file + flag | | 15 | System upgrade YAML auto-creation | OS upgrades | ⭐⭐⭐ | Auto-generated vs manual | | 16 | Overcloud params file auto-creation | OC prepare | ⭐⭐⭐ | Critical file not documented | | 17 | Role modifications (20+ deprecated services) | Pre-prepare | ⭐⭐ | Auto-cleanup not documented | | 18 | OVN workaround for cells | Post-prepare | ⭐⭐ | Auto for cells vs manual | | 19 | Stack export preparation | Post-UC upgrade | ⭐⭐ | Directory structure setup | | 20 | Ctlplane VIP fix | Post-UC upgrade | ⭐ | Bug fix | | 21 | FreeIPA integration | Post-UC upgrade | ⭐ | Environment-specific | | 22 | Glance healthcheck disable | Pre-validation | ⭐ | Validation noise reduction | | 23 | Quota limits removal | Pre-prepare | ⭐ | Operational workaround | | 24 | Tempest Python version fix | Post-UC OS | ⭐ | Testing-specific | Critical Findings Major Discrepancies (⭐⭐⭐): 10 1. Repository tool difference (subscription-manager vs rhos-release) 2. Execution framework (OpenStack CLI vs InfraRed) 3. 546-line workarounds file providing parallel automation 4. Missing undercloud content switch phase 5. Multi-RHEL container preparation (3-stage intelligent system) 6. Overcloud upgrade strategy (two-phase OVN + fast-and-furious) 7. Undercloud OS upgrade (different command + VBMC fix) 8. Control plane container prep (automated in workarounds) 9. System upgrade YAML (auto-generated vs manual) 10. Overcloud params file (auto-generated, not documented) Moderate Discrepancies (⭐⭐): 9 Minor Discrepancies (⭐): 5 Overall Assessment: The automation is fundamentally different from documentation due to the workarounds file acting as a parallel automation framework that auto-generates configuration files, handles multi-RHEL complexity, and executes pre/post hooks at every phase. The documentation describes a manual, customer-facing procedure using subscription-manager and direct OpenStack CLI, while automation uses internal Red Hat tools (rhos-release, InfraRed) with extensive automated workarounds.