-
Bug
-
Resolution: Done-Errata
-
Normal
-
None
-
4.11
-
Important
-
No
-
False
-
This is a clone of issue OCPBUGS-18703. The following is the description of the original issue:
—
Description of problem:
Sometimes we met RHEL worker lost connection when upgrade, actually it happened on remove openvswitch rpm package as below. but it not always happen see logs, there are 2 RHEL worker upgraded, one (ip 10.0.176.221) is success, but the other (10.0.177.38) failed. https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp4-rhel-scaleup-runner/26003/consoleFull TASK [openshift_node : Find all downloaded rpms] ******************************* ok: [10.0.177.38] => {"changed": false, "examined": 2, "files": [{"atime": 1694159818.368708, "ctime": 1694159818.4047084, "dev": 64515, "gid": 0, "gr_name": "root", "inode": 168357075, "isblk": false, "ischr": false, "isdir": false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "mode": "0644", "mtime": 1694159818.3677077, "nlink": 1, "path": "/tmp/openshift-ansible-packages/openvswitch2.17-2.17.0-106.el8fdp.x86_64.rpm", "pw_name": "root", "rgrp": true, "roth": true, "rusr": true, "size": 6866096, "uid": 0, "wgrp": false, "woth": false, "wusr": true, "xgrp": false, "xoth": false, "xusr": false}, {"atime": 1694159822.372777, "ctime": 1694159822.377777, "dev": 64515, "gid": 0, "gr_name": "root", "inode": 168357076, "isblk": false, "ischr": false, "isdir": false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "mode": "0644", "mtime": 1694159822.372777, "nlink": 1, "path": "/tmp/openshift-ansible-packages/policycoreutils-python-utils-2.9-24.el8.noarch.rpm", "pw_name": "root", "rgrp": true, "roth": true, "rusr": true, "size": 259768, "uid": 0, "wgrp": false, "woth": false, "wusr": true, "xgrp": false, "xoth": false, "xusr": false}], "matched": 2, "msg": "All paths examined", "skipped_paths": {}} TASK [openshift_node : Setting list of rpms] *********************************** ok: [10.0.177.38] => {"ansible_facts": {"rpm_list": ["/tmp/openshift-ansible-packages/openvswitch2.17-2.17.0-106.el8fdp.x86_64.rpm", "/tmp/openshift-ansible-packages/policycoreutils-python-utils-2.9-24.el8.noarch.rpm"]}, "changed": false} TASK [openshift_node : Remove known conflicts] ********************************* changed: [10.0.177.38] => (item=openvswitch) => {"ansible_loop_var": "item", "changed": true, "item": "openvswitch", "msg": "", "rc": 0, "results": ["Removed: openvswitch-selinux-extra-policy-1.0-31.el8fdp.noarch", "Removed: openvswitch2.17-2.17.0-106.el8fdp.x86_64"]} TASK [openshift_node : Install downloaded packages] **************************** fatal: [10.0.177.38]: FAILED! => {"changed": false, "msg": "Failed to download metadata for repo 'rhel-8-for-x86_64-appstream-rpms': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried", "rc": 1, "results": []}
Version-Release number of selected component (if applicable):
4.11
How reproducible:
not always
Steps to Reproduce:
1. Upgrade cluster with RHEL8 from 4.11.48-x86_64 to 4.11.0-0.nightly-2023-09-05-134659 2. 3.
Actual results:
Expected results:
Additional info:
since the worker cannot be accessed, so no logs can be collected for now. Trying to find other ways if this happen next time. eg not sure it can be rollback if reboot the worker
- blocks
-
OCPBUGS-20219 [RHEL]Host lost connection during upgrade for RHEL worker
- Closed
- clones
-
OCPBUGS-18703 [RHEL]Host lost connection during upgrade for RHEL worker
- Closed
- is blocked by
-
OCPBUGS-18703 [RHEL]Host lost connection during upgrade for RHEL worker
- Closed
- is cloned by
-
OCPBUGS-20219 [RHEL]Host lost connection during upgrade for RHEL worker
- Closed
- links to
-
RHSA-2023:5006 OpenShift Container Platform 4.14.z security update