Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-16888

Fail to apply machine-config during rhel node upgrade

XMLWordPrintable

    • No
    • Approved
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      ci job "amd64-nightly-4.13-upgrade-from-stable-4.12-vsphere-ipi-proxy-workers-rhel8" failed at rhel node upgrade stage with following error:
      
      TASK [openshift_node : Apply machine config] ***********************************3583task path: /usr/share/ansible/openshift-ansible/roles/openshift_node/tasks/apply_machine_config.yml:683584Using module file /opt/python-env/ansible-core/lib64/python3.8/site-packages/ansible/modules/command.py3585Pipelining is enabled.3586<192.168.233.236> ESTABLISH SSH CONNECTION FOR USER: test3587<192.168.233.236> SSH: EXEC ssh -o ControlMaster=auto -o ControlPersist=600s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="test"' -o ConnectTimeout=30 -o IdentityFile=/var/run/secrets/ci.openshift.io/cluster-profile/ssh-privatekey -o StrictHostKeyChecking=no -o 'ControlPath="/alabama/.ansible/cp/%h-%r"' 192.168.233.236 '/bin/sh -c '"'"'sudo -H -S -n  -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-vwugynewkogzaosazvikpnplnmjoluxs ; http_proxy=http://XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX@192.168.221.228:3128 https_proxy=http://XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX@192.168.221.228:3128 no_proxy=.cluster.local,.svc,10.128.0.0/14,127.0.0.1,172.30.0.0/16,192.168.233.0/25,api-int.ci-op-ssnlf4qb-1dacf.vmc-ci.devcluster.openshift.com,localhost /usr/libexec/platform-python'"'"'"'"'"'"'"'"' && sleep 0'"'"''3588Escalation succeeded3589<192.168.233.236> (1, b'\n{"changed": XXXX, "stdout": "I0726 23:36:56.436283   27240 start.go:61] Version: v4.13.0-202307242035.p0.g7b54f1d.assembly.stream-dirty (7b54f1dcce4ea9f69f300d0e1cf2316def45bf72)\\r\\nI0726 23:36:56.437075   27240 daemon.go:478] not chrooting for source=rhel-8 target=rhel-8\\r\\nF0726 23:36:56.437240   27240 start.go:75] failed to re-exec: writing /rootfs/run/bin/machine-config-daemon: open /rootfs/run/bin/machine-config-daemon: text file busy", "stderr": "time=\\"2023-07-26T19:36:55-04:00\\" level=warning msg=\\"The input device is not a TTY. The --tty and --interactive flags might not work properly\\"", "rc": 255, "cmd": ["podman", "run", "-v", "/:/rootfs", "--pid=host", "--privileged", "--rm", "--entrypoint=/usr/bin/machine-config-daemon", "-ti", "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0110276ce82958a105cdd59028043bcdb1e5c33a77e550a13a1dc51aee08b032", "start", "--node-name", "ci-op-ssnlf4qb-1dacf-bbmqt-rhel-1", "--once-from", "/tmp/ansible.mlldlsm5/worker_ignition_config.json", "--skip-reboot"], "start": "2023-07-26 19:36:55.852527", "end": "2023-07-26 19:36:56.827081", "delta": "0:00:00.974554", "failed": XXXX, "msg": "non-zero return code", "invocation": {"module_args": {"_raw_params": "podman run -v /:/rootfs --pid=host --privileged --rm --entrypoint=/usr/bin/machine-config-daemon -ti quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0110276ce82958a105cdd59028043bcdb1e5c33a77e550a13a1dc51aee08b032 start --node-name ci-op-ssnlf4qb-1dacf-bbmqt-rhel-1 --once-from /tmp/ansible.mlldlsm5/worker_ignition_config.json --skip-reboot", "_uses_shell": false, "warn": false, "stdin_add_newline": XXXX, "strip_empty_ends": XXXX, "argv": null, "chdir": null, "executable": null, "creates": null, "removes": null, "stdin": null}}}\n', b'')3590<192.168.233.236> Failed to connect to the host via ssh: 3591fatal: [192.168.233.236]: FAILED! => {3592    "changed": XXXX,3593    "cmd": [3594        "podman",3595        "run",3596        "-v",3597        "/:/rootfs",3598        "--pid=host",3599        "--privileged",3600        "--rm",3601        "--entrypoint=/usr/bin/machine-config-daemon",3602        "-ti",3603        "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0110276ce82958a105cdd59028043bcdb1e5c33a77e550a13a1dc51aee08b032",3604        "start",3605        "--node-name",3606        "ci-op-ssnlf4qb-1dacf-bbmqt-rhel-1",3607        "--once-from",3608        "/tmp/ansible.mlldlsm5/worker_ignition_config.json",3609        "--skip-reboot"3610    ],3611    "delta": "0:00:00.974554",3612    "end": "2023-07-26 19:36:56.827081",3613    "invocation": {3614        "module_args": {3615            "_raw_params": "podman run -v /:/rootfs --pid=host --privileged --rm --entrypoint=/usr/bin/machine-config-daemon -ti quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0110276ce82958a105cdd59028043bcdb1e5c33a77e550a13a1dc51aee08b032 start --node-name ci-op-ssnlf4qb-1dacf-bbmqt-rhel-1 --once-from /tmp/ansible.mlldlsm5/worker_ignition_config.json --skip-reboot",3616            "_uses_shell": false,3617            "argv": null,3618            "chdir": null,3619            "creates": null,3620            "executable": null,3621            "removes": null,3622            "stdin": null,3623            "stdin_add_newline": XXXX,3624            "strip_empty_ends": XXXX,3625            "warn": false3626        }3627    },3628    "msg": "non-zero return code",3629    "rc": 255,3630    "start": "2023-07-26 19:36:55.852527",3631    "stderr": "time=\"2023-07-26T19:36:55-04:00\" level=warning msg=\"The input device is not a TTY. The --tty and --interactive flags might not work properly\"",3632    "stderr_lines": [3633        "time=\"2023-07-26T19:36:55-04:00\" level=warning msg=\"The input device is not a TTY. The --tty and --interactive flags might not work properly\""3634    ],3635    "stdout": "I0726 23:36:56.436283   27240 start.go:61] Version: v4.13.0-202307242035.p0.g7b54f1d.assembly.stream-dirty (7b54f1dcce4ea9f69f300d0e1cf2316def45bf72)\r\nI0726 23:36:56.437075   27240 daemon.go:478] not chrooting for source=rhel-8 target=rhel-8\r\nF0726 23:36:56.437240   27240 start.go:75] failed to re-exec: writing /rootfs/run/bin/machine-config-daemon: open /rootfs/run/bin/machine-config-daemon: text file busy",3636    "stdout_lines": [3637        "I0726 23:36:56.436283   27240 start.go:61] Version: v4.13.0-202307242035.p0.g7b54f1d.assembly.stream-dirty (7b54f1dcce4ea9f69f300d0e1cf2316def45bf72)",3638        "I0726 23:36:56.437075   27240 daemon.go:478] not chrooting for source=rhel-8 target=rhel-8",3639        "F0726 23:36:56.437240   27240 start.go:75] failed to re-exec: writing /rootfs/run/bin/machine-config-daemon: open /rootfs/run/bin/machine-config-daemon: text file busy"3640    ]3641}3642

      Version-Release number of selected component (if applicable):

      4.13.0-0.nightly-2023-07-26-101700

      How reproducible:

      always

      Steps to Reproduce:

      Found in ci:
      1. Install a v4.13.6 cluster with rhel8 node
      2. Upgrade ocp succeed
      3. Upgrade rhel node
      

      Actual results:

      rhel node upgrade failed

      Expected results:

      rhel node upgrade succeed

      Additional info:

      job link: https://qe-private-deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gs/qe-private-deck/logs/periodic-ci-openshift-openshift-tests-private-release-4.13-amd64-nightly-4.13-upgrade-from-stable-4.12-vsphere-ipi-proxy-workers-rhel8-p2-f28/1684288836412116992

            rhn-engineering-skumari Sinny Kumari
            rhn-support-jiajliu Jia Liu
            Gaoyun Pei Gaoyun Pei
            Colin Walters, Yu Qi Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: