Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-48256

Agent-based install using iSCSI fails when writing image to target

XMLWordPrintable

      When testing the agent-based installer using iSCSI with dev-scripts https://github.com/openshift-metal3/dev-scripts/pull/1727 it was found that the installer was not able to complete the installation when using multiple hosts. This same problem did not appear when using SNO.

      The iscsi session from all the hosts work do their targets fine until coreos-installer is run, at which time (before reboot) the connection to the target is lost and the coreos-installer fails

      Jan 09 16:12:23 master-1 kernel:  session1: session recovery timed out after 120 secs
      Jan 09 16:12:23 master-1 kernel: sd 7:0:0:0: rejecting I/O to offline device
      Jan 09 16:12:23 master-1 kernel: I/O error, dev sdb, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 2
      Jan 09 16:12:23 master-1 installer[2937]: time="2025-01-09T16:12:23Z" level=info msg="\nError: syncing data to disk\n\nCaused by:\n    Input/output error (os error 5)\n\nResetting partition table\n"
      Jan 09 16:12:23 master-1 installer[2937]: time="2025-01-09T16:12:23Z" level=warning msg="Retrying after error: failed executing /usr/bin/nsenter [--target 1 --cgroup --mount --ipc --pid -- coreos-installer install --insecure -i /opt/install-dir/master-f3c24588-2129-483f-9dfb-8a8fe332a4bf.ign --append-karg rd.iscsi.firmware=1 --append-karg ip=enp6s0:dhcp --copy-network /dev/sdb], Error exit status 1, LastOutput \"Error: syncing data to disk\n\nCaused by:\n    Input/output error (os error 5)\n\nResetting partition table\nError: syncing partition table to disk\n\nCaused by:\n    Input/output error (os error 5)\""
      

      On the host it can be seen that session shows as logged out

      Iface Name: default
      		Iface Transport: tcp
      		Iface Initiatorname: iqn.2023-01.com.example:master-1
      		Iface IPaddress: [default]
      		Iface HWaddress: default
      		Iface Netdev: default
      		SID: 1
      		iSCSI Connection State: Unknown
      		iSCSI Session State: FREE
      		Internal iscsid Session State: Unknown
      

      The problem occurs because the iscsid service is not running. If it is started by iscsadm then coreos-installer can successfully write the image to disk.

              bfournie@redhat.com Robert Fournier
              bfournie@redhat.com Robert Fournier
              Manoj Hans Manoj Hans
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: