-
Bug
-
Resolution: Won't Do
-
Major
-
None
-
rhel-8.6.0
-
None
-
None
-
rhel-storage-io-1
-
ssg_filesystems_storage_and_HA
-
5
-
False
-
False
-
-
None
-
None
-
None
-
None
-
If docs needed, set a value
-
-
x86_64
-
None
-
57,005
Description of problem:
When an iSCSI initiator has a large number of sessions (1024 in the example below) open with a target server, rebooting the target or setting its firewall to reject traffic for a short period of time (15-30s) leaves some or all of the iSCSI sessions in a broken state. They cannot be logged out or logged in again using 'iscsiadm' - “Logging out of session …” messages are printed, but session/block device state is not affected. The only way to clear the state is to reboot the initiator. Also, ‘iscsiadm’ hangs when trying to print more session information.
Version-Release number of selected component (if applicable):
iscsi-initiator-utils.x86_64 6.2.1.4-4.git095f59c
kernel.x86_64 4.18.0-372.26.1
How reproducible:
Easily reproducible on every attempt.
Steps to Reproduce:
Setup the target with the following script (it requires 1024*512M space on /mnt, but disk size can be reduced)
#!/bin/bash
[[ -z $TARGETS ]] && TARGETS=1024
[[ -z $BASEDIR ]] && BASEDIR="/mnt"
[[ -z $BASENAME ]] && BASENAME="iqn.2022-10.com.example"
yum install -y targetcli
firewall-cmd --permanent --add-port=3260/tcp
firewall-cmd --reload
cmds=""
for tgt in $(seq "$TARGETS"); do
disk="disk${tgt}"
target="${BASENAME}:tgt${tgt}"
cmds="${cmds}cd /backstores/fileio\n"
cmds="${cmds}create disk${tgt} ${BASEDIR}/${disk} 512M\n"
cmds="${cmds}cd /iscsi\n"
cmds="${cmds}create ${target}\n"
cmds="${cmds}cd /iscsi/${target}/tpg1/luns\n"
cmds="${cmds}create /backstores/fileio/${disk}\n"
cmds="${cmds}cd /iscsi/${target}/tpg1/acls\n"
cmds="${cmds}create iqn.2022-10.com.example:s26\n"
done
echo -e "$cmds" | targetcli
systemctl restart target
Create sessions on the initiator:
iscsiadm -m discoverydb --type sendtargets --portal 10.1.7.25 --discover # replace 10.1.7.25 with target IP
iscsiadm -m node --login all
One way to put the sessions in a broken state is to simply reboot the target server.
Another is to reject iSCSI packets for a short interval, e.g. by running ‘iptables -A INPUT -p tcp --dport 3260 -j REJECT; sleep 30; iptables -D INPUT -p tcp --dport 3260 -j REJECT’.
Actual results:
The vast majority of iSCSI block devices on the initiator go from “running” into a “blocked” state (as per ‘/sys/block/sd*/device/state’), and after a while reach “transport-offline”.
Trying to use the “iscsiadm -m session -P3” command hangs with the following output:
[root@s26 ~]# iscsiadm -m session -P3
iSCSI Transport Class version 2.0-870
version 6.2.1.4-1
Target: iqn.2022-10.com.example:tgt1 (non-flash)
Current Portal: 10.1.7.25:3260,1
Persistent Portal: 10.1.7.25:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.2022-10.com.example:s26
Iface IPaddress: 10.1.7.26
Iface HWaddress: default
Iface Netdev: default
SID: 1
When running the above command with strace, it seems to get stuck polling for a response:
socket(AF_UNIX, SOCK_STREAM, 0) = 3
connect(3,
, 30) = 0
write(3, "\r\0\0\0\0\0\0\0\1\0\0\0\0[...]”, 16104) = 16104
poll([
], 1, 1000) = 0 (Timeout)
Increasing the ‘node.session.timeo.replacement_timeout’ parameter in /etc/iscsi/iscsid.conf might allow for some devices to return back to a ‘running’ state (and they can be used as normal), but still leaves the system in an overall broken state.
Expected results:
The iSCSI sessions should either recover, or at least be able to be manually reconnected by doing a logout & login.