Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-8109

target stuck on iSCSI Login timeout

    • sst_storage_io
    • ssg_platform_storage
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem:

      While running IO from initiator on multipath device, dropping packets on target using iptables eventually renders it unusable with repeating iSCSI Login timeout messages:

      18:16:11 kernel: iSCSI Login timeout on Network Portal 172.16.1.73:3260
      18:16:11 kernel: tx_data returned -32, expecting 48.
      18:16:11 kernel: iSCSI Login negotiation failed.

      Target setup:

      o- iqn.1994-05.com.redhat:st86 ................ [TPGs: 1]
      o- tpg1 ..................................... [no-gen-acls, no-auth]
      o- acls ................................... [ACLs: 1]
        o- iqn.1994-05.com.redhat:storageqe-86 .. [Mapped LUNs: 1]
        o- mapped_lun0 ........................ [lun0 fileio/st86 (rw)]
      o- luns ................................... [LUNs: 1]
        o- lun0 ................................. [fileio/st86 (filest86) (default_tg_pt_gp)]
      o- portals ................................ [Portals: 2]
      o- 172.16.0.73:3260 ..................... [OK]
      o- 172.16.1.73:3260 ..................... [OK]

      Target logs:
      18:15:54 kernel: iSCSI Login timeout on Network Portal 172.16.0.73:3260
      18:15:57 kernel: Did not receive response to NOPIN on CID: 0, failing connection for I_T Nexus iqn.1994-05.com.redhat:storageqe-86,i,0x00023d000003,iqn.1994-05.com.redhat:st86,t,0x01
      18:16:11 kernel: iSCSI Login timeout on Network Portal 172.16.1.73:3260
      18:16:11 kernel: tx_data returned -32, expecting 48.
      18:16:11 kernel: iSCSI Login negotiation failed.
      18:16:14 kernel: Did not receive response to NOPIN on CID: 0, failing connection for I_T Nexus iqn.1994-05.com.redhat:storageqe-86,i,0x00023d000002,iqn.1994-05.com.redhat:st86,t,0x01
      18:16:28 kernel: iSCSI Login timeout on Network Portal 172.16.1.73:3260
      18:16:28 kernel: tx_data returned -32, expecting 48.
      18:16:28 kernel: iSCSI Login negotiation failed.
      18:17:55 kernel: INFO: task iscsi_np:1124 blocked for more than 120 seconds.
      18:17:55 kernel: Not tainted 4.18.0-372.9.1.el8.x86_64 #1
      18:17:55 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      18:17:55 kernel: task:iscsi_np state stack: 0 pid: 1124 ppid: 2 flags:0x80004084
      18:17:55 kernel: Call Trace:
      18:17:55 kernel: __schedule+0x2d1/0x830
      18:17:55 kernel: schedule+0x35/0xa0
      18:17:55 kernel: schedule_timeout+0x274/0x300
      18:17:55 kernel: ? signal_wake_up_state+0x15/0x30
      18:17:55 kernel: ? __send_signal+0x354/0x4b0
      18:17:55 kernel: wait_for_completion+0x96/0x100
      18:17:55 kernel: iscsit_cause_connection_reinstatement+0x95/0xf0 [iscsi_target_mod]
      18:17:55 kernel: iscsit_stop_session+0xff/0x190 [iscsi_target_mod]
      18:17:55 kernel: iscsi_check_for_session_reinstatement+0x1e4/0x280 [iscsi_target_mod]
      18:17:55 kernel: iscsi_target_do_login+0x21d/0x570 [iscsi_target_mod]
      18:17:55 kernel: iscsi_target_start_negotiation+0x51/0xc0 [iscsi_target_mod]
      18:17:55 kernel: iscsi_target_login_thread+0x820/0xe10 [iscsi_target_mod]
      18:17:55 kernel: ? iscsi_target_login_sess_out+0x150/0x150 [iscsi_target_mod]
      18:17:55 kernel: kthread+0x10a/0x120
      18:17:55 kernel: ? set_kthread_struct+0x40/0x40
      18:17:55 kernel: ret_from_fork+0x35/0x40
      18:17:55 kernel: INFO: task iscsi_ttx:2342 blocked for more than 120 seconds.
      18:17:55 kernel: Not tainted 4.18.0-372.9.1.el8.x86_64 #1
      18:17:55 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      18:17:55 kernel: task:iscsi_ttx state stack: 0 pid: 2342 ppid: 2 flags:0x80004084
      18:17:55 kernel: Call Trace:
      18:17:55 kernel: __schedule+0x2d1/0x830
      18:17:55 kernel: ? iscsit_thread_get_cpumask+0x70/0x70 [iscsi_target_mod]
      18:17:55 kernel: schedule+0x35/0xa0
      18:17:55 kernel: schedule_timeout+0x274/0x300
      18:17:56 kernel: ? check_preempt_curr+0x62/0x90
      18:17:56 kernel: ? ttwu_do_wakeup+0x19/0x160
      18:17:56 kernel: ? iscsit_thread_get_cpumask+0x70/0x70 [iscsi_target_mod]
      18:17:56 kernel: wait_for_completion+0x96/0x100
      18:17:56 kernel: kthread_stop+0x48/0x100
      18:17:56 kernel: iscsit_close_connection+0x4e0/0x910 [iscsi_target_mod]
      18:17:56 kernel: ? __schedule+0x2d9/0x830
      18:17:56 kernel: ? iscsit_thread_get_cpumask+0x70/0x70 [iscsi_target_mod]
      18:17:56 kernel: iscsit_take_action_for_connection_exit+0x7e/0x100 [iscsi_target_mod]
      18:17:56 kernel: iscsi_target_tx_thread+0x174/0x1f0 [iscsi_target_mod]
      18:17:56 kernel: ? finish_wait+0x80/0x80
      18:17:56 kernel: kthread+0x10a/0x120
      18:17:56 kernel: ? set_kthread_struct+0x40/0x40
      18:17:56 kernel: ret_from_fork+0x35/0x40

      Version-Release number of selected component (if applicable):
      RHEL-8.6.0
      targetcli-2.1.53-2.el8.noarch
      kernel-4.18.0-372.9.1.el8.x86_64

      How reproducible:
      100%

      Steps to Reproduce:
      Initiator:
      1. dnf install -y iscsi-initiator-utils device-mapper-multipath fio
      2. /sbin/mpathconf --enable && systemctl start multipathd
      3. discover the iSCSI lun and log in (multipath)
      2. fio --bs='4k' --direct='1' --ioengine='libaio' --iodepth='16' --verify='crc32c' --verify_fatal='1' --do_verify='1' --rw='randwrite' --runtime='5m' --filename='/dev/mapper/mpatha' --name='fio_test' --numjobs='1' --verify_backlog='1024'

      Target
      1. While the fio command is running on the initator, simulate packet drops using:
      iptables -I INPUT -i ens1f4 -p tcp --destination-port 3260 -j DROP; sleep 16; iptables -D INPUT -i ens1f4 -p tcp --destination-port 3260 -j DROP

      Actual results:
      transport-offline iscsi session state
      target stuck even after session logout

      Expected results:
      target operating normally

      Additional info:
      Reproduced with various initiator HW and initiator OS versions. Haven't tried other target kernel/os yet.
      Target is using Chelsio T520 adapter without cxgbit offload.

            cleech@redhat.com Chris Leech
            mhoyer@redhat.com Martin Hoyer
            Chris Leech Chris Leech
            Martin Hoyer Martin Hoyer
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: