Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-29474

Persistent disk naming issues persist across reboots in CoreOS, challenging conventional fixes, impacting various environments and requiring robust solutions.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • 4.12.z
    • RHCOS
    • Critical
    • No
    • False
    • Hide

      None

      Show
      None
    • Customer Escalated, Customer Facing

      Description of problem:

      Here is the complete details what we have observed and faced:

      • Few days back, we have faced less IOPS issue with the azure.
      • We recommended to use the disk with good IOPS.
      • In order to achieve this we went on remote with the customer.
      • interestingly, we found that there were 2 disks attached to the core os, and this is only in Azure (temp disk)[1] 

       

      ==============================================

      We have faced another big problem while adding additional faster disk for the etcd specifically, we have faced the below issues:

      We continue to face this challenge with Red Hat CoreOS, and despite extensive efforts, we haven't identified a viable solution yet. one of our respective SPTSE created KCS [2] during version 4.3, which addressed bare metal deployment. However, given the current reliance on Cloud Providers, this approach appears impractical.

      We require assistance in brainstorming various options to ensure that mounts remain persistent across reboots. Utilizing /etc/fstab for CoreOS doesn't seem suitable or practical for our needs. Additionally, relying on /dev/disk/by-path and by-id values presents challenges since they differ for each machine and disk. Therefore, a single machine-config with secondary mounts wouldn't provide a comprehensive solution.

       

      Would it be advisable to generate /etc/fstab entries using UUIDs and establish distinct machine-config-pools for each machine individually, given that this would entail hard-coded entries? While our documentation [3] offers some solutions concerning secondary disks, relying on disk names isn't reliable across reboots, resulting in instability for clients.

      Furthermore, through extensive discussions across SBRs, we have delved into this matter in detail and have learned additional perspectives, summarized as follows.

      For the establishment of a distinct and dedicated secondary /var partition, we've documented a procedure [4] that involves hardcoding the disk name as /dev/nvme1n1, which represents the AWS block device's absolute path. While the procedure remains the same, any changes in disk names would result in failure. This approach may encounter issues when applied to bare metal or VMware vSphere or Azure environments, where disk names typically appear as /dev/sda or /dev/sdb, potentially leading to failures.

      A similar procedure [5] is outlined for bare metal UPI, which relies on by-id values. However, these values may vary for each node, adding complexity to the procedure. Example by-id values are provided.

       

       

      ls -ltr /dev/disk/by-id
      total 0
      lrwxrwxrwx. 1 root root  9 Feb 12 19:05 wwn-0x60022480522cf2d84b3fb8c42ef578e6 -> ../../sda
      lrwxrwxrwx. 1 root root  9 Feb 12 19:05 scsi-360022480522cf2d84b3fb8c42ef578e6 -> ../../sda
      lrwxrwxrwx. 1 root root  9 Feb 12 19:05 scsi-14d53465420202020522cf2d84b3fa94cb8c2b8c42ef578e6 -> ../../sda
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 scsi-360022480522cf2d84b3fb8c42ef578e6-part1 -> ../../sda1
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 scsi-14d53465420202020522cf2d84b3fa94cb8c2b8c42ef578e6-part1 -> ../../sda1
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 wwn-0x60022480522cf2d84b3fb8c42ef578e6-part1 -> ../../sda1
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 wwn-0x60022480522cf2d84b3fb8c42ef578e6-part4 -> ../../sda4
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 scsi-360022480522cf2d84b3fb8c42ef578e6-part4 -> ../../sda4
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 scsi-14d53465420202020522cf2d84b3fa94cb8c2b8c42ef578e6-part4 -> ../../sda4
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 wwn-0x60022480522cf2d84b3fb8c42ef578e6-part2 -> ../../sda2
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 scsi-360022480522cf2d84b3fb8c42ef578e6-part2 -> ../../sda2
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 scsi-14d53465420202020522cf2d84b3fa94cb8c2b8c42ef578e6-part2 -> ../../sda2
      lrwxrwxrwx. 1 root root  9 Feb 12 19:05 ata-Virtual_CD -> ../../sr0
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 wwn-0x60022480522cf2d84b3fb8c42ef578e6-part3 -> ../../sda3
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 scsi-360022480522cf2d84b3fb8c42ef578e6-part3 -> ../../sda3
      lrwxrwxrwx. 1 root root 10 Feb 12 19:05 scsi-14d53465420202020522cf2d84b3fa94cb8c2b8c42ef578e6-part3 -> ../../sda3
      lrwxrwxrwx. 1 root root  9 Feb 12 19:40 wwn-0x600224808fe8bd31e72955aadc4cf77d -> ../../sdb
      lrwxrwxrwx. 1 root root  9 Feb 12 19:40 scsi-SMsft_Virtual_Disk_8FE8BD31E729E641A58E55AADC4CF77D -> ../../sdb
      lrwxrwxrwx. 1 root root  9 Feb 12 19:40 scsi-3600224808fe8bd31e72955aadc4cf77d -> ../../sdb
      lrwxrwxrwx. 1 root root  9 Feb 12 19:40 scsi-14d534654202020208fe8bd31e729e641a58e55aadc4cf77d -> ../../sdb
      

      So the butane file should have different values for each worker here since the by-id for secondary disk would differ for each worker/master. We have many clients using this[6] KCS as well and this too talks about hard-coded disk name i.e. /dev/sdb which isn't consistent. Oscar in comments section says using UUIDs but that doesn't seem feasible.

      [1] https://learn.microsoft.com/en-us/azure/virtual-machines/managed-disks-overview#temporary-disk
      [2] https://access.redhat.com/solutions/5023051
      [3] https://docs.openshift.com/container-platform/4.12/scalability_and_performance/recommended-performance-scale-practices/recommended-etcd-practices.html#move-etcd-different-disk_recommended-etcd-practices
      [4] https://docs.openshift.com/container-platform/4.14/post_installation_configuration/node-tasks.html#machine-node-custom-partition_post-install-node-tasks
      [5] https://docs.openshift.com/container-platform/4.14/installing/installing_bare_metal/installing-bare-metal.html#installation-user-infra-machines-advanced_vardisk_installing-bare-metal
      [6] https://access.redhat.com/solutions/4952011

      Steps to Reproduce:

      Yes it's 100% reproducible on Azure IPI OCP

       

            jlebon1@redhat.com Jonathan Lebon
            rhn-support-bbabbar Bharat Babbar
            Michael Nguyen Michael Nguyen
            Votes:
            2 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated:
              Resolved: