Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-2324

rook-ceph-rgw-ocs-storagecluster-cephobjectstore pod crashes with ODF v4.19.0-75 deployment with host networking enabled

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • Committed
    • ?
    • s390x
    • ?
    • 4.19.0-64.konflux
    • Committed
    • Release Note Not Required
    • None

       

      Description of problem - Provide a detailed description of the issue encountered, including logs/command-output snippets and screenshots if the issue is observed in the UI:

      rook-ceph-rgw-ocs-storagecluster-cephobjectstore pod crashes with ODF v4.19.0-75 deployment on IBM Z with host networking enabled

      The OCP platform infrastructure and deployment type (AWS, Bare Metal, VMware, etc. Please clarify if it is platform agnostic deployment), (IPI/UPI):

      IBM Z , Baremetal

      The ODF deployment type (Internal, External, Internal-Attached (LSO), Multicluster, DR, Provider, etc):

      Internal Mode (Converged Provider and Internal mode)

       

      The version of all relevant components (OCP, ODF, RHCS, ACM whichever is applicable):

      OCP: 4.19.0-ec.4

      ODF: v4.19.0-75

       

      Does this issue impact your ability to continue to work with the product?

      Yes

       

      Is there any workaround available to the best of your knowledge?

      No

       

      Can this issue be reproduced? If so, please provide the hit rate

      Yes

       

      Can this issue be reproduced from the UI?

      Yes

      If this is a regression, please provide more details to justify this:

       

      Steps to Reproduce:

      1. Deploy OCP  4.19.0-ec.4

      2. Deploy LSO and ODF v4.19.0-75

      3. Update the odf-operator csv .spec.provider to IBM for the Converged Internal and Provider mode deployment

      4. Create Storage system with Host networking option

      The exact date and time when the issue was observed, including timezone details:

       

      Actual results:

      rook-ceph-rgw-ocs-storagecluster-cephobjectstore is in CLB state

       

      Expected results:

      rook-ceph-rgw-ocs-storagecluster-cephobjectstore pod should be up and Running

      Logs collected and log location:

       

      # oc logs rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-59d874f8jhn9 -f
      Defaulted container "rgw" out of: rgw, log-collector, chown-container-data-dir (init)
      + exec radosgw --crush-location=host=worker-1-a3e18001-lnxero1-boe --keyring=/etc/ceph/keyring-store/keyring --default-log-to-stderr=true --default-err-to-stderr=true --default-mon-cluster-log-to-stderr=true '--default-log-stderr-prefix=debug ' --default-log-to-file=false --default-mon-cluster-log-to-file=false '--mon-host=[v2:172.23.235.17:3300],[v2:172.23.235.16:3300],[v2:172.23.235.15:3300]' --mon-initial-members=a,b,c --id=rgw.ocs.storagecluster.cephobjectstore.a --setuser=ceph --setgroup=ceph --foreground '--rgw-frontends=beast port=80 ssl_port=443 ssl_certificate=/etc/ceph/private/rgw-cert.pem ssl_private_key=/etc/ceph/private/rgw-key.pem' --rgw-mime-types-file=/etc/ceph/rgw/mime.types --rgw-realm=ocs-storagecluster-cephobjectstore --rgw-zonegroup=ocs-storagecluster-cephobjectstore --rgw-zone=ocs-storagecluster-cephobjectstore --rados-replica-read-policy=localize
      debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00  0 deferred set uid:gid to 167:167 (ceph:ceph)
      debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00  0 ceph version 19.2.1-120.el9cp (9d9d735fbda3c9cca21e066e3d8238ee9520d682) squid (stable), process radosgw, pid 4436
      debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00  0 framework: beast
      debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00  0 framework conf key: port, val: 80
      debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00  0 framework conf key: ssl_port, val: 443
      debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00  0 framework conf key: ssl_certificate, val: /etc/ceph/private/rgw-cert.pem
      debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00  0 framework conf key: ssl_private_key, val: /etc/ceph/private/rgw-key.pem
      debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00  1 init_numa not setting numa affinity
      debug 2025-04-17T14:09:45.220+0000 3fe7a69f800  1 v1 topic migration: starting v1 topic migration..
      debug 2025-04-17T14:09:45.220+0000 3fe7a69f800  1 v1 topic migration: finished v1 topic migration
      debug 2025-04-17T14:09:45.280+0000 3ffb1a5ab00 -1 LDAP not started since no server URIs were provided in the configuration.
      debug 2025-04-17T14:09:45.280+0000 3ffb1a5ab00  1 rgw main: Lua ERROR: failed to find luarocks
      debug 2025-04-17T14:09:45.320+0000 3ffb1a5ab00  0 framework: beast
      debug 2025-04-17T14:09:45.320+0000 3ffb1a5ab00  0 framework conf key: ssl_certificate, val: config://rgw/cert/$realm/$zone.crt
      debug 2025-04-17T14:09:45.320+0000 3ffb1a5ab00  0 framework conf key: ssl_private_key, val: config://rgw/cert/$realm/$zone.key
      debug 2025-04-17T14:09:45.340+0000 3ffb1a5ab00  0 starting handler: beast
      debug 2025-04-17T14:09:45.340+0000 3ffb1a5ab00 -1 failed to bind address 0.0.0.0:443: Address already in use
      debug 2025-04-17T14:09:45.340+0000 3ffb1a5ab00 -1 ERROR: failed initializing frontend
      debug 2025-04-17T14:09:45.340+0000 3ffb1a5ab00 -1 ERROR:  initialize frontend fail, r = 98

      Additional info:

       Unable to provide must-gather logs due to the following error, although I was able to collect the logs previously,  pls let me know if you need any specific logs
       

      Error running must-gather collection: gather did not start for pod must-gather-jjbcn: unable to pull image: ImagePullBackOff: Back-off pulling image "quay.io/rhceph-dev/ocs-must-gather:latest-4.19": ErrImagePull: [rpc error: code = Canceled desc = copying system image from manifest list: copying config: context canceled, initializing source docker://quay.io/rhceph-dev/ocs-must-gather:latest-4.19: reading manifest latest-4.19 in quay.io/rhceph-dev/ocs-must-gather: unauthorized: access to the requested resource is not authorized]

              badhikar@redhat.com Bipul Adhikari
              sravikab2 Sravika Balusu
              Bipul Adhikari
              Daniel Osypenko Daniel Osypenko
              Votes:
              0 Vote for this issue
              Watchers:
              28 Start watching this issue

                Created:
                Updated:
                Resolved: