-
Bug
-
Resolution: Done-Errata
-
Critical
-
odf-4.19
Description of problem - Provide a detailed description of the issue encountered, including logs/command-output snippets and screenshots if the issue is observed in the UI:
rook-ceph-rgw-ocs-storagecluster-cephobjectstore pod crashes with ODF v4.19.0-75 deployment on IBM Z with host networking enabled
The OCP platform infrastructure and deployment type (AWS, Bare Metal, VMware, etc. Please clarify if it is platform agnostic deployment), (IPI/UPI):
IBM Z , Baremetal
The ODF deployment type (Internal, External, Internal-Attached (LSO), Multicluster, DR, Provider, etc):
Internal Mode (Converged Provider and Internal mode)
The version of all relevant components (OCP, ODF, RHCS, ACM whichever is applicable):
OCP: 4.19.0-ec.4
ODF: v4.19.0-75
Does this issue impact your ability to continue to work with the product?
Yes
Is there any workaround available to the best of your knowledge?
No
Can this issue be reproduced? If so, please provide the hit rate
Yes
Can this issue be reproduced from the UI?
Yes
If this is a regression, please provide more details to justify this:
Steps to Reproduce:
1. Deploy OCP 4.19.0-ec.4
2. Deploy LSO and ODF v4.19.0-75
3. Update the odf-operator csv .spec.provider to IBM for the Converged Internal and Provider mode deployment
4. Create Storage system with Host networking option
The exact date and time when the issue was observed, including timezone details:
Actual results:
rook-ceph-rgw-ocs-storagecluster-cephobjectstore is in CLB state
Expected results:
rook-ceph-rgw-ocs-storagecluster-cephobjectstore pod should be up and Running
Logs collected and log location:
# oc logs rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-59d874f8jhn9 -f Defaulted container "rgw" out of: rgw, log-collector, chown-container-data-dir (init) + exec radosgw --crush-location=host=worker-1-a3e18001-lnxero1-boe --keyring=/etc/ceph/keyring-store/keyring --default-log-to-stderr=true --default-err-to-stderr=true --default-mon-cluster-log-to-stderr=true '--default-log-stderr-prefix=debug ' --default-log-to-file=false --default-mon-cluster-log-to-file=false '--mon-host=[v2:172.23.235.17:3300],[v2:172.23.235.16:3300],[v2:172.23.235.15:3300]' --mon-initial-members=a,b,c --id=rgw.ocs.storagecluster.cephobjectstore.a --setuser=ceph --setgroup=ceph --foreground '--rgw-frontends=beast port=80 ssl_port=443 ssl_certificate=/etc/ceph/private/rgw-cert.pem ssl_private_key=/etc/ceph/private/rgw-key.pem' --rgw-mime-types-file=/etc/ceph/rgw/mime.types --rgw-realm=ocs-storagecluster-cephobjectstore --rgw-zonegroup=ocs-storagecluster-cephobjectstore --rgw-zone=ocs-storagecluster-cephobjectstore --rados-replica-read-policy=localize debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00 0 deferred set uid:gid to 167:167 (ceph:ceph) debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00 0 ceph version 19.2.1-120.el9cp (9d9d735fbda3c9cca21e066e3d8238ee9520d682) squid (stable), process radosgw, pid 4436 debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00 0 framework: beast debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00 0 framework conf key: port, val: 80 debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00 0 framework conf key: ssl_port, val: 443 debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00 0 framework conf key: ssl_certificate, val: /etc/ceph/private/rgw-cert.pem debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00 0 framework conf key: ssl_private_key, val: /etc/ceph/private/rgw-key.pem debug 2025-04-17T14:09:43.690+0000 3ffb1a5ab00 1 init_numa not setting numa affinity debug 2025-04-17T14:09:45.220+0000 3fe7a69f800 1 v1 topic migration: starting v1 topic migration.. debug 2025-04-17T14:09:45.220+0000 3fe7a69f800 1 v1 topic migration: finished v1 topic migration debug 2025-04-17T14:09:45.280+0000 3ffb1a5ab00 -1 LDAP not started since no server URIs were provided in the configuration. debug 2025-04-17T14:09:45.280+0000 3ffb1a5ab00 1 rgw main: Lua ERROR: failed to find luarocks debug 2025-04-17T14:09:45.320+0000 3ffb1a5ab00 0 framework: beast debug 2025-04-17T14:09:45.320+0000 3ffb1a5ab00 0 framework conf key: ssl_certificate, val: config://rgw/cert/$realm/$zone.crt debug 2025-04-17T14:09:45.320+0000 3ffb1a5ab00 0 framework conf key: ssl_private_key, val: config://rgw/cert/$realm/$zone.key debug 2025-04-17T14:09:45.340+0000 3ffb1a5ab00 0 starting handler: beast debug 2025-04-17T14:09:45.340+0000 3ffb1a5ab00 -1 failed to bind address 0.0.0.0:443: Address already in use debug 2025-04-17T14:09:45.340+0000 3ffb1a5ab00 -1 ERROR: failed initializing frontend debug 2025-04-17T14:09:45.340+0000 3ffb1a5ab00 -1 ERROR: initialize frontend fail, r = 98
Additional info:
Unable to provide must-gather logs due to the following error, although I was able to collect the logs previously, pls let me know if you need any specific logs
Error running must-gather collection: gather did not start for pod must-gather-jjbcn: unable to pull image: ImagePullBackOff: Back-off pulling image "quay.io/rhceph-dev/ocs-must-gather:latest-4.19": ErrImagePull: [rpc error: code = Canceled desc = copying system image from manifest list: copying config: context canceled, initializing source docker://quay.io/rhceph-dev/ocs-must-gather:latest-4.19: reading manifest latest-4.19 in quay.io/rhceph-dev/ocs-must-gather: unauthorized: access to the requested resource is not authorized]