When using JBOD storage, each volume should have a unique ID. But we currently don't validate it and when the same ID is used multiple times, it causes invalid Pod definition and the Pod is deleted and not recreated anymore:
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.96.0.1:443/api/v1/namespaces/myproject/pods. Message: Pod "my-cluster-bodymoor-3000" is invalid: [spec.volumes[2].name: Duplicate value: "data-1", spec.containers[0].volumeMounts[2].mountPath: Invalid value: "/var/lib/kafka/data-1": must be unique]. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.volumes[2].name, message=Duplicate value: "data-1", reason=FieldValueDuplicate, additionalProperties={}), StatusCause(field=spec.containers[0].volumeMounts[2].mountPath, message=Invalid value: "/var/lib/kafka/data-1": must be unique, reason=FieldValueInvalid, additionalProperties={})], group=null, kind=Pod, name=my-cluster-bodymoor-3000, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Pod "my-cluster-bodymoor-3000" is invalid: [spec.volumes[2].name: Duplicate value: "data-1", spec.containers[0].volumeMounts[2].mountPath: Invalid value: "/var/lib/kafka/data-1": must be unique], metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}). at io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:238) ~[io.fabric8.kubernetes-client-api-6.12.0.jar:?] at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:507) ~[io.fabric8.kubernetes-client-6.12.0.jar:?] at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:524) ~[io.fabric8.kubernetes-client-6.12.0.jar:?] at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:340) ~[io.fabric8.kubernetes-client-6.12.0.jar:?] at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:754) ~[io.fabric8.kubernetes-client-6.12.0.jar:?] at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:98) ~[io.fabric8.kubernetes-client-6.12.0.jar:?] at io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42) ~[io.fabric8.kubernetes-client-6.12.0.jar:?] at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1155) ~[io.fabric8.kubernetes-client-6.12.0.jar:?] at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:98) ~[io.fabric8.kubernetes-client-6.12.0.jar:?] at io.strimzi.operator.cluster.operator.assembly.StrimziPodSetController.maybeCreateOrPatchPod(StrimziPodSetController.java:470) ~[io.strimzi.cluster-operator-0.41.0-SNAPSHOT.jar:0.41.0-SNAPSHOT] at io.strimzi.operator.cluster.operator.assembly.StrimziPodSetController.reconcile(StrimziPodSetController.java:398) ~[io.strimzi.cluster-operator-0.41.0-SNAPSHOT.jar:0.41.0-SNAPSHOT] at io.strimzi.operator.cluster.operator.assembly.StrimziPodSetController.run(StrimziPodSetController.java:566) ~[io.strimzi.cluster-operator-0.41.0-SNAPSHOT.jar:0.41.0-SNAPSHOT] at java.lang.Thread.run(Thread.java:840) ~[?:?]
We should validate the storage for this.
- links to
-
RHSA-2024:142550 Streams for Apache Kafka 2.8.0 release and security update
Since the problem described in this issue should be resolved in a recent advisory, it has been closed.
For information on the advisory (Moderate: Streams for Apache Kafka 2.8.0 release and security update), and where to find the updated files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2024:9571