Hi guys, as you may know, Kogito Operator uses Infinispan Operator to create a predefined Infinispan instance running on Openshift. Quite often we encounter an issue reproducible with these steps:
1. Install the Kogito Operator (this will also install Infinispan Operator).
2. Create a KogitoApp custom resource (CR) with this YAML:
apiVersion: app.kiegroup.org/v1alpha1 kind: KogitoApp metadata: name: example-quarkus spec: enablePersistence: true build: envs: # enable persistence - name: MAVEN_ARGS_APPEND value: "-Ppersistence" gitSource: contextDir: process-quarkus-example uri: 'https://github.com/kiegroup/kogito-examples' reference: master
This will create a KogitoApp CR and will tell the Kogito Operator to provision Infinispan with one replica. Kogito application runs on Quarkus which makes use of RemoteCacheManager of Quarkus Infinispan Client Extension. Up to this point, everything works, application is deployed.
3. Try change Infinispan config by editing Infinispan CR a few times and Infinispan won't be able to start properly. By editing I mean - change one of these 3 parameters:
... spec: container: cpu: '' extraJvmOpts: '' memory: '' ...
I generally want to change the cpu and memory as defaults are too low and I am also specifying `-Xmx2G` to extraJvmOpts so Infinispan has more heap than 200 MB which is default.
Anyway, if you do this change a couple of times and after each change you wait until Infinispan pod is restarted, after ~5 times you will see java.nio.channels.OverlappingFileLockException in the Infinispan pod log.
There is also another issue attached at the bottom of the Gist which was observed in Openshift events logs.
What I have found is that if I create only KogitoInfra CR, which will create Infinispan CR and won't run any KogitoApp, so there is nothing connected to Infinispan, I can restart it how many times I want and it will work without any issues. I even tried to store something to the Infinispan via Infinispan REST API from the pod and tried changing Infinispan configuration then, and it worked like a charm after each restart.
However, as soon as I deploy KogitoApp so it is connected to Infinispan using HotRod client and change Infinispan CR a few times (after each change waiting for Infinispan pod to restart), it will break with the linked exception present in the logs.
To me it seems that this stops working once there is actual connection to Infinispan using HotRod client. I am not sure how this client works internally, but I would think that in addition to real user data there is some sort of exchange of "control data" let's say in this protocol between client and Infinispan which might break if Infinispan is suddenly restarted? Not sure, but with pushing there data using REST API (so without HotRod client) where the connection is maintained only for the time of the request, the exception didn't occur.
- is caused by
ISPN-12572 OverlappingFileLockException in OverlayLocalConfigurationStorage
- relates to
ISPN-12571 jcache/tck-runner-remote random failures starting server