Loading...

Type: Bug
Resolution: Not a Bug
Priority: Critical
Fix Version/s: None
Affects Version/s: 8.1.0.GA-CR4, 8.1.0.Beta
Component/s: Clustering, OpenShift
Labels:
- intersmash

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Affects:

User Experience
CDW devel_ack:
CDW docs_ack:
CDW pm_ack:
CDW qa_ack:
CDW release:
Target Release:

8.1.z.GA
Test Coverage:

+
Intelligence Requested:
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

An EAP 8.1.0 Beta + Red Hat Datagrid 8.5.3.GA interoperability test on OpenShift that validates EAP behavior against remote RHDG failover fails intermittently, signaling cache inconsistencies:

java.lang.AssertionError: 
1 expectation failed.
JSON path value doesn't match.
Expected: is "10"
  Actual: null

	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
	at org.codehaus.groovy.reflection.CachedConstructor.invoke(CachedConstructor.java:73)
	at org.codehaus.groovy.reflection.CachedConstructor.doConstructorInvoke(CachedConstructor.java:60)
	at org.codehaus.groovy.runtime.callsite.ConstructorSite$ConstructorSiteNoUnwrap.callConstructor(ConstructorSite.java:86)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallConstructor(CallSiteArray.java:57)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:263)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:277)
	at io.restassured.internal.ResponseSpecificationImpl$HamcrestAssertionClosure.validate(ResponseSpecificationImpl.groovy:512)
	at io.restassured.internal.ResponseSpecificationImpl$HamcrestAssertionClosure$validate$1.call(Unknown Source)
	at io.restassured.internal.ResponseSpecificationImpl.validateResponseIfRequired(ResponseSpecificationImpl.groovy:696)
	at io.restassured.internal.ResponseSpecificationImpl.this$2$validateResponseIfRequired(ResponseSpecificationImpl.groovy)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:198)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:62)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:185)
	at io.restassured.internal.ResponseSpecificationImpl.body(ResponseSpecificationImpl.groovy:270)
	at io.restassured.specification.ResponseSpecification$body$1.callCurrent(Unknown Source)
	at io.restassured.internal.ResponseSpecificationImpl.body(ResponseSpecificationImpl.groovy:117)
	at io.restassured.internal.ValidatableResponseOptionsImpl.body(ValidatableResponseOptionsImpl.java:244)
	at org.jboss.qa.appsint.tests.eap.rhdg.eap8.session.offload.Eap8WebCacheOffloadedToOperatorRhdgTests.testValue(Eap8WebCacheOffloadedToOperatorRhdgTests.java:262)
	at org.jboss.qa.appsint.tests.eap.rhdg.eap8.session.offload.Eap8WebCacheOffloadedToOperatorRhdgTests.rhdgFailover(Eap8WebCacheOffloadedToOperatorRhdgTests.java:188)
...

The deployment is built via the EAP Maven plugin with the cloud-default-config layer, plus the web-clustering, ejb, and ejb-dist-cache, and excluding the ejb-local-cache layer.
The infinispan subsystem is configured to connect via HotRod:

/socket-binding-group=standard-sockets/remote-destination-outbound-socket-binding=rhdg:add(host=${env.JDG_HOST}, port=${env.JDG_PORT})
/subsystem=infinispan/remote-cache-container=rhdg-container:add(default-remote-cluster=data-grid-cluster)
/subsystem=infinispan/remote-cache-container=rhdg-container/remote-cluster=data-grid-cluster:add(socket-bindings=[rhdg])
/subsystem=infinispan/cache-container=web/invalidation-cache=rhdg-cache:add()
/subsystem=infinispan/cache-container=web/invalidation-cache=rhdg-cache/store=hotrod:add(remote-cache-container=rhdg-container,fetch-state=false,purge=false,passivation=false,shared=true)
/subsystem=infinispan/cache-container=web:write-attribute(name=default-cache,value=rhdg-cache)
/subsystem=infinispan/remote-cache-container=rhdg-container:write-attribute(name=properties, value={infinispan.client.hotrod.auth_realm=default,infinispan.client.hotrod.use_auth=true,infinispan.client.hotrod.auth_username=${env.CACHE_USERNAME},infinispan.client.hotrod.auth_password=${env.CACHE_PASSWORD},infinispan.client.hotrod.auth_server_name=rhdg-host,infinispan.client.hotrod.sasl_properties.javax.security.sasl.qop=auth,infinispan.client.hotrod.sasl_mechanism=SCRAM-SHA-512,infinispan.client.hotrod.sni_host_name=rhdg-host,infinispan.client.hotrod.ssl_hostname_validation=false,infinispan.client.hotrod.trust_store_path=/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt,})

The test logic is about creating an EAP cluster that offloads a web session cache to a RHDG cluster, and checking that the expected values are stored in the cache when an RHDG instance is ungracefully stopped and the related 3 replicas cluster scaled down to 2 immediately after that.

This is similar to ~~JBEAP-29870~~, but about an RHDG failover scenario, rather than an EAP one.
The overall configuration (layers + infinispan subsystem) has been validated already by developers, so we're setting this as a blocker for 8.1.0 GA.

Regarding the test logic, here's a source code fragment, enriched with numbered comments to emphasize the most relevant steps:

                // 1. start a 2 replicas RHDG cluster, then - once it's well-formed - starting a 2 replicas EAP cluster 
                setInitialClustersReplicas();
		List<Pod> pods = rhdgOpenShiftProvisioner.getPods();
                // 2. get a reference to the RHDG pod that will be deleted
		Pod podToFail = pods.get(0);
		log.debug("The \"{}\" pod will be terminated ungracefully to simulate Infinispan/RHDG failover",
				podToFail.getMetadata().getName());
                // 3. store a web session value, which is persisted to the remote Infinispan cache
		RequestSpecification session = RestAssured.given().accept(ContentType.JSON)
				.filter(new SessionFilter());
		putValue(session, 10);
                
		// 4. as noted in https://issues.redhat.com/browse/JBEAP-29870 - here we need to add a sleep period for the
		// pod deletion since it is not guaranteed that data was successfully replicated/persisted prior to abrupt pod
		// deletion, which would make the test fail intermittently.
		Thread.sleep(PAUSE_TO_ALLOW_DATA_REPLICATION_IN_SECONDS * 1000);
		testValue(session, 10);
                
                // 4. scaling the RHDG cluster up to 3 replicas
		log.debug("Scaling Infinispan/RHDG cluster up to 3 replicas...");
		rhdgOpenShiftProvisioner.scale(3, true);

                // 5. deleting the first RHDG pod
		//	killing the first pod will cause the RHDG Operator to try and redeploy it
		rhdgOpenShiftProvisioner.getOpenShift().deletePod(podToFail);

                // 6. scaling the RHDG cluster dow to 2 replicas immediately after the pod deletion 
		//	but here we scale down to 2, so the operator should:
		//	a. react to the #0-pod deletion by spinning it up again
		//	b. once it's ready, react to the sale down request by deleting the #1-pod
		log.debug("Scaling Infinispan/RHDG cluster down to 2 replicas...");
		rhdgOpenShiftProvisioner.scale(2, true);

                // 7. read the value, here's where the test is failing intermittently
		testValue(session, 10);

As a final note, both the EAP pods have clean logs at the end of the test execution, and the same applies to the 2 remaining RHDG pods.
Feel free to reach out for any additional details.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

eap-1-52lnj.log
35 kB
2025/07/01 1:47 PM
eap-1-r78qs.log
42 kB
2025/07/01 1:47 PM
rhdg-0.log
29 kB
2025/07/02 4:14 PM
rhdg-1.log
22 kB
2025/07/02 4:14 PM

Details

Description

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates