Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Done
Priority: Minor
Fix Version/s: 7.74.0.Final
Affects Version/s: None
Component/s: Cloud, KieServer
Labels:
- Cloud

Affects:

User Experience
Docs QE Status:
NEW
QE Status:
NEW
Git Pull Request:
https://github.com/kiegroup/droolsjbpm-integration/pull/2855, https://github.com/kiegroup/droolsjbpm-integration/pull/2856, https://github.com/kiegroup/droolsjbpm-integration/pull/2857

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Scenario 1:
When two Kieservers pods or more are bootstrapped on multi-KieServer-Pod environment, then there could be a race condition to create config maps by two or more Kieserver pods, the following error could show up in the logs:

19:01:27,997 ERROR [org.kie.server.services.openshift.impl.storage.cloud.KieServerStateOpenShiftRepository] (ServerService Thread Pool -- 76) Processing KieServerState failed.: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://172.30.0.1/api/v1/namespaces/bsig-cloud/configmaps. Message: configmaps "authoring-ha-kieserver" already exists. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, kind=configmaps, name=authoring-ha-kieserver, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=configmaps "authoring-ha-kieserver" already exists, metadata=ListMeta(_continue=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=AlreadyExists, status=Failure, additionalProperties={}).

This error could be safely ignored since the ConfigMap will be created and the pods will work normally.

The proposed change is to add a new catch in this part of the code to handle this exception as a warn that can be safely ignored.

In 7.4.0 a "Known Issue" should be documented alerting users that this error message can be safely ignored explaining that is a simple racing condition to create the configmap used by Kieservers during runtime. The configmap will be created and the pods will work as expected.

Scenario 2:
Intermittently, the Watcher is closed due to random KubernetesClientException, such as this 'too old resource version'.

�[0m�[0m12:20:15,553 INFO  [org.kie.server.services.openshift.impl.OpenShiftStartupStrategy] (OkHttp https://172.30.0.1/...) Watcher closed.
�[0m�[0m12:20:15,554 INFO  [org.kie.server.services.openshift.impl.OpenShiftStartupStrategy] (OkHttp https://172.30.0.1/...) too old resource version: 750726 (779798)

It could be related to known issues from k8s or f8 kube-client. While waiting for the lower level lib to address such issue, from upper level API client perspective, potential options are:
Option 1 (Short Term):
Escalate log message level, gracefully terminate Watcher thread, and recommend a Pod recycle.

Option 2 (Long Term):
Refactor out the Watcher logic from OpenShiftStartupStrategy into a dedicate component with enhanced resiliency, such as being able to restart Watcher should it exits abnormally.

is incorporated by

RHPAM-3333 Kie Server OpenShift startup strategy watcher is closed and DC is not updated

Closed

relates to

JBPM-8295 Refactor KieServerStateOpenShiftRepository

Open

Assignee:: Filippe Spolti

Reporter:: Ricardo Zanini

Tester:: Jakub Schwan

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2019/06/06 10:19 AM

Updated:: 2022/08/29 4:03 PM

Resolved:: 2022/08/29 4:03 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates