Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Done
Priority: Major
Fix Version/s: 7.11.0.GA
Affects Version/s: 7.9.0.GA
Component/s: jBPM Core
Labels:
None

Blocked:
False
Ready:
False
Affects:

Documentation (Ref Guide, User Guide, etc.), Release Notes
CDW devel_ack:
CDW docs_ack:
CDW pm_ack:
CDW qa_ack:
CDW release:
Fix Build:
CR1
Release Note Text:
Undefined
Target Release:

7.11.0.GA
Git Pull Request:
https://github.com/kiegroup/jbpm-work-items/pull/173
[QE] How to address?:
---
[QE] Why QE missed?:
---
Market:

Sprint:
2021 Week 04-06 (from Jan 25)

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

I just tested the Kafka Producer WIH and noticed if the Kafka broker goes offline the kie-server hangs and can't process any new request. It becomes completely inoperable until it manages to reconnect to the Kafka Broker (comes online).

This is the error I see in server.log:

This WARN keeps showing up until the produces reaches its default timeout (60s)

10:13:57,315 WARN  [org.apache.kafka.clients.NetworkClient] (kafka-producer-network-thread | pam) [Producer clientId=pam] Connection to node 0 (localhost/127.0.0.1:9092) could not be established. Broker may not be available.

After the kafka produce timeout, this exception is thrown

10:13:57,396 ERROR [org.jbpm.process.workitem.kafka.KafkaWorkItemHandler] (default task-58) Handler error: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for PAM_Events-0:120000 ms has passed since batch creation
	at deployment.kie-server.war//org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:98)
	at deployment.kie-server.war//org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:67)
	at deployment.kie-server.war//org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
	at deployment.kie-server.war//org.jbpm.process.workitem.kafka.KafkaWorkItemHandler.executeWorkItem(KafkaWorkItemHandler.java:124)
...

It seems the Kafka WIH sends the event in Sync mode.

producer.send(new ProducerRecord(topic,key, value)).get();

That's why the process hangs until the Producer client reaches its default timeout. Maybe we should provide a way to fine-tune (configure) the Kafka Producer using parameters or provide an option to use this WIH in an Async or Fire and Forget way...

Solution

In the deployment descriptor runtimeManager should be used to use async command by kafka:

new KafkaWorkItemHandler.KafkaWorkItemHandler(env['kafka_server'], "jbpm_id", "org.apache.kafka.common.serialization.StringSerializer", "org.apache.kafka.common.serialization.StringSerializer", classLoader, runtimeManager);

also added global env set for next properties for all producers created.

org.jbpm.process.workitem.kafka.reconnect.backoff.max.msreconnect.backoff.max.ms "The maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms."

org.jbpm.process.workitem.kafka.reconnect.backoff.ms "The base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker."

org.jbpm.process.workitem.kafka.request.timeout.ms "The configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted."

org.jbpm.process.workitem.kafka.retries "Setting a value greater than zero will cause the client to resend any request that fails with a potentially transient error. It is recommended to set the value to either zero or `MAX_VALUE` and use corresponding timeout parameters to control how long a client should retry a request."

org.jbpm.process.workitem.kafka.retry.backoff.ms "The amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios."

org.jbpm.process.workitem.kafka.enable.idempotence "When set to 'true', the producer will ensure that exactly one copy of each message is written in the stream. If 'false', producer retries due to broker failures, etc., may write duplicates of the retried message in the stream. Note that enabling idempotence requires <code> MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION </code> to be less than or equal to 5, RETRIES_CONFIG to be greater than 0 and <code>ACKS_CONFIG </code> must be 'all'. If these values are not explicitly set by the user, suitable values will be chosen. If incompatible values are set, a <code>ConfigException</code> will be thrown."

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Tests_kafka-event-emitter_PerProcessRuntimeMode.zip
31 kB
2020/12/30 10:39 AM
Tests_kafka-event-emitter.zip
27 kB
2020/12/29 6:47 PM

is related to

JBPM-9585 Kafka WIH puts kie-server hanging if Kafka Broker is offline. (allow async)

Resolved

Assignee:: Enrique González Martínez (Inactive)

Reporter:: Rafael Soares

Tester:: Gonzalo Muñoz Fernández

QA Contact:: Gonzalo Muñoz Fernández

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2020/12/29 6:46 PM

Updated:: 2024/02/12 9:18 AM

Resolved:: 2021/02/11 6:26 AM

Details

Description

Solution

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates