-
Bug
-
Resolution: Unresolved
-
Minor
-
1.9.6.Final
-
None
In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.
Bug report
For bug reports, provide this information, please:
What Debezium connector do you use and what version?
1.9.6
What is the connector configuration?
{ "name": "mongo-connector", "config": { "connector.class": "io.debezium.connector.mongodb.MongoDbConnector", "tasks.max": "1", "mongodb.hosts": "rs0/mongo:27017", "mongodb.name": "mongodbserver1", "mongodb.user": "root", "mongodb.password": "root", "mongodb.server.selection.timeout.ms": "60000000", "transforms": "unwrap", "transforms.unwrap.type": "io.debezium.connector.mongodb.transforms.ExtractNewDocumentState", "transforms.unwrap.sanitize.field.names": true, "transforms.unwrap.drop.tombstones": "false", "transforms.unwrap.delete.handling.mode": "drop", "transforms.unwrap.add.headers": "op" } }
What is the captured database version and mode of depoyment?
(E.g. on-premises, with a specific cloud provider, etc.)
Local Mongo db 4.4 as replica set.
What behaviour do you expect?
The ingestion not to fail.
What behaviour do you see?
In case of empty [] and {} a parsing error is returned.
Do you see the same behaviour using the latest relesead Debezium version?
(Ideally, also verify with latest Alpha/Beta/CR version)
Didn't try it
Do you have the connector logs, ideally from start till finish?
(You might be asked later to provide DEBUG/TRACE level log)
For the {} I have a small snippet (could not catch totally):
Caused by: org.apache.kafka.connect.errors.DataException: Failed to find field 'attribute_values' in schema mongodbserver1.mydata.release.media.tracks.recording.relations at io.debezium.connector.mongodb.transforms.MongoDataConverter.convertFieldValue(MongoDataConverter.java:144) ....
{{}}
and more complete for []
2022-12-08 11:30:43,265 WARN || [Producer clientId=connector-producer-mongo-connector-0] Error while fetching metadata with correlation id 57 : {mongodbserver1.music_brainz_api.release=LEADER_NOT_AVAILABLE} [org.apache.kafka.clients.NetworkClient] 2022-12-08 11:30:49,394 WARN || Field 'offset-count' name potentially not safe for serialization, replaced with 'offset_count' [io.debezium.schema.FieldNameSelector$FieldNameSanitizer] 2022-12-08 11:30:49,395 WARN || Field 'catalog-number' name potentially not safe for serialization, replaced with 'catalog_number' [io.debezium.schema.FieldNameSelector$FieldNameSanitizer] 2022-12-08 11:30:49,395 WARN || Field 'label-code' name potentially not safe for serialization, replaced with 'label_code' [io.debezium.schema.FieldNameSelector$FieldNameSanitizer] 2022-12-08 11:31:00,493 INFO || 53 records sent during previous 00:01:18.626, last recorded offset: {sec=1670499060, ord=1, transaction_id=null, resume_token=826391CAF4000000012B022C0100296E5A100463BEA335932C4C6D83991DE596D21920463C5F6964003C34386663303665312D663861342D343963342D613736322D323438353139623238383939000004, h=null} [io.debezium.connector.common.BaseSourceTask] 2022-12-08 11:31:07,338 INFO MongoDB|mongodbserver1|disc Checking current members of replica set at rs0/mongo:27017 [io.debezium.connector.mongodb.ReplicaSetDiscovery] 2022-12-08 11:31:16,406 ERROR || WorkerSourceTask{id=mongo-connector-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted [org.apache.kafka.connect.runtime.WorkerTask] org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error handler at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:223) at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:149) at org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:50) at org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:355) at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:258) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:188) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:243) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: org.apache.kafka.connect.errors.DataException: isrcs is not a valid field name at org.apache.kafka.connect.data.Struct.lookupField(Struct.java:254) at org.apache.kafka.connect.data.Struct.put(Struct.java:202) at io.debezium.connector.mongodb.transforms.MongoDataConverter.convertFieldValue(MongoDataConverter.java:214) at io.debezium.connector.mongodb.transforms.MongoDataConverter.convertFieldValue(MongoDataConverter.java:151) at io.debezium.connector.mongodb.transforms.MongoDataConverter.convertFieldValue(MongoDataConverter.java:265) at io.debezium.connector.mongodb.transforms.MongoDataConverter.lambda$convertFieldValue$0(MongoDataConverter.java:189) at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) at io.debezium.connector.mongodb.transforms.MongoDataConverter.convertFieldValue(MongoDataConverter.java:181) at io.debezium.connector.mongodb.transforms.MongoDataConverter.convertFieldValue(MongoDataConverter.java:265) at io.debezium.connector.mongodb.transforms.MongoDataConverter.lambda$convertFieldValue$0(MongoDataConverter.java:189) at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) at io.debezium.connector.mongodb.transforms.MongoDataConverter.convertFieldValue(MongoDataConverter.java:181) at io.debezium.connector.mongodb.transforms.MongoDataConverter.convertRecord(MongoDataConverter.java:60) at io.debezium.connector.mongodb.transforms.ExtractNewDocumentState.newRecord(ExtractNewDocumentState.java:324) at io.debezium.connector.mongodb.transforms.ExtractNewDocumentState.apply(ExtractNewDocumentState.java:264) at org.apache.kafka.connect.runtime.TransformationChain.lambda$apply$0(TransformationChain.java:50) at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:173) at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:207) ... 11 more 2022-12-08 11:31:16,406 INFO || Stopping down connector [io.debezium.connector.common.BaseSourceTask] 2022-12-08 11:31:17,119 INFO MongoDB|mongodbserver1|streaming Closing all connections to rs0/mongo:27017 [io.debezium.connector.mongodb.ConnectionContext] 2022-12-08 11:31:17,126 INFO MongoDB|mongodbserver1|streaming Finished streaming [io.debezium.pipeline.ChangeEventSourceCoordinator] 2022-12-08 11:31:17,127 INFO MongoDB|mongodbserver1|streaming Connected metrics set to 'false' [io.debezium.pipeline.ChangeEventSourceCoordinator] 2022-12-08 11:31:17,128 INFO || [Producer clientId=connector-producer-mongo-connector-0] Closing the Kafka producer with timeoutMillis = 30000 ms. [org.apache.kafka.clients.producer.KafkaProducer] 2022-12-08 11:31:17,131 INFO || Metrics scheduler closed [org.apache.kafka.common.metrics.Metrics] 2022-12-08 11:31:17,131 INFO || Closing reporter org.apache.kafka.common.metrics.JmxReporter [org.apache.kafka.common.metrics.Metrics] 2022-12-08 11:31:17,131 INFO || Metrics reporters closed [org.apache.kafka.common.metrics.Metrics] 2022-12-08 11:31:17,131 INFO || App info kafka.producer for connector-producer-mongo-connector-0 unregistered [org.apache.kafka.common.utils.AppInfoParser]
How to reproduce the issue using our tutorial deployment?
A simple unit/integration test could be enough to reproduce is, using the following JSON for {} (snip at the end) where the issue is related to the empty `attribute-values` at the bottom
{ "_id" : "987f3e2d-22a6-4a4f-b840-c80c26b8b91a", "quality" : "normal", "date" : "2019-06-14", "asin" : "null", "status-id" : "4e304316-386d-3409-af2e-78857eec5cfe", "status" : "Official", "disambiguation" : "", "text-representation" : { "script" : "Latn", "language" : "eng" }, "relations" : [ ], "release-events" : [ { "date" : "2019-06-14", "area" : { "name" : "[Worldwide]", "disambiguation" : "", "id" : "525d4e18-3d00-31b9-a58b-a146a916de8f", "type" : "null", "type-id" : "null", "sort-name" : "[Worldwide]", "iso-3166-1-codes" : [ "XW" ] } } ], "packaging-id" : "119eba76-b343-3e02-a292-f0f00644bb9b", "country" : "XW", "media" : [ { "title" : "", "tracks" : [ { "id" : "33781879-1dae-422c-a634-b26f89705e48", "position" : "1", "title" : "Become Desert", "length" : "2422450", "number" : "1", "recording" : { "first-release-date" : "2019-06-14", "title" : "Become Desert", "length" : "2422450", "disambiguation" : "", "id" : "d90bc0ff-c7d9-4c09-a12b-d46f46f7281d", "artist-credit" : [ { "artist" : { "sort-name" : "Seattle Symphony", "type" : "Orchestra", "type-id" : "a0b36c92-3eb1-3839-a4f9-4799823f54a5", "name" : "Seattle Symphony", "disambiguation" : "", "id" : "0b51c328-1f2b-464c-9e2c-0c2a8cce20ae" }, "joinphrase" : ", ", "name" : "Seattle Symphony" } ], "video" : "false", "relations" : [ { "target-type" : "artist", "type-id" : "234670ce-5f22-4fd0-921b-ef1662695c5d", "type" : "conductor", "target-credit" : "", "attribute-values" : { }, ...etc...
and a Json snipped for [] where the issue is at the end for `isrcs`
{ "_id" : "86d82194-ade8-4822-ba20-1c37703ed19f", "status-id" : "4e304316-386d-3409-af2e-78857eec5cfe", "quality" : "normal", "release-group" : { "first-release-date" : "", "primary-type-id" : "null", "primary-type" : "null", "secondary-types" : [ ], "title" : "The Treatment", "disambiguation" : "", "id" : "4634be1c-553b-3b2b-8340-c8d83f4879f9", "secondary-type-ids" : [ ], "artist-credit" : [ { "artist" : { "name" : "The Treatment", "id" : "694c8123-cae2-432b-ae4b-5c8d9c409c41", "disambiguation" : "unknown, album 'The Treatment'", "type-id" : "null", "type" : "null", "sort-name" : "The Treatment" }, "joinphrase" : "", "name" : "The Treatment" } ] }, "asin" : "null", "status" : "Official", "text-representation" : { "language" : "eng", "script" : "Latn" }, "disambiguation" : "", "media" : [ { "track-count" : "12", "position" : "1", "format" : "null", "title" : "", "tracks" : [ { "artist-credit" : [ { "artist" : { "name" : "The Treatment", "disambiguation" : "unknown, album 'The Treatment'", "id" : "694c8123-cae2-432b-ae4b-5c8d9c409c41", "sort-name" : "The Treatment", "type-id" : "null", "type" : "null" }, "joinphrase" : "", "name" : "The Treatment" } ], "recording" : { "title" : "GI Blues", "length" : "119920", "disambiguation" : "", "id" : "8c51f247-33a6-42b4-ac76-59e51ef45ffe", "video" : "false", "artist-credit" : [ { "artist" : { "sort-name" : "The Treatment", "type-id" : "null", "type" : "null", "id" : "694c8123-cae2-432b-ae4b-5c8d9c409c41", "disambiguation" : "unknown, album 'The Treatment'", "name" : "The Treatment" }, "joinphrase" : "", "name" : "The Treatment" } ], "isrcs" : [ ] }, ....etc
Feature request or enhancement
For feature requests or enhancements, provide this information, please:
Which use case/requirement will be addressed by the proposed feature?
Ingestion for a public api via Mongo/Debezium
Implementation ideas (optional)
Some unit/integration test to reproduce and fix the issue might be enough