This addresses a potential design flaw with regard to the handling of the _id field of MongoDB CDC events in the key struct of SourceRecords. Currently, the problem is that _id fields are always
represented as flat strings and do not expose type specific information to the consumers (e.g. String vs. ObjectId, String vs Integer, Full Document,...)
At the moment, one cannot even distinguish simple cases e.g. a numeric _id field holding the integer 1234 vs. a string _id field holding "1234". While for CREATE/READ events consumers can re-create the correct (original) _id type based on the "after" field found in the value struct, it is not possible to use this workaround for idempotent change or delete events respectively. For these we need to rely on a correct _id type in order to be able to refer to the corresponding data records and apply the changes or deletetions at the consuming sink.
There are different solutions to tackle this enhancement. Some of which would not work with AVRO but only with JSON serialization of Kafka Connect records. There's already a pull request discussion dealing with this, see https://github.com/debezium/debezium/pull/258