[DBZ-1413] Incorrect handling of type alias

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: 1.0.0.CR1
Affects Version/s: 0.10.0.Beta3
Component/s: postgresql-connector
Labels:
None

Git Pull Request:
https://github.com/debezium/debezium/pull/1079

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

If a new type is introduced as an alias to a built-in type like CREATE DOMAIN money2 AS NUMERIC(12,2); then it is not correctly handled in io.debezium.connector.postgresql.PostgresValueConverter.convertBinary(Column, Field, Object).
The issue is visble in snapshot as the value arrives as the logical Java class and is not thus possible to be converted as a binary value.

See https://issues.jboss.org/browse/DBZ-1009?focusedCommentId=13757501&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13757501 for reference.

Key problems

streaming mode delivers byte representation but snapshot mode delivers Java class
how to handle default values?

is duplicated by

DBZ-808 Debezium cant parse user define type on AWS RDS

Closed

relates to

DBZ-1009 Postgres Connector fails on none nullable MACADDR field during initial snapshot

Closed

Jiri Pechanec added a comment - 2019/12/10 7:04 AM

Released

Jiri Pechanec added a comment - 2019/12/10 7:04 AM Released

Krzysztof Lesniewski (Inactive) added a comment - 2019/11/11 12:11 PM

jpechane directed me here from DBZ-920. Although I see, that this issue is related, I have trouble assesing, whether it touches the same problem. If you consider an enum to be an alias of string, with additional constraint, that its value matches one of the pre-defined values, then I guess I have the same problem as described in this ticket.

I have a problem with getting Postgres enum value out. I create an enum type similar to:

CREATE TYPE test.test_type AS ENUM (
   'Value1',
   'Value2'
);

When I stream changes from table that uses the type, the column is ignored by default. After setting "include.unknown.datatypes": true and
_"column.propagate.source.type": ".*", I can get the value out, but it is returned as bytes (the bytes converted to string match the enum value) and column type is
UNKNOWN

"__debezium.source.column.type": "UNKNOWN",
"__debezium.source.column.scale": "-1"

If I understood correctly, when this issue is solved, the enum columns' values would be returned as strings and debezium.source.column.type would be
TEST_TYPE. Did I get that right?

Krzysztof Lesniewski (Inactive) added a comment - 2019/11/11 12:11 PM jpechane directed me here from DBZ-920 . Although I see, that this issue is related, I have trouble assesing, whether it touches the same problem. If you consider an enum to be an alias of string, with additional constraint, that its value matches one of the pre-defined values, then I guess I have the same problem as described in this ticket. I have a problem with getting Postgres enum value out. I create an enum type similar to: CREATE TYPE test.test_type AS ENUM ( 'Value1' , 'Value2' ); When I stream changes from table that uses the type, the column is ignored by default. After setting "include.unknown.datatypes": true and _"column.propagate.source.type": ".*", I can get the value out, but it is returned as bytes (the bytes converted to string match the enum value) and column type is UNKNOWN "__debezium.source.column.type" : "UNKNOWN" , "__debezium.source.column.scale" : "-1" If I understood correctly, when this issue is solved, the enum columns' values would be returned as strings and debezium.source.column.type would be TEST_TYPE . Did I get that right?

Chris Cranford added a comment - 2019/10/14 9:47 AM

I'd expect to get BigDecimal? Is it because the custom type does not provide the correct scale/length?

Actually no. The PgResultSetMetaData actually reports the OID as 1700 (numeric) rather than the OID for the domain type. So during PostgresSnapshotChangeEventSource#getColumnValue, we actually end up building a SpecialDecimalValue because of that. I would if instead we should consider basing the switch logic off Column#getNativeType() which actually is the OID that represents the domain type?

Chris Cranford added a comment - 2019/10/14 9:47 AM I'd expect to get BigDecimal? Is it because the custom type does not provide the correct scale/length? Actually no. The PgResultSetMetaData actually reports the OID as 1700 (numeric) rather than the OID for the domain type. So during PostgresSnapshotChangeEventSource#getColumnValue , we actually end up building a SpecialDecimalValue because of that. I would if instead we should consider basing the switch logic off Column#getNativeType() which actually is the OID that represents the domain type?

Jiri Pechanec added a comment - 2019/10/14 12:11 AM

ccranfor@redhat.com
I think the concurrency question has been quite alleviated with switch to the new framework. Given the current characteristic I'd

suppose, require and document not-thread safety
If mutlithreading access will be necessary I think the copy-on-write semantics would be good enough if the updates are really only very rare

snapshot - why is that? I'd expect to get BigDecimal? Is it because the custom type does not provide the correct scale/length? If it is the case we might need to update/hook the common snapshotting code.

Genrally I believe all is solvable now.

Jiri Pechanec added a comment - 2019/10/14 12:11 AM ccranfor@redhat.com I think the concurrency question has been quite alleviated with switch to the new framework. Given the current characteristic I'd suppose, require and document not-thread safety If mutlithreading access will be necessary I think the copy-on-write semantics would be good enough if the updates are really only very rare snapshot - why is that? I'd expect to get BigDecimal? Is it because the custom type does not provide the correct scale/length? If it is the case we might need to update/hook the common snapshotting code. Genrally I believe all is solvable now.

Chris Cranford added a comment - 2019/10/11 3:51 PM - edited

Hi jpechane

The ponly problem is - is there a concurrent access to the registry? I'd like to avoid adding synchronization to it if possible.

I am not aware of any concurrent access. My hope is that for the most part building the TypeRegistry up-front with these domain types solves 99% of the use cases for users. The only time that we'd mutate the TypeRegistry is when a new table is added with a new domain type at runtime. If we'd rather not make it mutable, could we add that as a limitation that the connector should be restarted after adding new domain types and before adding new objects that use said domain types?

have you checked the snapshot -e.g. JDBC? How does it behave in this case?

In this case we get a SpecialValueDecimal with a decimalValue.

could you please check the behaviour of having column.propagate.source.type set to one of columns with custom type?

Without any changes, what I observe is that the schema parameters for the Field are populated with:

__debezium.source.column.type=MONEY2
__debezium.source.column.length=2147483647
__debezium.source.column.scale=0

This should be resolved automatically by making sure that the MONEY2 domain type is in the TypeRegistry and that the right attribute type modifiers are being applied to the type so that the length, precision, and scale are being resolved and therefore can be propagated downstream when this is enabled.

Given that MONEY2 is defined as NUMERIC(12,2) in the above use case, I would expect it to then have these parameters defined as

__debezium.source.column.type=MONEY2
__debezium.source.column.length=12
__debezium.source.column.scale=2

Chris Cranford added a comment - 2019/10/11 3:51 PM - edited Hi jpechane The ponly problem is - is there a concurrent access to the registry? I'd like to avoid adding synchronization to it if possible. I am not aware of any concurrent access. My hope is that for the most part building the TypeRegistry up-front with these domain types solves 99% of the use cases for users. The only time that we'd mutate the TypeRegistry is when a new table is added with a new domain type at runtime. If we'd rather not make it mutable, could we add that as a limitation that the connector should be restarted after adding new domain types and before adding new objects that use said domain types? have you checked the snapshot -e.g. JDBC? How does it behave in this case? In this case we get a SpecialValueDecimal with a decimalValue. could you please check the behaviour of having column.propagate.source.type set to one of columns with custom type? Without any changes, what I observe is that the schema parameters for the Field are populated with: __debezium.source.column.type=MONEY2 __debezium.source.column.length=2147483647 __debezium.source.column.scale=0 This should be resolved automatically by making sure that the MONEY2 domain type is in the TypeRegistry and that the right attribute type modifiers are being applied to the type so that the length, precision, and scale are being resolved and therefore can be propagated downstream when this is enabled. Given that MONEY2 is defined as NUMERIC(12,2) in the above use case, I would expect it to then have these parameters defined as __debezium.source.column.type=MONEY2 __debezium.source.column.length=12 __debezium.source.column.scale=2

Jiri Pechanec added a comment - 2019/10/08 5:22 AM

ccranfor@redhat.com I guess Gunnar meant ~~DBZ-808~~

Jiri Pechanec added a comment - 2019/10/08 5:22 AM ccranfor@redhat.com I guess Gunnar meant DBZ-808

Chris Cranford added a comment - 2019/10/07 2:16 PM

gunnar.morling, was your prior comment meant for something else? I'm not following exactly what you were trying to convey.

Chris Cranford added a comment - 2019/10/07 2:16 PM gunnar.morling , was your prior comment meant for something else? I'm not following exactly what you were trying to convey.

Gunnar Morling added a comment - 2019/09/30 9:03 AM

Please see the linked ~~DBZ-1413~~ for something that'd probably be a good test case.

Gunnar Morling added a comment - 2019/09/30 9:03 AM Please see the linked DBZ-1413 for something that'd probably be a good test case.

Jiri Pechanec added a comment - 2019/09/20 12:32 AM

ccranfor@redhat.com COol, thanks for the investigation. The TypeRegistry plan is exactly what I had in mind. Add another method that would try to find out the type in case it is missing from the internal map. The ponly problem is - is there a concurrent access to the registry? I'd like to avoid adding synchronization to it if possible.

Also before following this path - have you checked the snapshot -e.g. JDBC? How does it behave in this case?

Also last but not least could you please check the behaviour of having column.propagate.source.type set to one of columns with custom type?

Jiri Pechanec added a comment - 2019/09/20 12:32 AM ccranfor@redhat.com COol, thanks for the investigation. The TypeRegistry plan is exactly what I had in mind. Add another method that would try to find out the type in case it is missing from the internal map. The ponly problem is - is there a concurrent access to the registry? I'd like to avoid adding synchronization to it if possible. Also before following this path - have you checked the snapshot -e.g. JDBC? How does it behave in this case? Also last but not least could you please check the behaviour of having column.propagate.source.type set to one of columns with custom type?

Chris Cranford added a comment - 2019/09/19 4:22 PM

jpechane gunnar.morling:

What do we get as the type name in decoder event?

We get the domain type, money2 in all 3 decoder events.

If the other can we use any postgres metadata query to get the source type and do the correct transformation?

We certainly can but should we consider making TypeRegistry mutable? While we can fetch the base type for a newly added domain type as needed, I think it would make sense to cache this in TypeRegistry so that we avoid doing that lookup unnecessarily. What are you guys thoughts?

Chris Cranford added a comment - 2019/09/19 4:22 PM jpechane gunnar.morling : What do we get as the type name in decoder event? We get the domain type, money2 in all 3 decoder events. If the other can we use any postgres metadata query to get the source type and do the correct transformation? We certainly can but should we consider making TypeRegistry mutable? While we can fetch the base type for a newly added domain type as needed, I think it would make sense to cache this in TypeRegistry so that we avoid doing that lookup unnecessarily. What are you guys thoughts?

Assignee:: Chris Cranford

Reporter:: Jiri Pechanec

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2019/07/24 8:04 AM

Updated:: 2023/01/24 3:42 PM

Resolved:: 2019/12/10 4:10 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

Collapse comment: Jiri Pechanec added a comment - 2019/12/10 7:04 AM

Expand comment: Jiri Pechanec added a comment - 2019/12/10 7:04 AM

Collapse comment: Krzysztof Lesniewski (Inactive) added a comment - 2019/11/11 12:11 PM

Expand comment: Krzysztof Lesniewski (Inactive) added a comment - 2019/11/11 12:11 PM

Collapse comment: Chris Cranford added a comment - 2019/10/14 9:47 AM

Expand comment: Chris Cranford added a comment - 2019/10/14 9:47 AM

Collapse comment: Jiri Pechanec added a comment - 2019/10/14 12:11 AM

Expand comment: Jiri Pechanec added a comment - 2019/10/14 12:11 AM

Collapse comment: Chris Cranford added a comment - 2019/10/11 3:51 PM, Edited by Chris Cranford - 2019/10/11 3:54 PM

Expand comment: Chris Cranford added a comment - 2019/10/11 3:51 PM, Edited by Chris Cranford - 2019/10/11 3:54 PM

Collapse comment: Jiri Pechanec added a comment - 2019/10/08 5:22 AM

Expand comment: Jiri Pechanec added a comment - 2019/10/08 5:22 AM

Collapse comment: Chris Cranford added a comment - 2019/10/07 2:16 PM

Expand comment: Chris Cranford added a comment - 2019/10/07 2:16 PM

Collapse comment: Gunnar Morling added a comment - 2019/09/30 9:03 AM

Expand comment: Gunnar Morling added a comment - 2019/09/30 9:03 AM

Collapse comment: Jiri Pechanec added a comment - 2019/09/20 12:32 AM

Expand comment: Jiri Pechanec added a comment - 2019/09/20 12:32 AM

Collapse comment: Chris Cranford added a comment - 2019/09/19 4:22 PM

Expand comment: Chris Cranford added a comment - 2019/09/19 4:22 PM

People

Dates