Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-8910

FieldToEmbedding SMT crashes when source field name is substring of embedding name

XMLWordPrintable

      If the existing unit test is modified to (notice the change in embeddings.field.embedding)

          @Test
          public void testNestedFieldIsEmbeddedNested() {
              FieldToEmbedding<SourceRecord> embeddingSmt = new FieldToEmbedding();
              embeddingSmt.configure(Map.of(
                      "embeddings.field.source", "after.product",
                      "embeddings.field.embedding", "after.product_embedding"));
              SourceRecord transformedRecord = embeddingSmt.apply(SOURCE_RECORD);
      
              Struct payloadStruct = (Struct) transformedRecord.value();
              assertThat(payloadStruct.getStruct("after").getString("product")).contains("a product");
              assertThat(payloadStruct.getStruct("after").getArray("product_embedding")).contains(0.0f, 1.0f, 2.0f, 3.0f);
          }
      

      then it wails with

      org.apache.kafka.connect.errors.DataException: Only Struct objects supported for [Nested field], found: java.lang.String
      	at org.apache.kafka.connect.transforms.util.Requirements.requireStruct(Requirements.java:52)
      	at io.debezium.transforms.ConnectRecordUtil.buildUpdatedValue(ConnectRecordUtil.java:109)
      	at io.debezium.transforms.ConnectRecordUtil.buildUpdatedValue(ConnectRecordUtil.java:111)
      	at io.debezium.transforms.ConnectRecordUtil.makeUpdatedValue(ConnectRecordUtil.java:100)
      	at io.debezium.ai.embeddings.FieldToEmbedding.buildUpdatedRecord(FieldToEmbedding.java:180)
      	at io.debezium.ai.embeddings.FieldToEmbedding.apply(FieldToEmbedding.java:109)
      	at io.debezium.ai.embeddings.FieldToEmbeddingTest.testNestedFieldIsEmbeddedNested(FieldToEmbeddingTest.java:70)
      	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
      	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
      	at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
      	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
      	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
      	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
      	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
      	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:93)
      	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:40)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:520)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:748)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:443)
      	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:211)
      

      The root cause is probably use of contains in io.debezium.transforms.ConnectRecordUtil.isContainedIn(String, List<String>).

              vjuranek@redhat.com Vojtech Juranek
              jpechane Jiri Pechanec
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: