Uploaded image for project: 'JBeret'
  1. JBeret
  2. JBERET-461

Checkpoints must be set after skipped item

    XMLWordPrintable

Details

    Description

      Cloned from https://github.com/jberet/jsr352/issues/116

      As discussed in WASdev/standards.jsr352.batch-spec#15 it is necessary that checkpoints get also update after skipping an item. Otherwise the skipped item might be read again when processing the next chunk.

      You can easily test this behavior by adding a test to ChunkSkipRetryIT:

          @Test
          public void retrySkipWrite0to10() throws Exception {
              params.setProperty("writer.fail.on.values", "0,1,2,3,4,5,6,7,8,9,10");
              params.setProperty("repeat.failure", "true");
              final ArrayList<List<Integer>> expected = new ArrayList<List<Integer>>();
      
              // 0 - 10 failed, re-written and failed and skipped
              expected.add(asList(11));
              expected.add(asList(12));
              expected.add(asList(13));
              expected.add(asList(14));
              expected.add(asList(15));
              expected.add(asList(16));
              expected.add(asList(17));
              expected.add(asList(18));
              expected.add(asList(19));
              expected.add(asList(20, 21, 22, 23, 24, 25, 26, 27, 28, 29));
      
              runTest(chunkSkipRetryXml, expected);
              // 20: read first 10 in first try, read again in retry 
              // +20: read second 10 in first try, read again in retry 
              // +10: read third 10 in first try, no retry needed
              verifyMetric(Metric.MetricType.READ_COUNT, 20 + 20 + 10);
              verifyMetric(Metric.MetricType.READ_SKIP_COUNT, 0);
              verifyMetric(Metric.MetricType.PROCESS_SKIP_COUNT, 0);
              verifyMetric(Metric.MetricType.WRITE_SKIP_COUNT, 11);
              verifyMetric(Metric.MetricType.ROLLBACK_COUNT, 2);
          }
      

      In this test the verifyMetric for READ_COUNT fails because the read count is 60 instead of 50. Thats because all the first 10 items of the first chunk failed and were skipped. As long as there is no checkpoint update after skip there is no checkpoint while processing the fist 10 items. After starting with the next chunk again an exception occures and the reader is rolled back to the previous checkpoint. Because there is no checkpoint yet it reads the fist items again.

      Imagine if there are many failing items it would read a lot of items again and again.

      Attachments

        Issue Links

          Activity

            People

              cfang@redhat.com Cheng Fang
              cfang@redhat.com Cheng Fang
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: