Details

      Description

      Cloned from https://github.com/jberet/jsr352/issues/116

      As discussed in WASdev/standards.jsr352.batch-spec#15 it is necessary that checkpoints get also update after skipping an item. Otherwise the skipped item might be read again when processing the next chunk.

      You can easily test this behavior by adding a test to ChunkSkipRetryIT:

          @Test
          public void retrySkipWrite0to10() throws Exception {
              params.setProperty("writer.fail.on.values", "0,1,2,3,4,5,6,7,8,9,10");
              params.setProperty("repeat.failure", "true");
              final ArrayList<List<Integer>> expected = new ArrayList<List<Integer>>();
      
              // 0 - 10 failed, re-written and failed and skipped
              expected.add(asList(11));
              expected.add(asList(12));
              expected.add(asList(13));
              expected.add(asList(14));
              expected.add(asList(15));
              expected.add(asList(16));
              expected.add(asList(17));
              expected.add(asList(18));
              expected.add(asList(19));
              expected.add(asList(20, 21, 22, 23, 24, 25, 26, 27, 28, 29));
      
              runTest(chunkSkipRetryXml, expected);
              // 20: read first 10 in first try, read again in retry 
              // +20: read second 10 in first try, read again in retry 
              // +10: read third 10 in first try, no retry needed
              verifyMetric(Metric.MetricType.READ_COUNT, 20 + 20 + 10);
              verifyMetric(Metric.MetricType.READ_SKIP_COUNT, 0);
              verifyMetric(Metric.MetricType.PROCESS_SKIP_COUNT, 0);
              verifyMetric(Metric.MetricType.WRITE_SKIP_COUNT, 11);
              verifyMetric(Metric.MetricType.ROLLBACK_COUNT, 2);
          }
      

      In this test the verifyMetric for READ_COUNT fails because the read count is 60 instead of 50. Thats because all the first 10 items of the first chunk failed and were skipped. As long as there is no checkpoint update after skip there is no checkpoint while processing the fist 10 items. After starting with the next chunk again an exception occures and the reader is rolled back to the previous checkpoint. Because there is no checkpoint yet it reads the fist items again.

      Imagine if there are many failing items it would read a lot of items again and again.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  cfang Cheng Fang
                  Reporter:
                  cfang Cheng Fang
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  1 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: