Uploaded image for project: 'JBRULES'
  1. JBRULES
  2. JBRULES-3284

Duplicate job scheduling in org.drools.time.impl.DefaultTimerJobInstance.call()

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

    XMLWordPrintable

Details

    Description

      See the forum reference above for context. As requested, this is the bug report for issue F1 in the post. The intro and description are copied from the original post:

      I am making extensive use of the event processing features of the Drools rule engine. Upgrading from Drools 5.2.0.Final to Drools 5.3.0.Final broke 47 of my unit tests and also broke my functional tests. There seem to be multiple changes in Drools 5.3.0 that cause incorrect behavior and/or break backward compatibility. Here are the results of my investigation so far.

      Issue F1:
      While tracking down the failures of my unit tests, I experimented with the Broker example, and while probably not finding the root cause of the broken unit tests, I nonetheless came across what clearly seems to be incorrect behavior in the DefaultTimerJobInstance.call() method. The bug only reveals itself after all input has been processed, so it is masked by the fact that running the Broker demo through the entire sequence in stocktickstream.dat (1100 lines) takes a long time. In order to reveal the problem more quickly, run the demo with only the first 14 lines in stocktickstream.dat. Do so for both the 5.2.0 Broker demo (against Drools 5.2.0) and the 5.3.0 Broker demo (against Drools 5.3.0). The Broker example code in both versions is identical, so only the Drools-internal code changes matter.

      When running the 5.2.0 Broker demo to the end, you get one java.text.ParseException (given the structure of the example code, that's expected, albeit not elegant, and not the focus of our investigation). In particular, no matter how many lines stocktickstream.dat contains, you always get exactly one ParseException at the end.

      Contrast this with running the 5.3.0 Broker demo: At the end you get N occurrences of java.text.ParseException, where N is the number of lines in stocktickstream.dat. So for 14 lines you get 14 occurrences of ParseException. Looking at two specific methods shows us why:

      Method org.drools.examples.broker.events.EventFeeder.FeedJob.execute(JobContext):

              public void execute(JobContext context) {
                  this.sink.receive( ((FeedContext) context).event );
                  if ( this.source.hasNext() ) {
                      ((FeedContext) context).setEvent( this.source.getNext() );
                      this.trigger.setNextFireTime( ((FeedContext) context).getEvent().getDate() );
                      clock.scheduleJob( this,
                                         context,
                                         trigger );
                  }
              }
      

      Note in particular how this method already takes care of scheduling the next job execution by updating the next fire time of the job's existing FeedTrigger instance. Unfortunately, in Drools 5.3.0, DefaultTimerJobInstance.call() does a duplicate scheduling of the same job:

      Method org.drools.time.impl.DefaultTimerJobInstance.call():

          public Void call() throws Exception {
              this.trigger.nextFireTime(); // need to pop
              if ( handle.isCancel() ) {
                  return null;
              }
              this.job.execute( this.ctx );
              if ( handle.isCancel() ) {
                  return null;
              }
      
              // our triggers allow for flexible rescheduling
              Date date = this.trigger.hasNextFireTime();
              if ( date != null ) {
                  scheduler.internalSchedule( this );
              }
      
              return null;
          }
      

      So, every job is duplicated and that's why there are 2*N calls to org.drools.examples.broker.events.StockTickPersister.load() instead of N.

      I think the rescheduling inside DefaultTimerJobInstance.call() is incorrect. For one, it breaks backward compatibility, and it is unexpected. The job should be in charge of deciding whether there is another job to schedule or what to do. Implicitly scheduling the next job just by updating the trigger time is a little too much magic.

      Attachments

        Issue Links

          Activity

            People

              etirelli@redhat.com Edson Tirelli
              rcalmbac Richard Calmbach (Inactive)
              Archiver:
              rhn-support-ceverson Clark Everson

              Dates

                Created:
                Updated:
                Archived:

                PagerDuty