Uploaded image for project: 'Subscription Watch'
  1. Subscription Watch
  2. SWATCH-3269

Update stage deployment tests to check the status of cronjobs & clowdjobinvocations

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      If a cronjob fails in stage, it may indicate a regression. Automated testing in stage should include checks of the status of jobs, so that we can quickly detect regressions in scheduled logic.

      Acceptance criteria:

      • stage health check job updated to include test
      • test verified to fail when a job is returned having status "Failed"

      Using the kubernetes API, the tests should:

      1. Loop through every job in the stage namespace, filtering to jobs in the last 24 hours.
      2. Record the status of the job.
      3. Fail if any job has "Failed" status.

      The easiest way to verify that the latest job run was successful is to look at the `status` section in the job statuses via kubernetes api. e.g.

      GET apis/batch/v1/namespaces/rhsm-stage/jobs 

      (equivalent to `oc get jobs`).

      Look at the most recent object in .status.conditions for "type" having value "Failed" and having status.startTime within the last 24 hours. The related cronjob/clowdjobinvocation for any failure can be identified by the value of .metadata.ownerReferences objects having kind "CronJob" or "ClowdJobInvocation" (the reference's "name" value is the cronjob/clowdjobinvocation that failed).

      Any job without a terminal status should be skipped.

      The test should log all recent jobs' owner objects and their statuses, and fail if any of them failed.

      Example output:

      Complete jobs:
      CronJob/floorist-swatch-tally-exporter
      CronJob/rhsm-subscriptions-egress
      CronJob/swatch-billable-usage-purge-remittances
      CronJob/swatch-billable-usage-retry-remittances
      CronJob/swatch-billable-usage-sync
      CronJob/swatch-contracts-offering-sync
      CronJob/swatch-contracts-subscription-sync
      CronJob/swatch-metrics-rhel-sync
      CronJob/swatch-metrics-sync
      CronJob/swatch-system-conduit-sync
      CronJob/swatch-tally-hourly
      CronJob/swatch-tally-purge
      CronJob/swatch-tally-purge-events 
      CronJob/swatch-tally-tally
      
      Failed jobs:
      ClowdJobInvocation/db-changelog-cleanup-2

      References:

      https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/job-v1/#JobStatus

              Unassigned Unassigned
              khowell@redhat.com Kevin Howell
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: