Uploaded image for project: 'Subscription Watch'
  1. Subscription Watch
  2. SWATCH-4062

Disable rhsm-subscriptions-egress job

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • swatch-database
    • None
    • 5
    • False
    • Hide

      None

      Show
      None
    • False

      Summary/Problem Statement

      The rhsm-subscriptions-egress job has been failing in production for an extended period (SWATCH-4024) but is no longer used upstream. Its original use case has been replaced by Floorist queries for Telesense, and our Snowflake integration will take a different implementation path that doesn't require reading from these S3 buckets. When this job fails, it reports failures to the #alerts-swatch Slack channel, which creates noise and buries real critical alerts.

      Acceptance Criteria

      • Add delete: true to rhsm-subscriptions-egress resource template targets in deploy.yml for both stage and prod namespaces
      • Add delete: true to rhsm-subscriptions-egress-post-deploy-tests target in post-deploy-tests.yaml
      • Add delete: true to any rhsm-subscriptions-egress references in deploy-perf.yml if they exist
      • Verify that standalone egress deployments are removed from stage and prod environments
      • Confirm that swatch-tally service continues to function normally (including database cleanup jobs)

      Done Criteria

      • All standalone egress job deployments are disabled across all environments
      • No standalone egress job containers are running in stage or production
      • Main swatch-tally service continues to function normally (including database cleanup jobs)
      • No S3 upload failures are occurring from disabled egress job
      • Evaluate is SWATCH-4024 is resolved by these changes and if so, it can be closed out

      Implementation Notes

      • Use delete: true on resource template targets in app-interface files to safely disable deployments
      • Main deployment: data/services/insights/rhsm/deploy.yml
      • Post-deploy tests: data/services/insights/rhsm/post-deploy-tests.yaml
      • The swatch-tally ClowdApp references EGRESS_IMAGE parameters, but this is only for reusing the egress container image as a PostgreSQL utility container for database cleanup jobs - do not modify swatch-tally
      • Deploy changes to stage environment first, then after confirming the next day's deployment cycle, open a separate MR to apply the same changes to production
      • The actual egress functionality that exports data to S3 is deployed as separate standalone deployments, not as part of swatch-tally
      • The egress post-deploy tests (rhsm-subscriptions-egress-post-deploy-tests) run automatically after successful egress deployments and publish to rhsm-subscriptions-egress-stage-post-deploy-tests-success-channel - these will be disabled along with the main deployment
      • This is a safe disable operation that preserves all configurations for potential rollback
      • This is Phase 1 of the egress decommissioning - subsequent cards will be created for complete cleanup (repository cleanup, documentation updates, etc.)

              Unassigned Unassigned
              lburnett0 Lindsey Burnett
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: