-
Task
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
-
5
-
False
-
-
False
-
-
Summary/Problem Statement
The rhsm-subscriptions-egress job has been failing in production for an extended period (SWATCH-4024) but is no longer used upstream. Its original use case has been replaced by Floorist queries for Telesense, and our Snowflake integration will take a different implementation path that doesn't require reading from these S3 buckets. When this job fails, it reports failures to the #alerts-swatch Slack channel, which creates noise and buries real critical alerts.
Acceptance Criteria
- Add delete: true to rhsm-subscriptions-egress resource template targets in deploy.yml for both stage and prod namespaces
- Add delete: true to rhsm-subscriptions-egress-post-deploy-tests target in post-deploy-tests.yaml
- Add delete: true to any rhsm-subscriptions-egress references in deploy-perf.yml if they exist
- Verify that standalone egress deployments are removed from stage and prod environments
- Confirm that swatch-tally service continues to function normally (including database cleanup jobs)
Done Criteria
- All standalone egress job deployments are disabled across all environments
- No standalone egress job containers are running in stage or production
- Main swatch-tally service continues to function normally (including database cleanup jobs)
- No S3 upload failures are occurring from disabled egress job
- Evaluate is SWATCH-4024 is resolved by these changes and if so, it can be closed out
Implementation Notes
- Use delete: true on resource template targets in app-interface files to safely disable deployments
- Main deployment: data/services/insights/rhsm/deploy.yml
- Post-deploy tests: data/services/insights/rhsm/post-deploy-tests.yaml
- The swatch-tally ClowdApp references EGRESS_IMAGE parameters, but this is only for reusing the egress container image as a PostgreSQL utility container for database cleanup jobs - do not modify swatch-tally
- Deploy changes to stage environment first, then after confirming the next day's deployment cycle, open a separate MR to apply the same changes to production
- The actual egress functionality that exports data to S3 is deployed as separate standalone deployments, not as part of swatch-tally
- The egress post-deploy tests (rhsm-subscriptions-egress-post-deploy-tests) run automatically after successful egress deployments and publish to rhsm-subscriptions-egress-stage-post-deploy-tests-success-channel - these will be disabled along with the main deployment
- This is a safe disable operation that preserves all configurations for potential rollback
- This is Phase 1 of the egress decommissioning - subsequent cards will be created for complete cleanup (repository cleanup, documentation updates, etc.)
- relates to
-
SWATCH-4063 Decommission rhsm-subscriptions-egress
-
- New
-