Uploaded image for project: 'Operator Ecosystem'
  1. Operator Ecosystem
  2. OPECO-1895

Auto-pruning of Objects created by layered products via Operators

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Done
    • Icon: Critical Critical
    • 2021Q3 Plan, 2021Q4 Plan
    • None
    • None
    • None
    • auto-pruning of Operator created objects
    • False
    • False
    • Done
    • OCPPLAN-6823 - Make OpenShift Operators more mature
    • Impediment
    • OCPPLAN-6823Make OpenShift Operators more mature
    • 0% To Do, 0% In Progress, 100% Done
    • Undefined
    • M

      Epic Goal

      SDK supports Operator authors to easily provide strategies in terms of time duration or resource limitation (e.g. storage space, overall etcd object count, or user-defined settings) to cleanup/auto-prune/harvest the objects created by the Operators.

      • This harvester/cleanup process is part of a larger set of solutions to control bloat, key on very large clusters.
      • The ideal use case for SDK to support operator authors with a prewritten solution, supporting the SDK value proposition.
      • Good use of abstraction would create a ‘just works’ feature that also promotes best practices.

      Why is this important?

      Operators are widely used to control the deployment, configuration, and life cycle management of OpenShift layered products and products from our ecosystem.

      In many cases, Operators will be generating resources/objects on the cluster as a way to allow users to later trace/track the work history (e.g. Build history, k8s Jobs, etc). This becomes a problem if the Operators didn’t actively harvest/clean/prune those created objects and those just kept taking up cluster storage.

      Currently, it’s either up to Operator authors to come up with their own way to deal with this, or in the worse case, it’s on the cluster admins to figure out when and what Operator-created objects can be safely removed/deleted instead of kept hogging the storage space on the clusters.

      Scenarios

      • Use Case 1:
        There is an ETL (extract/transform/load) Operator that has a CR it manages called DataSource. On some frequency files that match that DataSource are created and pushed to a location that the operator will process. Processing of that file will create a ETL Job that runs to process that particular file. The ETL Job runs to completion. The ETL operator might also want to capture the processing log to retain for future reference, but the Job itself would need to be harvested/cleaned up at some point.
      • Use Case 2:
        There is a database operator that has a CR it manages called BackupSchedule. On some frequency the database will create a backup Job that runs to completion. The Job logs are archived into the database itself for future reference. Depending on the customer, the backup Jobs need to be harvested or cleaned from the system.

      Acceptance Criteria

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • An optional solution that Operator SDK users can easily opt-in for their projects:
        • SDK generates pre-defined strategy codes as a background cron execution for Operator created objects cleanup/pruning/harvesting
        • SDK users could specify a cron expression to control the frequency of Operator created objects cleanup/pruning/harvesting
        • SDK users could customize a strategy to control the criteria in terms of resource limitation (e.g. storage space, or overall etcd object count) of Operator created objects cleanup/pruning/harvesting
        • SDK users could specify a preDelete hook function to support archival or log processing before the log is deleted from the cluster
        • SDK users could specify pod/Job selection criteria using selector patterns or similar that determine what jobs/pods/etc get harvested
        • SDK could provide a log of resources that get harvested for auditing purposes (i.e. easily see which deletions are executed by the Operators)
      • New upstream doc for introducing and guiding how to utilize this new feature in Operator projects.

      Dependencies (internal and external)

      1. ...

      Previous Work (Optional):

      1. POC in operator-libhttps://github.com/operator-framework/operator-lib/pull/68

      Open questions::

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

        There are no Sub-Tasks for this issue.

            ryking@redhat.com Ryan King (Inactive)
            rhn-coreos-tunwu Tony Wu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: