Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-64717

OLMv1: No mechanism for operators to dynamically configure their deployment

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major Major
    • None
    • 4.21
    • OLM
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      With OLMv0, operators can dynamically modify their deployment configuration by updating the CSV, and the CSV controller applies those changes to the deployment. OLMv1's ClusterExtension controller reverts any deployment modifications to match the bundle spec, preventing operators from dynamically configuring their deployments at runtime.

      Background context for Cost Management Metrics Operator:

      Our operator collects Prometheus metrics and ships them to console.redhat.com for cost analysis.

      We wanted our operator to be as simple as possible to maintain and for a user to install and we needed a way to persist data between operator restarts. Our operator doesn't do much; it just gathers prometheus metrics and ships them somewhere else.  We didn't really want to deal with an operator that needs to create a PVC, and then create another deployment mounted on that PVC.

      When our operator is installed and configured, the operator modifies the CSV to add a PVC to the deployment. This gave us what we needed without  needing to maintain many different components. This is what we've been doing since OCP 4.5.

      Maybe today there is a different way to bring a PVC to an operator deployment. But at the time we started work on this operator, there weren't any good options that didn't require the customer to configure persistent storage.

       
      This pattern has worked successfully since OCP 4.5 with OLMv0.
       
      Current Behavior with OLMv1
      What Works * Operator installs via ClusterExtension

      • Operator pod starts and runs
      • CR detection
      • PVC creation

      What Doesn't Work * Operator cannot mount the created PVC

      • Infinite restart loop (deployment constantly updated/reverted)
      • CR never fully reconciles
      • Persistent storage feature lost
         

      Version-Release number of selected component (if applicable):

      4.2.1 nightly, techpreview

       

      How reproducible:

      always

       

      Steps to Reproduce:

      1. deploy costmanagement-metrics-operator using OLMv1

      1. Create cluster catalog
      
      ---
      apiVersion: olm.operatorframework.io/v1
      kind: ClusterCatalog
      metadata:  
        name: costmanagement-metrics-operator-catalog
      spec:  
        source:
          type: Image
          image:
            ref: quay.io/dnakabaa/test-catalog:cmmo-olmv1
      
      
      2. Apply manifest
      
      ---
      apiVersion: v1
      kind: Namespace
      metadata:
        name: costmanagement-metrics-operator
      ---
      apiVersion: v1
      kind: ServiceAccount
      metadata:
        name: costmanagement-metrics-operator-installer
        namespace: costmanagement-metrics-operator
      ---
      apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRoleBinding
      metadata:
        name: costmanagement-metrics-operator-installer-binding
      roleRef:
        apiGroup: rbac.authorization.k8s.io
        kind: ClusterRole
        name: cluster-admin
      subjects:
      - kind: ServiceAccount
        name: costmanagement-metrics-operator-installer
        namespace: costmanagement-metrics-operator
      ---
      apiVersion: olm.operatorframework.io/v1
      kind: ClusterExtension
      metadata:
        name: costmanagement-metrics-operator
      spec:
        namespace: costmanagement-metrics-operator
        serviceAccount:
          name: costmanagement-metrics-operator-installer
        config:
          configType: Inline
          inline:
            watchNamespace: "costmanagement-metrics-operator"
        source:
          sourceType: Catalog
          catalog:
            packageName: costmanagement-metrics-operator
            selector:
              matchLabels:
                olm.operatorframework.io/metadata.name: costmanagement-metrics-operator-catalog
      
      
      3. Create a CR
      
      ---
      apiVersion: costmanagement-metrics-cfg.openshift.io/v1beta1
      kind: CostManagementMetricsConfig
      metadata:
        name: costmanagementmetricscfg-sample-v1beta1
        namespace: costmanagement-metrics-operator
      spec:
        upload:
          ingress_path: /api/ingress/v1/upload
          upload_cycle: 60
          upload_toggle: true
          validate_cert: true
        packaging:
          max_reports_to_store: 30
          max_size_MB: 100
        api_url: 'https://console.redhat.com'
        prometheus_config:
          collect_previous_data: true
          context_timeout: 120
          disable_metrics_collection_cost_management: false
        authentication:
          type: token
        source:
          check_cycle: 1440
          create_source: false
          sources_path: /api/sources/v1.0/
          name: ''
      

       

      Actual results:

      constant operator pod restarts 

       

      Expected results:

      - Operator should be able to dynamically configure its deployment based on CR requirements. i.e. when operator creates a PVC and needs to mount it, the change should persist.

       

      Additional info:

              rh-ee-cchantse Catherine Chan-Tse
              rh-ee-dnakabaa David Nakabaale
              None
              None
              Kui Wang Kui Wang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: