Uploaded image for project: 'Network Observability'
  1. Network Observability
  2. NETOBSERV-1747

Operator should trigger SVM when old versions are found

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • netobserv-1.8
    • netobserv-1.6.1
    • Operator
    • None
    • False
    • None
    • False
    • Hide
      Previously, when a managed API version was removed after an operator upgrade, if you ever used this API version in the past, the upgrade process would fail, and some manual recovery steps would have to be taken.
      This change allows the operator to proactively avoid this scenario by migrating old / deprecated versions to the latest stored version.
      Show
      Previously, when a managed API version was removed after an operator upgrade, if you ever used this API version in the past, the upgrade process would fail, and some manual recovery steps would have to be taken. This change allows the operator to proactively avoid this scenario by migrating old / deprecated versions to the latest stored version.
    • NetObserv - Sprint 256, NetObserv - Sprint 257, NetObserv - Sprint 258, NetObserv - Sprint 259, NetObserv - Sprint 260, NetObserv - Sprint 261

      After having removed v1alpha1 in 1.6, there has been customers having issues to upgrade. The upgrade process is stuck in pending state, with this error:

      > risk of data loss updating "flowcollectors.flows.netobserv.io": new CRD removes version v1alpha1 that is listed as a stored version on the existing CRD

      This seems to be more a safety precaution from OLM than an actual risk in our case, given that the version lifecycle is correctly followed (ie: v1alpha1 version was not abruptly removed, it was first replaced with v1beta1 as the storage version, then deprecated, etc. )

      Despite this error message, v1alpha1 is most probably not used in etcd; it would have been rewritten into a v1beta1, if only during the previous operator upgrades, when the operator updates the status of the FlowCollector with deployments being reinstalled, even if the user doesn't do any manual change.

      Despite that, the flowcollector CRD still shows in its status:

       

        storedVersions:
        - v1alpha1 
        - v1beta1
      

      Workarounds have been documented.

       

      For future version removal to not reiterate this issue, we need to implement a fix in the operator, to send a SVM request and amend the CRD status, as described here: https://dev.to/jotak/kubernetes-crd-the-versioning-joy-6g0

      At startup (or during reconciles), the operator should check if the CRD has any deprecated storage version in its status, and if so, trigger the SVM+update

      It can also be inspired by TektonCD which has a similar migration process: https://github.com/tektoncd/operator/blob/v0.72.0/pkg/reconciler/shared/tektonconfig/upgrade/helper/migrator.go (they don't use SVM, they run a blank patch themselves, which is equivalent)

            jtakvori Joel Takvorian
            jtakvori Joel Takvorian
            Amogh Rameshappa Devapura Amogh Rameshappa Devapura
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: