Uploaded image for project: 'OpenShift API for Data Protection'
  1. OpenShift API for Data Protection
  2. OADP-1054

CloudStorage: openshift-adp-controller-manager crashloop seg fault with Restic enabled

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Can't Do
    • Icon: Normal Normal
    • None
    • OADP 1.1.1, OADP 1.0.6
    • Documentation
    • False
    • Hide

      None

      Show
      None
    • False
    • QE - Ack
    • ToDo
    • No
    • 0
    • 0
    • Very Likely
    • 0
    • None
    • Unset
    • Unknown
    • Proposed

      Description of problem: controller-manager pod crashloops when setting up DPA with cloudstorage and restic enabled

       

      Version-Release number of selected component (if applicable): 1.1.1 stage build was tested so far.
      Also checked on  oadp-operator-bundle-container-1.0.6-3

      How reproducible: 100%

       

      Steps to Reproduce:

      1. Create AWS role based creds
      2. Create cloudstorage pointing to these AWS creds:
       [mperetz@fedora oadp-e2e-qe]$ oc get cloudstorages.oadp.openshift.io -A -o yaml
      apiVersion: v1
      items:
      - apiVersion: oadp.openshift.io/v1alpha1
        kind: CloudStorage
        metadata:
          creationTimestamp: "2022-11-22T15:57:01Z"
          finalizers:
          - oadp.openshift.io/bucket-protection
          generation: 1
          name: temp-oadpbucket157861
          namespace: openshift-adp
          resourceVersion: "268931"
          uid: 5c0bd7e3-de98-4743-af5f-1a6beb4afb1f
        spec:
          creationSecret:
            key: cloud
            name: cloud-credentials
          enableSharedConfig: true
          name: temp-oadpbucket157861
          provider: aws
          region: us-east-2
        status:
          lastSyncTimestamp: "2022-11-22T15:57:14Z"
          name: temp-oadpbucket157861
      kind: List
      metadata:
        resourceVersion: ""
      

      3 . Create a DPA with cloudstorageref :

      [mperetz@fedora oadp-e2e-qe]$ oc get dpa -n openshift-adp -o yaml
      apiVersion: v1
      items:
      - apiVersion: oadp.openshift.io/v1alpha1
        kind: DataProtectionApplication
        metadata:
          creationTimestamp: "2022-11-22T16:05:24Z"
          generation: 1
          name: ts-dpa
          namespace: openshift-adp
          resourceVersion: "272699"
          uid: 9bc6df1a-5ccc-43f0-b808-c6ed55910717
        spec:
          backupLocations:
          - bucket:
              cloudStorageRef:
                name: temp-oadpbucket157861
              config:
                enableSharedConfig: "true"
                region: us-east-2
              credential:
                key: cloud
                name: cloud-credentials
              default: true
          configuration:
            restic:
              enable: true
              podConfig:
                resourceAllocations: {}
            velero:
              defaultPlugins:
              - openshift
              - aws
              - kubevirt
          podDnsConfig: {}
          snapshotLocations: []
      kind: List
      metadata:
        resourceVersion: ""
       

       

      Actual results:

      controller-manager crashes with seg fault:

      [mperetz@fedora oadp-e2e-qe]$ oc get pods -n openshift-adp
      NAME                                                READY   STATUS             RESTARTS         AGE
      openshift-adp-controller-manager-59495db548-cdppj   0/1     CrashLoopBackOff   10 (2m25s ago)   40m
      velero-5794bc5bbb-djclj                             1/1     Running            0                32m
       
       [mperetz@fedora oadp-e2e-qe]$ oc logs openshift-adp-controller-manager-59495db548-cdppj -n openshift-adp
      1.6691332079769745e+09    INFO    setup    patching operator namespace with PSA labels
      I1122 16:06:49.045551       1 request.go:665] Waited for 1.043087477s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/machineconfiguration.openshift.io/v1?timeout=32s
      1.669133210750874e+09    INFO    controller-runtime.metrics    Metrics server is starting to listen    {"addr": ":8080"}
      1.6691332107522833e+09    INFO    setup    starting manager
      1.6691332107526248e+09    INFO    Starting server    {"kind": "health probe", "addr": "[::]:8081"}
      I1122 16:06:50.752652       1 leaderelection.go:248] attempting to acquire leader lease openshift-adp/8b4defce.openshift.io...
      1.6691332107526777e+09    INFO    Starting server    {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"}
      [mperetz@fedora oadp-e2e-qe]$ oc logs openshift-adp-controller-manager-59495db548-cdppj -n openshift-adp --previous 
      1.6691331660032907e+09    INFO    setup    patching operator namespace with PSA labels
      I1122 16:06:07.071723       1 request.go:665] Waited for 1.044397738s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/autoscaling.openshift.io/v1beta1?timeout=32s
      1.6691331687751675e+09    INFO    controller-runtime.metrics    Metrics server is starting to listen    {"addr": ":8080"}
      1.6691331687758849e+09    INFO    setup    starting manager
      1.6691331687761607e+09    INFO    Starting server    {"kind": "health probe", "addr": "[::]:8081"}
      1.669133168776194e+09    INFO    Starting server    {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"}
      I1122 16:06:08.776174       1 leaderelection.go:248] attempting to acquire leader lease openshift-adp/8b4defce.openshift.io...
      I1122 16:06:24.999644       1 leaderelection.go:258] successfully acquired lease openshift-adp/8b4defce.openshift.io
      1.669133184999681e+09    DEBUG    events    Normal    {"object": {"kind":"ConfigMap","namespace":"openshift-adp","name":"8b4defce.openshift.io","uid":"673fedec-52db-4b54-8101-25c1a63da547","apiVersion":"v1","resourceVersion":"273578"}, "reason": "LeaderElection", "message": "openshift-adp-controller-manager-59495db548-cdppj_7afbb7c7-5a00-40d9-9a04-52189ab9f9a0 became leader"}
      1.669133184999797e+09    DEBUG    events    Normal    {"object": {"kind":"Lease","namespace":"openshift-adp","name":"8b4defce.openshift.io","uid":"9261f8d2-96c7-48ba-bc57-8b0051603901","apiVersion":"coordination.k8s.io/v1","resourceVersion":"273579"}, "reason": "LeaderElection", "message": "openshift-adp-controller-manager-59495db548-cdppj_7afbb7c7-5a00-40d9-9a04-52189ab9f9a0 became leader"}
      1.6691331849999173e+09    INFO    controller.dataprotectionapplication    Starting EventSource    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication", "source": "kind source: *v1alpha1.DataProtectionApplication"}
      1.6691331849999518e+09    INFO    controller.dataprotectionapplication    Starting EventSource    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication", "source": "kind source: *v1.Deployment"}
      1.669133184999959e+09    INFO    controller.dataprotectionapplication    Starting EventSource    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication", "source": "kind source: *v1.BackupStorageLocation"}
      1.6691331849999661e+09    INFO    controller.dataprotectionapplication    Starting EventSource    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication", "source": "kind source: *v1.VolumeSnapshotLocation"}
      1.669133184999973e+09    INFO    controller.dataprotectionapplication    Starting EventSource    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication", "source": "kind source: *v1.DaemonSet"}
      1.6691331849999814e+09    INFO    controller.dataprotectionapplication    Starting EventSource    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication", "source": "kind source: *v1.SecurityContextConstraints"}
      1.6691331849999893e+09    INFO    controller.dataprotectionapplication    Starting EventSource    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication", "source": "kind source: *v1.Service"}
      1.6691331849999804e+09    INFO    controller.cloudstorage    Starting EventSource    {"reconciler group": "oadp.openshift.io", "reconciler kind": "CloudStorage", "source": "kind source: *v1alpha1.CloudStorage"}
      1.6691331850000026e+09    INFO    controller.cloudstorage    Starting Controller    {"reconciler group": "oadp.openshift.io", "reconciler kind": "CloudStorage"}
      1.6691331849999957e+09    INFO    controller.dataprotectionapplication    Starting EventSource    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication", "source": "kind source: *v1.Route"}
      1.669133185000023e+09    INFO    controller.dataprotectionapplication    Starting EventSource    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication", "source": "kind source: *v1.ConfigMap"}
      1.6691331850000286e+09    INFO    controller.dataprotectionapplication    Starting EventSource    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication", "source": "kind source: *v1.Secret"}
      1.6691331850000334e+09    INFO    controller.dataprotectionapplication    Starting Controller    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication"}
      1.6691331851037712e+09    INFO    controller.cloudstorage    Starting workers    {"reconciler group": "oadp.openshift.io", "reconciler kind": "CloudStorage", "worker count": 1}
      1.6691331851037946e+09    INFO    controller.dataprotectionapplication    Starting workers    {"reconciler group": "oadp.openshift.io", "reconciler kind": "DataProtectionApplication", "worker count": 1}
      panic: runtime error: invalid memory address or nil pointer dereference
      [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x1572894]goroutine 528 [running]:
      github.com/openshift/oadp-operator/pkg/credentials.AppendCloudProviderVolumes(0xc0009d8000?, 0xc00099bb00, 0xc0007000e0?, 0x1?)
          /remote-source/pkg/credentials/credentials.go:240 +0x134
      github.com/openshift/oadp-operator/controllers.(*DPAReconciler).customizeResticDaemonset(0xc000823110, 0xc0009d8000, 0xc00099bb00)
          /remote-source/controllers/restic.go:280 +0x985
      github.com/openshift/oadp-operator/controllers.(*DPAReconciler).buildResticDaemonset(0xc0009154f8?, 0xc0009d8000, 0xc00099bb00)
          /remote-source/controllers/restic.go:192 +0x2bc
      github.com/openshift/oadp-operator/controllers.(*DPAReconciler).ReconcileResticDaemonset.func1()
          /remote-source/controllers/restic.go:123 +0x245
      sigs.k8s.io/controller-runtime/pkg/controller/controllerutil.mutate(0x1dcb860?, {{0xc00083f440?, 0xc00067a3f0?}, {0x1abfc66?, 0xc000595650?}}, {0x1df92a8, 0xc00099bb00})
          /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/controller/controllerutil/controllerutil.go:339 +0x4f
      sigs.k8s.io/controller-runtime/pkg/controller/controllerutil.CreateOrPatch({0x1de4138, 0xc00067a3f0}, {0x1debb60, 0xc0003d8eb0}, {0x1df92a8?, 0xc00099bb00}, 0xc000915770)
          /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/controller/controllerutil/controllerutil.go:239 +0x145
      github.com/openshift/oadp-operator/controllers.(*DPAReconciler).ReconcileResticDaemonset(0xc000823110, {{0x1de6d08?, 0xc00067a420?}, 0x127399d?})
          /remote-source/controllers/restic.go:101 +0x2b4
      github.com/openshift/oadp-operator/controllers.ReconcileBatch({{0x1de6d08?, 0xc00067a420?}, 0xc00067a3f0?}, {0xc000915bf0, 0x13, 0xc00083f450?})
          /remote-source/controllers/dpa_controller.go:216 +0x77
      github.com/openshift/oadp-operator/controllers.(*DPAReconciler).Reconcile(0xc000823110, {0x1de4138?, 0xc00067a3f0}, {{{0xc00083f440, 0xd}, {0xc00083f450, 0x6}}})
          /remote-source/controllers/dpa_controller.go:87 +0x66c
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0xc000866420, {0x1de4138, 0xc00067a330}, {{{0xc00083f440?, 0x19e56c0?}, {0xc00083f450?, 0x4045d4?}}})
          /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:114 +0x28b
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000866420, {0x1de4090, 0xc000821c40}, {0x18f25c0?, 0xc00021e060?})
          /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:311 +0x352
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000866420, {0x1de4090, 0xc000821c40})
          /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266 +0x1d9
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
          /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227 +0x85
      created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
          /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:223 +0x31c
      

      Expected results:

       

      Additional info:

            richard.hoch Richard Hoch
            mperetz@redhat.com Maya Peretz
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: