Uploaded image for project: 'OpenShift Hive'
  1. OpenShift Hive
  2. HIVE-1587

Hive stops fulfilling ALL clusterclaims if given a very long lifetime for one claim

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Minor Minor
    • None
    • None
    • None
    • Quality / Stability / Reliability
    • False
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • Undefined

      Description of the problem

      If the user creates a clusterclaim against a clusterpool with an especially long lifetime, such as:

      spec:
        clusterPoolName: eternal-claim
        lifetime: 177777777777777777777777777777777777777777777777777777777777777777777777777777h0m0s
      

      Hive will fail to reconcile all clusterclaims and will stop fulfilling new and old claims with the following error in the hive-controller:

      E0713 02:17:19.678591       1 reflector.go:127] k8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:156: Failed to watch *v1.ClusterClaim: failed to list *v1.ClusterClaim: v1.ClusterClaimList.Items: []v1.ClusterClaim: v1.ClusterClaim.Spec: v1.ClusterClaimSpec.Subjects: []v1.Subject: Namespace: Lifetime: unmarshalerDecoder: time: invalid duration "177777777777777777777777777777777777777777777777777777777777777777777777777777h0m0s", error found in #10 byte of ...|7777h0m0s","namespac|..., bigger context ...|77777777777777777777777777777777777777777777h0m0s","namespace":"policy-grc-cp-autoclaims-l629l","sub|...
      

      This long duration should, at the very least, just throw an error in the hive controller and continue to reconcile the rest of the clusterclaims and fulfill other claims and should, ideally, reject this claim as malformed.

      Cluster Details

      • OpenShift Container Platform
      • [ ] 4.7.18 (suspected all versions)

      Steps

      1. Create a clusterclaim with a super long lifetime, shown above. Note: it has no status items and is not picked up by hive or fulfilled
      2. Create another clusterclaim on the same cluster, notice that your new claim and all subsequent claims have no status items and is not reconciled by hive

      Expected behavior

      The invalid clusterclaim is, at best, rejected at create or at least still allows all other claims to be reconciled and fulfilled.

      Screenshots & Logs

      Breaking Clusterclaim:

      apiVersion: hive.openshift.io/v1
      kind: ClusterClaim
      metadata:
        annotations:
          kubectl.kubernetes.io/last-applied-configuration: |
            {"apiVersion":"hive.openshift.io/v1","kind":"ClusterClaim","metadata":{"annotations":{},"name":"rhacmstackem-policy-grc-cp-autoclaims","namespace":"acm-grc-security"},"spec":{"clusterPoolName":"policy-grc-cp-autoclaims","lifetime":"11h","subjects":[{"kind":"ServiceAccount","name":"policy-grc-sa","namespace":"acm-grc-security"},{"apiGroup":"rbac.authorization.k8s.io","kind":"Group","name":"policy-grc"}]}}
        creationTimestamp: "2021-07-12T11:00:19Z"
        deletionGracePeriodSeconds: 0
        deletionTimestamp: "2021-07-12T22:31:43Z"
        finalizers:
        - hive.openshift.io/claim
        generation: 5
        managedFields:
        - apiVersion: hive.openshift.io/v1
          fieldsType: FieldsV1
          fieldsV1:
            f:metadata:
              f:annotations:
                .: {}
                f:kubectl.kubernetes.io/last-applied-configuration: {}
            f:spec:
              .: {}
              f:clusterPoolName: {}
              f:subjects: {}
          manager: kubectl-client-side-apply
          operation: Update
          time: "2021-07-12T11:00:19Z"
        - apiVersion: hive.openshift.io/v1
          fieldsType: FieldsV1
          fieldsV1:
            f:metadata:
              f:finalizers:
                .: {}
                v:"hive.openshift.io/claim": {}
            f:spec:
              f:namespace: {}
            f:status:
              .: {}
              f:conditions: {}
          manager: manager
          operation: Update
          time: "2021-07-12T11:31:43Z"
        - apiVersion: hive.openshift.io/v1
          fieldsType: FieldsV1
          fieldsV1:
            f:spec:
              f:lifetime: {}
          manager: oc
          operation: Update
          time: "2021-07-12T21:39:39Z"
        name: rhacmstackem-policy-grc-cp-autoclaims
        namespace: acm-grc-security
        resourceVersion: "278422289"
        selfLink: /apis/hive.openshift.io/v1/namespaces/acm-grc-security/clusterclaims/rhacmstackem-policy-grc-cp-autoclaims
        uid: 34eb8a1a-f5ce-4d20-b883-5a6f2a7178cf
      spec:
        clusterPoolName: policy-grc-cp-autoclaims
        lifetime: 177777777777777777777777777777777777777777777777777777777777777777777777777777h0m0s
        namespace: policy-grc-cp-autoclaims-l629l
        subjects:
        - kind: ServiceAccount
          name: policy-grc-sa
          namespace: acm-grc-security
        - apiGroup: rbac.authorization.k8s.io
          kind: Group
          name: policy-grc
      status:
        conditions:
        - lastProbeTime: "2021-07-12T11:31:43Z"
          lastTransitionTime: "2021-07-12T11:31:43Z"
          message: Cluster claimed
          reason: ClusterClaimed
          status: "False"
          type: Pending
        - lastProbeTime: "2021-07-12T11:31:43Z"
          lastTransitionTime: "2021-07-12T11:31:43Z"
          message: Cluster is running
          reason: Running
          status: "True"
          type: ClusterRunning
      

      Relevant hive logs:

      E0713 02:17:19.678591       1 reflector.go:127] k8s.io/client-go@v12.0.0+incompatible/tools/cache/reflector.go:156: Failed to watch *v1.ClusterClaim: failed to list *v1.ClusterClaim: v1.ClusterClaimList.Items: []v1.ClusterClaim: v1.ClusterClaim.Spec: v1.ClusterClaimSpec.Subjects: []v1.Subject: Namespace: Lifetime: unmarshalerDecoder: time: invalid duration "177777777777777777777777777777777777777777777777777777777777777777777777777777h0m0s", error found in #10 byte of ...|7777h0m0s","namespac|..., bigger context ...|77777777777777777777777777777777777777777777h0m0s","namespace":"policy-grc-cp-autoclaims-l629l","sub|...
      
      

              efried.openshift Eric Fried
              Gurney.Buchanan@ibm.com Gurney Buchanan
              None
              None
              None
              Lin Wang Lin Wang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: