Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-69882

[hcp][azurefile-nfs]PVC Restore from VolumeSnapshot Fails with 403 AuthorizationFailure

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.22.0
    • Storage / Operators
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • Proposed
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      When attempting to restore a PVC (mypvc-res) from a VolumeSnapshot (mysnapshot) in a HyperShift (Hosted Control Plane) environment, the provisioning fails with the following error:
      
        failed to perform copy command due to error: cannot start job due to error
        GET https://fc9160ad19b084eb8b5ce84.file.core.windows.net/pvcn-xxx/
        RESPONSE 403: 403 This request is not authorized to perform this operation.
        ERROR CODE: AuthorizationFailure

      Version-Release number of selected component (if applicable):

      4.22 pre-merge testing with openshift/cluster-storage-operator#643,openshift/csi-operator#461,openshift/hypershift#7157

      How reproducible:

      Always    

      Steps to Reproduce:

          1. Create Azure self-managed hcp cluster (I used the prow CI: periodic-ci-openshift-openshift-tests-private-release-4.21-amd64-nightly-azure-ipi-ovn-hypershift-guest-f7)
          2. Create storageclass for azurefile-nfs (even using matchTags to use new storageaccount created by CSI Driver)
      parameters:
        protocol: nfs
        skuName: Premium_LRS
        matchTags: "true"
        tags: storageClassName=azurefile-csi-nfs
          3. Create an original PVC and pod, 
          4. Create a volumesnapshot from orginal PVC
          5. restore a PVC from the VolumeSnapshot 

      Actual results:

      PVC provisoning failed due to AuthorizationFailure

      Expected results:

      PVC provisoning should succeed

      Root Cause Analysis:

      1. Storage Account Network Configuration is "defaultAction": "Deny" and only allow subnet of hosted cluster vnetwork
      I examined the storage account's network firewall rules:
        az storage account show \
          --name fc9160ad19b084eb8b5ce84 \
          --resource-group ci-op-t1k2xqn3-0cde2-rg \
          --query "networkRuleSet"  
      Initial Configuration:
        {
          "bypass": "AzureServices",
          "defaultAction": "Deny",
          "virtualNetworkRules": [
            {
              "subnet": "/subscriptions/.../virtualNetworks/ci-op-t1k2xqn3-0cde2-vnet/subnets/ci-op-t1k2xqn3-0cde2-subnet"
            }
          ]
        }
      2. AuthorizationFailure happens when azcopy try to restore the PVC
      
           

      Verified with WA:

      1. Add Management Cluster Subnet to Whitelist, the issue did not reproduce.
        {
          "bypass": "AzureServices",
          "defaultAction": "Deny",
          "virtualNetworkRules": [
            {
              "subnet": ".../ci-op-t1k2xqn3-0cde2-vnet/subnets/ci-op-t1k2xqn3-0cde2-subnet"
              // Worker nodes subnet (original)
            },
            {
              "subnet": ".../ci-op-t1k2xqn3-0cde2-xk4mr-vnet/subnets/<management-subnet>"
              // Management cluster subnet (newly added)
            }
          ]
        }
      

              Unassigned Unassigned
              wduan@redhat.com Wei Duan
              None
              None
              Wei Duan Wei Duan
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: