Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-927

Azure install fails in CI: Error: error creating/updating Private DNS Zone Virtual network link

    XMLWordPrintable

Details

    Description

      Description of problem:

      We're seeing frequent private DNS zone creation failures in Azure CI jobs recent two days, the Azure CI jobs have been greatly affected.
      https://search.ci.openshift.org/?search=error+creating%2Fupdating+Private+DNS+Zone+Virtual+network&maxAge=48h&context=1&type=build-log&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job
      
      Such as the following error from https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.12-upgrade-from-stable-4.11-e2e-azure-sdn-upgrade/1566852244215697408
      
      level=info msg=Consuming Openshift Manifests from target directory
      level=info msg=Consuming Common Manifests from target directory
      level=info msg=Credentials loaded from file "/var/run/secrets/ci.openshift.io/cluster-profile/osServicePrincipal.json"
      level=info msg=Creating infrastructure resources...
      level=error
      level=error msg=Error: error creating/updating Private DNS Zone Virtual network link "ci-op-1w80vs6f-7f65d-t2zlz-network-link" (Resource Group "ci-op-1w80vs6f-7f65d-t2zlz-rg"): privatedns.VirtualNetworkLinksClient#CreateOrUpdate: Failure sending request: StatusCode=404 -- Original Error: Code="ParentResourceNotFound" Message="Can not perform requested operation on nested resource. Parent resource 'ci-op-1w80vs6f-7f65d.ci2.azure.devcluster.openshift.com' not found."
      level=error
      level=error msg=  with module.dns.azureprivatedns_zone_virtual_network_link.network,
      level=error msg=  on dns/dns.tf line 13, in resource "azureprivatedns_zone_virtual_network_link" "network":
      level=error msg=  13: resource "azureprivatedns_zone_virtual_network_link" "network" 
      
      

      Version-Release number of selected component (if applicable):

      All OCP versions
      
      

      How reproducible:

      https://search.ci.openshift.org/chart?name=e2e-azure&search=error+creating%2Fupdating+Private+DNS+Zone&maxAge=24h&type=build-log
      shows 26% of the failed Azure jobs are related to "error creating/updating Private DNS Zone" in the past day. 
      3/5 of the failed Azure jobs are caused by this in QE’s CI today. 
      
      

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

      
      

      Expected results:

      
      

      Additional info:

       
      No Azure outage was reported from https://status.azure.com/en-us/status.
      No private zone or DNS records quota exceeded was observed.   
      

      Attachments

        Issue Links

          Activity

            People

              jhixson_redhat John Hixson
              gpei@redhat.com Gaoyun Pei
              Gaoyun Pei Gaoyun Pei
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated: