Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-14013

[Regression] cert-manager does not work with route53 (works with azureDNS though)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Can't Do
    • Icon: Undefined Undefined
    • None
    • 4.13.z, 4.12.z
    • cert-manager
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • No
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      In several OCP clusters, installed either v1.10.2 or v1.11.1 cert-manager Operator. In all of these clusters, tried 3 different AWS users' AWS key/secret pairs, tested against AWS route53 solver (each test used different dns name for certificate), all failed with:
      
      $ oc get challenge -o wide
      NAME                                                             STATE     DOMAIN                                     REASON                                                                                                                                                                                                                                                                          AGE
      cert-from-clusterissuer-dns01-zone-qe1-mjrrv-3966567-503983807   invalid   auth-xxia-1.qe1.devcluster.openshift.com   Error accepting authorization: acme: authorization error for auth-xxia-1.qe1.devcluster.openshift.com: 400 urn:ietf:params:acme:error:dns: DNS problem: NXDOMAIN looking up CAA for auth-xxia-1.qe1.devcluster.openshift.com - check that a DNS record exists for this domain   9m18s

      Note, however, all 3 different AWS users' AWS key/secret pairs worked before.

       

      Version-Release number of selected component (if applicable):

      v1.11.1-6 in 4.13.0-0.nightly-2023-05-23-145816
      v1.10.2 in 4.12.0-0.nightly-2023-05-23-221822

      How reproducible:

      Always

      Steps to Reproduce:

      1. Follow test case https://polarion.engineering.redhat.com/polarion/#/project/OSE/workitem?id=OCP-62494 "OCP-62494 Use explicit credential in ACME dns01 solver with route53 to generate certificate" steps.
      2. Then check challenge
      $ oc get challenge -o wide -n xxia-test
      NAME                                                             STATE     DOMAIN                                     REASON                                                                                                                                                                                                                                                                          AGE
      cert-from-clusterissuer-dns01-zone-qe1-mjrrv-3966567-503983807   invalid   auth-xxia-1.qe1.devcluster.openshift.com   Error accepting authorization: acme: authorization error for auth-xxia-1.qe1.devcluster.openshift.com: 400 urn:ietf:params:acme:error:dns: DNS problem: NXDOMAIN looking up CAA for auth-xxia-1.qe1.devcluster.openshift.com - check that a DNS record exists for this domain   9m18s
      

      Actual results:

      As above.
      BTW checked AWS route53 console, found the TXT record exists, and then it disappeared.
      Increase log level:
      $ oc --context=admin patch certmanager/cluster --type=merge -p='{"spec":{"controllerConfig":

      {"overrideArgs":["--v=8"]}

      }}'
      Then check pod logs, found "ACME DNS01 validation record propagated" and :

      2023-05-24T09:01:14.155718000Z I0524 09:01:14.155676       1 dns.go:130] cert-manager/challenges/Check "msg"="ACME DNS01 validation record propagated" "dnsName"="auth-xxia-1.qe1.devcluster.openshift.com" "domain"="auth-xxia-1.qe1.devcluster.openshift.com" "fqdn"="_acme-challenge.auth-xxia-1.qe1.devcluster.openshift.com." "resource_kind"="Challenge" "resource_name"="cert-from-clusterissuer-dns01-zone-qe1-75ssp-3966567-3088233583" "resource_namespace"="xxia-test" "resource_version"="v1" "type"="DNS-01"
      2023-05-24T09:01:14.155753650Z I0524 09:01:14.155714       1 sync.go:359] cert-manager/challenges/acceptChallenge "msg"="accepting challenge with ACME server" "dnsName"="auth-xxia-1.qe1.devcluster.openshift.com" "resource_kind"="Challenge" "resource_name"="cert-from-clusterissuer-dns01-zone-qe1-75ssp-3966567-3088233583" "resource_namespace"="xxia-test" "resource_version"="v1" "type"="DNS-01"
      

      And:

      2023-05-24T09:01:15.187742816Z E0524 09:01:15.187703       1 sync.go:379] cert-manager/challenges/acceptChallenge "msg"="error waiting for authorization" "error"="acme: authorization error for auth-xxia-1.qe1.devcluster.openshift.com: 400 urn:ietf:params:acme:error:dns: DNS problem: NXDOMAIN looking up CAA for auth-xxia-1.qe1.devcluster.openshift.com - check that a DNS record exists for this domain" "dnsName"="auth-xxia-1.qe1.devcluster.openshift.com" "resource_kind"="Challenge" "resource_name"="cert-from-clusterissuer-dns01-zone-qe1-75ssp-3966567-3088233583" "resource_namespace"="xxia-test" "resource_version"="v1" "type"="DNS-01"
      

      Full logs are uploaded to: https://drive.google.com/file/d/1Iw_zHg1qgHlPKG0WsSm08hmxPXKfnZKj/view?usp=sharing

      Expected results:

      Success

      Additional info:

      azuneDNS works well in same env https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/205292/ :

      $ oc get certificate -n xxia-proj
      NAME                             READY   SECRET                           AGE
      cert-from-issuer-with-azuredns   True    cert-from-issuer-with-azuredns   19m

              thn@redhat.com Thejas N (Inactive)
              xxia-1 Xingxing Xia
              None
              Swarup Ghosh, Thejas N (Inactive)
              Xingxing Xia Xingxing Xia
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: