Uploaded image for project: 'Knative Serving'
  1. Knative Serving
  2. SRVKS-1301

transient knative-operator-webhook 'failed calling webhook "webhook.serving.knative.dev"' errors after KnativeServing creation

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • 1.36.0
    • 1.35.0
    • None
    • None
    • False
    • None
    • False

      Seeing transient 

      failed to apply non rbac manifest: Internal error occurred: failed calling webhook \"webhook.serving.knative.dev\": failed to call webhook: Post \"https://webhook.knative-serving.svc:443/?timeout=10s\": no endpoints available for service \"webhook\"

      errors in knative-operator-webhook logs while reconciling newly created KnativeServing.

      {
        "level": "error",
        "ts": "2024-12-11T16:52:36.258Z",
        "logger": "knative-operator",
        "caller": "controller/controller.go:564",
        "msg": "Reconcile error",
        "commit": "8750a8b",
        "knative.dev/pod": "knative-operator-webhook-785b4bc7bf-dknvg",
        "knative.dev/controller": "knative.dev.operator.pkg.reconciler.knativeserving.Reconciler",
        "knative.dev/kind": "operator.knative.dev.KnativeServing",
        "knative.dev/traceid": "11967dfd-ad6d-46f4-b495-7a7830a57933",
        "knative.dev/key": "knative-serving/knative-serving",
        "duration": 4.711107858,
        "error": "failed to apply non rbac manifest: Internal error occurred: failed calling webhook \"webhook.serving.knative.dev\": failed to call webhook: Post \"https://webhook.knative-serving.svc:443/?timeout=10s\": no endpoints available for service \"webhook\"",
        "stacktrace": "knative.dev/pkg/controller.(*Impl).handleErr\n\t/workspace/vendor/knative.dev/pkg/controller/controller.go:564\nknative.dev/pkg/controller.(*Impl).processNextWorkItem\n\t/workspace/vendor/knative.dev/pkg/controller/controller.go:541\nknative.dev/pkg/controller.(*Impl).RunContext.func3\n\t/workspace/vendor/knative.dev/pkg/controller/controller.go:489"
      } 

      the problem seems to be, as part of the KnativeServing reconciliation, we're adding https://github.com/openshift-knative/serverless-operator/blob/release-1.35/openshift-knative-operator/cmd/openshift-knative-operator/kodata/knative-serving/latest/2-serving-core.yaml#L558-L571 (the routing-serving-certs Certificate) , which is a certificate.networking.internal.knative.dev , which is a resource that is hooked by the webhook.serving.knative.dev webhook, whose deployment we reconcile together with that resource in the same KnativeServing reconciler...

      The webhook does eventually start up, the reconciliation is retried and the routing-serving-certs Certificate is created, so, eventually, KnativeServing is up and Ready.

              Unassigned Unassigned
              maschmid@redhat.com Marek Schmidt
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: