Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-45403

unbound service stops responding - EAGAIN (Resource temporarily unavailable)

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Minor Minor
    • None
    • CentOS Stream 9
    • unbound
    • None
    • None
    • Important
    • rhel-sst-cs-net-perf-services
    • ssg_core_services
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • x86_64
    • None

      What were you trying to do that didn't work?

      unbound service stops returning results after being functional for some time.

      This is strace output - it is polling - and on receiving a request EAGAIN (Resource temporarily unavailable)

       

      epoll_wait(8, [{events=EPOLLIN, data={u32=3, u64=3}}], 32, -1) = 1                                                                                                                                                                                                              
      recvfrom(3, "\347\2\1\0\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 65552, 0, {sa_family=AF_INET6, sin6_port=htons(53595), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, [128 => 28]) = 39
      sendto(3, "\347\2\201\202\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 39, 0, {sa_family=AF_INET6, sin6_port=htons(53595), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 39                                                        
      recvfrom(3, "\271\227\1\0\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 65552, 0, {sa_family=AF_INET6, sin6_port=htons(51424), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, [128 => 28]) = 39
      sendto(3, "\271\227\201\202\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 39, 0, {sa_family=AF_INET6, sin6_port=htons(51424), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 39
      recvfrom(3, "\347\2\1\0\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 65552, 0, {sa_family=AF_INET6, sin6_port=htons(53595), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, [128 => 28]) = 39
      sendto(3, "\347\2\201\202\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 39, 0, {sa_family=AF_INET6, sin6_port=htons(53595), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 39                                                        
      recvfrom(3, "\271\227\1\0\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 65552, 0, {sa_family=AF_INET6, sin6_port=htons(51424), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, [128 => 28]) = 39
      sendto(3, "\271\227\201\202\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 39, 0, {sa_family=AF_INET6, sin6_port=htons(51424), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 39
      recvfrom(3, "\344\327\1\0\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 65552, 0, {sa_family=AF_INET6, sin6_port=htons(56440), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, [128 => 28]) = 39
      sendto(3, "\344\327\201\202\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 39, 0, {sa_family=AF_INET6, sin6_port=htons(56440), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 39
      recvfrom(3, "\274\311\1\0\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 65552, 0, {sa_family=AF_INET6, sin6_port=htons(53242), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, [128 => 28]) = 39
      sendto(3, "\274\311\201\202\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 39, 0, {sa_family=AF_INET6, sin6_port=htons(53242), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 39                                                      
      recvfrom(3, "\344\327\1\0\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 65552, 0, {sa_family=AF_INET6, sin6_port=htons(56440), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, [128 => 28]) = 39
      sendto(3, "\344\327\201\202\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 39, 0, {sa_family=AF_INET6, sin6_port=htons(56440), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 39
      recvfrom(3, "\274\311\1\0\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 65552, 0, {sa_family=AF_INET6, sin6_port=htons(53242), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, [128 => 28]) = 39
      sendto(3, "\274\311\201\202\0\1\0\0\0\0\0\0\0012\6centos\4pool\3ntp\3o"..., 39, 0, {sa_family=AF_INET6, sin6_port=htons(53242), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 39                                                      
      recvfrom(3, 0x5641a194bcf0, 65552, 0, 0x7ffcbbc34438, [128]) = -1 EAGAIN (Resource temporarily unavailable)
      

      This is the service configuration:

      [cloud-user@nat64-appliance ~]$ cat /etc/unbound/unbound.conf
      include: "/etc/unbound/conf.d/*.conf"
      server:
        use-systemd: yes
        do-daemonize: no
        do-not-query-localhost: no
        module-config: "dns64 validator iterator"
        interface: ::1
        access-control: ::0/0 refuse
        access-control: 2620:cf:cf:fc00::/64 allow
        dns64-prefix: 2620:cf:cf:fcff::/96
      
      [cloud-user@nat64-appliance ~]$ cat /etc/unbound/conf.d/example.com.conf 
      # Example of an override of the "public DNS tree" with an "internal view"
      # override, for example to add an internal-only corporate DNS zone.
      #
      # The stub-zone/stub-addr must point to AUTHORITATIVE servers. If you want to
      # point to an internal RECURSIVE server, use forward-zone/forward-addr instead.#stub-zone:
      #       name: example.com
      #       stub-prime: no
      #       # if you could trust a lookup, use:
      #       stub-host: a.iana-servers.net.
      #       stub-host: b.iana-servers.net.
      #       # else specify the IP's using:
      #       stub-addr: 199.43.132.53
      #       stub-addr: 2001:500:8c::53
      #       stub-addr: 199.43.133.53
      #       stub-addr: 2001:500:8d::53
      
      [cloud-user@nat64-appliance ~]$ cat /etc/unbound/conf.d/
      example.com.conf     forward-zones.conf   remote-control.conf
      
      [cloud-user@nat64-appliance ~]$ cat /etc/unbound/conf.d/remote-control.conf
      # Remote control config section update.
      # Previous defaults allowed any process to change settings, CVE-2024-1488
      remote-control:
          # set to an absolute path to use a unix local name pipe, certificates
          # are not used for that, so key and cert files need not be present.
          control-interface: "/run/unbound/control"    # For local sockets this option is ignored, and TLS is not used.
          control-use-cert: "yes"
      
      [cloud-user@nat64-appliance ~]$ cat /etc/unbound/conf.d/
      example.com.conf     forward-zones.conf   remote-control.conf
      [cloud-user@nat64-appliance ~]$ cat /etc/unbound/conf.d/forward-zones.conf
      forward-zone:
          name: "."
          forward-addr: 127.0.2.254
      

      Please provide the package NVR for which bug is seen:

      unbound-1.16.2-8.el9.x86_64

      How reproducible:

      The issue reproduces starts after about 1 hour, and then re-occurs more frequently after restarting the service.

      Steps to reproduce

      1. This can be reproduced using this ansible role: https://github.com/openstack-k8s-operators/ci-framework/tree/main/roles/nat64_appliance
      2. The role builds an "appliance" like image, and creates networks + a VM in libvirt - when set up the unbound service listen on `2620:cf:cf:fc00::2`
      3. We are running this as part of CI deploying openshift + a number of VMs - I see this issue after about 1 hour, so it is possible some script to generate some load is needed. Probably not, since we have an DNS server in the lab using this appliance as forwarder - it would be caching requests.

      Expected results

      [zuul@host26 ~]$ dig AAAA @2620:cf:cf:fc00::2 github.com ; <<>> DiG 9.16.23-RH <<>> AAAA @2620:cf:cf:fc00::2 github.com
      ; (1 server found)
      ;; global options: +cmd
      ;; Got answer:
      ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48276
      ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1;; OPT PSEUDOSECTION:
      ; EDNS: version: 0, flags:; udp: 1232
      ;; QUESTION SECTION:
      ;github.com.                    IN      AAAA;; ANSWER SECTION:
      github.com.             40      IN      AAAA    2620:cf:cf:fcff::8c52:7204;; Query time: 2 msec
      ;; SERVER: 2620:cf:cf:fc00::2#53(2620:cf:cf:fc00::2)
      ;; WHEN: Thu Jun 27 23:17:25 EDT 2024
      ;; MSG SIZE  rcvd: 67 

      Actual results

       

      [zuul@host26 ~]$ dig AAAA @2620:cf:cf:fc00::2 github.com ; <<>> DiG 9.16.23-RH <<>> AAAA @2620:cf:cf:fc00::2 github.com
      ; (1 server found)
      ;; global options: +cmd
      ;; Got answer:
      ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 43419
      ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1;; OPT PSEUDOSECTION:
      ; EDNS: version: 0, flags:; udp: 1232
      ;; QUESTION SECTION:
      ;github.com.                    IN      AAAA;; Query time: 0 msec
      ;; SERVER: 2620:cf:cf:fc00::2#53(2620:cf:cf:fc00::2)
      ;; WHEN: Thu Jun 27 23:16:56 EDT 2024
      ;; MSG SIZE  rcvd: 39
       

       

       

              pemensik@redhat.com Petr Mensik
              rhn-gps-hjensas Harald Jensas
              Petr Mensik Petr Mensik
              rhel-cs-infra-services-qe rhel-cs-infra-services-qe rhel-cs-infra-services-qe rhel-cs-infra-services-qe
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: