Bugzilla – Full Text Bug Listing
|Summary:||Unbound could not be stopped correctly with enabled DNS TAP socket|
|Product:||unbound||Reporter:||Pavel Odintsov <pavel.odintsov>|
|Component:||server||Assignee:||unbound team <unbound-team>|
|Severity:||major||CC:||cathya, edmonds, wouter|
Description Pavel Odintsov 2016-10-27 15:58:24 CEST
Hello! I'm testing DNS Tap capability with following configuration: server: interface: 0.0.0.0 port: 1153 pidfile: "/var/run/unbound/unbound.pid" chroot: "" log-queries: yes username: root do-ip6: no logfile: "/var/log/unbound.log" verbosity: 10 dnstap: dnstap-enable: yes dnstap-socket-path: "/tmp/dnstap.sock" dnstap-send-identity: no dnstap-send-version: no dnstap-log-client-query-messages: yes dnstap-log-client-response-messages: yes dnstap-log-resolver-query-messages: yes dnstap-log-resolver-response-messages: yes dnstap-log-forwarder-query-messages: yes dnstap-log-forwarder-response-messages: yes And I'm using this dnstap client: https://github.com/dnstap/dnstap-ldns Reproduce list: 1) sudo dnstap -u /tmp/dnstap.sock 2) Start unbound in foreground: sudo /usr/sbin/unbound -c /home/vagrant/unbound_dnstap.conf -d 3) Execute some query to this Unbound instance 4) Try to shutdown Unbound instance correctly with CTRL+C. It handled up until you stopped DNS TAP client. Thanks!
Comment 1 Wouter Wijngaards 2016-10-27 16:57:33 CEST
Hi Pavel, That exactly went wrong? If you start unbound with -dd it logs to stderr on the console and with -vvvvv it enables maximum verbosity. What does unbound printout when it goes wrong, and what do you mean with "It handled up until you stopped DNS TAP client.". What handled and what happened when dns tap was stopped? Best regards, Wouter
Comment 2 Pavel Odintsov 2016-10-27 17:27:57 CEST
Hello, Wouter! Expected behaviour is "correct Unbound shutdown". But I see following. I started up DNS TAP client tool: sudo dnstap -u /tmp/dnstap.sock dnstap: opened input socket /tmp/dnstap.sock dnstap.FrameStreamSockInput: accepted a socket connection Then I started Unbound: sudo /usr/sbin/unbound -c /home/vagrant/unbound_dnstap.conf -d -vvvvv Then I tried to shutdown it with CTRL+C:  unbound[5983:0] notice: Start of unbound 1.5.11.  unbound[5983:0] debug: creating udp4 socket 0.0.0.0 1153  unbound[5983:0] debug: creating tcp4 socket 0.0.0.0 1153  unbound[5983:0] error: cannot open pidfile /var/run/unbound/unbound.pid: No such file or directory  unbound[5983:0] debug: cannot chown 0.0 /var/run/unbound/unbound.pid: No such file or directory  unbound[5983:0] debug: chdir to /etc/unbound  unbound[5983:0] debug: drop user privileges, run as root  unbound[5983:0] debug: switching log to /var/log/unbound.log ^C^C^C^C^C Here you could find log content for this time:  unbound[5983:0] info: start of service (unbound 1.5.11).  unbound[5983:0] info: service stopped (unbound 1.5.11).  unbound[5983:0] debug: stop threads  unbound[5983:0] debug: cleanup.  unbound[5983:0] info: server stats for thread 0: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch  unbound[5983:0] info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0  unbound[5983:0] info: mesh has 0 recursion states (0 with reply, 0 detached), 0 waiting replies, 0 recursion replies sent, 0 replies dropped, 0 states jostled out  unbound[5983:0] debug: cache memory msg=66072 rrset=66072 infra=2632 val=66344  unbound[5983:0] notice: closing dnstap socket But Unbound daemon is not stopped. There is only one way to stop it correctly: shutdown DNS TAP client tool. Then I could see:  unbound[5983:0] debug: Exit cleanup.  unbound[5983:0] debug: switching log to stderr
Comment 3 Wouter Wijngaards 2016-10-27 17:47:35 CEST
Hi Pavel, At this point unbound simply calls: fstrm_iothr_destroy(&env->iothr); And this is taking time. It seems to be inside the fstrm library (from dnstap). Not sure how to fix that? Best regards, Wouter
Comment 4 Wouter Wijngaards 2016-10-27 17:49:52 CEST
Hi Pavel, It's documentation says: * Destroy an `fstrm_iothr` object. This signals the background I/O thread to * flush or discard any queued data frames and deallocates any resources used * internally. This function blocks until the I/O thread has terminated. So I guess the I/O thread has not terminated. Not sure why? Are the contents not properly drained and it is waiting for some last packets to be sent? Otherwise, it doesn't seem to be an unbound bug. But clearly, undesired behaviour. Not sure what to do about it from unbound's code. Best regards, Wouter
Comment 5 Robert Edmonds 2016-10-27 20:29:53 CEST
Hi, Pavel: This is definitely a problem in fstrm, not in Unbound. I would recommend opening an issue in the fstrm issue tracker: https://github.com/farsightsec/fstrm/issues.
Comment 6 Pavel Odintsov 2016-11-02 15:10:34 CET
Thanks for attention! I'm going to create issue for fstrm project.