Bug 1439

Summary: ub_result::bogus not set when authoritative server returns SERVFAIL for signed zone
Product: unbound Reporter: Andrew Ayer <agwa-bugs>
Component: serverAssignee: unbound team <unbound-team>
Status: ASSIGNED ---    
Severity: critical CC: cathya, wouter
Priority: P5    
Version: 1.6.5   
Hardware: All   
OS: All   

Description Andrew Ayer 2017-09-10 00:14:12 CEST
[This bug is present in version 1.6.5, but that's not an option in the version list]

servfail.caatestsuite-dnssec.com has a DNSSEC authentication chain all the way to the root zone.  Its authoritative name server is configured to return SERVFAIL to all queries.  When I query the A record for servfail.caatestsuite-dnssec.com, unbound-host states that the DNSSEC validation status is "insecure":

    $ ./unbound-host -D -v servfail.caatestsuite-dnssec.com. -t A
    Host servfail.caatestsuite-dnssec.com. not found: 2(SERVFAIL). (insecure)

This implies that both ub_result::secure and ub_result::bogus are 0.

Labeling this result "insecure" is a violation of RFC 4033 section 5, which defines "insecure" as:

    "The validating resolver has a trust anchor, a chain of
    trust, and, at some delegation point, signed proof of the
    non-existence of a DS record.  This indicates that subsequent
    branches in the tree are provably insecure.  A validating 
    may have a local policy to mark parts of the domain space as
    insecure." [https://tools.ietf.org/html/rfc4033#section-5]

Since servfail.caatestsuite-dnssec.com has a DNSSEC authentication chain to the root, there is no "signed proof of the non-existence of a DS record".  To the contrary, caatestsuite-dnssec.com is signed and has DS records for servfail.caatestsuite-dnssec.com:

    servfail.caatestsuite-dnssec.com. has DS record 58627 5 2 516C31C125B99494F8AFA3959F12A2C2D566FFF96AF714E13EE77413CC3BD9FF
    servfail.caatestsuite-dnssec.com. has DS record 58627 5 1 15257A91D2A271D4FFAEBDCBBC9D11226F284161

This is also contrary to Unbound's documentation, which states:

    "If !secure and !bogus, this can happen if the data is not secure because security is disabled for that domain name. This means the data is from a domain where data is not signed." [https://www.unbound.net/documentation/doxygen/structub__result.html#a43a077bcb37724c47648f3cf5a676a6f]

I'm marking this "critical" since it could lead to security vulnerabilities in applications that use libunbound.  For example, if a mail server falls back to unauthenticated TLS if looking up the TLSA record fails and ub_result::bogus is not set, then an active attacker could force it to downgrade to unauthenticated TLS by returning a SERVFAIL response for the TLSA query.
Comment 1 Wouter Wijngaards 2017-09-13 09:51:38 CEST
Hi Andrew,

Servfail should not be classified as dnssec bogus.  That is not correct.

What you are right about it that for TLSA, servfail answers (and also timeouts during lookups) should not trigger the fallback to insecure.  The difference is that the servfail represents a failure to lookup the data.  And bogus represents that the crypto validation failed.

Best regards, Wouter
Comment 2 Andrew Ayer 2017-09-13 17:46:00 CEST
Thank you for looking at this, Wouter.

Here is the definition of bogus:

    The validating resolver has a trust anchor and a secure
    delegation indicating that subsidiary data is signed, but the
    response fails to validate for some reason: missing signatures,
    expired signatures, signatures with unsupported algorithms, data
    missing that the relevant NSEC RR says should be present, and so
    forth. [https://tools.ietf.org/html/rfc4033#section-5]

Unbound has a trust anchor.  Unbound has a secure delegation indicating that servfail.caatestsuite-dnssec.com is signed.  The response doesn't validate, because the response is an error and contains no signatures.  Therefore, the response is bogus.

I'm not saying *all* SERVFAILs are bogus, only those where you know the zone is signed.  If the zone isn't signed, because you have "signed proof of the non-existence of a DS record", then the response is insecure.

If the servfail.caatestsuite-dnssec.com. response isn't bogus, what is it?  It definitely can't be insecure, because Unbound does not have "signed proof of the
non-existence of a DS record".
Comment 3 Wouter Wijngaards 2017-09-14 11:04:22 CEST
Hi Andrew,

That definition of bogus, supports that the failure to lookup data is not bogus, but SERVFAIL.  SERVFAIL is a failure to lookup data and not the absence of the data itself.

It is not that the response lacks signatures, but that there is no response to validate at all.  Because there is a lookup failure.

The response is a failure to get a response.  Unbound could not fetch a response, it got only errors or timeouts.  So there is no response.  This turns into SERVFAIL.  Not bogus, because there is nothing there.

For comparison, if you get a memory allocation failure, the crypto routines fail, but that is not a signature verification failure, but a memory allocation failure.  You failed to be able to check.

The line in the RFC about checking for missing signatures, is there because older servers don't include signatures with data, and then the missing signatures should not create a downgrade, but be classified as bogus.  But in this case, the server is simply not answering (only with an error) the queries.

SERVFAIL does not signify the absence of data.  That is NXDOMAIN.  NXDOMAIN does  get protection with DNSSEC, and that error code would become bogus in unbound.

Likely that server responds with servfail because it is not configured to answer for the zone (the zone is not loaded), and so all queries are turned away with an error.

Best regards, Wouter
Comment 4 Andrew Ayer 2017-09-14 16:24:55 CEST
Hi Wouter,

unbound-host labels this response as "insecure."  libunbound's documentation suggests that !secure and !bogus together mean "insecure."  How is this response "insecure" when there is no "signed proof of the non-existence of a DS record"?
Comment 5 Wouter Wijngaards 2017-09-14 16:46:28 CEST
Hi Andrew,

Yes, you are correct that is is not insecure.  It is also not secure, and it is also not bogus.  It is another proof state 'unchecked' (this is what it is called inside unbound's code).  This state means that the DNSSEC status has not been checked because that was not possible.

For the TLSA processing, you should check the rcode, and see if it is not 0 (NOERROR) or 3(NXDOMAIN), and if that is so; always perform the 'lookup is bad, do not downgrade' step, just like you do for bogus results.  And in fact, something you should also do when the lookup produces other errors, eg. ub_ctx_lookup returns an error, it was not possible to make the query (out of memory, network down, that sort of stuff), and those should also be treated like that (i.e. like you suggest for bogus, no downgrade steps).

It would be nice if the interface of libunbound made TLSA (or TLSA-style) processing easier.  But labelling it bogus is simply not true; even if conveniently the same action.  The easiest option right now, would be some sort of note in the documentation of the bogus result value (eg. referencing TLSA processing and to check the rcode) ?

Best regards, Wouter
Comment 6 Andrew Ayer 2017-09-14 17:32:12 CEST
Hi Wouter,

Yes, adding a note in the documentation (and fixing the output of unbound-host) would help.

That said, as practical matter, it would be very useful to be able to tell when a zone is not signed when there is a lookup failure.  An unfortunate number of DNS servers are broken and either return the wrong code (SERVFAIL, NOTIMP) or return nothing at all when queried for an unsupported record type such as TLSA.  Such servers are not standards-compliant, but they exist.  A nice approach is to downgrade on lookup failure if and only if you have proof the zone is not signed.  I agree you should not downgrade when there is an out-of-memory condition or the network is totally down, but if you have signed proof of DS record non-existence, it's safe to downgrade on lookup failure because the zone administrator has opted out of security anyways.

Another application is the checking of CAA records.  Certificate authorities are required to check CAA records before issuing TLS certificates.  CAs are allowed to treat a lookup failure as permission to issue only if the zone is not signed.  Unfortunately, there are a lot of DNS servers returning SERVFAIL or NOTIMP or ignoring CAA queries. 

Calling it "bogus" might not be correct, but if libunbound provided some way to report whether there is proof of DS record non-existence when there's a lookup failure, it would help the above use cases a lot.