Bug 806

Summary: infra cache doesn't seem to be reused on reload
Product: unbound Reporter: JINMEI Tatuya <jtatuya>
Component: serverAssignee: unbound team <unbound-team>
Status: REOPENED ---    
Severity: normal CC: cathya, wouter
Priority: P5    
Version: unspecified   
Hardware: Other   
OS: All   

Description JINMEI Tatuya 2016-07-28 01:01:19 CEST
Code comment of daemon_cleanup() states that the infra cache is kept in the cleanup (e.g, as part of reloading the configuration):

	 * The infra cache is kept, the timing and edns info is still valid */

Is this really true?  From a quick code inspection, this cache is implemented using a hash table, and the global hash seed is always updated after reload by calling hash_set_raninit() from daemon_create_workers().  So, unless the random seed happens to be set to the same value, the next lookup of the infra cache will fail and a new entry will be created, effectively flushing it (it's slightly even worse as the cache capacity is reduced due to the now-redundant entries).

In fact, from my quick experiment using a single forwarder, infra_lookup_nottl() call from infra_host() returns non-NULL except for the first call, but it returns NULL again after a reload.
Comment 1 Wouter Wijngaards 2016-07-28 09:14:54 CEST
Hi Jinmei,

The comment is wrong.  The infra is removed, and this is a feature.  People want to flush the infra cache, because it gets full of server-unreachable after a disconnect and they want a reload to refresh caches.

I removed the comment, fixing the code.

Best regards, Wouter
Comment 2 JINMEI Tatuya 2016-07-28 17:06:06 CEST
Thanks for the explanation, but I'm not sure if we can simply close it by removing this particular comment.

First, as far as I can see the infra cache is not literally "removed" on reload; it's just that existing entries of the cache are effectively invalidated by the change of random seed for the underlying hash.  In fact, if I suppress the call to hash_set_raninit() from daemon_create_workers() on reload, infra_lookup_nottl() called from infra_host() returns the same non-NULL entry.  So, although it would be quite unlikely in practice, the existing entries could actually be reused after a reload if the new random seed happens to be the same as the old one.

Secondly, especially if you want to keep this "removal emulation", I think the intent and its rationale should be explicitly commented just like other caches.  In its current form code readers would wonder whether the infra cache is to be flushed actually, and if and when they know it is, they would be confused by finding there's no call to infra_delete() (unless its parameters such as # of slabs are changed).

So, IMO, the correct way to handle this issue is:

1. explicitly delete and re-create the infra cache rather than relying on the emulation
2. explicitly document the infra cache is also removed and why (and also how if you still keep the emulated remove)

I'm reopening the issue in case this is worth considering.