[nsd-users] nsd3 fills disk

Fri Jul 11 09:29:32 UTC 2014

Hi!

My test zone is also served by an nsd3 3.2.15 slave. It seems, that
sometimes the server has too less memory to update to a newer zone file:

Jul 11 04:23:04 nsd[857]: memory recyclebin holds 113128 bytes
Jul 11 04:23:27 nsd[857]: nsec3-prepare took 23 seconds for 1 zones.
Jul 11 04:23:27 nsd[857]: fork failed: Cannot allocate memory
Jul 11 04:23:27 nsd[382]: handle_reload_cmd: reload closed cmd channel
Jul 11 04:23:27 nsd[382]: Reload process 857 failed with status 256,
continuing with old database
Jul 11 04:25:12 nsd[411]: Notify received and accepted, forward to xfrd
Jul 11 04:26:36 nsd[411]: Notify received and accepted, forward to xfrd
Jul 11 04:26:36 nsd[26991]: Handle incoming notify for zone test
Jul 11 04:26:36 nsd[26991]: xfrd: zone test written received XFR from
83.136.34.4 with serial 2014071106 to disk
Jul 11 04:26:36 nsd[26991]: last message repeated 54 times
Jul 11 04:26:36 nsd[26991]: xfrd: zone test committed "xfrd: zone test
received update to serial 2014071106 at time 1405052796 from 83.136.34.4
in 55 parts TSIG verified with key rcode0-distribution"
Jul 11 04:26:36 nsd[382]: signal received, reloading...
Jul 11 04:26:57 nsd[965]: memory recyclebin holds 112696 bytes
Jul 11 04:27:21 nsd[965]: nsec3-prepare took 24 seconds for 1 zones.
Jul 11 04:27:21 nsd[965]: fork failed: Cannot allocate memory
Jul 11 04:27:21 nsd[382]: handle_reload_cmd: reload closed cmd channel
Jul 11 04:27:21 nsd[382]: Reload process 965 failed with status 256,
continuing with old database
Jul 11 04:27:21 nsd[26991]: xfrd: zone test: soa serial 2014071106
update failed, restarting transfer (notified zone)
Jul 11 04:27:21 nsd[26991]: Handle incoming notify for zone test

Ok. Probably I gave too less memory to the VM. But some hours later,
suddenly the disk gets full:

Jul 11 06:55:31 nsd[382]: signal received, reloading...
Jul 11 06:56:21 nsd[2245]: memory recyclebin holds 112696 bytes
Jul 11 06:56:44 nsd[2245]: nsec3-prepare took 23 seconds for 1 zones.
Jul 11 06:56:44 nsd[2245]: fork failed: Cannot allocate memory
Jul 11 06:56:44 nsd[382]: handle_reload_cmd: reload closed cmd channel
Jul 11 06:56:44 nsd[382]: Reload process 2245 failed with status 256,
continuing with old database
Jul 11 06:56:44 nsd[26991]: xfrd: zone test: soa serial 2014071108
update failed, restarting transfer (notified zone)
Jul 11 06:56:44 nsd[26991]: Handle incoming notify for zone test
Jul 11 07:00:13 nsd[411]: Notify received and accepted, forward to xfrd
Jul 11 07:00:13 nsd[26991]: Handle incoming notify for zone test
Jul 11 07:00:13 nsd[26991]: xfrd: zone test written received XFR from
83.136.34.4 with serial 2014071108 to disk
Jul 11 07:00:20 nsd[26991]: last message repeated 2116 times
Jul 11 07:00:20 nsd[26991]: xfrd: zone test written received XFR from
83.136.34.4 with serial 2014071108 to disk
Jul 11 07:00:30 nsd[26991]: last message repeated 5853 times
Jul 11 07:00:30 nsd[26991]: xfrd: zone test written received XFR from
83.136.34.4 with serial 2014071108 to disk
Jul 11 07:00:31 nsd[26991]: last message repeated 669 times
Jul 11 07:00:31 nsd[26991]: short write (disk full?)
Jul 11 07:00:31 nsd[26991]: could not write to file
/var/lib/nsd3/ixfr.db: No space left on device

# ll /var/lib/nsd3/
-rw-r--r-- 1 nsd nsd 3255480320 2014-07-11 09:25 ixfr.db
-rw-r--r-- 1 nsd nsd  546008330 2014-07-11 04:21 nsd.db
-r--r--r-- 1 nsd nsd          0 2014-07-11 09:16 nsd.db.lock
-rw-r--r-- 1 nsd nsd  666578670 2014-07-11 04:20 test.zone
-rw-r--r-- 1 nsd nsd       1493 2014-07-10 14:32 xfrd.state

Why is the ixfr.db file so huge? ixfr.db is 3GB although the zone is
only 600MB large.

May it be that nsd just retransfers the zone again and again, and does
not delete the ixfr.db but append every transfer to the file?

I think that "disk full" is caused by some strange nsd3 behavior due to
"out of memory" and may be fixed.

thanks
Klaus