DNS Magic

The lastest version of this page can be found at:
http://www.crufty.net/help/dns/dnsmagic.html

This document outlines how we setup our DNS data. We use cvs(1) and make(1) to ensure reliable operation while allowing for multiple hostmasters and automated tools.

The scheme relies on rather rigid rules for naming of zone files etc, but thanks to a simple script, converting from your old DNS setup to this method is quite painless.

The advantages of this setup are:

CVS

Concurrent Version Control (CVS) is a layer upon the Revision Control System (RCS). CVS allows multiple edits concurrently. The tree that named uses to load from is never used for editing, that is done in a separate tree (or trees), so there is never any risk of a partially edited zone file being loaded by named.

Once setup, CVS is very simple to use. The hostmaster might use

$ cd 
$ cvs checkout named
to obtain a private copy of the DNS data for editing. Making changes there after is simply a case of:

$ cd ~/named
$ cvs update			# pick up changes made by others
$ vi hosts/crufty.net.db	# make changes as desired
$ cvs diff -cb			# see what we have changed
$ cvs commit -m"log comment"	# check in the changes.
If the log comment is omitted then the user is dropped into an editor ($EDITOR or vi) to make a log entry. If multiple files have been changed, they can be committed separately or as a group by adding file names to the commit command.

See here for more detail on using CVS.

make

Make(1) is a tool used to keep files up to date with their dependencies. We use it to ensure that the zone files loaded by named are up to date and that the zone file serial numbers are updated when any of the data within the zone is.

bmake

The makefile we provide is configured for the BSD make(1) (we refer to it as bmake when we want to distinguish) which provides greater flexibility. The makefile has alternate rules (commented out) for GNU make, so bmake is recommended but not strictly required.

For those on systems other than NetBSD or similar, http://www.crufty.net/pub/sjg/bmake.tar.gz is an autoconf'd version of the latest NetBSD make and will compile on many different UNIX systems.

gmake

We also provide a GNUmakefile which is tailored to GNU make which will look for that file in preference to makefile or Makefile.

zone files

To achieve our goal, the zone files referenced by named.boot or BIND-8's named.conf contain nothing but the SOA record (where the serial number lives) and an appropriate $INCLUDE directive.

Since make(1) is most conveniently driven by filename suffixes, we use the convention that the SOA file has an extension of .soa and that the included zone file has an extension of .db

An example always helps.

# make depend
dnsdeps -N named.boot
# touch ns.list
# make
updsoa hosts/crufty.net.soa
updsoa rev/203.12.250.soa
bouncedns
In the above, we ran make depend which uses dnsdeps(1) to ensure that all the dependencies of all the (primary) zone files referenced by named.boot are recorded. We then simply touch a file that some zones are dependent on and run make, which runs updsoa to update the serial number of zones that were dependent on ns.list.

Note, we used to have make(1) perform a cvs update automagically, but since any update will not be reflected in the dependency graph, we don't do this any more. See upddns below.

# rm hosts/crufty.net.soa
# make
updsoa hosts/crufty.net.soa
bouncedns
In the above example, we remove one of the .soa files - to simulate an accident or perhaps a new .db file. When we then run make the .soa file is created automagically.

On systems that do not have perl(1) and cannot install it, we have shell scripts that provide the same functionality though at lower speed. The performance really only matters though, when generating PTR records for a large domain.

dependencies

The makefile runs dnsdeps(1) when ever the named.boot file (or named.conf,primary.zones files) is updated. The purpose is to ensure that make(1) knows about all the files that a zone file depends on. The .depend file produced looks like:

hosts/crufty.net.soa: \
	hosts/crufty.net.db \
	mx/crufty.net \
	ns.list

rev/203.12.250.soa: \
	rev/203.12.250.db \
	ns.list

.zones: \
	hosts/crufty.net.soa \
	named.local \
	rev/203.12.250.soa

Reverse maps

Small sites can easily keep their in-addr.arpa zones in sync with the rest of their DNS data. For large nets or for bootstrapping, updrev(8) can be used to build in-addr .db files for all the A records found in the zone files.

Updrev works with data gleaned from a named_dump.db file and respects any existing PTR records provided that a matching A record exists. This makes updrev(8) more suitable for interaction with human hostmasters than our old mkrev.pl(8) tool. That is, updrev can be used to initially generate the reverse maps, and a human can then edit them to override the tool's choices and this intervention need not be repeated (unlike mkrev.pl). For sites that object to using perl(1) updrev.sh is a shell version of the same logic.

The tool is reasonably efficient, updrev.pl(8) can generate or update reverse maps at about 10,000 A records per minute.

Note that updrev only supports the DNS arrangement described in this document.

bouncedns

On a nameserver for a large network, it is not practical to reload/restart named every time a change is made. Even on a small nameserver, we want to reload named when any .soa file is updated but not as each .soa file is updated.

For this reason the bouncedns command above, simply touches a flag file to indicate that a DNS restart is needed. The same command is then run regularly from cron(8) such that if the flag file exists, named is restarted.

Note that it is worthwhile coordinating the cron jobs on secondary servers such that the bouncedns jobs do not all run at the same time.

upddns

To update the tree that named(8) loads from we have cron(8) run dnsmagic on a regular basis. This script:
  1. runs upddns which does a cvs update in named's data directory, and if anything has been updated runs make depend. It then runs make to ensure serial numbers are up to date and to set the bouncedns flag if needed. Finally, if a .rdistrc file exists, upddns will source it - this is handy for SSLrdisting the slave.zones file to secondary servers when needed.
  2. runs bouncedns so as to action the bounce flag if present.
  3. runs /etc/rc_d/named check to ensure that named(8) is running - if for some reason it failed to restart.
The assute reader will note that doing an automated CVS update in the live tree, risks updating that tree between two related commits, possibly introducing just the sort of problem we are trying to avoid. For this reason, if a file named .nocvs is present, the CVS update step is skipped.

As long as administrators are aware of the issue, the .nocvs can be removed and automated updates allowed. When an extensive set of changes are to be performed, .nocvs should be created in the live tree to ensure no automated updates will occur until the commits are complete.

Truely rigid sites might only allow updates of the live tree to be done manually and under change management.

For many sites the cronjob modules included with DNS Magic, should prove quite useful.

Installation

The instructions here assume you are using our configs tool. You can download a suitable archive of DNSMagic for unpacking within the /configs tree from http://www.crufty.net/ftp/pub/unix/DNSMagic.cpio.gz

Eventually we may provide a self contained DNSMagic archive.

Setup

Setup is quite simple thanks to dns_convert.sh An example probably will suffice...

$ mkdir /tmp/named
$ cd /tmp/named
$ dns_convert.sh
$ ls
makefile hosts/ mx/ ns.list db.auth named.boot named.ca rev/ secondary/
$ cvs import -m"original data" named NAMED NAMED_0
$ su
# cd /var
# mv named named.old
# cvs checkout named
...
# cd named
# make
dnsdeps -N named.boot
updsoa hosts/crufty.net.soa
bouncedns
...
# cd /etc
# mv named.boot named.boot.old
# ln -s /var/named/named.boot .
# /etc/rc_d/bouncedns -f
Stopping named
Restarting named
# exit
$ cd
$ cvs checkout named
There after, changes you make in ~/named can be cvs committed and a cvs update and make in /var/named will sort it out.

Regression Testing

A corrupted primary DNS zone can bring a company to its knees. For this reason, regression testing is a must for all but trival setups.

CVS pre-commit checks

Fortunately, CVS makes it simple to enforce regression testing before changes can be committed to the repository.

Simply add a line like:

^named/		/usr/local/share/dns/regress
to $CVSROOT/CVSROOT/commitinfo, and that command will be run when ever a commit is made to $CVSROOT/named. Most systems support starting named(8) with an alternate port and bootfile. This allows named to be started and given a chance to verify its input, without interfering with normal DNS service.

Note that if a large number of files have been updated, CVS may fail to invoke the regression suite due to too many args or rather to long a command line. This then causes the commit to fail. The only work around is to commit the files in several batches. The exact number of files which is too many is system dependent.

An alternative is to modify CVS such that the pre-commit filter is fed its args via stdin rather than the command line. We have a patch which does this if the filter command begins with xargs or its full path. For sites with more than 200 in-addr zone files this is a good option.

dns/regress

dns/regress is a symlink to rc.sh, so will look for the directory dns/regress.d and preform all the checks found there (that start with an S, see rc.sh(8) for details). If all of the checks pass, then the commit proceeds.

The basic modules are (most of these do nothing if NO_NAMED is set in the environment):

dns/regress.d/S10regress.sh
See regress.sh(1) for details. It sets up the environment, and if this is the first call for the current cvs commit, it starts named on a different port, with a trimmed named.boot (produced by dns/Makefile) that does not contain any secondary entries. The named process is killed when dns/regress terminates. For subsequence calls by the same cvs process, we skip the above by setting NO_NAMED (which subsequent tests check) and if the original tests failed we bail out immediately. Since we rely on scanning the syslog output from named(8), we take great pains to verify that syslog is acutally working before starting. Syslog can fail to log due to lack of space or simply due to bugs (at least one major UNIX vendor has a very unreliable syslogd).
dns/regress.d/S20checklog
This module simply checks the syslog output from named, for problems. It is deliberately pedantic (ok, facist), but that's what we want for regression testing. If it sees anything it is looking for the game is over.
dns/regress.d/S20chkorigin
With the DNS setup we are advocating, there is no need for $ORIGIN records in the zone files. Used incorrectly they can cause data to dissappear mysteriously (mysterious to the victim anyway). This module complains bitterly if it finds any $ORIGIN records and suggests an alternative.
dns/regress.d/S40getdb
This module causes named to dump its cache to named_dump.db and then runs getdata which produces a format which is easily searchable using grep(1). This saves us having to support a DNS client which can query named on a non-standard port. It can be omitted if no subsequent tests need to look at the data.
dns/regress.d/local.sh
This module looks for a regress.d directory within the tree being committed and if found runs the tests therein. This is a simple means for providing tests specific to a portion of your DNS data.
dns/regress.d/chkwildmx
Wild card MX's are evil. The only excuse for using them is in an external DNS which basically only provides some MX records. Note that this module is not run by default. Link it to say dns/regress.d/S45chkwildmx or in named/regress.d as it needs S40getdb to have run first. It simply checks that there is at least one wildcard MX record for each domain in $WILD_MX if not, it complains.
dns/regress.d/S70chkcvs
This module, runs cvs update to see which files have not been added or committed to CVS. It then runs make .depend to get the list of files that named(8) will need when it reloads. If any of the needed files have not been added to CVS, it generates an error. If any needed files have been added but not yet committed it issues a warning to that effect. The goal is to avoid committing files that rely on others which have not been committed and thus will not be available to the live named(8).
dns/regress.d/S90cleanup
Just as the name implies.
The simple process of feeding the DNS config into named(8) will pick up the majority of errors. Sites with complex requirements may well find it necessary to add specific tests. Note that the numbering above is quite sparse so it is simple to instantiate new tests.

As mentioned above, if the variable NO_NAMED is set in the environment, then the above tests do very little. Presumably other tests will check the validity of the data in this case. Note that if a group of changes are to be committed individually, then loading up named each time is over-kill. This is the main reason for the variable NO_NAMED, it is set by regress.sh if it detects that it is not the first child of a CVS process and that the original did not fail.

Forcing a commit

If the variable FORCE_COMMIT is set in the environment, then dns/regress.d/regress.sh terminates dns/regress immediately and no checking is done. This should be used with caution.

BIND Versions

If the variable BINDVERS is not set, but named.conf exits in the working directory, then it will be set to 9 if /etc/rndc.* also exists otherwise 8. If named.conf does not exist BINDVERS will be set to 4.

Note that BIND-9 versions prior to 9.3 are not supported.

For BIND-8 or BIND-9 dns/regress.d/S10regress.sh knows to use named.conf.test and to look for the named_dump.db in named's working directory.

Example

;; this zone file has a number of problems that dns/regress
;; will vomit on.  See how many tries it takes to fix them all
;; --sjg
@       IN      SOA     ns.crufty.net. hostmaster.crufty.net.  (
                                1.2 	; Last changed by - sjg
                                7200    ; Refresh 2 hour
                                1800	; Retry 1/2 hour
                                14400 )	; Minimum
$INCLUDE n.list

cool	IN	A	192.168.168.42
	IN	MX	100 cool

foo	IN	A	192.168.168.1
	IN	A	192.168.168.2
	IN	A	192.168.168.
	IN	MX	foo

fool	IN	CNAME	foo
	IN	MX	foo
A first run though regress produces:

Jul 21 12:36:27 gate named[10296]: /tmp/tnamed/named.boot.test: line 16: zone "test." has trailing dot
Jul 21 12:36:27 gate named[10648]: n.list: No such file or directory
Jul 21 12:36:27 gate named[10296]: test.db:5: decimal serial number interpreted as 10002
Jul 21 12:36:27 gate named[10296]: test.db: line 21: unexpected EOF
Jul 21 12:36:27 gate named[10296]: test.db: line 21: database format error (ns.crufty.net.)
Jul 21 12:36:27 gate named[10296]: primary zone "test" rejected due to errors (serial 10002)
After fixing those...
;; this zone file has a number of problems that dns/regress
;; will vomit on.  See how many tries it takes to fix them all
;; --sjg
@       IN      SOA     ns.crufty.net. hostmaster.crufty.net.  (
                                1997072100 ; Last changed by - sjg
                                7200	; Refresh 2 hour
                                1800	; Retry 1/2 hour
				3600000	; Expire 1000 hours
                                14400 )	; Minimum
$INCLUDE ns.list

cool	IN	A	192.168.168.42
	IN	MX	100 cool

foo	IN	A	192.168.168.1
	IN	A	192.168.168.2
	IN	A	192.168.168.
	IN	MX	foo

fool	IN	CNAME	foo
	IN	MX	foo
Regress says:
Jul 21 12:43:29 gate named[10472]: test.db: line 17: database format error (192.168.168.)
Jul 21 12:43:29 gate named[10472]: test.db: line 18: database format error (foo)
Jul 21 12:43:29 gate named[10472]: test.db: line 21: database format error (foo)
Jul 21 12:43:29 gate named[10472]: primary zone "test" rejected due to errors (serial 1997072100)
Ok, this time for sure...
;; this zone file has a number of problems that dns/regress
;; will vomit on.  See how many tries it takes to fix them all
;; --sjg
@       IN      SOA     ns.crufty.net. hostmaster.crufty.net.  (
                                1997072100 ; Last changed by - sjg
                                7200	; Refresh 2 hour
                                1800	; Retry 1/2 hour
				3600000	; Expire 1000 hours
                                14400 )	; Minimum
$INCLUDE ns.list

cool	IN	A	192.168.168.42
	IN	MX	100 cool

foo	IN	A	192.168.168.1
	IN	A	192.168.168.2
	IN	A	192.168.168.3
	IN	MX	10 foo

fool	IN	CNAME	foo
	IN	MX	10 foo
Oops:
Jul 21 12:45:32 gate named[10529]: fool.test has CNAME and other data (invalid)
Well at least now named is happy.

BIND-8

BIND-8 incorporates many improvments over earlier releases. All of the tools described here know how to handle BIND-8's named.conf file (cf. named.boot from earlier versions) but a few points are worth noting:

named.conf
BIND-8's config is much more complex than the old named.boot. BIND-8 allows the config to be split up by use of include files - much like the $INCLUDE records in the zone files. Thus we use a number of include files to handle our primary and secondary zone files.
primary.zones
Lists our primary zone definitions.
slave.zones
Lists our slave zone definitions. This file is automagically generated from primary.zones and is for loading on our secondary servers which inlcude it rather than primary.zones.
secondary.zones
This file contains the zones for which we are a slave (secondary in previous releases) for other than the ones that originated in primary.zones, it is expected to be manually maintained and by default is empty.
pid-file
There is a bug in bind-8.1.1 such that it frobbs the default pid-file (/var/run/named.pid) before reading the config. This causes the following problems with our regression test model: Naturally we have a patch which solves this problem.

logging

Logging in BIND-8 is also much more configurable, and some things that we want the regression suite to pick up are no longer logged by default.

We put the following into named.conf:

logging {
	// we want to know about all problems
	// so that the regression suite will pick them up
	// we only need this on the master.
	category cname { default_syslog; };
	category lame-servers { default_syslog; };
	category insist { default_syslog; };
	// we may also want some of these
	category xfer-out { default_syslog; };
	category statistics { default_syslog; };
	category update { default_syslog; };
	category security { default_syslog; };
	category os { default_syslog; };
	category notify { default_syslog; };
	category response-checks { default_syslog; };
	category maintenance { default_syslog; };
};

BIND-9

BIND-9 is a complete re-write of BIND and is incompatible with earlier versions in several ways.

Cannot listen on port 0
BIND-8 allows us to set the listen port to 0 (gives us a random high numbered port) when running the regression suite, this is not allowed with BIND-9 so we have to revert to picking a port and hoping it is unused. This is far from ideal.
named exits on SIGINT
To a large extent BIND-9 abandons use of signals for controlling named. So we have to detect BIND-9 and use rndc(8) instead for many operations. We use 'rndc dumpdb -all' and rndc blocks until the dump is complete. So this is actually an improvment.
In 9.2 and earier named_dump.db does not contain any authoratative data
This is was the show stopper for supporting BIND-9. In BIND-9 (9.2 and earlier) named_dump.db contains only cached data which is not what we want for generating/validating PTR RR's and other purposes. This is fixed in BIND 9.3 though.
BIND-9 has different logging categories
This makes it necessary to generated a different named.conf for BIND-9 and BIND-8. This is an issue for sites trying to use both versions.
The net result is that we can only support BIND-9.3 or later.

It is possible to run both BIND-8 and BIND-9. We used to run BIND-9 on our gateway NS, and BIND-8 on internal NS's. In particular the NS where cvs commits are done ran BIND-8. On the gateway NS, we use two views:

// master for our external view
view external {
	match-clients { ! internals; any; };

	zone "." {
		type hint;
		file "named.ca";
	};

	zone "127.in-addr.arpa" {
		type master;
		file "named.local";
	};

	include "primary.zones";
	include "secondary.zones";
};

// a slave server for our internal view
view internal {
	match-clients { internals; };

	zone "127.in-addr.arpa" {
		type master;
		file "named.local";
	};

	include "internal-slave.zones";
};
It requires some kludges in Makefile.inc to convert the BIND-9 named.conf into a named.conf.test that is acceptible to BIND-8 so that the regression suite can still be used, but it works.

Upgrading to BIND 9.3 is recommended.


$Id: dnsmagic.html,v 1.40 2006/03/20 22:38:02 sjg Exp $
Copyright © 1997-2001 Simon J. Gerraty