Droms R. The DHCP handbook

Подождите немного. Документ загружается.

pool {

deny dynamic bootp clients;

range 10.0.2.10 10.0.2.209;

}

The configuration file for Server B is shown in Example 18.2. As you can see, there

are four differences between the two configuration files:

• The

domain-name-servers option is different.

• The allocation range for subnet 10.0.1.0 is different.

• Server A serves subnet 10.0.2.0, and Server B does not.

• Server B serves subnet 10.0.3.0, and Server A does not.

Example 18.2

option domain-name “example.org”;

option domain-name-servers 10.0.1.18;

subnet 10.0.1.0 netmask 255.255.255.0 {

option routers 10.0.1.1;

pool {

deny dynamic bootp clients;

range 10.0.1.110 10.0.1.209;

}

subnet 10.0.3.0 netmask 255.255.255.0 {

option routers 10.0.3.1;

pool {

deny dynamic bootp clients;

range 10.0.3.10 10.0.3.209;

}

In order to merge these two configuration files into one, you must resolve the differ-

ences between them. The first difference is actually the most difficult to resolve.

Each DHCP server sends a different

domain-name-servers option. This ensures that

each name server serves half of the DHCP clients on the network. There is no way to

preserve the exact behavior of the two disjoint servers in this case, but what you can

CHAPTER 18 Configuring a Failover Server306

Example 18.1 Continued

022 3273 CH18 10/3/02 5:00 PM Page 306

do is use different name servers for different pools. Example 18.3 shows one way to

do this.

Example 18.3

option domain-name “example.org”;

subnet 10.0.1.0 netmask 255.255.255.0 {

option routers 10.0.1.1;

pool {

option domain-name-servers 10.0.1.17;

deny dynamic bootp clients;

range 10.0.1.10 10.0.1.109;

}

pool {

option domain-name-servers 10.0.1.18;

deny dynamic bootp clients;

range 10.0.1.110 10.0.1.209;

}

subnet 10.0.2.0 netmask 255.255.255.0 {

option routers 10.0.2.1;

option domain-name-servers 10.0.1.17;

pool {

deny dynamic bootp clients;

range 10.0.2.10 10.0.2.209;

}

subnet 10.0.3.0 netmask 255.255.255.0 {

option routers 10.0.3.1;

option domain-name-servers 10.0.1.18;

pool {

deny dynamic bootp clients;

range 10.0.3.10 10.0.3.209;

}

The second difference is that each server is serving a different pool on subnet

10.0.1.0. There are two ways to solve this problem. The first is to copy the

pool declarations into the master configuration file unchanged, as shown in

Example 18.3. The second is to merge them. In this example, we have chosen to

Configuring the ISC DHCP Server to Do Failover 307

022 3273 CH18 10/3/02 5:00 PM Page 307

keep the two pool declarations separate so that we can send a different domain-name-

servers option, depending on the pool from which a client’s address comes.

The third and fourth differences are easily solved. We simply copy the two subnet

declarations, along with their pools, into the master configuration. In order to

preserve the

domain-name-servers behavior from the disjoint configuration, we

move the

option domain-name-servers statement into each subnet declaration. The

final result is shown in Example 18.4.

Example 18.4

option domain-name “example.org”;

subnet 10.0.1.0 netmask 255.255.255.0 {

option routers 10.0.1.1;

pool {

option domain-name-servers 10.0.1.17;

deny dynamic bootp clients;

range 10.0.1.10 10.0.1.109;

}

pool {

option domain-name-servers 10.0.1.18;

deny dynamic bootp clients;

range 10.0.1.110 10.0.1.209;

}

subnet 10.0.2.0 netmask 255.255.255.0 {

option routers 10.0.2.1;

option domain-name-servers 10.0.1.17;

pool {

deny dynamic bootp clients;

range 10.0.2.10 10.0.2.209;

}

subnet 10.0.3.0 netmask 255.255.255.0 {

option routers 10.0.3.1;

option domain-name-servers 10.0.1.18;

pool {

deny dynamic bootp clients;

range 10.0.3.10 10.0.3.209;

}

CHAPTER 18 Configuring a Failover Server308

022 3273 CH18 10/3/02 5:00 PM Page 308

After you have created your master DHCP configuration file, you must establish

some kind of discipline for how you are going to maintain it. If you do not, the two

configuration files will change over time and become different, and that makes them

difficult to maintain. Many ISC DHCP users keep their DHCP configurations in an

open-source product called Concurrent Versioning System (CVS; see

www.cvshome.org).

Configuring the Cooperating Partners Relationship

The configuration for each member of a simple DHCP failover pair is almost identi-

cal, but there is one major difference: the portion of each configuration that

describes the failover relationship. The failover configuration differs on the two

servers for two reasons. First, each server’s configuration has to describe how to

contact the other server, and second, one server is primary and one is secondary.

Other than this, the two configurations should not be different from one another.

Failover Configuration Parameters

To configure a failover relationship on the ISC server, you need to write a failover

peer declaration. If two DHCP servers have a failover relationship, a failover peer

declaration for that relationship must appear in the configuration file of each server.

Each failover peer declaration has a name and a sequence of data defining the rela-

tionship. For our configuration example, we use the values shown in Table 18.2,

which are consistent with the cooperating partners relationship.

TABLE 18.2 Failover Peer Declaration Information

Setting Value on Primary Value on Secondary

Contact address 10.0.1.1 10.0.1.2

Contact port 847 647

Partner address 10.0.1.2 10.0.1.1

Partner port 647 847

Contact timeout 180 180

Maximum pending updates 100 100

MCLT 1800 Not applicable

Free address balance 50% 50%

Load balance split 50% Not applicable

Load balance override 3 3

The following sections describe these settings.

Setting Server Roles

The role setting defines the role that the server plays in the failover pair. One server

is always defined as the primary server, and one server is always defined as the

secondary server. The designation of primary or secondary is used to determine

Configuring the ISC DHCP Server to Do Failover 309

022 3273 CH18 10/3/02 5:00 PM Page 309

which server goes first in certain protocol negotiations. In the ISC DHCP server,

there is never any operational difference between how the primary and secondary

servers act, so it doesn’t matter which server is primary and which is secondary. The

role setting in the ISC server is specified with the

primary or secondary keyword.

Setting Contact and Partner Address and Port

Each server has a contact port and contact IP address—these are the port number

and IP address on which the server listens for connections from its partner. The

partner address and port are the IP address and port to which each server tries to

connect when it is out of communication with its peer. The contact address and port

of the primary server is always the partner address and port of the secondary, and

vice versa. This is a requirement, even for a server that has more than one IP address,

because the contact address is used as an identifier for the connection; if either

server sends a contact address that is not the one the other server is expecting to

receive, the other server refuses the connection. You should use the port numbers

shown in Table 18.2 unless you have some specific reason to use different port

numbers.

In the ISC server, the contact address is specified with the

address statement, and

the contact port is specified with the

port statement. The partner address is specified

with the

peer address statement, and the partner port is specified with the peer

port statement.

Contact Timeout

The contact timeout determines how long a server will wait without receiving any

messages from its partner before it assumes that the connection to its partner has

failed. In this example, we have chosen a timeout of 180 seconds, or 3 minutes. This

allows either server to quickly notice a connection failure with its partner, but it

prevents a temporary network outage—for example, a wire being unplugged from

one port and plugged into another—from breaking the connection between the

servers prematurely. In the ISC server, the contact timeout is specified with the

max-response-delay statement.

Maximum Number of Pending Updates

The maximum number of pending updates defines the number of updates that the

server can accept without blocking input. This parameter might be a hard limit

configured by the server, but the ISC DHCP server always processes updates as they

arrive, so the only reason to choose a particular value for this parameter with the ISC

server is to avoid having the partner send too few updates at a time. In general there

is no need to configure this parameter, but we mention it here for completeness. If

your DHCP server does not insist that you choose a value for this parameter, you

shouldn’t try to configure it. You do not need to configure it on the ISC server;

100 is the default. But if you want to configure this in the ISC server, use the

max-pending-updates keyword.

CHAPTER 18 Configuring a Failover Server310

022 3273 CH18 10/3/02 5:00 PM Page 310

MCLT

The MCLT is the maximum amount of time by which either server can extend a

lease without contacting the other server. This value has to be a compromise

between client lease time and recovery time. You should choose a value that is

reasonably long so that clients that get a lease that is MCLT seconds long have a

useful lease that won’t lead to instability for them. You must not choose a value that

is too long because the MCLT is also the recovery interval for the server. That is, the

longer the MCLT is, the longer it takes to return to normal failover operations after a

server failure.

One other point to consider is that if clients generally get short leases, they need to

renew more often than if they had longer leases. If the normal lease interval for a

client is 5 hours and the MCLT is 30 minutes, when the servers are operating in the

COMMUNICATIONS-INTERRUPTED state, the load on each server is 10 times as great. If

the load on the servers is very light, as is the case with most DHCP servers, this

really isn’t a problem. However, if your DHCP server is serving a very large network,

it might not be able to gracefully handle a tenfold increase in load.

NOTE

To better understand the implications of using short leases, refer the section in Chapter 10

titled “Operation in the PARTNER-DOWN State,” particularly the part that talks about the start

time of state. You should also read the description of the section of Chapter 10 titled

“Operation in the RECOVER State.” The section titled “Operation in the COMMUNICATIONS-

INTERRUPTED

State” is helpful for understanding why a really short value for MCLT could be a

problem.

The MCLT is configured only on the primary server, in order to avoid disagreements

between the primary and secondary servers about its value. On the ISC server, MCLT

is configured by using the

mclt statement.

Free Address Balance

The free address balance is the balance that the primary server tries to strike between

IP addresses in the

FREE state and in the BACKUP state. Addresses in the FREE state are

available for allocation on the primary server, and addresses in the

BACKUP state are

available for allocation on the secondary server. The ISC DHCP server does not

support configuring the free address balance; it only supports a balance of 50%

free/50% backup.

Some other DHCP servers that implement the failover protocol support other free

address balances. The 50% free/50% backup balance works both for the cooperating

partners relationship and the primary/backup relationship. For the backing store

relationship, the balance should be 100% free/0% backup, assuming that the backing

store server is configured as the secondary server.

Configuring the ISC DHCP Server to Do Failover 311

022 3273 CH18 10/3/02 5:00 PM Page 311

Load Balance Split

The load balance split tells the primary server what portion of all clients it should

serve. Each client is assigned to one of 256 different groups, according to the identi-

fication information it sends. Using the load balance split, the primary server

constructs an array of 256 values and sets some elements of the array to one and

some to zero. If a client’s entry in the array is one, the primary server serves that

client. If it is zero, the secondary server serves the client. This allows the two failover

peers to split the workload. The ISC DHCP server allows you to specify either the

load balance split or the individual values in the 256-element array. In general,

there’s no reason to specify the bitmask directly, and in this example we specify the

split.

You can use the load balance split to put two DHCP servers into a primary/backup

relationship. In a primary/backup relationship, the primary server serves all clients

and the backup server serves none, unless the primary server doesn’t respond for

some reason. This corresponds to a split value of 256—that is, all 256 elements in

the array are set to one.

On the ISC server, the load balance split is configured with the

split keyword, and

if you want to specify the load balance array directly, you use the

hba keyword.

Load Balance Override

The load balance override parameter determines when the primary or secondary

server will bypass load balancing and respond to the client even if the client is

supposed to be served by the other server. Every message from a DHCP client

includes a

secs field, which indicates for how many seconds the DHCP client has

been trying to contact a DHCP server. If the value of the

secs field in a DHCP

message is greater than the load balance override parameter, the DHCP server always

attempts to respond to the client, regardless of the load balance split. On the ISC

server, the load balance override is specified by using the

load balance max seconds

keyword.

The ISC Failover Configurations

Given the parameters defined in the preceding sections, we now need to write two

failover declarations. The first, shown in Example 18.5, is the configuration for the

primary server. In order to combine the primary failover configuration with the

master DHCP server configuration, the primary configuration file needs to include

the master configuration. In this example, the primary configuration is in the file

/etc/dhcpd.conf, and the master configuration is in the file /etc/dhcpd.master.

CHAPTER 18 Configuring a Failover Server312

022 3273 CH18 10/3/02 5:00 PM Page 312

Example 18.5

failover peer “example” {

primary;

address 10.0.1.1;

port 847;

peer address 10.0.1.2;

peer port 647;

max-response-delay 180;

mclt 1800;

split 128;

load balance max seconds 3;

}

include “/etc/dhcpd.master”;

The failover configuration for the secondary server is shown in Example 18.6.

Example 18.6

failover peer “example” {

secondary;

address 10.0.1.2;

port 647;

peer address 10.0.1.1;

peer port 847;

max-response-delay 180;

load balance max seconds 3;

}

include “/etc/dhcpd.master”;

Final Failover Configuration Details

The example configuration files presented so far are very nearly complete, but there

are two last details that we haven’t talked about yet. First, because the ISC server

supports arbitrarily complex failover configurations, it does not assume that every

address pool mentioned in the configuration file is part of a single failover relation-

ship defined at the top of the file. You must explicitly define the failover relationship

for each pool. You do this by writing a failover reference within the

pool statement.

Example 18.7 shows an example of a failover reference statement in one of the pools

from Example 18.3.

Configuring the ISC DHCP Server to Do Failover 313

022 3273 CH18 10/3/02 5:00 PM Page 313

Example 18.7

pool {

failover peer “example”;

option domain-name-servers 10.0.1.17;

deny dynamic bootp clients;

range 10.0.1.10 10.0.1.109;

}

The second detail is that the ISC DHCP server does not support failover on address

allocation pools that contain addresses allocated to BOOTP clients. So if you try to

configure a pool for failover but leave out the

deny dynamic bootp clients state-

ment, the DHCP server reports an error and refuses to run. To correct this error, you

simply add a

deny dynamic bootp clients statement to the pool declaration. Be

careful not to confuse the

deny dynamic bootp clients permit statement and the

deny bootp statement. The deny bootp statement does not work in a pool declara-

tion, and it does not correct this error.

Operating a Failover Pair

After you have configured your failover servers, you can begin to use them. The first

step in using them is to get them to communicate with one another. After you do

this, they are operational.

Starting the Servers for the First Time

When you start a failover pair for the first time, the two servers generally refuse to

do anything until they have synchronized with each other. So the first order of busi-

ness is to get them to talk to each other. If you have configured the servers correctly,

this should work without any trouble. If the two servers don’t seem to be able to

communicate, you should check the address and port settings carefully.

The ISC DHCP server does not trust the timekeeping protocol described in the

failover protocol specification. Instead, it requires that you keep the system clocks on

both failover partners synchronized. If the system clocks on the two servers are not

synchronized, the servers refuse to talk to each other, and you see messages in the

system log telling you that the clocks aren’t synchronized. The ISC server doesn’t

require any particular synchronization mechanism—it is fine to synchronize them

by hand. However, it’s much easier to synchronize them by using the Network Time

Protocol (NTP). NTP clients are available for most operating systems, so this should

not be a serious problem.

When the two servers start for the first time, they start in the

RECOVER state. After

they have established communications, each server sends the other server a complete

list of all the leases it has. Through this process, the two servers synchronize their

CHAPTER 18 Configuring a Failover Server314

022 3273 CH18 10/3/02 5:00 PM Page 314

lease databases. This is why it’s a mistake to copy the lease database from a stand-

alone server to its partner when you convert it to a failover pair; if you do that, both

servers have identical lease files, and they take twice as long to synchronize.

When the servers are synchronized, they might both wait out the MCLT before

beginning to serve clients. This is the behavior required by the failover protocol

when a server is in the

RECOVER state. However, if you are starting up for the first

time, both servers are in the

RECOVER state, which isn’t a desirable situation. Some

DHCP servers, including the ISC DHCP server, bypass the waiting period if they

detect that both servers are in the

RECOVER state because this can usually only

happen the first time two servers are configured to do failover.

Normal Operations

After the servers have synchronized, they begin normal operations. This doesn’t

mean the

NORMAL failover state. Normal operations refers to all the failover states

described in Chapter 10. During normal operations, two sorts of failover log

messages are worth watching for: lease update messages and failover state messages.

When the state of either failover partner changes, you see a message in the log for

that state change. The most usual state changes are from

NORMAL to COMMUNICATIONS-

INTERRUPTED and from COMMUNICATIONS-INTERRUPTED to NORMAL. You see a message

about this on one server whenever the other server is stopped. More rarely, you see

this message when the network connection between the two servers has failed.

The second sort of log message is a binding update message. The ISC DHCP server is

usually quiet about binding update messages. The only time you hear about them in

the log is when they fail. The only real reason a binding update would fail is if the

server is buggy or the two servers have lease databases that have gone out of sync.

Operational Problems

During operations, a variety of problems can come up. Some of them have to do

with the fact that the failover protocol is very new, and existing implementations

might still have bugs to work out. Others are just normal operational problems that

can come up even if the DHCP servers are not at all buggy.

Server Down

When one server in a failover pair goes down, the other server continues to provide

service, but in a limited mode called the

COMMUNICATIONS-INTERRUPTED state. To learn

more about this state, see Chapter 10. Because of the limitations of

COMMUNICATIONS-

INTERRUPTED, if the server that has gone down isn’t expected to come back up

quickly, it’s good to put the other server into the

PARTNER-DOWN state. In the

PARTNER-DOWN state, the remaining DHCP server can, after waiting for the MCLT,

completely take over DHCP service on the network, including reclaiming all of the

down server’s IP addresses.

Operating a Failover Pair 315

022 3273 CH18 10/3/02 5:00 PM Page 315