Communication Between Failover Peers
Failover peers communicate with each other by using a persistent TCP connection.
The failover protocol is asynchronous—that is, either peer can send a message to the
other peer at any time, and there is no restriction placed on the order of the
responses.
Either failover peer can connect to the other; this allows a failover connection to be
established as soon as the second failover peer starts, whether the primary or the
secondary peer starts second. When a connection is established, whether the
secondary or the primary peer initiated the connection, the primary peer sends a
CONNECT message. This message contains identification and authentication informa-
tion, as well as some information about how the primary peer is configured—in
particular, what the MCLT is. If the secondary peer recognizes the primary peer and
is able to authenticate it, it sends a
CONNECTACK message. This message contains
authentication information that is similar to that in the
CONNECT message, as well as
configuration information from the secondary peer. After these two messages have
been successfully exchanged, the peers can communicate normally.
After the failover peers have established a connection, they tell each other what state
they are in, and if necessary, the two peers synchronize their IP address databases.
This process is described in more detail in the section “Operation in the
RECOVER
State,” later in this chapter. When the servers initially connect, after any synchro-
nization has been done, the two failover peers balance each address allocation pool,
making sure that each peer starts out with roughly the same number of IP addresses.
During normal communication, when the DHCP server receives a
DHCPREQUEST
message from a client, it responds with a DHCPACK and then sends a binding update
(
BNDUPD) message to its failover peer. When the peer receives the update, it puts the
update on a queue to be processed. After the update has been processed, the peer
sends a binding acknowledgement (
BNDACK)message in response. BNDUPD and
acknowledgement messages are also used during the synchronization process.
As each failover peer assigns IP addresses to clients, the pool of free addresses may
become unbalanced, with one peer having significantly more free addresses than the
other. In this case, the peer that has fewer addresses performs the appropriate pool-
rebalancing action, as described later in this chapter, in the section, “Pool
Rebalancing.”
During periods of inactivity, each peer sends periodic
CONTACT messages to the other
to probe for network outages. If no message is received from a peer for a certain
period of time, the peer assumes that the connection has broken and begins operat-
ing independently. The connection between peers can also be terminated because
one peer is being shut down; in that case, the server being shut down sends a
DISCONNECT message to its peer, and then both peers close the connection.
CHAPTER 10 Failover Protocol Operation162
013 3273 CH10 10/3/02 4:59 PM Page 162