failover pair sees that its peer has gone into the SHUTDOWN state, it automatically goes
into the
PARTNER-DOWN state. The peer that is being shut down then completes the
shutdown process and exits.
A member of a failover pair could also fail unexpectedly. In that case, its peer quickly
goes into the
COMMUNICATIONS-INTERRUPTED state. As mentioned in the previous
section, this is not very desirable, even though a server can run indefinitely in this
state. If one failover peer is actually down and not just out of communication, the
server administrator can place the other peer into the
PARTNER-DOWN state.
It is also possible to configure a failover pair with a safe period. The safe period is the
period between the time that a server enters the
COMMUNICATIONS-INTERRUPTED state
and the time that it concludes that the other server is no longer operating. If a safe
period has been set, then when either peer goes into the
COMMUNICATIONS-INTER-
RUPTED state, it sets a safe period timer. When this timer expires, the peer assumes
that the other peer is in fact not operating, and it therefore makes the transition into
the
PARTNER-DOWN state. If the safe period is used, then it is possible that during a
communications failure between the failover peers, the same IP address could be
allocated to two different clients.
Even when peers are not in communication with one another, they can still extend
leases, as long as they follow the rules described earlier in this chapter for determin-
ing the length of a lease. Because the failover peer that is running in the
PARTNER-
DOWN state knows that its peer has followed these rules, and because (as long as the
safe period is not used) it cannot enter the
PARTNER-DOWN state when the partner is
running, it can reliably know when to reclaim IP addresses for which the peer may
have extended the lease.
Whether a server got into the
PARTNER-DOWN state because its peer went into the
SHUTDOWN state while both partners were communicating, or whether it did so
because the peer failed and the administrator directed it to enter the
PARTNER-DOWN
state, the server cannot be sure that it has received a complete set of updates from its
peer. Because of this, the remaining server must treat any lease that its peer could
have extended as if the peer did extend it—but the remaining server knows that the
lease could not have been extended for more than the MCLT beyond the latest lease
time that it has recorded.
When a server changes state, it remembers the start time of state (STOS). When a
server enters the
PARTNER-DOWN state, it can reclaim any available IP address (any
address that is in the
FREE or BACKUP state) that belongs to its peer after MCLT plus
STOS has passed. If an address is in the
ACTIVE, EXPIRED, RELEASED,orRESET state
and the acknowledged potential expiry time is later than STOS, the server can free
the IP address after the acknowledged potential expiry time plus MCLT, or after
MCLT plus STOS, whichever comes last. This is because the failed peer may have
extended the lease to the acknowledged potential expiry time plus MCLT without
Failover Operational States 169
013 3273 CH10 10/3/02 4:59 PM Page 169