272 ■ Chapter Four Multiprocessors and Thread-Level Parallelism
represent error conditions. “z” means the requested event cannot currently be
processed, and “—” means no action or state change is required.
The following example illustrates the basic operation of this protocol.
Assume that P0 attempts a read to a block that is in state I (Invalid) in all caches.
The cache controller’s action—determined by the table entry that corresponds to
state I and event “read”—is “send GetS/IS
AD
,” which means that the cache con-
troller should issue a GetS (i.e., GetShared) request to the address network and
transition to transient state IS
AD
to wait for the address and data messages. In the
absence of contention, P0’s cache controller will normally receive its own GetS
message first, indicated by the OwnReq column, causing a transition to state IS
D
.
Other cache controllers will handle this request as “Other GetS” in state I. When
the memory controller sees the request on its ADDR_IN queue, it reads the block
from memory and sends a data message to P0. When the data message arrives at
P0’s DATA_IN queue, indicated by the Data column, the cache controller saves
the block in the cache, performs the read, and sets the state to S (i.e., Shared).
A somewhat more complex case arises if node P1 holds the block in state M.
In this case, P1’s action for “Other GetS” causes it to send the data both to P0 and
to memory, and then transition to state S. P0 behaves exactly as before, but the
memory must maintain enough logic or state to (1) not respond to P0’s request
(because P1 will respond) and (2) wait to respond to any future requests for this
block until it receives the data from P1. This requires the memory controller to
implement its own transient states (not shown). Exercise 4.11 explores alternative
ways to implement this functionality.
More complex transitions occur when other requests intervene or cause
address and data messages to arrive out of order. For example, suppose the cache
controller in node P0 initiates a writeback of a block in state Modified. As Figure
4.40 shows, the controller does this by issuing a PutModified coherence request
to the ADDR_OUT queue. Because of the pipelined nature of the address net-
work, node P0 cannot send the data until it sees its own request on the ADDR_IN
queue and determines its place in the total order. This creates an interval, called a
window of vulnerability, where another node’s request may change the action that
should be taken by a cache controller. For example, suppose that node P1 has
issued a GetModified request (i.e., requesting an exclusive copy) for the same
block that arrives during P0’s window of vulnerability for the PutModified
request. In this case, P1’s GetModified request logically occurs before P0’s Put-
Modified request, making it incorrect for P0 to complete the writeback. P0’s
cache controller must respond to P1’s GetModified request by sending the block
to P1 and invalidating its copy. However, P0’s PutModified request remains pend-
ing in the address network, and both P0 and P1 must ignore the request when it
eventually arrives (node P0 ignores the request since its copy has already been
invalidated; node P1 ignores the request since the PutModified was sent by a dif-
ferent node).
4.8 [10/10/10/10/10/10/10] <4.2> Consider the switched network snooping protocol
described above and the cache contents from Figure 4.37. What are the sequence
of transient states that the affected cache blocks move through in each of the fol-