## Exercise 1. Given a shared memory platform with two processors connected through a bus and a single memory bank. Processors have cache. There are two possible implementations of this architecture: - Through an MSI invalidation protocol. The traffic associated with the operation of invalidation is 16 bits. - Through an update protocol. The cost of an upgrade is 64 bits. The cost associated with a cache miss used to notify the miss is 32 bits and to bring the data to the cache is 64 bits. lock: t&s .R1, 0x FFFF0001 bnez .R1, \$lock sleep(10) ret unlock: st 0 0x FFFF0001 ret PrWr/BusRdX S BusRdX/Flush BusRdX/ PrRd/BusRd PrRd/BusRd/- State: - Describe the sequence of the implemented lock/unlock solution. - If the processors operate at 200MIPS (million instructions per second) and all instructions have the same runtime, calculate for each implementation the bus traffic (in bits transferred) associated with the execution of the lock and unlock. - For the case of updates: if the bus has a bandwidth of 3Gbits/s . Does it saturate the bus during the lock wait? - What effect produces the bus saturation on the program performance? # Exercise 2. Given a platform with the following features: two processors connected through a bus and a single memory bank. Each processor has a local cache. There are two possible implementations of this architecture: - Through an invalidation protocol. The traffic associated with the operation of invalidation is 16 bits. - Through an update protocol. The cost of an upgrade is 64 bits. The cost associated with a read cache miss is 32 bits and 64 bits to bring the data to the cache. The two processors run the code in the same instant of time. That is, both "while" are reached simultaneously. Calculate for each implementation the bus traffic (in bits transferred) associated with the execution of the lock and unlock for the two processors. # Exercise 3. Consider the following two program patterns: ### Pattern 1 ``` Repeat k times: Processor P1 writes a new value in variable V PN processors P2 to read the value of variable V \, ``` #### Pattern 2 ``` Repeat k times: Processor P1 writes variable V M times Processor P2 reads variable V ``` These programs run on a shared memory multiprocessor. Assume that an update requires 14 bytes (6 bytes for the address and command, plus 8 bytes of data to update the word), and a cache miss requires 70 bytes (6 bytes for the address and command, plus 64 bytes for data for the cache line). Assume further that N = 16, M = 10 and k = 10, and all the caches are initially empty. What is the cost in bytes of traffic originating in both patterns when using the program update protocol? Explain your answer. # Exercise 4: The following program is executed in 3 processors in a shared memory architecture with sequential consistency. X and Y are initialized to zero values (x=0, y=0). Reason the possible printed values assuming that the sentences are executed in order and are atomic. | Processor 1 | Processor 2 | Processor 3 | | |--------------|--------------|--------------|--| | (1a) print x | Lock (l) | Lock (l) | | | | (2a) y=1 | (3a) x=1 | | | | (2b) print x | (3b) print y | | | | Unlock (l) | Unlock (l) | |