## **Chapter Exam**

# **Chapter 4 – A Simple Implementation Scheme (4.1~4.4)**

### 2011/05/17

 According Fig 1, Fig 2 and Fig 3, please fill the results in the blank cells(1)~(36) in the Table 1. (You must draw the Table on your answer paper and fill the results)
 (36. points)

| R-type         | 0       | rs    | rt    | rd    | shamt   | funct |
|----------------|---------|-------|-------|-------|---------|-------|
|                | 31:26   | 25:21 | 20:16 | 15:11 | 10:6    | 5:0   |
| Load/<br>Store | 35 / 43 | rs    | rt    |       | address | ,     |
| otore          | 31:26   | 25:21 | 20:16 |       | 15:0    |       |
| Branch         | 4       | rs    | rt    |       | address |       |
|                | 31:26   | 25:21 | 20:16 |       | 15:0    |       |
|                |         |       | Fig 1 |       |         |       |

| opcode | ALUOp | Operation        | funct  | ALU function     | ALU control |
|--------|-------|------------------|--------|------------------|-------------|
| lw     | 00    | load word        | XXXXXX | add              | 0010        |
| sw     | 00    | store word       | XXXXXX | add              | 0010        |
| beq    | 01    | branch equal     | XXXXXX | subtract         | 0110        |
| R-type | 10    | add              | 100000 | add              | 0010        |
|        |       | subtract         | 100010 | subtract         | 0110        |
|        |       | AND              | 100100 | AND              | 0000        |
|        |       | OR               | 100101 | OR               | 0001        |
|        |       | set-on-less-than | 101010 | set-on-less-than | 0111        |

Fig 2



Fig 3

|             |        |        | Memto- | Reg   | Mem  | Mem   |        |        |                    |
|-------------|--------|--------|--------|-------|------|-------|--------|--------|--------------------|
| Instruction | RegDst | ALUSrc | Reg    | Write | Read | Write | Branch | ALUOp1 | ALUp0              |
| R-format    | (1)    | (2)    | (3)    | (4)   | (5)  | (6)   | (7)    | (8)    | (9)                |
| lw          | (10)   | (11)   | (12)   | (13)  | (14) | (15)  | (16)   | (17)   | (18)               |
| SW          | (19)   | (20)   | (21)   | (22)  | (23) | (24)  | (25)   | (26)   | (27)               |
| beq         | (28)   | (29)   | (30)   | (31)  | (32) | (33)  | (34)   | (35)   | <mark>(36</mark> ) |

#### Table 1

2. According to the OP code of Fig 1 and the answer of question 1, Please write the logic equations for control signals ALUSrc and RegWrite, (24. points) For example, the logic equation for MemRead is "I<sub>31</sub>(~I<sub>30</sub>)(~I<sub>29</sub>)(~I<sub>28</sub>)I<sub>27</sub>I<sub>26</sub>."

## **Chapter Exam**

## Chapter 4 – The Processor (4.5 - 4.10)

2011/06/08

- 1. Given Fig.1, what are the control signals at 5th cycle using following instructions?
  - lw \$10, 20(\$1)
  - sub \$11, \$2, \$3
  - add \$12, \$3, \$4
  - lw \$13, 24(\$1)
  - add \$14, \$5, \$6

PCSro



Fig.1

| RegWrite | RegDst | ALUSrc | MemtoReg | PCSrc |
|----------|--------|--------|----------|-------|
| (1)      | (2)    | (3)    | (4)      | (5)   |

2. Given Fig.2, which is including data path with forwarding and hazard detection, please complete following condition?

#### 課程: Computer Organization, 國立中山大學資訊工程學系,教師:黃英哲





Forwarding of EX hazard

```
if (EX/MEM.RegWrite and (EX/MEM.RegisterRd \neq 0)
and (EX/MEM.RegisterRd = (6))) ForwardA = 10
if (EX/MEM.RegWrite and (EX/MEM.RegisterRd \neq 0)
and (EX/MEM.RegisterRd = (7))) ForwardB = 10
<u>Forwarding of MEM hazard</u>
if (MEM/WB.RegWrite and (MEM/WB.RegisterRd \neq 0)
and (MEM/WB.RegisterRd = (8))) ForwardA = 01
if (MEM/WB.RegWrite and (MEM/WB.RegisterRd \neq 0)
and (MEM/WB.RegWrite and (MEM/WB.RegisterRd \neq 0)
and (MEM/WB.RegWrite and (MEM/WB.RegisterRd \neq 0)
if (MEM/WB.RegWrite and (MEM/WB.RegisterRd \neq 0)
and (MEM/WB.RegisterRd = (9))) ForwardB = 01
<u>Hazard detection</u>
if ((10) and
((ID/EX.RegisterRt = IF/ID.RegisterRs) or
(ID/EX.RegisterRt = IF/ID.RegisterRt))) stall the pipeline
```

3. Given Fig.3, which is pipelined branch, what are contains of blank with branch taken?

36: sub \$10, \$4, \$8 beq \$1, \$3, 7 40: and \$12, \$2, \$5 44: 48: \$13, \$2, \$6 or 52: add \$14, \$4, \$2 56: slt \$15, \$6, \$7 ... 72: \$4, 50(\$7) lw

#### 課程: Computer Organization, 國立中山大學資訊工程學系,教師:黃英哲





4. Given Fig.4, what are the arc symbols of 2-bit prediction scheme? (Taken or Not taken)



5. Given Fig.5, which is pipelined exception, what are contains of blank?

| 40    | sub  | \$11, \$ | 52, \$4 |           |
|-------|------|----------|---------|-----------|
| 44    | and  | \$12, \$ | 52, \$5 |           |
| 48    | or   | \$13, \$ | 52, \$6 |           |
| 4C    | add  | \$1, \$2 | 2, \$1  |           |
| 50    | slt  | \$15, \$ | 56, \$7 |           |
| 54    | lw   | \$16, 5  | 50(\$7) | )         |
|       |      |          |         |           |
| 80000 | )180 | SW       | \$25,   | 1000(\$0) |
| 80000 | )184 | SW       | \$26,   | 1004(\$0) |

### 課程: Computer Organization, 國立中山大學資訊工程學系,教師:黃英哲



How well loop unrolling and scheduling works on a static two-issue pipeline for MIPS, assume that the loop index is a multiple of four? Loop: lw \$t0, 0(\$s1)

|      | 1                  |
|------|--------------------|
| addu | \$t0, \$t0, \$s2   |
| SW   | \$t0, 0(\$s1)      |
| addi | \$s1, \$s1,-4      |
| bne  | \$s1, \$zero, Loop |

|       | ALU or branch instruction | Data transfer instruction | Clock cycle |  |  |  |
|-------|---------------------------|---------------------------|-------------|--|--|--|
| Loop: | addi \$\$1, \$\$1,-16     | (29)                      | 1           |  |  |  |
|       |                           | lw \$t1, 12(\$s1)         | 2           |  |  |  |
|       | addu \$t0, \$t0, \$s2     | lw \$t2, 8(\$s1)          | 3           |  |  |  |
|       | (26)                      | lw \$t3, 4(\$s1)          | 4           |  |  |  |
|       | (27)                      | (30)                      | 5           |  |  |  |
|       | (28)                      | sw \$t1, 12(\$s1)         | 6           |  |  |  |
|       |                           | sw \$t2, 8(\$s1)          | 7           |  |  |  |
|       | bne \$s1, \$zero, Loop    | sw \$t3, 4(\$s1)          | 8           |  |  |  |

### **Chapter Exam**

## <u>Chapter 5 – Exploiting Memory Hierarchy</u>

### 2011/06/21

7. For a direct-mapped cache design with 32-bit address, the following bits of the address are used to access the cache.

| Tag   | Index | Offset |
|-------|-------|--------|
| 31-10 | 9-4   | 3-0    |

(1) What is the cache line size? (10%)

(2) How many entries does the cache have? (10%)

- 8. How many total bits are required for a direct-mapped cache with 16 KB of data and 4-word blocks, assuming a 32-bit address? (20%)
- 9. Consider a cache with 64 blocks and a block size of 16 bytes. To what block number does byte address 1200 map? (20%)
- 10. Given a cache block of four words and a one-word-wide memory organization, assume 1 memory bus clock cycle to send the address, 15 memory bus clock cycles for each DRAM access initiated, 1 memory bus clock cycle to send a word of data.
  - (1) What is the miss penalty? (10%)
  - (2) What is the number of bytes transferred per bus clock cycle for a single miss? (10%)
- 11. Find the AMAT for a processor with a 1 ns clock cycle time, a miss penalty of 20 clock cycles, a miss rate of 0.05 misses per instruction, and a cache access time of 1 clock cycle. Assume that the read and write miss penalties are same and ignore other write stalls. (20%)