Download ENGR 5863 COMPUTER ARCHITECTURE

ENGR6859 COMPUTER ENGINEERING FUNDAMENTALS – COMPUTER ARCHITECTURE Problem Set #0; RV; Issued: Mon. Sep. 11, 2006 Problems to be submitted on Oct. 6, 2006: 0, 7, 9, 14 & 16 0. Problem 1.2 in the textbook. 1. Problem 1.7 in the textbook. 2. Problem 1.17 in the textbook. 3. Discuss the issues related to the choice of entity sizes for memory access in a 64-bit processor. Recall this issue is different from that of choosing operand sizes. 4. Read Appendix D (on the web). Give your detailed comments on the choice Intel 80x86 architects have made on the issue in the previous problem. Of course, the Intel microprocessors do not follow the load-store architecture, and so one may not agree that this comparison is fair. 5. Problem 2.5 in the textbook. 6. Problem 2.6 in the textbook. 7. Problem 2.11 in the textbook. 8. Problem 2.12 in the textbook. 9. We are contemplating the addition of floating point divider hardware unit to a processor so that its performance improves for the anticipated application where the processor is going to be used. This application consists of execution of 40% floating point instructions, and 5% of these operations are expected to be divisions. All other floating point instructions could execute as fast as the integer instructions, i.e., in one clock cycle, whereas division is currently implemented in microcode that takes 10 clock cycles to execute. If we go ahead with the plan, of course, the division operation can also be performed in one clock cycle. However, the designers tell us that an inexpensive addition of the floating point square root unit stretches the clock by 20%. Determine if it would be beneficial to incorporate this enhancement, ignoring the additional cost involved. 10. Discuss why processors based on load-store architectures facilitate access of information from memory in various sizes, but limits the operands in ALU operations to the word size. Also discuss why many high performance architectures require aligned memory access. 11. Assume that we make an enhancement to a computer that improves some mode of execution by a factor of 10. Enhanced mode is used 50% of the time, measured as a percentage of the execution time when the enhanced mode is in use. Recall that Amdahl’s law depends on the fraction of the original, unenhanced, execution time that could make use of the enhanced mode. a. What is the speedup we have obtained from fast mode? b. What percentage of the original execution time has been converted to fast mode? 12. You are considering an enhancement to the implementation of the divide operation in the processor your company is designing for a particular application. Assume that divide instruction takes 40 cycles before enhancement. It has been estimated that divide instructions account for 3% of all instructions, and that the average execution time of all other instructions is 2 clock cycles. a. Calculate the percentage of the total time spent for executing divide instructions. b. You have determined that it is possible to reduce the number of cycles required for division to 8, but that this would require a 10% increase in the clock cycle time. Nothing else will be affected. Would you proceed with this enhancement? Why? c. Calculate the maximum percentage decrease in the clock frequency that would still make the above enhancement (reducing divide time to 6 clock cycles) attractive. d. State Amdahl’s law – any form is fine. e. Suppose you are considering another modification which would cut down the number of clock cycles needed for division to 10 clock cycles, while not imposing any penalty on the clock cycle. Calculate the speedup in this case. 13. Measurements have shown that a certain load-store machine uses 45% ALU operations, 20% load operations, 10% store operations, and 25% branch operations. The execution times of these operations are 1, 2, 2, and 2 cycles, respectively. Assume that an optimizing compiler for this machine discards 40% of the arithmetic logic unit (ALU) instructions, although it cannot reduce loads, stores, or branches. Ignore system issues, and assume a 1 ns clock cycle time. a. Calculate the CPI and MIPS rating of the unoptimized code. b. Calculate the CPI and MIPS rating of optimized code. c. What are the execution times with and without optimization? d. Considering the MIPS ratings and execution times computed above, comment on whether optimization improves performance. 14. Measurements have shown that a certain load-store machine uses 40% ALU operations, 25% load operations, 10% store operation, and 25% branch operations. The processor takes 1 clock cycle to execute each ALU instruction, but 2 clock cycles to run each of the other instructions. Assume that an optimizing compiler for this machine discards 40% of the arithmetic logic unit (ALU) instructions, although it cannot reduce loads, stores, or branches. Ignore system issues, and assume a 1 ns clock cycle time. a. Calculate CPI and MIPS ratings of the unoptimized code. b. Calculate CPI and MIPS ratings of the optimized code. c. What are the execution times with and without optimization? d. Considering the MIPS ratings and execution times computed above, comment on whether optimization improves performance. e. Discuss how optimizing compilers, in general, improve the performance. 15. Discuss the advantages and limitations of using a fixed instruction length in a processor. 16. Consider the following code written in MIPS64 assembly language. Here: DADDI R2, R0, #2000 DADD R3, R0, R0 LB R1, 10000(R3) SB 20000(R3), R1 DADDI R3, R3, #1 BNE R3, R2, Here a. Write clearly, in one sentence, what the above code accomplishes. b. Rewrite the above code so that 64-bit load and store, LD and SD, are used in lieu of the byte operations above. c. As MIPS-64 allocates only a 16 bit signed number for the displacement field in an instruction, if the array address starts with a large number, say 100000, the load and store instructions cannot be used as shown above. Rewrite the code so that the same task would be accomplished when array address is large, without changing the number of instructions within the loop.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download ENGR 5863 COMPUTER ARCHITECTURE