Download answers

Solution of Homework Regarding 3D IC Technology and Emerging 3D Processors Q1) (i) Pros:  Significant reduction of Interconnect RC compared with TSV based 3D IC  Uses small-dimension via to build fine-grained 3D IC  The small-dimension via has negligible resistance and capacitance Cons:  In the process of TSV based 3D IC, two dies are processed in parallel and then vertically  stacked for 3D implementation. In the process of monolithic 3D IC, the second die is processed after the process of the first die. So the monolithic 3D IC has lower product throughput. TSV based 3D IC uses big-dimension via which is easier to be fabricated compared with the small-dimension via used in monolithic 3D IC. Thus, the monolithic 3D IC may have worse product yield and higher cost. (ii) Gate-level monolithic 3D IC builds 3D circuit by stacking two silicon tiers. Standard cells are built in each tier. The small-dimension via called Monolithic Inter-layer Via (MIV) is used for cell-to-cell vertical connection between the two tiers. The cell-to-cell interconnects are thus reduced much compared with the 2D implementation. The transistor-level monolithic 3D IC also uses two silicon tiers for 3D implementation. But each tier has single type of transistor. The p-type transistors are placed in top tier for the implementation of pull-up network of each standard cell, and the n-type transistors are placed in bottom tier for the implementation of pull-down network. The MIVs are used for the connection between the pull-up and pull-down network inside each standard cell. This way, the pull-up network and pull-down network of each standard cell are separated into two tiers and overlapped with each other. Since in transistor-level monolithic 3D IC each standard cell is built with overlapped pull-up network and pull-down network, the standard cells are implemented with significantly reduced footprint which helps the transistor-level monolithic 3D IC based design to achieve reduction of cell-to-cell interconnect. Additionally, the transistor-level monolithic 3D IC can also achieve reduction of intra-cell RC due to the use of 3D routing inside each cell. Therefore, compared with the gate-level approach, transistor-level monolithic 3D IC not only has reduced cell-to-cell interconnect but also has intra-cell RC reduction for standard cell design. (iii) Even if the transistor-level monolithic 3D IC is the most fine-grained monolithic 3D IC, it still relies on conventional CMOS technology and thus inherits the challenges of device scaling, manufacturing complexities. In Skybridge fabric, core aspects from device to circuit style, connectivity, thermal management and manufacturing pathway are co-architected with 3D integration mindset. It completely breaks away from conventional CMOS technology and builds true 3D integration through device-to-system elaboration. In this technology, transistors are stacked on vertical nanowire to form compact design of elementary logic gate with circuit style. Truly fine-grained 3D interconnect are achieved by using bridge and coaxial routing structure for signal propagation in-between nanowires. Q2) ̅̅̅̅̅̅̅̅̅̅̅̅ In Skybridge, each elementary gate can have high fan-in. Thus, the function 𝑓 = 𝐴𝐵𝐶𝐷𝐸𝐹 can be implemented using one NAND gate with 6 inputs. And the NAND gate occupies one nanowire’s footprint. Since the nanowire pitch is 90nm, each nanowire would occupy a footprint of 90nm*90nm in average if it is in a uniform nanowire array. So the footprint is 90nm*90nm==0.0081μm2 In 2D CMOS based implementation (See Figure 1), three NAND gates, two NOR gates and two Inverters need to be used and each gate has two inputs. Each NAND gate has 4 transistors. Each NOR gate has 4 transistors. And each inverter has 2 transistors. So the total count of transistors in the 2D CMOS based implementation is 24. So the footprint is calculated as 24*192nm*82nm=0.378μm2. It is to be noted that the design can be flexible and any other combination of logic gates is fine if the output function is the same. Figure 1: Gate-level schematic of 2D CMOS implementation Q3) (i) In CMOS based processor, the stage with largest delay is the ALU which has 300ps delay. The clock to Q delay of each flip-flop is 20ps and the setup time is 0ps. So the minimum clock period should be 320ps. And the throughput of CMOS based processor is 1/320ps=3.1GHz. In Skybridge based processor, no latching or flip-flop is used between stages since it uses dynamic micro pipeline scheme. And the worst case delay of these stages is 100ps. So the minimum clock period should be 100ps. The throughput is calculated to be 1/100ps=10GHz. (ii) The clock period is equal to the sum of the worst case delay of the stages and the delay of one flip-flop. Then the instruction execution delay can be calculated as ‘number of stages*clock period’. For the 2D CMOS based processor, the instruction execution has a delay of 5*(300ps+20ps)=1600ps. For the Skybridge based processor, the instruction execution has a delay of 16*100ps=1600ps.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download answers