Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COMP541 Memories - I Montek Singh Feb 25, 2010 1 Topics Midterm Test Thursday after Spring Break Lab Preview: VGA character terminal Overview of Memory Types ROMs: PROMs, FLASH, etc. RAMs Random-Access Memory (RAM) Static today Dynamic next 2 Lab: VGA Display Driver Architecture No frame buffer Character terminal From/To CPU Screen Character Memory Bitmap Memory bitmaps by rows RGB VGA Driver HSync VSync Valid, VSync, HSync Timing Generator 3 Character Memory Dual ported Memory mapped CPU writes Could read also To CPU How many characters? Screen Character Memory Bitmap Memory bitmaps by rows RGB VGA Driver HSync VSync Valid, VSync, HSync Timing Generator 4 Bitmap Memory What bitmap size? 5x7 at least Codes http://www.piclist.com/techref/datafile/charsets.htm http://www.piclist.com/techref/datafile/charset/8x8.htm Indexed by character memory So what code to store in character memory? What size should memory be? 5 VGA driver Just sends hsync, vsync Track current row/column something the Timing Generator should provide the VGA Driver To CPU Screen Character Memory Bitmap Memory bitmaps by rows RGB Generates color When valid Maybe smaller than VGA VGA Driver HSync VSync Valid, VSync, HSync Timing Generator What character code? ASCII? How many rows and columns? 6 Possibilities Code color into some bits of character? Other possibilities Sprites for games? Your own Nintendo Ideas? 7 RAM on FPGA Ours has 28 blocks, each 18Kb (bits, not bytes!) They call it block RAM Block RAM: One or two ports, and several possible layouts Often you’ll use it as a 16Kb RAM module 8 Using from Verilog It’s a primitive Instantiate a block (here called R1) RAMB16_S1 R1( .DO(out), .ADDR(addr), .CLK(clk), .DI(in), .EN(ena), .SSR(1’b0), .WE(we) ); // 1-bit Data Output // 14-bit Address Input // Clock // 1-bit Data Input // RAM Enable Input // Synchronous Set/Reset Input // Write Enable Input 9 4-Wide Block RAMB16_S4 RAMB16_S4_inst ( .DO(DO), // 4-bit Data Output .ADDR(ADDR), // 12-bit Address Input .CLK(CLK), // Clock .DI(DI), // 4-bit Data Input .EN(EN), // RAM Enable Input .SSR(SSR), // Synchronous Set/Reset Input .WE(WE) // Write Enable Input ); 10 Wider Have Parity RAMB16_S18 RAMB16_S18_inst ( .DO(DO), // 16-bit Data Output .DOP(DOP), // 2-bit parity Output .ADDR(ADDR), // 10-bit Address Input .CLK(CLK), // Clock .DI(DI), // 16-bit Data Input .DIP(DIP), // 2-bit parity Input .EN(EN), // RAM Enable Input .SSR(SSR), // Synchronous Set/Reset Input .WE(WE) // Write Enable Input ); 11 Can Initialize Block RAM RAMB16_S1 #( .INIT(1'b0), // Value of output RAM registers at startup .SRVAL(1'b0), // Output value upon SSR assertion .WRITE_MODE("WRITE_FIRST"), // WRITE_FIRST, READ_FIRST or NO_CHANGE // The following INIT_xx declarations specify the initial contents of the RAM // Address 0 to 4095 .INIT_00(256'h0000000000000000000000000000000000000000000000000000000000000F1F), .INIT_01(256'h0000000000000000000000000000000000000000000000000000000000000000), … .INIT_3E(256'h0000000000000000000000000000000000000000000000000000000000000000), .INIT_3F(256'h0000000000000000000000000000000000000000000000000000000000000000) ) RAMB16_S1_inst ( .DO(data), // 1-bit Data Output .ADDR(addr), // 14-bit Address Input .CLK(clk), // Clock .DI(DI), // 1-bit Data Input .EN(EN), // RAM Enable Input .SSR(SSR), // Synchronous Set/Reset Input .WE(WE) // Write Enable Input Note that addresses ); go right to left, top to bottom 12 Synthesizer Can Also Infer Careful how you specify (see XST manual). module inferRAM(clk, addr, data, we); input clk; input [8:0] addr; // 512 locations output [7:0] data; // by 8 bits input we; reg [7:0] mem [511:0]; reg [8:0] ra; always @ (posedge clk) begin if(we) mem[addr] <= data; ra <= addr; end assign data = mem[ra]; endmodule 13 Look at Test Code RAM testing example I’ll post online for tomorrow’s lab Note how memory values are specified Addresses go right-to-left, top-to-bottom See the Constraints Guide and Library manuals in Xilinx docs 14 Today’s lecture 15 Types of Memory Many dimensions Read Only vs Read/Write (or write seldom) Volatile vs Non-Volatile Requires refresh or not Look at ROM first to examine interface 16 Non-Volatile Memory Technologies Mask (old) Fuses (old) Electrically erasable 17 Details of ROM Memory that is permanent k address lines 2k items n bits 18 Notional View of Internals 19 Programmed Truth Table 20 Resulting Programming In truth, they’re laid out in 2D (row, col) 21 Mask ROMs Oldest technology Originally “mask” used as last step in manufacturing Specify metal layer (connections) Used for volume applications Long turnaround Used for applications such as embedded systems and, in the old days, boot ROM 22 Programmable ROM (PROM) First ones had fusible links High voltage would blow out links Fast to program Single use 23 UV EPROM Erasable PROM Common technologies used UV light to erase complete device Took about 10 minutes Holds state as charge in very well insulated areas of the chip Nonvolatile for several (10?) years 24 EEPROM Electrically Erasable PROM Similar technology to UV EPROM Erased in blocks by higher voltage Programming is slower than reading Some called flash memory Digital cameras, MP3 players, BIOS Limited life Some support individual word write, some block One on Xess board has 5 blocks Has a boot block that is carefully protected 25 How Flash Works Special transistor with floating gate This is part of device surrounded by insulation So charge placed there can stay for years Aside: some newer devices store multiple bits of info in a cell Interested in this? If so, we can cover in more detail w/ transistors 26 Read/Write Memories Flash is obviously writeable But not meant to be written rapidly (say at CPU rates) And often by blocks (disk replacement) On to RAM 27 Random Access Memories So called because it takes same amount of time to address any particular location Not quite true for modern DRAMs First look at asynchronous static RAM Ones on Xilinx chip synchronous Data available at clock edges, like registers One on board can be both 28 Simple View of RAM Of some word size n Some capacity 2k k bits of address line Maybe have read line Strictly speaking may not need Have a write line 29 1K x 16 memory Variety of sizes From 1-bit wide Issue is no. of pins Memory size often specified in bytes This would be 2KB memory 10 address lines and 16 data lines 30 Writing Sequence of steps Setup address lines Setup data lines Activate write line (maybe a pos edge) 31 Reading Steps Setup address lines Activate read line Data available after specified amt of time For async Synchronous memories use a clock 32 Chip Select Usually a line to enable the chip Why? 33 Writing 34 Reading 35 Static vs Dynamic RAM SRAM vs DRAM DRAM stores charge in capacitor Disappears after short period of time Must be refreshed SRAM easier to use Uses transistors (think of it as latch) Faster More expensive per bit Smaller sizes 36 Structure of SRAM Control logic One memory cell per bit Cell consists of one or more transistors Not really a latch made of NANDs/NORs, but logically equivalent 37 Simple Organization In reality, more complex Note that only one wordline H at a time 2:4 Decoder 11 Address wordline3 2 10 01 00 bitline2 wordline2 wordline1 wordline0 bitline1 stored bit = 0 stored bit = 1 stored bit = 0 stored bit = 1 stored bit = 0 stored bit = 0 stored bit = 1 stored bit = 1 stored bit = 0 stored bit = 0 stored bit = 1 stored bit = 1 Data2 Data1 bitline0 Data0 38 Bit Slice Cells connected to form 1 bit position Word Select gates one latch from address lines Note it selects Reads also B (and B’) set by R/W, Data In and BitSelect Funny thing here when you write. What is it? 39 Bit Cells bitline wordline stored bit Example: bitline = Z bitline = 0 wordline = 1 wordline = 0 stored bit = 0 stored bit = 0 bitline = Z bitline = 1 wordline = 1 wordline = 0 stored bit = 1 (a) stored bit = 1 (b) Bit Slice can Become Module Basically bit slice is a X1 memory Next 41 SRAM Bit Cell bitline wordline stored bit bitline wordline bitline 16 X 1 RAM “Chip” Now shows decoder 43 Row/Column If RAM gets large, there is a large decoder Also run into chip layout issues Larger memories usually “2D” in a matrix layout Next Slide 44 16 X 1 RAM as 4 X 4 Array Two decoders Row Column Address just broken up Not visible from outside on SRAMs 45 Change to 8 X 2 RAM Minor change in logic Also pinouts What’s different? 46 Realistic Sizes Imagine 256K memory as 32K X 8 One column layout would need 15-bit decoder with 32K outputs! Can make a square layout with 9-bit row and 6-bit column decoders 47 SRAM Performance Current ones have cycle times in low nanoseconds (say 2.5ns) Used as cache (typically on-chip or off-chip secondary cache) Sizes up to 8Mbit or so for fast chips SRAMs also common for low power 48 Wider Memory What if you don’t have enough bit width? 49 Larger/Wider Memories Made up from sets of chips Consider a 64K by 8 RAM 50 Larger 256K X 8 Decoder for high- order 2 bits Selects chip Look at selection logic Address ranges Tri-state outputs 51 Deeper Memory Adding chips to increase storage, but keep same width Need decoder 52 Today Fast look at non-volatile memory Static RAM Next: Dynamic RAM Complex, largest, cheap Much more design effort to use 53