f11 lec 12 misc topics

Upload: ning2012

Post on 06-Apr-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 F11 Lec 12 Misc Topics

    1/19

    Miscellaneous Topics

    EE M216A .:. Fall 2011

    Lecture 12

    Alireza Tarighat

    [email protected]

    RAM Memory

    mailto:[email protected]:[email protected]
  • 8/3/2019 F11 Lec 12 Misc Topics

    2/19

    D. Markovic / Slide 3

    Types of Memory

    There are many types of memory

    Usually distinguished by type of memory and access method

    Access methods

    Random access memory RAM You can access any memory location at the same speed

    Most common type of memory

    Content address memory CAM Access memory by a search on its contents

    E.g. find location where the upper byte is 250

    Memory Types

    Static SRAM, read/write memory

    Dynamic DRAM, read/write/refresh memory

    Read only ROM, read mostly (PROM, EEPROM) Programmable ROM, Electrically Erasable PROM

    EE216A - Fall 2011 Misc. Topics | 3

    D. Markovic / Slide 4

    Semiconductor Memory Classification

    Read-Write MemoryNon-Volatile

    Read-Write

    Memory

    Read-Only Memory

    EPROM

    E2PROM

    FLASH

    Random

    Access

    Non-Random

    Access

    SRAM

    DRAM

    Mask-Programmed

    Programmable (PROM)

    FIFO

    Shift Register

    CAM

    LIFO

    EE216A - Fall 2011 Misc. Topics | 4

  • 8/3/2019 F11 Lec 12 Misc Topics

    3/19

    D. Markovic / Slide 5

    Memory Architecture: Decoders

    Word 0

    Word 1

    Word 2

    WordN2 2

    WordN2 1

    Storagecell

    Mbits Mbits

    Nwords

    S0

    S1

    S2

    SN2 2

    A0

    A1

    AK2 1

    K5 log2N

    SN2 1

    Word 0

    Word 1

    Word 2

    WordN2 2

    WordN2 1

    Storagecell

    S0

    Input-Output(Mbits)

    Intuitive architecture for N x M memoryToo many select signals:

    N words == N select signalsK = log2N

    Decoder reduces the number of select signals

    Input-Output(Mbits)

    Decoder

    EE216A - Fall 2011 Misc. Topics | 5

    D. Markovic / Slide 6

    Array-Structured Memory ArchitectureProblem: ASPECT RATIO or HEIGHT >> WIDTH

    Amplify swing torail-to-rail amplitude

    Selects appropriateword

    EE216A - Fall 2011 Misc. Topics | 6

  • 8/3/2019 F11 Lec 12 Misc Topics

    4/19

    D. Markovic / Slide 7

    Hierarchical Memory Architecture

    Advantages:

    1. Shorter wires within blocks2. Block address activates only 1 block => power savings

    EE216A - Fall 2011 Misc. Topics | 7

    D. Markovic / Slide 8

    Read-Write Memories (RAM)

    STATIC (SRAM)

    DYNAMIC (DRAM)

    Data stored as long as supply is applied

    Large (6 transistors/cell)

    Fast

    Differential

    Periodic refresh required

    Small (1-3 transistors/cell)

    Slower

    Single Ended

    EE216A - Fall 2011 Misc. Topics | 8

  • 8/3/2019 F11 Lec 12 Misc Topics

    5/19

    D. Markovic / Slide 9

    6-transistor CMOS SRAM Cell

    WL

    BL

    VDD

    M5M6

    M4

    M1

    M2

    M3

    BL

    QQ

    EE216A - Fall 2011 Misc. Topics | 9

    D. Markovic / Slide 10

    3-Transistor DRAM Cell

    No constraints on device ratios

    Reads are non-destructive

    Value stored at node X when writing a 1 = V WWL-VTn

    WWL

    BL1

    M1 X

    M3

    M2

    CS

    BL2

    RWL

    VDD

    VDD2 VT

    DV

    VDD2 VTBL2

    BL1

    X

    RWL

    WWL

    EE216A - Fall 2011 Misc. Topics | 10

  • 8/3/2019 F11 Lec 12 Misc Topics

    6/19

    D. Markovic / Slide 11

    3-Transistor DRAM Cell

    No constraints on device ratiosReads are non-destructive

    Value stored at node X when writing a 1 = V WWL-VTn

    WWL

    BL1

    M1 X

    M3

    M2

    CS

    BL2

    RWL

    VDD

    VDD2 VT

    DVVDD2 VTBL2

    BL1

    X

    RWL

    WWL

    EE216A - Fall 2011 Misc. Topics | 11

    D. Markovic / Slide 12

    RAM: Single-Port Access

    Typical RAM IO list

    CLK (common read/write clock)

    DIN (input)

    DOUT (output)

    ADDR (read/write address)

    EN (enable/disable)

    WR (write/read)

    Usually, there are memory compilers that generate any RAM size

    in any process. Once created, they can be instantiated as an HDLmodule in the system.

    At any clock cycle, only one address is readable in a RAM block

    Single read address (ADDR); Single output data (DOUT)

    At any clock cycle, only write or read operation can be performed

    Single-PortEE216A - Fall 2011 Misc. Topics | 12

  • 8/3/2019 F11 Lec 12 Misc Topics

    7/19

    D. Markovic / Slide 13

    SRAM: Using standard DFF

    RAM block can be created using standard DFFs and a decoder and a MUX

    Typically larger than compact optimized SRAM implementations

    No need for memory compiler

    Implemented using standard cells

    Inefficient for large sizes

    EE216A - Fall 2011 Misc. Topics | 13

    Dataout

    D. Markovic / Slide 14

    DFF-Based RAM (All Std Cells): 32 words x 32 bits

    EE216A - Fall 2011 Misc. Topics | 14

  • 8/3/2019 F11 Lec 12 Misc Topics

    8/19

    D. Markovic / Slide 15

    Different RAM Variations

    Single Port

    One Address (either READ or WRITE at a time) Smallest, most efficient with array-structured memory cells

    Dual Ports (RD/WR)

    RD_ADDR & WR_ADDR

    Read and write different locations at any cycle

    Dual-Read Ports

    WR_ADDR, RD_ADDR1, RD_ADDR2

    Read two locations at a time

    No possible with array-structured memory cells

    Easily implementable with register-file structures There is no standard RAM IO definition

    Even single-port RAMs could have different IO variations

    EE216A - Fall 2011 Misc. Topics | 15

    D. Markovic / Slide 16

    Dual-Read-Port Register-File

    Common register-bank

    Duplicate muxes

    More routing

    EE216A - Fall 2011 Misc. Topics | 16

  • 8/3/2019 F11 Lec 12 Misc Topics

    9/19

    D. Markovic / Slide 17

    Dual Read/Write RAM

    Read & Write (different or same) two location at a cycle

    A commonly required feature A dual-port RAM is almost twice as big as a single-port RAM

    When implemented with memory cells (inevitable for large

    sizes)

    Architectural solutions to avoid dual-port RAMs

    Memory partitioning and address management

    If N-word dual port RAM is required, implement N addresses as two

    separate N/2-word single-port RAMs.

    As part of memory address management, make sure the possible

    simultaneous read and write addresses do not belong to the same N/2-

    size block.

    This is possible in most of applications since read/write addresses arent

    totally arbitrary and random!

    EE216A - Fall 2011 Misc. Topics | 17

    D. Markovic / Slide 18

    Dual Read/Write RAM

    Architectural solutions to avoid dual-port RAMs

    Memory partitioning and address management

    If read/write addresses arent totally arbitrary!

    Implement dual-port RAM by running a single-port RAM at

    twice the speed!

    Assume at every clock cycle (Tclk period), ADDR_WR is to be updated

    and ADDR_RD is to be read.

    Run a single-port RAM at twice clock frequency, in first Tclk/2 period,

    update ADDR_WR and in second Tclk/2 read ADDR_RD.

    It looks and feels like a dual-port RAM!

    Whenever possible, avoid dual-port RAMs

    There is a factor of 2 saving in area!

    EE216A - Fall 2011 Misc. Topics | 18

  • 8/3/2019 F11 Lec 12 Misc Topics

    10/19

    Hardware Reuse

    D. Markovic / Slide 20

    Hardware Reuse

    If fastest clock achievable in a technology process is much

    faster than the desired throughput:

    This can be exploited to aggressively reduce logic area

    Large physical modules such as multipliers can be reused

    multiple times

    Several logical multipliers are implemented using the same physical

    multiplier

    Example: FIR Filter

    EE216A - Fall 2011 Misc. Topics | 20

  • 8/3/2019 F11 Lec 12 Misc Topics

    11/19

    D. Markovic / Slide 21

    Hardware Reuse: FIR Filter

    EE216A - Fall 2011 Misc. Topics | 21

    din(n)

    din(n-1)din(n-2)

    din(n-4)

    din(n-3)

    h0

    h1

    h2

    h4

    h3

    FF

    State-Machine

    Sequencer

    dout n din n k HL1

    =0

    FF

    din[0] din[1]

    din

    dout

    clk

    dout[0] dout[1]

    Clock Domain Crossing

  • 8/3/2019 F11 Lec 12 Misc Topics

    12/19

    D. Markovic / Slide 23

    CDC: Clock Domains

    EE216A - Fall 2011 Misc. Topics | 23

    Single Clock Domain

    Multiple Clock Domain

    D. Markovic / Slide 24

    CDC: Metastability

    EE216A - Fall 2011 Misc. Topics | 24

  • 8/3/2019 F11 Lec 12 Misc Topics

    13/19

    D. Markovic / Slide 25

    Clock Domain Crossing signal

    CDC: Guaranteed Setup/Hold Violation

    When 2 or more designs run on disparate clocks: The clocks will continually skew, guaranteeing setup/hold violations

    Signals from one design to another are Clock Domain Crossings (CDCs)

    EE216A - Fall 2011 25

    D

    CLK

    Q

    Sensor System Guidance System

    Tx

    Clock B

    Clock A

    Setup/hold window

    Signals that crossasynchronous clock

    domains (CDC signals)

    WILL violate setup andhold conditions

    25

    D

    CLK

    Q

    D. Markovic / Slide 26

    CDC: Guaranteed Setup/Hold Violation

    EE216A - Fall 2011 26

    Q

    D

    CLK

    Simulation captures a 1 whilesilicon produces either a 1 or

    0

    Setup Violation

    Q

    D

    CLK

    Hold Violation

    Simulation Does NOT Reflect Silicon BehaviorPropagation from D to Q has an ambiguity of 1 clock cycle!

    Q in silicon Q in silicon

    Simulation captures a 0 whilesilicon produces either a 1 or

    0

    26

    Q in simulationQ in simulation

  • 8/3/2019 F11 Lec 12 Misc Topics

    14/19

    D. Markovic / Slide 27

    CDC: Data Uncertainty

    EE216A - Fall 2011 Misc. Topics | 27

    D. Markovic / Slide 28

    CDC: Data Uncertainty

    EE216A - Fall 2011 Misc. Topics | 28

  • 8/3/2019 F11 Lec 12 Misc Topics

    15/19

    D. Markovic / Slide 29

    CDC: Data Uncertainty

    EE216A - Fall 2011 Misc. Topics | 29

    D. Markovic / Slide 30

    CDC: Divergent Paths

    FSM1_EN and FSM2_EN

    have different profiles

    although they are both

    derived from the same

    input signal.

    EE216A - Fall 2011 Misc. Topics | 30

  • 8/3/2019 F11 Lec 12 Misc Topics

    16/19

    D. Markovic / Slide 31

    CDC: Metastability

    Synchronization FF is used

    when going from CLKA

    domain to CLKB domain.

    Double sampling can lower

    probability of metastability

    DB2 can then be used in

    downstream logic on clock

    domain CLKA

    Although metastability canbe solved by double FFs,

    other problems with CDC

    still persist!

    EE216A - Fall 2011 Misc. Topics | 31

    D. Markovic / Slide 32

    CDC: Timing Closure Across Two Clock Domains

    Enforce and guarantee a timing condition between the two clock

    domains.

    Example: Same C1/C2 Frequency

    If tskew and setup/hold for A with respect to C1 can be constrained;

    opposite edge of C2 can be used to safely sample A and transfer to C2

    domain.EE216A - Fall 2011 Misc. Topics | 32

    c1

    A

    c2

    tskew

  • 8/3/2019 F11 Lec 12 Misc Topics

    17/19

    D. Markovic / Slide 33

    CDC: Timing Closure Across Two Clock Domains

    Opposite edging works only viable if tskew and setup/hold are less than Tclk/2

    period.

    Very effective and robust.

    Once implemented, it can work for any clock period larger than original design

    spec (independent of clock period).

    If timings are not met, increasing clock period can eventually make the

    system work!

    No double sampling required!EE216A - Fall 2011 Misc. Topics | 33

    c1

    A

    c2

    tskew

    D. Markovic / Slide 34

    CDC: Asynchronous Clocks; EN Transfer

    If clock synchronization is not possible, design a system/architecture that is

    robust to 1-2 clock uncertainty in data transfer

    Example:

    Assume signal EN is passed from CLK1 to CLK2. The EN is supposed to be

    used in CLK2 to start a counter. There will be two counters (one in CLK1

    domain and the other in CLK2 domain) expected to be fully synchronized in

    ideal case.

    Use double-sampling to eliminate metastability

    Design your system such that few clock cycles mismatch between the two

    domains wouldnt cause malfunction in the overall operation.

    EE216A - Fall 2011 Misc. Topics | 34

    CLK1

    EN

    CLK2

    tskew

  • 8/3/2019 F11 Lec 12 Misc Topics

    18/19

    D. Markovic / Slide 35

    CDC: Asynchronous Clocks; Data Transfer

    For data transfer scenarios, the CDC scheme should guarantee

    the following: Correct sampling of first data sample

    No data value can be dropped or repeated

    Using faster clock frequency in the destination domain can

    generally help with data transfer.

    A factor of 3 or 4 is generally sufficient

    Example:

    Use both EN and DATA to transfer DATA from CLK1 to CLK2

    EE216A - Fall 2011 Misc. Topics | 35

    D. Markovic / Slide 36

    CDC: Asynchronous Clocks; Data Transfer

    EE216A - Fall 2011 Misc. Topics | 36

    DFF1

    Clk1

    EN

    DFF2

    Clk1

    DATA

    DFF1 DFF2

    Clk2 Clk2

    EN1 EN2

    EN

    D Q

    Clk2

    DATA2

    StateMachine

  • 8/3/2019 F11 Lec 12 Misc Topics

    19/19

    D. Markovic / Slide 37

    CDC: Asynchronous Clocks; Data Transfer

    EE216A - Fall 2011 Misc. Topics | 37

    D1

    clk1

    DATA D2

    clk2

    EN

    D1 D2

    EN1

    EN2

    DATA2