Wafer Acceptance Test (WAT)

温德通. 集成电路制造工艺与工程应用. 机械工业出版社 2018

Wafer acceptance testing (WAT) also known as Process Control Monitoring (PCM)

image-20250802091601281


image-20250802101539555

Short Lg Stackgate

TSMC. VLSI2025 JFS2-1: Analog Cells DTCO (Design and Technology Co-Optimization) and Their Impact on Advanced Node CMOS Analog/MixedSignal Circuits

image-20250719221634273

smaller W*L*M, X*Y for same mismatch with short Lg stackgate


image-20250719221918689

N7/N5 4-fin Grid Rule

Same Fin1/Fin3 or Fin2/Fin4 Fin Position


image-20250719222047276

image-20250719222146669

note W/L is different \(12/(135*2) \lt 6/(8*8)\)

Current Density (EM)

image-20250712144939547

image-20250712145414052

Interconnect Resistance Evolution

image-20250703232709089

White Paper: Microelectronics/Semiconductor Research Community Virtual Workshop 2022 [https://nnci.net/sites/default/files/inline-files/Microelectronics%202022%20Workshop%20Report%20with%20Slides.pdf]

Copper Pillar Bump vs Solder bump

Cu-pillar bumping is a next-generation flip chip interconnection between chip & packages, especially for fine pitch applications

img

img

  • On the wafer end, comparing to solder bump, cu-pillar bump provides the advantage of fine pitch; the die size can be reduced about 5~10%.

  • On the package end, the substrate layer can be reduced from 6 layers to 4 layers by fine pitch and bump on trace process and using simplified substrate process.

image-20250613233806417

Why Your Symmetric Layouts Are Showing Mismatches in SPICE Simulations

[https://www.ansys.com/blog/symmetric-layouts-showing-mismatches-spice-simulations]

figure-2

The root cause of the delay mismatch is related to how parasitic extraction tools distribute coupling capacitances over the nodes of the resistive networks

The most likely reason for such asymmetry is the anisotropy of computational geometry algorithms used by extraction tools.

figure-4

STRAP

A "strap" refers to a low-impedance connection

image-20230518001007350

NWDMY = NWDMY1, NWDMY2

STRAP = NWSTRAP or PWSTRAP

NWSTRAP = {NP & OD} & {NW not {NW INTERACT NWDMY}}

PWSTRAP = {PP & OD} not NW

cell  pin PLUS MINUS
N diode PWSTRAP \
P diode \ NWSTRAP

Calibre Rule::NOT

image-20230518005758993

Calibre Rule::INTERACT

image-20230518010124496

image-20230518010758342

Antenna Effect

The antenna effect is a common name for the effects of charge accumulation in isolated nodes of an integrated circuit during its processing

This effect is also sometimes called "Plasma Induced Damage", "Process Induced Damage" (PID) or "charging effect"

This accumulation of charge is usually, and misleadingly, called the antenna effect.

antenna ratio

During manufacture, if part of the metal wiring is connected to the gate, but not a diffusion contact, this "floating" metal collects charge from the plasma.

Manufacturing rules for the antenna effect are usually expressed as the ratio of the area of floating metal (i.e. charge collection area) to the area of the gate.

image-20250714203610809

To prevent the antenna effect from destroying your circuit you need to reduce the floating metal/gate area ratio or give the charge a safe way to dissipate to the ground before it can build up and cause damage

metal jumping (bridging, metal hopping)

Long metal can be taken to higher metal routing layer, which is known as metal jumping.

This metal jumping is usually done near the gate, which will mean that there is a full connection to the diffusion contact before the area of floating metal becomes too large

The jumper is constructed so that the long track is only connected to the gate once it has also been connected to a diffusion contact, which then allows the charge to dissipate through diffusion to the substrate

Diode Insertion

Diode helps dissipate charges accumulated on metal. Diode should be placed as near as possible to the gate of device on low level of metal.

image-20250714204033328

main-qimg-c3fe57dfac5fd5e5b5616ddf4f89f08a-pjlq

In the reverse bias region, the reverse saturation current of Si and Ge diodes doubles for every \(10 ^oC\) rise in temperature

image-20250719083520735


pulsic.com, Analog layout – Stop the antenna effect from destroying your circuit [link]

Prof. Adam Teman, Digital VLSI Design. Lecture-10-The-Manufacturing-Process [pdf]

Zongjian Chen, Processing and Reliability Issues That Impact Design Practice. [https://web.stanford.edu/class/archive/ee/ee371/ee371.1066/lectures/Old/lect_15_2up.pdf]

Shallow Trench Isolation (STI)

image-20241121211242335

image-20241121211348053

Voltage-Dependent DRC

In T* DRC deck, it is based on the voltage recognition CAD layer and net connection to calculate the voltage difference between two neighboring nets by the following formula:

\[ \Delta V = \max(V_H(\text{net1})-V_L(\text{net2}), V_H(\text{net2})-V_L(\text{net1})) \]

where \[ V_H(\text{netx}) = \max(V(\text{netx})) \] and \[ V_L(\text{netx}) = \min(V(\text{netx})) \]

  • The \(\Delta V\) will be 0 if two nets are connected as same potential
  • If \(V_L \gt V_H\) on a net, DRC will report warning on this net

Voltage recognition CAD Layer

Automate those voltage-dependent DRC checks! - siemens

Two method

  1. voltage text layer

    You place specific voltage text on specific drawing layer

  2. voltage marker layer

    Each voltage marker layer represent different voltage for specific drawing layer

voltage text layer has higher priority than voltage marker layer and is recommended

voltage text layer

For example M3

Process Layer CAD Layer# Voltage High Voltage High Top
(highest priority)
Voltage Low Voltage Low Top
(highest priority)
M3 63 110 112 111 113

where 63 is layer number, 110 ~ 113 is datatype

voltage marker layer

Different data type represent different voltage, like

DataType 100 101 102 ... 109
Voltage 0.0 0.1 0.2 0.3 0.9

Example

image-20220503171006936

drain & source sharing

Planar process vs. FinFet process

local_Interconnect.drawio

Standard Cell Tapcell

tapcell.drawio

Guard Ring in Custom block

Place well tie and substrate tie where they are needed. Redundant guard ring consume area and increase the routing of critical signal net.

guardring_stypes.drawio

Continuous OD

Performance & Matching

image-20220219223723289

current mirror

split diffusion with dummy transistors

mirror_continuous_OD_split_with_dummy.drawio

cascode structure

off transistor split diffusion

cascode_continuous_OD_split_with_dummy.drawio

sharing source & drain

sharing_SD.drawio

Stacked MOSFETs

LDE (Layout Dependent Effects)

Vladimír Stejskal, Jiří Slezák March, 2016. LOD Effect: Modeling and Implementation [https://www.mos-ak.org/dresden_2016/presentations/T5_Stejskal_MOS-AK_Dresden_2016.pdf]

John Faricelli – April 16, 2009. Layout-Dependent Proximity Effects in Deep Nanoscale CMOS [https://ewh.ieee.org/r5/denver/sscs/Presentations/2009_04_Faricelli.pdf]

吉富貞幸. 2021年7月29日. 高周波RFCMOS回路を実現する半導体素子のコンパクトモデリング技術 [https://kobaweb.ei.st.gunma-u.ac.jp/lecture/20210729_analog_KIOXIA_Yoshitomi.pdf]

Kanamoto, Toshiki, Yasuhiro Ogasahara, Keiko Natsume, Kenji Yamaguchi, Hiroyuki Amishiro, Tetsuya Watanabe and Masanori Hashimoto. “Impact of well edge proximity effect on timing.” ESSDERC 2007 - 37th European Solid State Device Research Conference (2007)

J. V. Faricelli, "Layout-dependent proximity effects in deep nanoscale CMOS," IEEE Custom Integrated Circuits Conference 2010, San Jose, CA, USA, 2010 [https://sci-hub.se/10.1109/CICC.2010.5617407]

Aleksandr Sidun, Layout-dependent effects (LOD, WPE, Latch-up, Electromigration, Antenna) [https://analoghub.ie/category/Layout/article/layoutDependentEffects]

image-20251009231152210

Length of Diffusion (LOD)

Shallow Trench Isolation Stress

LOD key points:

  • LOD is the result of the STI formation (Shallow trench isolation);
  • STI becomes compressive as the wafer cools down;
  • The width of STI (active to active spacing) has a strong impact on determining stress;
  • LOD improves holes mobility and decreases electron mobility.

image-20251010201258948

Stress has been more effective for PMOS

  • This has caused beta (N/P) ratio to fall to about unity at 7nm

image-20251009233811877

image-20251009234239279

LOD effect can be prevented by distancing devices away from the WELL edge (guard ring). This is usually done by placing dummy devices around the circuit devices, in which case your circuit devices will also benefit from the equal edge effects (each device will have the same neighbours).

image-20251009235236169

Well Proximity Effect (WPE)

Since the well implant dopant (acceptor or donor) is the same type as the channel implant dopant, the additional doping increases the absolute value of the threshold voltage (VT) of both NMOS and PMOS devices

image-20251009234116191

img

Gate Cut Stress LDE

image-20251010202621882

Metal Boundary Effect (MBE)

M. Hamaguchi et al., "New layout dependency in high-k/Metal Gate MOSFETs," 2011 International Electron Devices Meeting, Washington, DC, USA, 2011 [https://sci-hub.st/10.1109/IEDM.2011.6131614]

Alvin Loke. 2016 VLSI Circuits Short Courses – 2.2 Migrating Analog/Mixed-Signal Designs to FinFET Alvin Loke / Qualcomm [pdf]

Gate = (ALD MG stack to set \(\Phi_M\))+(metal fill to reduce RG)

image-20251010201413941

image-20251010201527553

image-20251010202132434

Matching

Aleksandr Sidun. Matching patterns in layout [https://analoghub.ie/category/Layout/article/layoutMatchingPatterns]

—. Matching in layout [https://analoghub.ie/category/Layout/article/layoutMatching]

image-20251011205334957

Interdigitation

Interdigitation provides good matching properties against 1D-gradients and is suitable for the simple circuits

The main concept is that you should create an imaginary center line and place your devices symmetrically, relative to this line. The simplest example of that is so called "ABBA" pattern

image-20251011205815529

image-20251011205608993


Interdigitation reduces the device mismatch as it suffers equally from process variations in X dimension. This technique was used to layout current mirrors and resistors in PTAT and BGR circuits.

HTML5 Icon

Common Centroid

Common Centroid provides better matching for 2D gradients, which is critical for the large arrays and advanced (below 28nm) nodes

The main idea behind common centroid is that we make our array symmetrical of the common centre. In other words, the array should be symmetrical in both X- and Y- axes

image-20251011210113327


The common centroid technique describes that if there are n blocks which are to be matched then the blocks are arranged symmetrically around the common centre at equal distances from the centre. This technique offers best matching for devices as it helps in avoiding cross-chip gradients

HTML5 Icon

Design with FinFETs

image-20221210165644336

image-20221210165916985

Mark Williams. Stacked MOSFETs in Analog Layout [https://community.cadence.com/cadence_blogs_8/b/cic/posts/stacked-mosfets-in-analog-layout]

Modeling Consideration

image-20221217152830191

image-20221210170042233

mos_pro \[\begin{align} R_{d1} &\propto \frac{1}{N_{fins}} \\ R_{s1} &\propto \frac{1}{N_{fins}} \\ R_{g1} &\propto N_{fins} \\ C_{gd} &\propto N_{fins} \cdot N_{fingers} \cdot N_{multipler} \\ C_{gs} &= Cgd \\ C_{g1d} &\propto N_{fins} \\ C_{g1s} &= C_{g1d} \\ C_{g1d1} &\propto N_{fins} \\ C_{g1s1} &= C_{g1d1} \\ C_{g1d1} &\simeq 2\times C_{g1d} \end{align}\]

image-20230708221056420

PODE & CPODE

The PODE devices is extracted as parasitic devices in post-layout netlist

image-20220213172653116

DDB is the PODE (Poly on OD/Diffusion Edge) in TSMC 16FFC process.

SDB is the CPODE (Common Poly on Diffusion Edge) in TSMC 16FFC process.

PO on OD edge (PODE) is a must and to define GATE that abuts OD vertical edge

CPODE is used to connect two PODE cells together. It will isolate OD to save 1 poly pitch, via STI; Additional mask (12N) is required for manufacture

PODE CPODE
Pro's simple density
Con's density LDE (LOD/OSE)
edge device 3T PODE(with single side OD): NO ERC
4T M-PODE (with S/D): ERC (gate tied to power/ground)
won't form device;
NO ERC;
OD under CPODE is cut off

image-20221210145232826

image-20221210150847737

image-20240509205506112


Leading Edge Logic Comparison March 9, 2018 [https://semiwiki.com/wp-content/uploads/2018/03/Leading-Edge-Logic.pdf]

What is CPODE, and why do we use it in VLSI layout? [https://semiconwiki.com/what-is-cpode-and-why-do-we-use-it-in-vlsi-layout/]


3T PODE device

image-20250708001318109

US9053283B2: Methods for layout verification for polysilicon cell edge structures in finFET standard cells using filters [https://patentimages.storage.googleapis.com/36/2c/ff/ad3d4c232ecc8d/US9053283.pdf]

US8943455B2: Methods for layout verification for polysilicon cell edge structures in FinFET standard cells [https://patentimages.storage.googleapis.com/19/12/64/f2badfdc09a4a4/US8943455.pdf]

CNOD

continuous oxide diffusion (CNOD) design

img

In CNOD, the diffusion is not broken at all. The fabrication process continues normally, but when standard cells need to be separated, the gate between them is designated as a dummy gate. This dummy gate is then connected to a Gate Tie-Down Via to the power rail

This dummy gate tie-down method of CNOD achieves the same horizontal width savings as SDB, and has the advantage of keeping the transistor diffusion unbroken and thus can achieve more uniform strain and performance characteristics

The TRUTH of TSMC 5nm [https://www.angstronomics.com/p/the-truth-of-tsmc-5nm]

S. Badel et al., "Chip Variability Mitigation through Continuous Diffusion Enabled by EUV and Self-Aligned Gate Contact," 2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT), Qingdao, China, 2018 [https://sci-hub.st/10.1109/ICSICT.2018.8565694]


image-20250707210444362


4T MPODE (with source/drain) may be formed in CNOD design layout

potential leakage: channel leakage (S to D); junction leakage (S/D to bulk)

image-20250708001207301


CNOD (MPODE) is same with primitive MOS model; PODE is the primitive MOS, just S/D shorted together

image-20250725215821959

Contacted-Poly-Pitch (CPP)

Wider Contacted-Poly-Pitch allows wider MD and VD size, which help reduce MEOL IRdrop

Schematic representation of a logic standard cell layout (CPP = contacted poly pitch, FP = fin pitch, MP = metal pitch; cell height = number of metal lines per cell x MP).

Naoto Horiguchi. Entering the Nanosheet Transistor Era [link]

SAC & SAGC

self-aligned diffusion contacts (SACs)

As shown in Fig. 35 in older planar technology nodes, gate pitch is so relaxed such that S/D contacts and gate contacts can easily be placed next to each other without causing any shorting risk (see Fig. 35(a)).

As the gate pitch scales, there’s no room to put gate contacts next to S/D contacts, and gatecontacts have been pushed away from the active region and are only placed on the STI region.

image-20230708221916716

In addition, at tight gate pitch, even forming S/D contact without shorting to gate metal becomes very challenging.

The idea of self-aligned contacts (SAC) has been introduced to mitigate the issue of S/D contact to gate shorts.

As shown in Fig. 35(b), the gate metal is fully encapsulated by a dielectric spacer and gate cap, which protects the gate from shorting to the S/D contact.

image-20230708230238362

A dielectric cap is added on top of the gate so that if the contact overlaps the gate, no short occurs.

MD layer represent SACs in PDK

image-20230709005334372

self-aligned gate contacts (SAGCs)

Self-aligned gate contacts (SAGCs) have also been implemented and Denser standard cells can be achieved by eliminating the need to land contacts on the gate outside the active area.

SAGCs require the source/drain contacts to be capped with an insulator that is different from both contact and gate cap dielectrics to protect the source/drain contacts against a misaligned gate contact etch.

image-20230708233009568

image-20230708232429240

According to the DRC of T foundary, poly extension > 0 um and space between MP and OD > 0 um., which demonstrate self-aligned gate contact is not introduced.

Gate Resistance

image-20230709000326683

image-20230709004432013

image-20230709000637817

image-20230709003917922

Native NMOS Blocked Implant (NT_N)

Principles of VLSI Design CMOS Processing CMPE 413 [https://redirect.cs.umbc.edu/~cpatel2/links/315/lectures/chap3_lect09_processing2.pdf]

CMOS processing [http://users.ece.utexas.edu/~athomsen/cmos_processing.pdf]

The Fabrication Process of CMOS Transistor [https://www.elprocus.com/the-fabrication-process-of-cmos-transistor/#:~:text=latch%2Dup%20susceptibility.-,N%2D%20well%2F%20P%2D%20well%20Technology,well%20it%20is%20vice%2D%20verse.]

CMOS Processing Technology [link1, link2]

A native layer (NT_N) is usually added under inductors or transformers in the nanoscale CMOS to define the non-doped high-resistance region of substrate, which decreases eddy currents in the substrate thus maintaining high Q of the coils.

For T* PDK offered inductor, a native substrate region is created under the inductor coil to minimize eddy currents

image-20230810000702597

OD inside NT_N only can be used for NT_N potential pickup purpose, such as the guarding-ring of MOM and inductor

Derived Geometries

Term Definition
PW {NOT NW}
N+OD {NP AND OD}
P+OD {PP AND OD}
GATE {PO AND OD}
TrGATE {GATE NOT PODE_GATE}

NP: N+ Source/Drain Ion Implantation

PP: P+ Source/Drain Ion Implantation

OD: Gate Oxide and Diffustion

NW: N-WELL

PW: P-WELL

CMOS Processing Technology

Four main CMOS technologies:

  • n-well process
  • p-well process
  • twin-tub process
  • silicon on insulator

Triple well, Deep N-Well (optional):

  • NWell: NMOS svt, lvt, ulvt ...
  • PWell: PMOS svt, lvt, ulvt ...
  • DNW: For isolating P-Well from the substrate

The NT_N drawn layer adds no process cost and no extra mask

The N-well / P-well technology, where n-type diffusion is done over a p-type substrate or p-type diffusion is done over n-type substrate respectively.

The Twin well technology, where NMOS and PMOS transistor are developed over the wafer by simultaneous diffusion over an epitaxial growth base, rather than a substrate.

Deep N-well

Chew, K.W., Zhang, J., Shao, K., Loh, W., & Chu, S.F. (2002). Impact of Deep N-well Implantation on Substrate Noise Coupling and RF Transistor Performance for Systems-on-a-Chip Integration. 32nd European Solid-State Device Research Conference, 251-254. URL:[slides, paper]

Mark Waller, Analog layout: Why wells, taps, and guard rings are crucial

KEITH SABINE Using Deep N Wells in Analog Design

Faricelli, J. (2010). Layout-dependent proximity effects in deep nanoscale CMOS. IEEE Custom Integrated Circuits Conference 2010, 1-8.

cmos_processing, URL:http://users.ece.utexas.edu/~athomsen/cmos_processing.pdf

Kuo-Tsai LiPaul ChangAndy Chang, TSMC, US20120053923A1, "Methods of designing integrated circuits and systems thereof"

Substrate noise

A variety of techniques can be used to minimize this noise, for example by keeping analog devices surrounded by guard rings, or using a separate supply for the substrate/well taps.

However guard rings alone cannot prevent noise coupling deep in the substrate, only surface currents.

PMOS are less noisy than NMOS since PMOS has its nwell which isolates the substrate noise, but such is not valid for NMOS .

DNW

The N-channel devices built directly into the P-type substrate are not as effectively isolated as P-channel devices in their N-wells. This is because despite creating a P+ guard ring around the devices, there remains an electrical path below the guard ring for charge to flow.

To overcome this issue, a deep N-well can be used to more effectively isolate these N-channel devices.

image-20230529001556060

image-20230529010836003

BM_SS_Together at Last_Fig1

pwdnw: PW/DNW diode

dnwpsub: DNW/PSUB diode

Together At Last – Combining Netlist and Layout Data for Power-Aware Verification

image-20240708221831791

image-20240708222327376

image-20230529002733114

  • the P-well is separated, allowing the voltage to be controlled
  • because the circuit within the deep N-well is separated from the p-substrate in this structure, there is the benefit that this circuitry is less susceptible to noise that propagates through the p-substrate.

Decap

img


img

Kevin Zheng. The Unsung Heroes – Dummies, Decaps, and More [https://circuit-artists.com/the-unsung-heroes-dummies-decaps-and-more/]

The Difference Between MOM, MIM, and MOS Capacitors [https://www.ansys.com/blog/difference-between-mom-mim-mos-capacitor]

MIM/MOM capacitor extraction boosts analog and RF designs [https://www.eeworldonline.com/mim-mom-capacitor-extraction-boosts-analog-and-rf-designs/]

Metal Resistors In Wire Management

img

img

Kevin Zheng. Metal Resistors – Your Unexpected Friend In Wire Management [https://circuit-artists.com/metal-resistors-your-unexpected-friend-in-wire-management/]

reference

Mikael Sahrling, Layout Techniques for Integrated Circuit Designers 1st Edition , Artech House 2022

LAYOUT, EE6350 VLSI Design Lab SMART TEMPERATURE SENSOR URL: https://www.ee.columbia.edu/~kinget/EE6350_S16/06_TEMPSENS_Sukanya_Vani/layout.html

Stacked MOSFETs in analog layout https://pulsic.com/stacked-mosfets-in-analog-layout/

JED Hurwitz, ISSCC2011 "T4: Layout: The other half of Nanometer CMOS Analog Design" [slides, transcript]

Tom Quan, TSMC, Bob Lefferts, Fred Sendig, Synopsys, Custom Design with FinFETs - Best practices designing mixed-signal IP

Jacob, Ajey & Xie, Ruilong & Sung, Min & Liebmann, Lars & Lee, Rinus & Taylor, Bill. (2017). Scaling Challenges for Advanced CMOS Devices. International Journal of High Speed Electronics and Systems. 26. 1740001. 10.1142/S0129156417400018.

Joddy Wang, Synopsys "FinFET SPICE Modeling" Modeling of Systems and Parameter Extraction Working Group 8th International MOS-AK Workshop (co-located with the IEDM Conference and CMC Meeting) Washington DC, December 9 2015

A. L. S. Loke et al., "Analog/mixed-signal design challenges in 7-nm CMOS and beyond," 2018 IEEE Custom Integrated Circuits Conference (CICC), San Diego, CA, USA, 2018, pp. 1-8, doi: 10.1109/CICC.2018.8357060.[slides]

Prof. Adam Teman, Advanced Process Technologies, [pdf]

Luke Collins. FinFET variability issues challenge advantages of new process [link]

Loke, Alvin. (2020). FinFET technology considerations for circuit design (invited short course). BCICTS 2020 Monterey, CA

Alvin Leng Sun Loke, TSMC. Device and Physical Design Considerations for Circuits in FinFET Technology", ISSCC 2020

A. L. S. Loke, C. K. Lee and B. M. Leary, "Nanoscale CMOS Implications on Analog/Mixed-Signal Design," 2019 IEEE Custom Integrated Circuits Conference (CICC), Austin, TX, USA, 2019, pp. 1-57, doi: 10.1109/CICC.2019.8780267.

A. L. S. Loke, Migrating Analog/Mixed-Signal Designs to FinFET Alvin Loke / Qualcomm. 2016 Symposia on VLSI Technology and Circuits

Lattice Semiconductor, 16FFC Process Technology Introduction December 9th, 2021[pdf]

image-20250729004456566


Subthreshold Conduction

By square-law, the Eq \(g_m = \sqrt{2\mu C_{ox}\frac{W}{L}I_D}\), it is possible to obtain a higer transconductance by increasing \(W\) while maintaining \(I_D\) constant. However, if \(W\) increases while \(I_D\) remains constant, then \(V_{GS} \to V_{TH}\) and device enters the subthreshold region. \[ I_D = I_0\exp \frac{V_{GS}}{\xi V_T} \]

where \(I_0\) is proportional to \(W/L\), \(\xi \gt 1\) is a nonideality factor, and \(V_T = kT/q\)

As a result, the transconductance in subthreshold region is \[ g_m = \frac{I_D}{\xi V_T} \]

which is \(g_m \propto I_D\)

image-20240627230726326

image-20240627230744044

PTAT with subthreshold MOS

MOS working in the weak inversion region ("subthreshold conduction") have the similar characteristics to BJTs and diodes, since the effect of diffusion current becomes more significant than that of drift current

image-20240803193343915

image-20240803195500321

image-20240803200129592

Hongprasit, Saweth, Worawat Sa-ngiamvibool and Apinan Aurasopon. "Design of Bandgap Core and Startup Circuits for All CMOS Bandgap Voltage Reference." Przegląd Elektrotechniczny (2012): 277-280.

Curvature Compensation

VBE

image-20250728233542026

In advanced node, N4P, \(V_{BE}\) is about -1.45mV/K

Assuming \(I_C\) is constant

image-20250728233112550

image-20250728233350355

image-20250728233839563

Assuming \(I_C\) is PTAT, \(I_C = (V_T \ln n) / R_3\)

image-20250728233317599

image-20250729002704253

The first-order linear temperature dependence term of \(V_{BE}\) can be eliminated with IPTAT. \(V_T(\eta - \theta)\ln)T/T_r\) is the high-order nonlinear temperature-dependent term of \(V_{BE}\), which requires high-order curvature compensation

G. Zhu, Y. Yang and Q. Zhang, "A 4.6-ppm/°C High-Order Curvature Compensated Bandgap Reference for BMIC," in IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 66, no. 9, pp. 1492-1496, Sept. 2019 [https://sci-hub.se/10.1109/TCSII.2018.2889808]

X. Fu, D. M. Colombo, Y. Yin and K. El-Sankary, "Low Noise, High PSRR, High-Order Piecewise Curvature Compensated CMOS Bandgap Reference," in IEEE Access, vol. 10, pp. 110970-110982, 2022 [https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9923910]


image-20240903234720200


image-20250728225624247

Tutorials | 08012023 | 1.2.1 Bandgap Voltage Regular [https://youtu.be/dz067SOX0XQ&t=6362]

temperature coefficient

The parameter that shows the dependence of the reference voltage on temperature variation is called the temperature coefficient and is defined as: \[ TC_F=\frac{1}{V_{\text{REF}}}\left[ \frac{V_{\text{max}}-V_{\text{min}}}{T_{\text{max}}-T_{\text{min}}} \right]\times10^6\;ppm/^oC \]

Choice of n

image-20221117002714125

classic bandgap reference

bg.drawio

\[ V_{bg} = \frac{\Delta V_{be}}{R_1} (R_1+R_2) + V_{be2} = \frac{\Delta V_{be}}{R_1} R_2 + V_{be1} \]

\[ V_{bg} = \left(\frac{\Delta V_{be}}{R_1} + \frac{V_{be1}}{R_2}\right)R_3 = \left(\frac{\Delta V_{be}}{R_1} R_2 + V_{be1}\right)\frac{R_3}{R_2} \]

OTA offset effect

bg_ota_vos.drawio

\[\begin{align} V_{be1} &= \frac{kT}{q}\ln(\frac{I_{e1}}{I_{ss}}) \\ V_{be2} &= \frac{kT}{q}\ln(\frac{I_{e2}}{nI_{ss}}) \end{align}\]

Here, we assume \(I_e = I_c\)

Hence,

\[\begin{align} \Delta V_{be} &= \frac{kT}{q}\ln(n\frac{I_{e1}}{I_{e2}}) \\ &= \frac{kT}{q}\ln(n) + \frac{kT}{q}\ln(\frac{I_{e1}}{I_{e2}}) \\ &= \Delta V_{be,0} + \frac{kT}{q}\ln(\frac{I_{e1}}{I_{e2}}) \end{align}\]

Therefore,

\[\begin{align} V_{bg} &= \frac{\Delta V_{be}+V_{os}}{R_2}(R_1+R_2) + V_{be2} \\ &= \alpha \Delta V_{be,0} + \alpha \frac{kT}{q}\ln(\frac{I_{e1}}{I_{e2}}) + \alpha V_{os} + \frac{kT}{q}\ln(\frac{I_{e2}}{nI_{ss}}) \\ &= \alpha \Delta V_{be,0} + \alpha \frac{kT}{q}\ln(\frac{I_{e1}}{I_{e2}}) + \alpha V_{os} + \frac{kT}{q}\ln(\frac{I_{e2,0}}{nI_{ss}})+\frac{kT}{q}\ln(\frac{I_{e2}}{I_{e2,0}}) \end{align}\]

We omit the last part \[\begin{align} V_{bg} &\approx \alpha \Delta V_{be,0} + \alpha \frac{kT}{q}\ln(\frac{I_{e1}}{I_{e2}}) + \alpha V_{os} + \frac{kT}{q}\ln(\frac{I_{e2,0}}{nI_{ss}}) \\ &= \alpha \Delta V_{be,0} + V_{be2,0} + \alpha \left(V_{os} + \frac{kT}{q}\ln(\frac{I_{e1}}{I_{e2}})\right) \\ &= V_{bg,0} + \alpha \left(V_{os} + \frac{kT}{q}\ln(\frac{I_{e1}}{I_{e2}})\right) \end{align}\]

i.e. the bg variation due to OTA offset \[ \Delta V_{bg} \approx \alpha \left(V_{os} + \frac{kT}{q}\ln(\frac{I_{e1}}{I_{e2}})\right) \]

  • \(V_{os} \gt 0\)

    \(I_{e1} \gt I_{e2}\): \(\Delta V_{bg} \gt \alpha V_{os}\)

  • \(V_{os} \lt 0\)

    \(I_{e1} \lt I_{e2}\): \(\Delta V_{bg} \lt \alpha V_{os}\)

OTA with chopper

bg_chop.drawio

bg_chop_shift.drawio

\(I_{e1}\), \(I_{e2}\)

\[\begin{align} V_{ip} &= V_{im} + V_{os} \\ \frac{V_{bg}-V_{ip}}{R_2} &= I_{e2} \\ \frac{V_{bg}-V_{im}}{R_2} &= I_{e1} \\ V_{ip} &= I_{e2}R_1 + V_T\frac{I_{e2}}{nI_S} \\ V_{im} &= V_T\frac{I_{e1}}{I_S} \end{align}\] where \(V_T = \frac{kT}{q}\)

we obtain \[ I_{e1} = \frac{V_T\ln n}{R_1} + V_{os}\left(\frac{1}{R_1} + \frac{1}{R_2} \right) - \frac{1}{R_1}\cdot V_T\ln\left(1- \frac{V_{os}}{R_2I_{e1}} \right) \]

we omit the last part \[\begin{align} I_{e1} &= I_{e0} + V_{os}\left(\frac{1}{R_1} + \frac{1}{R_2} \right) \\ I_{e2} &= I_{e1} - \frac{V_{os}}{R_2} = I_{e0} + \frac{V_{os}}{R_1} \end{align}\] where \(I_{e0} = \frac{\Delta V_{be}}{R_1}\), \(\Delta V_{be}=V_T\ln n\)

That is, both \(I_{e1}\) and \(I_{e2}\) are proportional to \(V_{os}\)

\(I_{e1}\) and \(I_{e2}\) can be expressed as \[\begin{align} I_{e1} &= I_{e0} + V_{os}\left(\frac{1}{R_1} + \frac{1}{2R_2} \right) + \frac{V_{os}}{2R_2} \\ I_{e2} &= I_{e0} + V_{os}\left(\frac{1}{R_1} + \frac{1}{2R_2} \right) - \frac{V_{os}}{2R_2} \end{align}\] i.e., \(\Delta I_{e,cm} = V_{os}\left(\frac{1}{R_1} + \frac{1}{2R_2} \right)\) and \(\Delta I_{e,dif} =\frac{V_{os}}{2R_2}\)

bandgap output voltage is

\[\begin{align} V_{bg} &= V_T \ln \frac{I_{e1}}{I_s} + I_{e1}R_2 \\ &= V_T \ln \frac{I_{e0} + V_{os}\left(\frac{1}{R_1} + \frac{1}{R_2} \right)}{I_s} + I_{e1}R_2 \\ &= V_T \ln \frac{I_{e0} + V_{os}\left(\frac{1}{R_1} + \frac{1}{R_2} \right)}{I_s} + I_{e0}R_2 + V_{os}\frac{R_1+R_2}{R_1} \\ &= I_{e0}R_2 + V_T \ln \frac{I_{e0}}{I_s} + V_T\ln\left(1+\frac{V_{os}\left(\frac{1}{R_1} + \frac{1}{R_2} \right)}{I_{e0}} \right) + V_{os}\frac{R_1+R_2}{R_1} \\ &= V_{bg0} + V_T\ln\left(1+\frac{V_{os}\left(\frac{1}{R_1} + \frac{1}{R_2} \right)}{I_{e0}} \right) + V_{os}\frac{R_1+R_2}{R_1} \end{align}\]

Therefore, the averaged output of bandgap

\[ V_{bg,avg} = V_{bg0} +\frac{1}{2}V_T\ln\left(1-\frac{V_{os}^2\left(\frac{1}{R_1} + \frac{1}{R_2} \right)^2}{I_{e0}^2} \right) \lt V_{bg0} \]

\(V_{bg,avg} \lt V_{bg0}\) due to nonlinearity of BJT

reference

ECEN 607 (ESS) Bandgap Reference: Basics URL:https://people.engr.tamu.edu/s-sanchez/607%20Lect%204%20Bandgap-2009.pdf

CICC 2023 Session 12: Forum: Recent Progress in LDOs and Voltage, Current, and Timing References

  • Jae-Yoon Sim, POSTECH. 12-2: Design of Ultra-low-power Bandgap Reference Circuits
  • Inhee Lee, University of Pittsburgh. 12-3: Sub-μW Non-Bandgap Voltage References

image-20250917184927874


MOS capacitances

  • oxide capacitance (aka gate-channel capacitance) between the gate and the channel \(C_1=WLC_{ox}\)
    • divided between \(C_{GS}\) and \(C_{GD}\)
  • depletion capacitance between the channel and the substrate \(C_2\)
  • overlap capacitance: direct overlap and fringing field
  • junction capacitance between the source/drain areas and the substrate
    • The value of \(C_{SB}\) and \(C_{DB}\) is a function of the source and drain voltages with respect to the substrate

image-20240727134110758

image-20240727134150216

The gate-bulk capacitance is usually neglected in the triode and saturation regions because the inversion layer acts as a "shield" between the gate and the bulk.


classification with Intrinsic and Extrinsic MOS capacitor

[Circuit Insights - 11-CI: Fundamentals 4 Tsinghua Nan Sun]

image-20250917185006503

image-20250917185041255

image-20250917185106474

image-20250917185146601

image-20250621113254648

image-20250621122921814

image-20250917185223939

FinFET Parasitic Fringing Capacitance

image-20241120201725441

image-20241120201739690

Temperature Dependence of Junction Diode CV

image-20240901234200243

where TCJ and TCJSW are positive

https://cmosedu.com/cmos1/BSIM4_manual.pdf

image-20240901235359149

image-20240901235425992

image-20240901235543033

Integrated varactors

D=S=B varactor

image-20220924003223575

image-20250622205317309

Inversion-mode (I-MOS)

image-20220924003314979


image-20250622211213169

Accumulation-mode (A-MOS)

image-20250622211513994

image-20250622212138953

NMOS in NWELL, aka NMOS in N-Well varactor

Notice: S/D and NWELL are connected togethor in layout

image-20230504221234639

image-20230504221313785

image-20220924004206116

I-MOS vs . A-MOS

P. Andreani and S. Mattisson, "On the use of MOS varactors in RF VCOs," in IEEE Journal of Solid-State Circuits, vol. 35, no. 6, pp. 905-910, June 2000 [https://sci-hub.se/10.1109/4.845194]

image-20250917193743547

varactor losses

channel resistance & gate resistance

image-20251010204657625

PDK varactor

nmoscap: NMOS in N-Well varactor

image-20240703224101060

  • Base Band MOSCAP model (nmoscap) is built without effective series resistance (ESR) and effective series inductance (ESL) calibrations, which is for capacitance simulation only
  • LC-Tank MOSCAP model (moscap_rf) is for frequency-dependent Q factor and capacitance simulations

MOS Device as Capacitor

image-20240115225644183

image-20240115225928617

image-20240115225853721


Voltage dependence

image-20240115230113523

image-20231103213004806

  • capacitance of MOS gate varies nonmonotonically with \(V_{GS}\)

  • "accumulation-mode" varactor varies monotonically with \(V_{GS}\)

reference

Aditya Varma Muppala. MOS Varactors | Oscillators 15 | MMIC 27 [https://youtu.be/LYCLZPQvIz0?si=yoSBZSD2j_wEx0zZ]

R. L. Bunch and S. Raman, "Large-signal analysis of MOS varactors in CMOS -G/sub m/ LC VCOs," in IEEE Journal of Solid-State Circuits, vol. 38, no. 8, pp. 1325-1332, Aug. 2003, doi: 10.1109/JSSC.2003.814416.

T. Soorapanth, C. P. Yue, D. K. Shaeffer, T. I. Lee and S. S. Wong, "Analysis and optimization of accumulation-mode varactor for RF ICs," 1998 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.98CH36215), 1998, pp. 32-33, doi: 10.1109/VLSIC.1998.687993. URL: http://www-smirc.stanford.edu/papers/VLSI98s-chet.pdf

R. Jacob Baker, 6.1 MOSFET Capacitance Overview/Review, CMOS Circuit Design, Layout, and Simulation, Fourth Edition

B. Razavi, Design of Analog CMOS Integrated Circuits 2nd

Bing Sheu, TSMC. "Circuit Design using FinFETs" [https://www.nishanchettri.com/isscc-slides/2013%20ISSCC/TUTORIALS/ISSCC2013Visuals-T4.pdf]

Due to the fact that long-term drift of temperature sensors and bandgap references caused by package-induced stress is lower with PNP BJTs than with NPN BJTs, PNP BJTs have been used traditionally for temperature sensor design in CMOS

Calibration

TODO 📅

[https://ww1.microchip.com/downloads/en/Appnotes/Atmel-8108-Calibration-of-the-AVRs-Internal-Temperature-Reference_ApplicationNote_AVR122.pdf]

\(V_{BE}\) curvature

curvature results in results in non-linearity

Though it is assumed that \(V_{BE}\) is a linear function of temperature for first oder analysis.

In practice, \(V_{BE}\) is slightly nonlinear, the magnitude of this nonlinearity is referred to as curvature.

curvature depends on the temperature dependency of the saturation current (\(I_s\)), and on that of the collector current (\(I_c\)), it can be written as \[ V_{curv}(T)=\frac{k}{q}(\eta-\delta)(T-T_r-T\cdot \ln(\frac{T}{T_r})) \] where \(\eta\) = a constant depending on the doping level, CMOS substrate pnp transistors have a typically value of \(\eta \cong 4\)

\(\delta\) = order of the temperature dependence of collector current (\(I_c\))

PTAT \(I_c\) help reduce \(V_{curv}(T)\), \(\delta=1\)

Although the temperature dependence of the bias current \(I_b\) doesn’t impact the accuracy of \(V_{BE}\), it does impact the systematic nonlinearity or curvature of \(V_{BE}\), and hence the sensor's systematic error. The curvature in \(V_{BE}\) can be reduced by using a PTAT bias current.

image-20221106010909644

PTAT bias current

image-20221023150817411 \[ I_{bias} = \frac{0.7}{\beta \cdot R^2} \] in which \(\beta=\frac{\mu_{n}\cdot C_{ox}\cdot W}{L}\), where:

\(\mu_n\)=mobility,

\(C_{ox}\) = oxide capacitance density,

\(\frac{W}{L}\) = dimension ratio of unit NMOS used for \(M_1\) and \(M_2\)

\(\mu_n\) is complementary to the absolute temperature and resitor R is implemented using high-R flow in FinFET which has a low temperature dependency, the net temperature dependency of \(I_{bias}\) is proportional to the absolute temperature \[ I_{bias}\propto T \]

Kamath, Umanath Ramachandra. "BJT Based Precision Voltage Reference in FinFET Technology." (2021).

Errors due to V-I Finite Gain

Finite gain introduces errors both in the V-I converters, finite loop gain results in errors in the closed-loop transconductances.

image-20221106153613505 \[\begin{align} (V_{i1} - V_{o1})\cdot A_{OL1} &= V_{o1} \\ V_{o1} &= \frac{A_{OL1}}{1+A_{OL1}}V_{i1} \\ I_{o1} &= \frac{A_{OL1}}{1+A_{OL1}}\frac{1}{R_1}V_{i1} \end{align}\] similarly, \[ I_{o2} = \frac{A_{OL2}}{1+A_{OL2}}\frac{1}{R_2}V_{i2} \]

Then, \(\alpha\) is obtained \[ \alpha = \frac{(1+A_{OL2})A_{OL1}}{A_{OL2}(1+A_{OL1})}\cdot\frac{R_2}{R_1} \] Since the loop gains in the two V-I converters cannot be expected to match, the resulting errors in both converters should be reduced to negligible levels.

First, assume \(A_{OL2}=\infty\) \[\begin{align} \Delta \alpha &= (1-\frac{A_{OL1}}{1+A_{OL1}})\cdot\frac{R_2}{R_1}\\ &=\frac{1}{1+A_{OL1}}\cdot\frac{R_2}{R_1}\\ &\cong \frac{1}{A_{OL1}}\cdot\frac{R_2}{R_1} \end{align}\]

We get \[ \frac{\Delta \alpha}{\alpha}=\frac{1}{A_{OL1}} \] Follow the same procedure, assume \(A_{OL1}=\infty\) \[ \frac{\Delta \alpha}{\alpha}=\frac{1}{A_{OL2}} \] The finite gain introduces an error inversely proportional to the loop gain \(A_{OL1}\),\(A_{OL2}\), the resulting errors in both converters should be reduced to negligible levels

Why is it named as "bandgap reference"

Let us write the output voltage as \[ V_{REF} = V_{BE} + V_T\cdot \ln n \] and hence \[ \frac{\partial V_{REF}}{\partial T} = \frac{\partial V_{BE}}{\partial T} + \frac{V_T}{T}\ln n \] Setting this to zero and substituting for \(\frac{\partial V_{BE}}{\partial T}\), we have \[ \frac{V_{BE}-(4+m)V_T-E_g/q}{T}=-\frac{V_T}{T}\ln n \] If \(V_T\ln n\) is found from this equation and inserted in \(V_{REF}\), we obtain \[ V_{REF}=\frac{E_g}{q} + (4+m)V_T \]

The term bandgap is used here because as \(T\to 0\), \(V_{REF} \to E_g/q\)

sinking PTAT-current generator without current mirrors

image-20240824110909314

why without current mirror?

image-20240824110641427

image-20240824110958282

Bakker, Anton. (2000). High-Accuracy CMOS Smart Temperature Sensors. 10.1007/978-1-4757-3190-3. [https://repository.tudelft.nl/record/uuid:fd398056-48dd-4d84-8ae8-27a1b011d2c3]

Readout Circuit

ADC dynamic range

Take \(V_{PTAT}=\alpha \cdot \Delta V_{BE}\) as input and \(V_{REF}\) as reference. The output \(\mu\) of the ADC will then be \[ \mu =\frac{V_{PTAT}}{V_{VREF}}=\frac{\alpha \cdot \Delta V_{BE}}{V_{BE}+\alpha \cdot \Delta V_{BE}} \] A final digital output \(D_{out}\) in degrees Celsius can be obtained by linear scaling: \[ D_{out}=A\cdot \mu + B \] where \(A\simeq 600K\) and \(B\simeq -273K\)

While the transfer is simple, it only uses about 30% of the of the ADC (the extremes of the operating range correspond to \(\mu \simeq 1/3\) and \(\mu \simeq 2/3\)). The ratio results in a rather inefficient use of the modulator's dynamic range.

For a first-order \(\Sigma\Delta\) modulator, this means that about 1.5 bits of resolution are lost

A more efficient transfer is \[ \mu '=\frac{2\alpha \cdot \Delta V_{BE}-V_{BE}}{V_{BE}+\alpha \cdot \Delta V_{BE}} \] With this more efficient combination, 90% of the dynamic range is used rather than 30%. Thus, the required resolution of the ADC is reduced by a factor of three.

image-20230204220522392

Integrator Output Swing

\[ \mu =\frac{\alpha \cdot \Delta V_{BE}}{V_{BE}+\alpha \cdot \Delta V_{BE}} \]

image-20230207002324363

\[ \mu '=\frac{2\alpha \cdot \Delta V_{BE}-V_{BE}}{V_{BE}+\alpha \cdot \Delta V_{BE}} \]

image-20230206230202755

In advanced process, like Finfet 16nm, 7nm, high resistance resistor has +/-15% variation and MOM capacitor has +/-30% variation.

Then, \(R_1\) and \(R_2\) not only determine the \(\alpha\) but also the integrator's output swing, so do \(V_{BE}\) and \(\Delta V_{BE}\), \(C_{int}\).

The integrator's output change per period

image-20230206231010121

example

image-20230430112230224

integrator, comparator offset

integrator offset

image-20230430114429118

image-20230430114520336

comparator offset

image-20230501223512686

integrator design

application in sensor

image-20221106142157115

Offset Errors

The offset of opamp \(A_3\) is much less critical:

  1. It affects the integrated currents via the finite output impedances \(R_{out1,2}\) of the V-I converters, and is therefore attenuated by a factor \(R_{out1}/R_1\) when referred back to the input of the sinking V-I converter,

  2. or by a factor \(R_{out2}/R_2\) when referred back to the input of the sourcing V-I converter.

Therefore, no special offset cancellation is needed for opamp \(A_3\).

The current change due to offset of \(A_3\): \[\begin{align} \frac{V_{BE,os}}{R_1} &= \frac{V_{ota,os}}{R_{out1}} \\ \frac{\Delta V_{BE,os}}{R_2} &= \frac{V_{ota,os}}{R_{out2}} \end{align}\] Then, the input referenced offset is: \[\begin{align} V_{BE,os} &=\frac{ V_{ota,os}}{R_{out1}/R_1} \\ \Delta V_{BE,os} &= \frac{ V_{ota,os}}{R_{out2}/R_2} \end{align}\]

Errors due to Finite Gain

Finite gain of opamp \(A_3\) results in a non-zero overdrive voltage at its input, which modulates the current Iint due to the finite output impedances of the V-I converters.

Assuming the opamp is implemented as a transconductance amplifier, there are two main causes of this non-zero overdrive voltage

  1. The finite transconductance \(g_{m3}\) of the opamp, , which implies that an overdrive voltage is required to provide the feedback current

​ The change in the integrated current

\[\begin{align} ​ \Delta I_{int} &= \frac{V_{i,ota}}{R_{out}}\\ ​ &= \frac{I_{int}}{g_{m3}}\cdot \frac{1}{R_{out}} ​ \end{align}\]

  1. The finite DC gain \(A_{0,3}\), which implies that an overdrive voltage is required to produce the output voltage \(V_{int}\)

reference

Micheal, A., P., Pertijs., Johan, H., Huijsing., Pertijs., Johan, H., Huijsing. (2006). Precision Temperature Sensors in CMOS Technology.

C. -H. Chang, J. -J. Horng, A. Kundu, C. -C. Chang and Y. -C. Peng, "An ultra-compact, untrimmed CMOS bandgap reference with 3σ inaccuracy of +0.64% in 16nm FinFET," 2014 IEEE Asian Solid-State Circuits Conference (A-SSCC), 2014, pp. 165-168, doi: 10.1109/ASSCC.2014.7008886.

EE247 - Analog Digital Interface Integrated Circuits - Fall 2009 Lecture 24- Oversampled ADCs

Hecht, Bruce. (2010). SSCS DL Kofi Makinwa Talks About Smart Sensor Design at SSCS-Boston [People]. Solid-State Circuits Magazine, IEEE. 2. 54 - 56. 10.1109/MSSC.2009.935278.

image-20241109171759694

Linear Time-varying System Theory

We define the ISF of the sampler as the sensitivity of its final output voltage to the impulse arriving at its input at different times, the ISF essentially describes the aperture of the sampler.

An ideal sampler would have the perfect aperture, i.e. sampling the input voltage at exactly one point in time; thus, its ISF would be a Dirac delta function, \(\delta(t-t_s)\) where \(t_s\) is when sampling occurs.

A realistic sampler would rather capture a weighted-average of the input voltage over a certain time window. This weighting function is called the sampling aperture and is equivalent to the ISF

image-20220610235211500

A time-varying impulse response \(h(t, \tau)\) is defined as the circuit response at time \(t\) responding to an impulse arriving at time \(\tau\).

In general, the ISF can be regarded as the time-varying impulse response evaluated at one particular observation time \(t=t_0\).

The system output \(y(t)\) is related to the input \(x(t)\) as: \[ y(t) = \int_{-\infty}^{\infty}h(t, \tau)\cdot x(\tau)d\tau \] Note that in a linear time-invariant (LTI) system, \(h(t,\tau)=h(t-\tau)\) and the above equation reduces to a convolution.

If \(X(j\omega)\) is the Fourier transform of the input signal \(x(t)\), i.e. \[ x(t) = \frac{1}{2\pi}\int_{-\infty}^{\infty}X(j\omega)\cdot e^{j\omega t}d\omega \] Then \[\begin{align} y(t) &= \int_{-\infty}^{\infty}h(t,\tau)\left[\frac{1}{2\pi}\int_{-\infty}^{\infty}X(j\omega)\cdot e^{j\omega\tau }d\omega \right]\cdot d\tau \\ &=\frac{1}{2\pi}\int_{-\infty}^{\infty}X(j\omega)\left[\int_{-\infty}^{\infty}h(t,\tau)\cdot e^{j\omega\tau}d\tau\right]\cdot d\omega \\ &=\frac{1}{2\pi}\int_{-\infty}^{\infty}X(j\omega)\left[\int_{-\infty}^{\infty}h(t,\tau)\cdot e^{-j\omega(t-\tau)}d\tau\right]\cdot e^{j\omega t}\cdot d\omega \\ &=\frac{1}{2\pi}\int_{-\infty}^{\infty}X(j\omega)\cdot H(j\omega;t)\cdot e^{j\omega t}\cdot d\omega \end{align}\]

where \(H(j\omega;t)\) is time-varying transfer function, defined as the Fourier transform of the time-varying impulse response. \[ H(j\omega;t)=\int_{-\infty}^{\infty}h(t,\tau)\cdot e^{-j\omega(t-\tau)}d\tau \] And it follows that: \[ Y(j\omega)=H(j\omega;t)\cdot X(j\omega) \] And

\[\begin{align} x(\tau) & \overset{FT}{\longrightarrow} X(j\omega) \\ h(t,\tau) & \overset{FT}{\longrightarrow} H(j\omega;t) \end{align}\]

For linear, periodically time-varying (LPTV) systems, \(h(t, \tau) = h(t+T, \tau+T)\) and \(H(j\omega; t) = H(j\omega; t+T)\) where \(T\) is the period of the time-varying dynamics of the system.

We prove \(H(j\omega; t) = H(j\omega; t+T)\):

\[\begin{align} \because H(j\omega;t)&=\int_{-\infty}^{\infty}h(t,\tau)\cdot e^{-j\omega(t-\tau)}d\tau \\ \therefore H(j\omega;t+T) &= \int_{-\infty}^{\infty}h(t+T,\tau)\cdot e^{-j\omega(t+T-\tau)}d\tau \\ &= \int_{-\infty}^{\infty}h(t+T,\tau+T)\cdot e^{-j\omega(t+T-(\tau+T))}d(\tau+T) \\ &= \int_{-\infty}^{\infty}h(t+T,\tau+T)\cdot e^{-j\omega(t-\tau)}d\tau \\ &= \int_{-\infty}^{\infty}h(t,\tau)\cdot e^{-j\omega(t-\tau)}d\tau \\ &= H(j\omega;t) \end{align}\]

PSS + PAC Method

Since \(H(j\omega;t)\) is periodic in \(T\), The time-varying transfer function \(H(j\omega;t)\) can be expressed in a Fourier series: \[ H(j\omega;t)=\sum_{m=-\infty}^{\infty}H_m(j\omega) \cdot e^{jm\omega_c t} \] where \(\omega_c\) is the fundamental frequency of the periodic system. \(H_m(j\omega)\) represent the frequency response of the system at the (m-th) harmonic output sideband to a unit \(j\omega\) sinusoid.

The above equation link time-varying transfer function \(H(j\omega;t)\) with PAC simulation output

The response to a periodic impulse train, that is: \[ x(t)=\sum_{m=-\infty}^{\infty}\delta(t-\tau-nkT) \] The idea is that if the impulse response of the system settles to zero long before the next impulse arrives, then the system response to this impulse train would be approximately equal to the periodic repetition of the true impulse response, i.e.: \[ y(t) \cong \sum_{m=-\infty}^{\infty}h(t;\tau+nkT) \] and \(y(t)\) would be approximately equal to \(h(t;\tau)\) for \(\tau \leq t \le t+kT\)

yt.drawio

Without loss of generality and for computation convenience, we set \(k=1\) thereafter.

The Fourier transform \(X(j\omega)\) of the T-periodic impulse train is: \[ X(j\omega)=\omega_c\sum_{n=-\infty}^{\infty}\delta(\omega-n\omega_c)\cdot e^{-j\omega\tau} \] Then the response \(y(t)\) is: \[ y(t)=\frac{1}{T}\sum_{n=-\infty}^{\infty}H(jn\omega_c;t)\cdot e^{jn\omega_c\cdot(t-\tau)} \] The expression for the approximate time-varying impulse response: \[ h(t,\tau) = \left\{ \begin{array}{cl} \frac{1}{T}\sum_{n=-\infty}^{\infty}\sum_{m=-\infty}^{\infty}H_m(jn\omega_c)\cdot e^{jm\omega_ct+jn\omega_c\cdot (t-\tau)} & : \ \tau \leq t \lt \tau+T \\ 0 & : \ \text{elsewhere} \end{array} \right. \] Finally, the ISF \(\Gamma(\tau)\) is equal to \(h(t,\tau)\) when \(t=t_0\) and \(t_0 \gt \tau\) \[ \Gamma(\tau)\cong \frac{1}{T}\sum_{n=-\infty}^{\infty}\sum_{m=-\infty}^{\infty}H_m(jn\omega_c)\cdot e^{jm\omega_ct_0+jn\omega_c\cdot (t_0-\tau)} \] In practice, the summations are carried out over finite ranges of n and m, for example, -50~50.

For each combination of n and m, the PAC analysis needs to be performed to compute \(H_m(jn\omega_c)\), the m-th harmonic response to the excitation at \(n\omega_c\)

The detailed procedure for characterizing the ISF of this sampler is outlined as follows:

  • First, apply the proper input voltages that place the sampler in a metastable state and perform the periodic steady-state (PSS) analysis.

  • Second, perform the PAC analysis.

  • Third, based on the simulated PAC response, pick a time point \(t_0\) at which the ISF is to be computed and derive the ISF

One possible candidate for the ISF measurement point \(t_0\) is the time at which the output voltage is amplified to the largest value. PAC response of the sampler to a small signal DC input, that is, the time-varying transfer function evaluated at \(\omega=0\) \[ H(0;t)=\sum_{m=-\infty}^{\infty}H_m(0) \cdot e^{jm\omega_c t} \] image-20220614214446328


The total area under the ISF is the sampling gain, which is equal to the time-varying gain measured at \(t_0\) to a small signal DC input (\(\omega=0\))

Because we have \(H(j\omega;t)=\int_{-\infty}^{\infty}h(t,\tau)\cdot e^{-j\omega(t-\tau)}d\tau\), i.e. Fourier transform \[ H(0;t)=\int_{-\infty}^{\infty}h(t,\tau)d\tau = \int_{-\infty}^{\infty}\Gamma(\tau)d\tau \]

1
2
time-varying gain at t0 H(0;t0): 19.486305
The total area under the ISF: 19.990230

Align pss_td.pss with ISF

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
****************************************************
Periodic Steady-State Analysis `pss': fund = 500 MHz
****************************************************
Trying `homotopy = gmin' for initial conditions.
DC simulation time: CPU = 4.237 ms, elapsed = 4.27389 ms.

===============================
`pss': time = (0 s -> 102.6 ns)
===============================

Opening the PSF file ../psf/pss.tran.pss ...
...
Important parameter values in tstab integration:
start = 0 s
outputstart = 0 s
stop = 102.6 ns
period = 2 ns
maxperiods = 20
step = 102.6 ps
...

tstab = 102.6 ns can be observed in pss simulation log

image-20220614214537033

1
2
3
4
5
tstab = 102.6e-9;
tshift = mod(tstab, Tc);
tt_shift = tt - tshift;
tt_shift_start_indx = find(tt_shift>=0, 1);
isf_shift = circshift(isf_re, -tt_shift_start_indx);

Align pss_fd.pss with ISF

Since both are frequency originated, time-shift is NOT needed

image-20220614214613574

1
2
3
4
5
6
7
8
function wv = wv_fd(fname,tt)
fd = csvread(fname, 1, 0);
DC = fd(1, 2);
w = 2*pi*fd(2:end, 1);
coef = fd(2:end, 2) + 1i*fd(2:end, 3);
exp_sup = 1i*w.*tt;
wv = sum(real(coef .* exp(exp_sup)), 1) + DC;
end

PSS + PAC Setup

  • clock frequency should be low enough to assure system response settle to zero.
  • Beat Frequency os PSS should be clock frequency
  • For PAC setup,
    • the Sweeptype is absolute
    • Input Frequency Sweep Range(Hz) should be large enough.
    • Sweep Type should be Linear and Step Size should equal PSS Beat Frequency(Hz)
    • SideBands should large enough, like 50 (i.e. 50*2 +1, positive, negative and 0)
    • Specialized Analyses should be None

one example: clock, i.e. beat frequency = 8G PAC: input frequency sweep from -400G to 400G and step is 8G, which is beat frequency, here K=1 Eq.(9) of paper

freqaxis=out: freqaxis of PAC not only affect "Direct Plot"'s output but also simuation data i.e. the phase shift(imaginary part).

matlab matrix nonconjugate transpose:

transpose, .' cf. https://www.mathworks.com/help/matlab/ref/transpose.html

tstab in PSS

Using shooting PSS, the steady waveform starts from tstab+n*tperiod.

  • pss_td.pss is one period waveform starting from tstab+n*tperiod
  • pss_fd.pss is the complex fourier series coefficient of expanded to left and right pss_td.pss waveform (tstab+n*tperiod : tstab+(n+1)*tperiod)

We have to left-shift mod(tstab, tperiod) pss_fd.pss in order to align it with of pss_tb.pss

image-20220610222535614

simulation log

The below stop = 1.3 ns is actual tstab time, though Stop Time(tstab) field of pss form is filled with 0.3n

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
**************************************************
Periodic Steady-State Analysis `pss': fund = 1 GHz
**************************************************
DC simulation time: CPU = 208 us, elapsed = 211.954 us.

=============================
`pss': time = (0 s -> 1.3 ns)
=============================

Opening the PSF file ../psf/pss.tran.pss ...

Output and IC/nodeset summary:
save 1 (current)
save 2 (voltage)

Important parameter values in tstab integration:
start = 0 s
outputstart = 0 s
stop = 1.3 ns
period = 1 ns
maxperiods = 20
step = 1.3 ps
maxstep = 40 ps
ic = all
useprevic = no
...

pss: time = 64.01 ps (4.92 %), step = 31.63 ps (2.43 %)
...
pss: time = 1.224 ns (94.2 %), step = 40 ps (3.08 %)
pss: time = 1.3 ns (100 %), step = 35.99 ps (2.77 %)
...

PSS simulation result

image-20220610224100135

Align pss_tb and pss_fd

image-20220610225310243

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
clear;
clc;

freq = 1e9;
tstab = 1.3e-9;
Tp = 1e-9;

load('pss_td.matlab')
t = pss_td(:, 1);
ytd = pss_td(:, 2);
plot(t*1e9, ytd, 'k', 'LineWidth',6)
hold on;

% time domian from pss frequency domain information
coff_real = -0.155222;
coff_imag = -0.0247045;
wc = 2*pi*freq;
tfd = (0:1e-11:2e-9);
yfd = coff_real*cos(wc*tfd) - coff_imag*sin(wc*tfd);
plot(tfd*1e9, yfd, 'b')

% actual pss_td.pss one-period waveform
tfd_td = (tstab:1e-11:2e-9);
yfd_td = coff_real*cos(wc*tfd_td) - coff_imag*sin(wc*tfd_td);
plot(tfd_td*1e9, yfd_td, '--b', 'LineWidth', 4)

% align pss_fd with pss_tb by left shift mod(tstab, Tp) pss_fd
tshift = mod(tstab, Tp);
tfd_shift = tfd - tshift;
tfd_shift_start_indx = find(tfd_shift>=0, 1);
tfd_shift = tfd_shift(1, tfd_shift_start_indx:end);
yfd_shift = yfd(1, tfd_shift_start_indx:end);
plot(tfd_shift*1e9, yfd_shift, '-magenta', 'LineWidth', 2)
grid on;

xlabel('t (ps)');
ylabel('V(t)');
legend('Using pss\_td', 'Using pss\_fd', 'pss\_tb one period clip', 'Using pss\_fd with time shift', 'location', 'east');

Transient Method

TODO 📅

reference

J. Kim, B. S. Leibowitz and M. Jeeradit, "Impulse sensitivity function analysis of periodic circuits," 2008 IEEE/ACM International Conference on Computer-Aided Design, 2008, pp. 386-391, doi: 10.1109/ICCAD.2008.4681602. [https://websrv.cecs.uci.edu/~papers/iccad08/PDFs/Papers/05C.2.pdf]

M. Jeeradit et al., "Characterizing sampling aperture of clocked comparators," 2008 IEEE Symposium on VLSI Circuits, Honolulu, HI, USA, 2008, pp. 68-69 [https://people.engr.tamu.edu/spalermo/ecen689/sampling_aperature_comparators_vlsi_2008.pdf]

T. Toifl et al., "A 22-gb/s PAM-4 receiver in 90-nm CMOS SOI technology," in IEEE Journal of Solid-State Circuits, vol. 41, no. 4, pp. 954-965, April 2006 [https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=4d1f0442be77425ed34b9dcfd48fbfff954a707b]

Sam Palermo, ECEN 720 High-Speed Links: Circuits and Systems [Lecture 6: RX Circuits], [Lab4 - Receiver Circuits]

CDM (Charged Device Model)

Jordan Davis, Samsung Electronics. Full-Chip CDM Analysis: Is Static Simulation Enough? [https://www.synopsys.com/content/dam/synopsys/implementation&signoff/electrical-layout-verification-documents/esd-workshop-2021-pres.pdf]

M. Etherton et al., "A new full-chip verification methodology to prevent CDM oxide failures," 2015 37th Electrical Overstress/Electrostatic Discharge Symposium (EOS/ESD), Reno, NV, USA, 2015 [pdf]

P.E. Allen 2021. Lesson 4 – ESD Input Circuit Protection [https://aicdesign.org/wp-content/uploads/2021/05/Lesson04_ESD_Input_Ckt_Protection210323.pdf]

M. Di, H. Wang, F. Zhang, C. Li, Z. Pan and A. Wang, "Does CDM ESD Protection Really Work?," 2019 IEEE Workshop on Microelectronics and Electron Devices (WMED), Boise, ID, USA, 2019 [https://sci-hub.se/10.1109/WMED.2019.8714145]

On-Chip Decoupling Capacitors

Y. -C. Huang and M. -D. Ker, "Study on CDM ESD Robustness Among On-Chip Decoupling Capacitors in CMOS Integrated Circuits," in IEEE Journal of the Electron Devices Society, vol. 9, pp. 881-890, 2021 [pdf]

Y. -C. Huang and M. -D. Ker, "Investigation of CDM ESD Protection Capability Among Power-Rail ESD Clamp Circuits in CMOS ICs With Decoupling Capacitors," in IEEE Journal of the Electron Devices Society, vol. 11, pp. 84-94, 2023

image-20251023234029891

image-20251026210502523

image-20251026205645844

NMOS capacitor with DNW owing to the parasitic junction that formed between P-substrate and the DNW to reduce the probability of ESD damage on the thin gate oxide layer of NMOS capacitor.

Therefore, it results in higher CDM ESD robustness than that of the other two designs with decoupling capacitors realized by of varactor and NMOS capacitor

Circuit-Level CDM Model

H. Wang, F. Zhang, C. Li, M. Di and A. Wang, "Chip-Level CDM Circuit Modeling and Simulation for ESD Protection Design in 28nm CMOS," 2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT), Qingdao, China, 2018

—. (2018). A Chip-Level CDM ESD Protection Circuit Modeling and Simulation Method and Experimental Verification. UC Riverside [https://escholarship.org/content/qt1355v6vs/qt1355v6vs.pdf]

Today's cognition ondie CDM charge is stored in the substrate


The circuit model is divided into three parts:

  • IC package

  • substrate resistance & capacitance

  • protection devices & circuit elements

image-20251023224827861

all charges are considered be distributed to the surface of an IC die, i.e., Si substrate

The surface-stored charges are modeled using the capacitors at the surfaces of the IC substrate

image-20251030233044387

image-20251030233715548

image-20251030232957818


A Vulnerable Circuit Topology — cascode topology

image-20251023232145919

image-20251023232931664

Parasitic Capacitance Path

Lin, Chun-Yu, Tang-Long Chang and Ming-Dou Ker. "Investigation on CDM ESD events at core circuits in a 65-nm CMOS process." Microelectron. Reliab. 52 (2012) [pdf]

CDM ESD issue due to the coupled current when I/O circuit is stressed by CDM ESD

image-20251022230235648

negative CDM ESD event

image-20251022224704366

image-20251022224753852

positive CDM ESD event

image-20251022224839422

CDM Failure Mechanisms

  • reverse S/D junctions
  • capacitively coupled through the gate

image-20251022212114852

For a bare Si die, the charges induced by whatever procedures, are stored inside the IC die randomly, unpredictably and anywhere, e.g., in the substrate, along the metal rails or locally to transistors

image-20251022233233120


M. Etherton et al., "A new full-chip verification methodology to prevent CDM oxide failures," 2015 37th Electrical Overstress/Electrostatic Discharge Symposium (EOS/ESD), Reno, NV, USA, 2015 [https://www.synopsys.com/content/dam/synopsys/implementation&signoff/electrical-layout-verification-documents/cdm-esd-paper.pdf]

Note that there is no notable CDM current flow in the signal route

image-20251022220920245


Yorgos Christoforou. Why negative polarity CDM ESD leads more often to failure [https://ycindustrial.wordpress.com/2014/01/15/why-negative-polarity-cdm-esd-leads-more-often-to-to-failure/]

?? suppose that charged package and substrate are same electric potential

image-20251022222016944

Misconception in CDM ESD Protection

Two players will affect the internal CDM discharging routing:

  • the amount of electrostatic charge stored inside the IC
  • more importantly, their internal distribution within a chip

image-20251019001052851


Wang, Han, Feilong Zhang, Cheng Li, Mengfu Di and Albert Z. H. Wang. “Chip-Level CDM Circuit Modeling and Simulation for ESD Protection Design in 28nm CMOS.” 2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT) (2018) [pdf]

It is generally believed that the induced electrostatic charges are stored on the package frame and/or on the supply buses in a lumped way

image-20251019003516926

image-20251019004248674


induced electrostatic charges are randomly distributed throughout a bare die of mixed-signal IC, anywhere and everywhere

image-20251019122233711

Field Induced CDM (FICDM)

Field Induced CDM Explained [https://certus-semi.com/field-induced-cdm-explained/]

image-20251018095202194Confusion about the test procedure is understandable because the actual process is opposite from what is expected

  • field induction does not place any charge on the device

  • the "discharge" when the pogo pin first touches the DUT is when the DUT is actually charged

\(C_{DF}\) is the capacitance of the DUT to the field plate

\(C_{DG}\) is the capacitance of the DUT to the ground plane

\(C_{FG}\) is the capacitance of the field plate to the ground plane

\(C_{DF}\gg C_{DG}\) — the separation of the DUT from the field plate is always much less than the separation of the DUT from the ground plane

Assuming no initial charge on the DUT, with the switch S open the DC voltage between the DUT and the Field Plate is \[ V_{DF} = \frac{C_{DG}}{C_{DG} + C_{DF}}\cdot V_{HV} \approx 0 \]

  • DUT potential will therefore closely track the power supply voltage
  • The potential of the DUT relative to the ground plane can therefore be controlled without actually putting any net charge on the DUT

CDM Test Sequence

  1. With the field plate at zero volts an uncharged DUT is placed on the field plate in the dead bug position and the ground plane is positioned with the pogo pin above the pin to be tested

  2. The field plate is raised to a high potential, for example +500 V. The high value resistor ensures that the field plate changes potential relatively slowly. The slow change in potential ensures that the DUT is not damaged before the CDM event.

    The potential of the DUT will closely track the field plate, reaching in excess of 450 V, although there will be no net charge on the DUT

    Capacitive coupling elevates the potential of the integrated circuit to a voltage close to that of the field plate

  3. After the voltage has stabilized the separation between the field plate and the ground plane is reduced until an arc forms between the pogo pin and the DUT pin and eventually the two pins touch.

    This is equivalent to closing the switch S in Figure 3

  4. Closing S in the circuit diagram produces a very rapid grounding of the DUT and a redistribution of charge between the three capacitors

    At this point the DUT is charged and the potential between the field plate and the ground plane has fallen as the capacitor \(C_{FG}\) provides charge to the DUT

    During this redistribution of charge, which usually lasts under 2 ns, the high voltage power supply and the high value resistor can be ignored because of their slow response time

  5. After the initial redistribution of charge the field plate will slowly return to the voltage on the high voltage power supply, while the DUT remains at zero potential, but in a charged state

  6. With the pogo pin still touching the DUT pin the HV power supply voltage is set to zero. The field plate will slowly return to zero volts and the charge on the DUT will slowly bleed off through the pogo pin.

single & dual discharge method

single discharge procedure:

single positive or single negative CDM ESD pulse is applied to DUT for individual CDM discharge

image-20251018171927965

the single discharge procedure involves only one CDM discharge to stress the DUT device


dual discharge procedure:

single positive and single negative CDM ESD pulses are applied to produce one pair of alternating polarity CDM discharges to zap the DUT

image-20251018171825654

Static Induction

ESD Static Induction & Double Jeopardy Demonstration [https://youtu.be/RGN-PvAE-OI]

image-20251018164948453

Electrostatic Equilibrium State

image-20251031003543399

image-20251031002004910

CDM Tester Model Using Spice

Robert Ashton. Simulating Small Device CDM Using Spice [https://incompliancemag.com/simulating-small-device-cdm-using-spice/]

TODO 📅

The value of CFG is also based on a parallel plate capacitor model with a peripheral capacitance term minus a capacitance representing a shielding of the Field Plate to ground plane capacitance due to the size of the device under test

image-20251101000506994

image-20251101003112971


Challenges of CDM Modeling for High-Speed Interface Devices [https://incompliancemag.com/challenges-of-cdm-modeling-for-high-speed-interface-devices/]

image-20251031225516239

Problem Statement of a lumped capacitor CDUT

image-20251031225558484

Distributed DUT model

To assess the design solutions, a distributed DUT model, as presented in Figure 3, can be plugged into the CDM tester model, replacing the lumped DUT capacitor

  • The maximum voltage difference between Vdut and Vss (Vdd) should not exceed the breakdown voltage of the gates.
  • On-die parasitics of Vss and Vdd nets strongly influence the actual voltage waveform at the input gate oxide. In particular, oscillations and spikes in the voltage waveform are sensed by the gate oxide and can lead to damage

image-20251031232430782

image-20251031231621824

ficdm.drawio

Diode capacitance vs. Vpn

TODO 📅

anti-parallel ESD diode

M. Etherton et al., "A new full-chip verification methodology to prevent CDM oxide failures," 2015 37th Electrical Overstress/Electrostatic Discharge Symposium (EOS/ESD), Reno, NV, USA, 2015 [https://www.synopsys.com/content/dam/synopsys/implementation&signoff/electrical-layout-verification-documents/cdm-esd-paper.pdf]

image-20251022220718331

PERC

  • CD: current density checks

  • P2P: point to point resistance checks

  • LDL: logic driven layout checks, latch up related

  • TOPO: topology, circuit connection and device size checks

database

  • CD, P2P, LDL : dfmdb

  • TOPO: svdb

Frank Feng. New Approach For Full Chip Electrical Reliability Verification [pdf]

Calibre PERC Catalog Test-Cases & Common Examples Version 2.0

Latchup

Latch-up in CMOS circuits: threat or opportunity (part 1) [https://monthly-pulse.com/2021/01/05/latch-up-in-cmos-circuits-threat-or-opportunity-part-1/]

Latch-up in CMOS circuits: threat or opportunity (part 2) [https://monthly-pulse.com/2021/01/05/latch-up-in-cmos-circuits-threat-or-opportunity-part-2/]

image-20250615105811120

image-20250615090308047

This can happen when a parasitic thyristor, which is essentially a pair of interconnected transistors, is triggered into a latched state, leading to sustained current flow and potential device failure.

Necessary Conditions

image-20250615111234769

image-20250802083418387

Trigger Modes

image-20250802083545596

latchup-prevention technique

image-20250615073909333

image-20250615085647213

Technical Paper Ensuring latch-up guard rings ESDA rules using Calibre PERC [https://resources.sw.siemens.com/en-US/technical-paper-ensuring-latch-up-guard-rings-esda-rules-using-calibre-perc/]

Protect against ESD by ensuring latch-up guard rings [https://semiwiki.com/eda/362551-protect-against-esd-by-ensuring-latch-up-guard-rings/]

Guard Rings

One important technique is the use of guard rings, the heavily doped regions surrounding sensitive components on the IC to divert excess current away from vulnerable areas, thereby reducing the likelihood of latch-up occurrence

These guard rings not only function as barriers against parasitic thyristor (SCR) formation but also serve to isolate different regions of the IC, minimizing unwanted electrical interactions and maintaining pathway integrity

image-20250615085930327

image-20250615115154079

image-20250615115306640

P.E. Allen - 2016. CMOS Analog Circuit Design: Lecture 08 – Latchup and ESD (4/25/16) [https://aicdesign.org/wp-content/uploads/2018/08/lecture08-160425.pdf]


[https://analoghub.ie/category/Layout/article/layoutDependentEffects#Latchup]

Latch-Up triggers:

  • Power up
  • Overshoot voltages and currents
  • Substrate noise
  • ESD occurrences

image-20251009223259947

Latch-up key points:

  • State of an IC when it is made inoperable by a parasitic shorting of VDD and VSS;
  • Triggering of a Low Impedance High Current state between supplies;
  • High Current State remains even when trigger signal is removed.

image-20251009223343419

Latch-up prevention:

  • Guard rings has a lot of contacts, providing a strong VDD/Ground potential;
  • Guard rings add more parallel resistance to the NWELL/Substrate, thereby reducing parasitic resistors;
  • NWELL/substrate potentials are held around VDD/Ground, no positive feedback is formed.

Transient-Induced Latchup

image-20250615101046612

image-20250615103508641

image-20250615105240076

image-20250615105309789

OD injector

image-20250615104009056

Diode in ESD Protection

A diode can operate in both forward and reverse modes for ESD protection.

\(R_{ON}\) for a forward-biased diode is lower than that for a reverse-biased diode

One major disadvantage of a forward diode-string for ESD protection is that the leakage current (Ileak) may be enlarged due to the Darlington effect in the diode-string

Silicon Controlled Rectifiers (SCR)

A thyristor (also known as a Silicon Controlled Rectifier or SCR) is a three-terminal semiconductor device used as an electronic switch or rectifier

thyristor_construction-1

To turn the thyristor on, a positive voltage pulse is applied to the gate (G) terminal. This voltage pulse needs to be of sufficient magnitude to trigger the device. When the gate is triggered, it allows a small current to flow into the base of the P-N-P transistor within the thyristor structure

image-20250615102217116

[https://ec2-44-207-46-173.compute-1.amazonaws.com/thyristor/]

image-20250615111100951

ESD design window

[https://monthly-pulse.com/2021/06/02/the-esd-design-window-concept/]

[https://www.researching.cn/ArticlePdf/m00098/2020/41/12/122403.pdf]

image-20241124163116072

  • Transparency
    • Trigger voltage Vt1
    • Holding/clamping voltage Vh
  • Robustness
    • failure current level It2
  • Effectiveness
    • maximum voltage of the clamp device: Vmax

image-20250712182622220

image-20250712182907493

You Li. CICC2020: ESD Protection Design Overview in Advanced SOI and Bulk FinFET Technologies


[https://picture.iczhiku.com/weixin/message1640668908028.html]

图片

ESD工作区称为“设计窗口

保护设备的触发电压(V t1)定义了它设计为导通的电平; 触发后的保持电压(V Hold)是指应高于施加电压的钳位电平。最后,I t2是指ESD故障电流水平。

如蓝色曲线(1A或1B)所示,NMOS晶体管在触发点V t1处进入双极击穿(npn),并迅速恢复为称为V Hold的保持电压,并保护高达故障电流I ESD对应于ESD目标水平。(I t2,V t2)是指保护设备可能烧坏的散热点,因此该I t2必须大于I ESD目标电流水平(例如,目标1.5 kV HBM的电流为1 Amp)。如果保护设备的导通电阻(R on)太高,则V t2也可能达到可靠性电压极限。钳位电路必须有效触发,以使其电压累积不超过栅极氧化层击穿电压(BV ox)或晶体管击穿电压。晶体管的V Hold经过设计,使其具有一定的工作电压裕度,如曲线1A所示。相反,在具有V Hold的快速恢复装置小于工作电压(曲线1B)的情况下,存在EOS损坏的风险。

Two-Stage ESD Protection

two-stage primary–secondary ESD protection

a primary ESD protection structure (ESD1), a secondary ESD protection unit (ESD2), and an isolation resistor (\(R\))

The desired specs for ESD2 is low \(V_\text{t1}\) and short \(t_1\), while that for ESD1 include low \(R_{ON}\), low \(V_\text{h}\) and high \(I_\text{t2}\)

  • The primary ESD1 structure is typically optimized for high ESD protection level, which however may feature a high ESD \(V_\text{t1}\), not suitable for low-voltage (LV) ICs

  • The secondary ESD2 unit serves as a trigger-assisting device that features a lower ESD \(V_\text{t1}\) and fast ESD triggering, which is typically weak in handling large ESD discharge currents

The isolation \(R\) has another role, which is to prevent an ESD pulse from getting into IC core (i.e., stressing the input device) directly, hence avoid possible CMOS gate breakdown

\(R\) involves a design trade-off too: large enough for fast voltage build up, but not too large to avoid adverse impact on signal propagation

The two-stage ESD protection method is re-gaining attention for CDM ESD protection because it can handle large ESD surges without overheating, while preventing CMOS gate breakdown due to the isolation R (i.e., no direct zapping on the input gate)

img

  1. Adding a (small) clamp behind the isolation resistance can extend the ESD design window, e.g. enabling dual diode protection for thin oxide transistors.
  2. ESD current through this clamp will build-up voltage across the isolation resistance, while protecting the circuit.
  3. The higher voltage at the IN pad will then trigger the primary protection (red current path)

Adding a (small) clamp behind the isolation resistance can extend the ESD design window, e.g. enabling dual diode protection for thin oxide transistors

img

Extended ESD design window example. The failure voltage of a thin gate oxide in advanced CMOS is about 4V. The primary ESD solution (red IV curve) introduces too much voltage. Thanks to an isolation resistance between primary and secondary local clamp device (green IV curve) additional margin is created.

[https://monthly-pulse.com/2022/03/29/introduction-esd-protection-concepts-for-i-os/]


image-20250712100248384

Okushima, M. and Tsuruta, J., "Secondary ESD clamp circuit for CDM protection of over 6Gbit/s SerDes application in 40nm CMOS", Microelectronics Reliability, vol. 53, no. 2, pp. 215–220, 2013 [https://sci-hub.se/https://doi.org/10.1016/j.microrel.2012.04.010]

Gated diode & STI diode

"gated diode" aka. "poly bound" diode

image-20241120212904118

image-20250712085956491

STI bound diodes typically have lower capacitance

M. Simicic, G. Hellings, S. -H. Chen, N. Horiguchi and D. Linten, "ESD diodes with Si/SiGe superlattice I/O finFET architecture in a vertically stacked horizontal nanowire technology," 2018 48th European Solid-State Device Research Conference (ESSDERC), Dresden, Germany, 2018

US9653448B2. Electrostatic Discharge (ESD) Diode in FinFET Technology


image-20241120211301296

image-20241120211426247


image-20250712181904914

?? Rotated STI Diode

image-20250712183318811

image-20250712183347973

Loke, Alvin & Yang, (2018). Analog/mixed-signal design challenges in 7-nm CMOS and beyond. 10.1109/CICC.2018.8357060.

Shih-Hung Chen. CICC 2019: Designing Diode Based ESD Protection in Advanced State of the Art Technologies

TLP/vf-TLP

TRANSMISSION LINE PULSE TESTING: THE INDISPENSABLE TOOL FOR ESD CHARACTERIZATION OF DEVICES, CIRCUITS AND SYSTEMS [https://www.esda.org/assets/News/1708-ESD-firstDraft.pdf]

[https://monthly-pulse.com/2021/06/08/transmission-line-pulse-tlp-test-system/]

Jon Barth "TLP and VFTLP Testing of Integrated Circuit ESD Protection" [https://barthelectronics.com/wp-content/uploads/2016/09/TLP-and-VFTLP-Test-of-Integrated-Circuit-ESD-Protection.pdf]

Horst A. Gieser(IZM), "ESD- Testing: HBM to very fast TLP" [https://www.thierry-lequeu.fr/data/ESREF/2004/Tut5.pdf]

image-20241124184848034

Example TLP characteristics using TLP

Vt1: trigger voltage

Vhold: holding voltage

soft failure current: Isoft

hard failure current: It2

TLP vs ESD

  • ESD tests simulate real world events (HBM, MM, CDM)
  • TLP does not simulate any real-world event
  • ESD tests record failure level (Qualification)
  • TLP tests record failure level and device behavior (Characterization)

TLP is not a qualification test, but a characterization method, which describes the resistance of a device for a given stimulus, aka. Device Characterization

Unlike ESD waveforms, TLP does not mimic any real world event

image-20220609234548431

TLP and Curve Tracing

  • Curve Tracing is DC; TLP is a short pulse
    • Shorter pulse - Reduced duty cycle, less heating, which means higher voltage before failure
    • Controlled Impedance - Allows device behavior to be observed
  • Both measure resistance of device with increasing voltage

image-20220609235252444

Device Characterization with TLP

  • Turn-on time
  • Snapback voltage
  • Performance changes with rise time

image-20220609235427204

VF-TLP and CDM differences

Question:

How well will VF-TLP results predict CDM testing performance?

Answer:

VF-TLP can be a guide to CDM failure levels, and provide a lot of understanding of a circuit's operation during CDM stressing, but simple correlations between VF-TLP failure current level and CDM withstand voltage levels are difficult to establish.

I.V and Leakage Evolution Plots

DC leakage current data combined with the I-V data provides electrical indications of where damage begins, and how rapidly it can evolve from soft to hard failures

Henry, Leo & Barth, Jon & Richner, John & Verhaege, Koen. (2000). Transmission Line Pulse Testing of the ESD Protection Structures in ICs - A Failure Analyst's Perspective. 203-213. 10.31399/asm.cp.istfa2000p0203. [https://barthelectronics.com/pdf_files/2000%20ISTFA%20TLP%20Testing%20of%20the%20ESD%20Protection%20Structure.pdf]

Henry, L.G. & Barth, Jon & Verhaege, K. & Richner, J.. (2001). Transmission-line pulse ESD testing of ICs: A new beginning. Compliance Engineering. 18. 46+53. [https://barthelectronics.com/pdf_files/CE%20TLP%20Article%20March-April%202001.pdf]

Snapback devices

Lesson 2 - ESD Clamps [https://aicdesign.org/wp-content/uploads/2021/05/Lesson02_ESD_Clamps210315.pdf]

Introduction of Transmission Line Pulse (TLP) Testing for ESD Analysis - Device Level [https://www.esdemc.com/public/docs/TechnicalSlides/ESDEMC_TS001.pdf]

snapback

img

BJT

image-20250726102945232

image-20250726103744211


image-20250729215703772

image-20250729220237239

Grounded-gate NMOS (ggNMOS)

[https://monthly-pulse.com/2022/02/02/time-to-say-farewell-to-the-snapback-ggnmos-for-esd-protection/]

[https://monthly-pulse.com/2023/01/26/ggnmos-grounded-gated-nmos/]

snapback ggNMOS for ESD protection

img

Influence of the pulse rise time on ggNMOS. (left side) A fast ESD pulse can couple the bulk of the NMOS to a higher potential for a short period, reducing the trigger voltage. (right side) A clear Vt1 reduction is visible, while the remaining part of the IV curve remains the same.

image-20240723213214708


image-20250729230619882

image-20250729230837254


[https://picture.iczhiku.com/weixin/message1588643699565.html]

一般都是把Gate/Source/Bulk短接在一起,把Drain结在I/O端承受ESD的浪涌(surge)电压,NMOS称之为GGNMOS (Gate-Grounded NMOS)PMOS称之为GDPMOS (Gate-to-Drain PMOS)。以NMOS为例,原理都是Gate关闭状态,Source/Bulk的PN结本来是短接0偏的,当I/O端有大电压时,则Drain/Bulk PN结雪崩击穿,瞬间bulk有大电流与衬底电阻形成压差导致Bulk/Source的PN正偏,所以这个MOS的寄生横向NPN管进入放大区(发射结正偏,集电结反偏),所以呈现特性,起到保护作用。PMOS同理推导。

img

Trigger电压/Hold电压: Trigger电压当然就是之前将的的第一个拐点(Knee-point),寄生BJT的击穿电压,而且要介于BVCEO与BVCBO之间。而Hold电压就是要维持持续ON,但是又不能进入栅锁(Latch-up)状态,否则就进入二次击穿(热击穿)而损坏了。还有个概念就是二次击穿电流,就是进入Latch-up之后I^2*R热量骤增导致硅融化了,而这个就是要限流,可以通过控制W/L,或者增加一个限流高阻, 最简单最常用的方法是拉大Drain的距离/拉大SAB的距离(ESD rule的普遍做法)。

PN结的击穿分两种,分别是电击穿热击穿电击穿指的是雪崩击穿, Avalanche Breakdown (低浓度)齐纳击穿(高浓度),而这个电击穿主要是载流子碰撞电离产生新的电子-空穴对(electron-hole),所以它是可恢复的。但是热击穿不可恢复的,因为热量聚集导致硅(Si)被熔融烧毁了。所以我们需要控制在导通的瞬间控制电流,一般会在保护二极管再串联一个高电阻,


img

Gate-coupled NMOS (gcNMOS)

Ming-Dou Ker, Chung-Yu Wu, Tao Cheng and Hun-Hsien Chang, "Capacitor-couple ESD protection circuit for deep-submicron low-voltage CMOS ASIC," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 4, no. 3, pp. 307-321, Sept. 1996 [https://ir.lib.nycu.edu.tw/bitstream/11536/1053/1/A1996VE01800002.pdf]

Gate-coupled NMOS (gcNMOS) was proposed to effectively reduce the \(V_\text{t1}\)

image-20250726111621772

image-20250726112517289

[https://bbs.eetop.cn/forum.php?mod=redirect&goto=findpost&ptid=353178&pid=7305079]


image-20241124161901252


image-20250730194612367

SCR (thyristor)

Guang Chen, Haigang Feng and A. Wang, "A systematic study of ESD protection structures for RF ICs," IEEE Radio Frequency Integrated Circuits (RFIC) Symposium, 2003, Philadelphia, PA, USA, 2003 [https://sci-hub.se/10.1109/RFIC.2003.1213959]

image-20250726111753314

image-20250726104632417

[https://www.sharecourse.net/sharecourse/upload/course/180/c574580760de44d2c6fb66d8be4c6d4a.pdf]


img

Safe operating area (SOA)

image-20241120210746211

power clamp

Thanks to the device scaling the area is actually reasonable. However, the leakage becomes the main bottleneck. bigfet-concept

high current diode (HIA)

image-20250815202404198

both diode are reverse-biased in normal operation, the PN Junction capacitance is proportional to forward-bias voltage


image-20220618123654830

image-20220618123821117

image-20220618124644879

Device
ndio_mac N+/P-well Diode
pdio_mac P+/N-well Diode
ndio_18_mac 1.8V N+/P-well Diode
pdio_18_mac 1.8V P+/N-well Diode
ndio_hia18_mac N-HIA Diode
pdio_hia18_mac P-HIA Diode
ndio_gated18_mac Thick Oxide N-Gated Diode
pdio_gated18_mac Thick Oxide P-Gated Diode

HIA_DIO can be used for logic or high speed circuits ESD protection

HIA: high current application purpose (High Amp)

There is no process difference between HIA_DIO and regular diode

image-20220618191312489

image-20220618183241535

image-20220618191405428

width (W) 2.020E-07
Length (L) 1.922E-06
ArrayY (Ny) 2
Perimeter (Ny*2*(W+L)) 8.496E-06
Area (Ny*W*L) 7.76488E-13
  • diode is drain/source originated, which is different from MOS (Gate originated)

  • The perimeter of diode in DRC is different from that in PERC deck, where PERC excludes the the left and right edge of OD

g after the rule numbers: DFM recommendations and guidelines

U: the rule is not checked by the DRC

I-V curve

image-20250712134532834

MOS

image-20220618191906210

image-20220618192253726

image-20220618192325486

l in netlist has different definition for MOS and diode.

MOS: length of channel

diode: Gate space


image-20230517233753530

HIA = High Amp

lateral diode: perimeter is key DRC rule for ESD diode

HIA diode process is same with regular junction diode

Dual Stacked Diodes

image-20230518012456390

PS: I/O to GND positively

NS: I/O to GND negatively

PD: I/O to VDD positively

ND: I/O to VDD negatively

Dual diode should be used with power clamp for PS and ND path

PMOS power clamp

power_clamp_pmos.drawio

EOS

[https://picture.iczhiku.com/weixin/message1640668908028.html]

图片

尽管通常ESD保护的设计并非旨在防止EOS事件,但根据特定的应用和操作,上述器件的ESD保护的IC 设计风格确实可以影响EOS损坏导致的故障率。环境。图2说明了两个不同的骤回设备,其中设备1与设备2的设计相比相对安全。设备2的EOS风险增加是由于V Hold参数低于最大允许VDD。

CMOS集成电路闩锁效应 - 摘录

CMOS闩锁效应的发展

闩锁效应是以体CMOS工艺为基础的集成电路特有的现象,无论是一般的常规体CMOS工艺集成电路,还是从CMOS工艺衍生出来的BiCMO、BCD和HV-CMOS等,都会发生闩锁效应。

image-20250731221753812

image-20250731222149977

  • 降低寄生BJT的放大系数
  • 降低衬底等效电阻

双极型晶体管

双极型晶体管的四种工作模式下集电结和发射结外加偏置电压

image-20250802080156521

1)正向有源:双极型晶体管的发射结正偏和集电结反偏。工作在正向有源区的双极型晶体管具有电流放大功能,它的放大系数是\(\beta\)\(\beta\)是集电极电流与基极电流的比,\(\beta\)是一个非常关键的参数,通常双极型晶体管设计和制造工艺参数的变动都是为了获得足够大的\(\beta\)。正向有源是一种常用的工作区

2)饱和:双极型晶体管的发射结和集电结都正偏,它相当于两个并联的二极管。

3)倒置:双极型晶体管发射结反偏和集电结正偏。与正向有源相比,它们的角色倒置了。工作在倒置区的双极型晶体管也具有电流放大功能,不过其放大系数会比正向有源小几倍。实际应用中也很少会把双极型晶体管偏置在倒置区。

4)截止:双极型晶体管的发射结和集电结都反偏,其漏电流非常微弱,就像开路的开关

根据双极型晶体管的电极被输入和输出共用的情况,可以把双极型晶体管分为三种电路连接方式

image-20250802080348332

双极型晶体管的击穿电压

双极型晶体管两个PN结的反向击穿电压有以下三种:

第一种是发射极开路时的BVCBO; 第二种是集电极开路时的BVEBO; 第三种是基极开路时的BVCEO

image-20250802081425147

这三个击穿电压的关系如下:BVCBO>BVCEO>BVEBO

NPN闩锁效应

在CMOS集成电路中,不仅寄生的PNPN结构会发生闩锁效应,单个NMOS自身寄生NPN也会发生闩锁效应

image-20250802110535053

与PNPN类似,从寄生NPN I-V曲线可以看出,有两种方式可以使寄生NPN工作状态进入BC段的闩锁态:

  • 第一种是出现瞬态激励电压大于等于Vt1,从而产生雪崩击穿电流,使寄生NPN进入闩锁态,这种方式称为电压触发;
  • 第二种是出现瞬态激励电流,该电流大于等于B点对应的电流Ih,使寄生NPN进入闩锁态,这种方式称为电流触发。

Reference

Wang, Albert. Practical ESD Protection Design. John Wiley & Sons, 2021.

温德通. CMOS集成电路闩锁效应. 机械工业出版社, 2020

ANSI/ESDA/JEDEC JS-002-2022: ESDA/JEDEC Joint Standard For Electrostatic Discharge Sensitivity Testing Charged Device Model (CDM) Device Level

ESDA/JEDEC JTR002-01-22: For the User Guide of ANSI/ESDA/JEDEC JS-002 Charged Device Model Testing of Integrated Circuits

JESD22-C101E: Field-Induced Charged-Device Model Test Method for Electrostatic Discharge-Withstand Thresholds of Microelectronic Components


M. Di, H. Wang, F. Zhang, C. Li, Z. Pan and A. Wang, "Does CDM ESD Protection Really Work?," 2019 IEEE Workshop on Microelectronics and Electron Devices (WMED), Boise, ID, USA, 2019 [https://sci-hub.se/10.1109/WMED.2019.8714145]

M. Di, C. Li, Z. Pan and A. Wang, "Pad-Based CDM ESD Protection Methods Are Faulty," in IEEE Journal of the Electron Devices Society, vol. 8, pp. 1297-1304, 2020 [pdf]


Introduction to Transmission Line Pulse (TLP), URL: https://tools.thermofisher.com/content/sfs/brochures/TLP%20Presentation%20May%202009.pdf

VF-TLP and CDM differences, URL: https://www.grundtech.com/app-note-vf-tlp-cdm-differences

ESD-Testing: HBM to very fast TLP URL: https://www.thierry-lequeu.fr/data/ESREF/2004/Tut5.pdf

S. Kim et al., "Technology Scaling of ESD Devices in State of the Art FinFET Technologies," 2020 IEEE Custom Integrated Circuits Conference (CICC), 2020, pp. 1-6, doi: 10.1109/CICC48029.2020.9075899.

KOEN DECOCK IEEE-SSCSLEUVEN "ON-CHIP ESD PROTECTION: BASIC CONCEPTS AND ADVANCED APPLICATIONS" [https://monthly-pulse.com/wp-content/uploads/2021/11/2021-11-sofics_presentation_ieee_final.pdf]

Yuanzhong Zhou, D. Connerney, R. Carroll and T. Luk, "Modeling MOS snapback for circuit-level ESD simulation using BSIM3 and VBIC models," Sixth international symposium on quality electronic design (isqed'05), 2005, pp. 476-481, doi: 10.1109/ISQED.2005.81.

Charged Device Model (CDM) Qualification Issues - Expanded [https://www.jedec.org/sites/default/files/IndustryCouncil_CDM_October2021_JEDECversion_September2022_rev1.pdf]


Wang, Albert ZH. On-chip ESD protection for integrated circuits: an IC design perspective. Vol. 663. Springer Science & Business Media, 2002.

Ker, Ming-Dou, and Sheng-Fu Hsu. Transient-induced latchup in CMOS integrated circuits. John Wiley & Sons, 2009. [https://picture.iczhiku.com/resource/eetop/wyiGjQaHOgrYFcxB.pdf]

Milin Zhang, "Low Power Circuit Design Using Advanced CMOS Technology" River Publishers 2018

Barry Fernelius, Evans Analytical Group. Latch-up Testing [https://site.ieee.org/ocs-cpmt/files/2013/06/Latch-up_at_EAG_IEEE_September_2013.pdf]

M. -D. Ker and Z. -H. Jiang, "Overview on Latch-Up Prevention in CMOS Integrated Circuits by Circuit Solutions," in IEEE Journal of the Electron Devices Society, vol. 11, pp. 141-152, 2023 [https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9998049]


Shih-Hung Chen. CICC 2019. ES2-4 "ESD Challenges in Advanced FinFET & GAA Nanowire CMOS technologies"

Y. Li, M. Miao and R. Gauthier, "ESD Protection Design Overview in Advanced SOI and Bulk FinFET Technologies," 2020 IEEE Custom Integrated Circuits Conference (CICC), Boston, MA, USA, 2020

S. Kim et al., "Technology Scaling of ESD Devices in State of the Art FinFET Technologies," 2020 IEEE Custom Integrated Circuits Conference (CICC), Boston, MA, USA, 2020

Jitter separation lets you learn if the components of jitter are random or deterministic. That is, if they are caused by crosstalk, channel loss, or some other phenomenon. The identification of jitter and noise sources is critical when debugging failure sources in the transmission of high-speed serial signals

  • Tail Fit Method
  • Spectral method
RJ Extraction Methods Rationale
Spectral Speed/Consistency to Past Measurements;
Accuracy in low Crosstalk or Aperiodic Bounded Uncorrelated Jitter (ABUJ) conditions
Tail Fit General Purpose;
Accuracy in high Crosstalk or ABUJ conditions

Jitter Components

image-20220521190326201

dual-Dirac model

image-20220521181604467

Figure-1


image-20250816100336592

image-20250816101651481

Jitter Analysis: The Dual-Dirac Model, RJ/DJ, and Q-scale [https://people.engr.tamu.edu/spalermo/ecen689/jitter_dual_dirac_agilent.pdf]

Spectral method

power spectral density (PSD) represents jitter spectrum and peaks in the spectrum can be interpreted as PJ or DDJ, while the average noise floor is the power of RJ

image-20220521182929127

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
S1 = sum(win);
S2 = sum(win.^2);
N = length(win);
spec_nospur2 = (spec_nospur*S1).^2/N/S2; % To obtain linear spectrum for rj
rj_utj = sqrt(sum(spec_nospur2))*1e12;

spec = 1*ones(length(spec_nospur), 1)*1e-21;
spec(index) = specx(index);
% insert fft nyquist frequency component between positive frequency and
% negative frequency component
% DC;posFreq;nyqFreq;negFreq
spec_ifft = [spec;specnyq;conj(spec(end:-1:2))]';
sfactor = sum(win)/sqrt(2);
spec_ifft = spec_ifft*sfactor;
sig_rec = real(ifft(spec_ifft));
sig_rec = sig_rec(:);
sig_rec_utj = sig_rec./win(1:end);

Tail Fit Method

Tail fitting algorithm based on the Gaussian tail model by using probability distribution of collected jitter value

image-20220521191029433

1
2
3
4
5
6
7
8
9
10
11
12
13
bin_sig = bin_sig*1e12;

x = qfuncinv(cdf_sig);

% coef(1)*bin_sig + coef(2) = x
% which x is norm(0, 1)
% bin_sig = (x - coef(2))/coef(1)
% Then bin is norm(-coef(2)/coef(1), 1/coef(1))
coef = polyfit(bin_sig, x, 1);
sigma = 1/coef(1);
mu = -coef(2)*sigma;

fprintf('sigma=%.3fps, mu=%.3fps\n', sigma, mu);

Least Squares (LS) method

image-20220524005848719

It is known that TIE jitter is a linear equation, shown in below formula \[ x[n] = d_n \times \left[ \Delta t_{pj}[n]+\Delta t_{DCD}[n] +\Delta t_{ISI}[n]+\Delta t_{RJ}[n]\right] \] LS can be used to estimate the PJ, DCD, RJ , and ISI parameters \([a,b,J_{DCD},J_0, J_1...J_{(2^k-1)}]\)

image-20220524185332637

image-20220524185351383

image-20220524010446422

Jitter modeling

Periodic Jitter (PJ)

PJ is a repeating jitter \[ \Delta t_{PJ}[n]=A\sin(2\pi f_0\cdot nT_s + \theta)=a \sin(2\pi f_0 \cdot nT_s)+b\cos(2\pi f_0 \cdot nT_s) \] where \(f_0\) represents the fundamental frequency of PJ; \(A\) is the amplitude of PJ; \(T_s\) is the data stream period, and \(\theta\) is the initial phase of PJ

In the spectrum, the frequency of maximum amount of the jitter is PJ frequency \(f_0\).

Duty Cycle Distortion (DCD)

DCD is viewed as a series of adjacent positive and negative impulses \[ \Delta t_{DCD}[n] = J_{DCD}\times (-1)^n = [-J_{DCD},J_{DCD},-J_{DCD},J_{DCD},...] \] Where \(J_{DCD}\) is the DCD amplitude.

Random Jitter (RJ)

RJ is created by unbounded jitter sources, such as Gaussian white noise. The statistical PDF for RJ is enerally treated as a Gaussian distribution \[ f_{RJ}(\Delta t) = \frac{1}{\sqrt{2\pi\sigma}}\exp(-\frac{(\Delta t)^2}{2\sigma^2}) \]

Remarks

Periodic Jitter Generator and Insertion

Analysis and Estimation of Jitter Sub-Components: Classification and Segregation of Jitter Components

image-20220521212129098

image-20220521212142719

Reference

Mike Li. 2007. Jitter, noise, and signal integrity at high-speed (First. ed.). Prentice Hall Press, USA.

余宥浚 Jacky Yu, Keysight Taiwan AEO, Advanced Jitter and Eye-Diagram Analysis

Y. Duan and D. Chen, "Accurate jitter decomposition in high-speed links," 2017 IEEE 35th VLSI Test Symposium (VTS), 2017, pp. 1-6, doi: 10.1109/VTS.2017.7928918.

Y. Duan's phd thesis URL: https://dr.lib.iastate.edu/handle/20.500.12876/30459

Y. Duan and D. Chen, "Fast and Accurate Decomposition of Deterministic Jitter Components in High-Speed Links," in IEEE Transactions on Electromagnetic Compatibility, vol. 61, no. 1, pp. 217-225, Feb. 2019, doi: 10.1109/TEMC.2018.2797122.

"Jitter Analysis: The Dual-Dirac Model, RJ/DJ, and Q-Scale", Whitepaper: Keysight Technologies, U.S.A., Dec. 2017

Sharma, Vijender Kumar and Sujay Deb. "Analysis and Estimation of Jitter Sub-Components." (2014).

Qingqi Dou and J. A. Abraham, "Jitter decomposition in ring oscillators," Asia and South Pacific Conference on Design Automation, 2006

E. Balestrieri, L. De Vito, F. Lamonaca, F. Picariello, S. Rapuano and I. Tudosa, "The jitter measurement ways: The jitter decomposition," in IEEE Instrumentation & Measurement Magazine, vol. 23, no. 7, pp. 3-12, Oct. 2020, doi: 10.1109/MIM.2020.9234759.

McClure, Mark Scott. "Digital jitter measurement and separation." PhD diss., 2005.

Ren, Nan, Zaiming Fu, Shengcu Lei, Hanglin Liu, and Shulin Tian. "Jitter generation model based on timing modulation and cross point calibration for jitter decomposition." Metrology and Measurement Systems 28, no. 1 (2021).

M. P. Li, J. Wilstrup, R. Jessen and D. Petrich, "A new method for jitter decomposition through its distribution tail fitting," International Test Conference 1999. Proceedings (IEEE Cat. No.99CH37034), 1999, pp. 788-794, doi: 10.1109/TEST.1999.805809.

K. Bidaj, J. -B. Begueret and J. Deroo, "RJ/DJ jitter decomposition technique for high speed links," 2016 IEEE International Conference on Electronics, Circuits and Systems (ICECS) [https://sci-hub.se/10.1109/ICECS.2016.7841269]

divide-by-1.5 circuit

TODO 📅

Phase Interpolator

A phase interpolator (PI) is normally used as a phase shifter (or phase rotator) to generate an output clock whose phase is precisely controlled

TODO 📅

B. Razavi, "The Design of a Phase Interpolator [The Analog Mind]," IEEE Solid-State Circuits Magazine, Volume. 15, Issue. 4, pp. 6-10, Fall 2023. [http://www.seas.ucla.edu/brweb/papers/Journals/BR_SSCM_4_2023.pdf]

Deterministic Jitter

image-20220516004008878


image-20220516004058916

image-20220516004206118

j_Djpp can be calculated by PSD,too

image-20220516004615033

1
2
3
4
5
6
7
8
9
fck = 38.4e6;
Nfft = 15000;
fres = fck/Nfft;
psddBc = -99.3343;
psBc = psddBc + 10*log10(fres); % psd -> ps;
phrad2 = 10^(psBc/10);
phrms = sqrt(phrad2);
Jrms = phrms/2/pi*1/fck;
Jpp = 2*sqrt(2)*Jrms;
1
2
3
Jpp =

6.4038e-12

For DJ, we usually use peak to peak value

BTW, the psd value at half of fundamental frequency (\(f_s/2\)) is duty cycle distortion due to the NMOS/PMOS imbalance, because of rising only data

Random Jitter

RJ can be accurately and efficiently measured using PSS/Pnoise or HB/HBnoise.

Note that the transient noise can also be used to compute RJ;

However, the computation cost is typically very high, and the accuracy is lesser as compared to PSS/Pnoise and HB/HBnoise.

Since RJ follows a Gaussian distribution, it can be fully characterized using its Root-Mean-Squared value (RMS) or the standard deviation value (\(\sigma\))

The Peak-to-Peak value of RJ (\(\text{RJ}_{\text{p-p}}\)) can be calculated under certain observation conditions \[ \text{RJ}_{\text{p-p}}\equiv K \ast \text{RJ}_{\text{RMS}} \] Here, \(K\) is a constant determined by the BER specification of the system given in the following Table

BER Crest factor (K)
\(10^{-3}\) 6.18
\(10^{-4}\) 7.438
\(10^{-5}\) 8.53
\(10^{-6}\) 9.507
\(10^{-7}\) 10.399
\(10^{-8}\) 11.224
\(10^{-9}\) 11.996
\(10^{-10}\) 12.723
\(10^{-11}\) 13.412
\(10^{-12}\) 14.069
\(10^{-13}\) 14.698
1
2
3
4
K = 14.698;
Ks = K/2;
p = normcdf([-Ks Ks]);
BER = 1 - (p(2)-p(1));
1
2
3
BER =

1.9962e-13

image-20220516160050961

image-20220516193125490

Total Jitter

\[ \text{TJ}_{\text{p-p}}\equiv \text{DJ}_{\text{p-p}} + \text{RJ}_{\text{p-p}}(\text{BER}) \]

tj.drawio

image-20220516160006909

image-20220516012200383

In the psd of TJ, the spur is DJ and floor is RJ

Phase Noise to Jitter

The phase noise is traditionally defined as the ratio of the power of the signal in 1Hz bandwidth at offset \(f\) from the carrier \(P\), divided by the power of the carrier \[ \ell (f) = \frac {S_v'(f_0+f)}{P} \] where \(S_v'\) is is one-sided voltage PSD and \(f \geqslant 0\)

Under narrow angle assumption \[ S_{\varphi}(f)= \frac {S_v'(f_0+f)}{P} \] where \(\forall f\in \left[-\infty +\infty\right]\)

Using the Wiener-Khinchin theorem, it is possible to easily derive the variance of the absolute jitter(\(J_{ee}\))via integration of the corresponding PSD \[ J_{ee,rms}^2 = \int S_{J_{ee}}(f)df \]

And we know the relationship between absolute jitter and excess phase is \[ J_{ee}=\frac {\varphi}{\omega_0} \] Considering that phase noise is normally symmetrical about the zero frequency, multiplied by two is shown as below \[ J_{ee,rms} = \frac{\sqrt{2\int_{0}^{+\infty}\ell(f)df}}{\omega_0} \] where phase noise is in linear units not in logarithmic ones.

Because the unit of phase noise in Spectre-RF is logarithmic unit (dBc), we have to convert the unit before applying the above equation \[ \ell[linear] = 10^{\frac {\ell [dBc/Hz]}{10}} \] The complete equation using the simulation result of Spectre-RF Pnoise is \[ J_{ee,rms} = \frac{\sqrt{2\int_{0}^{+\infty}10^{\frac {\ell [dBc/Hz]}{10}}df}}{\omega_0} \]

The above equation has been verified for sampled pnoise, i.e. Jee and Edge Phase Noise.

  • For pnoise-sampled(jitter), Direct Plot Form - Function: Jee:Integration Limits can calculate it conveniently
  • But for pnoise-timeaveage, you have to use the below equation to get RMS jitter.

One example, integrate to \(\frac{f_{osc}}{2}\) and \(f_{osc} = 16GHz\)

image-20220415100034220

Of course, it apply to conventional pnoise simulation.

On the other hand, output rms voltage noise, \(V_{out,rms}\) divied by slope should be close to \(J_{ee,rms}\) \[ J_{ee,rms} = \frac {V_{out,rms}}{slope} \]

Pulse Width Jitter (PWJ)

TODO 📅

[Spectre Tech Tips: Measuring Noise in Digital Circuits]

Pnoise sampled: Edge Delay mode measures the noise defined by two edges. Both edges are defined by a threshold voltage and rising or falling edges, which measures the noise of the pulse itself and direct plot calculate the variation of the pulse width

Power supply induced jitter (PSIJ)

A sampled pxf analysis can be used to simulate the deterministic jitter of a circuit due to power supply ripple

TODO 📅

DCC & AC-coupled buffer

The amount of correction can be set by intentional injection of an offset current into the summing input node of INV, threshold-adjustable inverter

Note that the change to the threshold is opposite in direction to the change to INV

increasing DC of input signal is equivalent to lower down the threshold of INV

image-20241215233057176


image-20241216205525818

voltage at INV1 will increased by: \[ \frac{\Delta V_{DAC} - \Delta {INV1}}{R_{DAC}} = \frac{\Delta {INV1} +A_0 \Delta {INV1}}{R_{F}} \] therefore \[ \Delta {INV1} = \Delta V_{DAC} \cdot \frac{R_F}{R_F+(A_0+1)R_{DAC}} \approx \Delta V_{DAC} \cdot \frac{R_F}{A_0R_{DAC}} \]

variable \(R_{DAC}\) can be used to tweak tuning resolution & range

If \(R_{DAC} = R_F\) \[ \Delta {INV1}\approx \frac{\Delta V_{DAC}}{A_0} \]


image-20251014215409535

image-20251014220640238

C. Menolfi et al., "A 112Gb/S 2.6pJ/b 8-Tap FFE PAM-4 SST TX in 14nm CMOS," 2018 IEEE International Solid-State Circuits Conference - (ISSCC) [https://sci-hub.se/https://doi.org/10.1109/ISSCC.2018.8310205],[visual]

M. A. Kossel et al., "8.3 An 8b DAC-Based SST TX Using Metal Gate Resistors with 1.4pJ/b Efficiency at 112Gb/s PAM-4 and 8-Tap FFE in 7nm CMOS," 2021 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2021[https://sci-hub.se/10.1109/ISSCC42613.2021.9365784]

C. Menolfi et al., "A 28Gb/s source-series terminated TX in 32nm CMOS SOI," 2012 IEEE International Solid-State Circuits Conference, San Francisco, CA, USA, 2012

Bob Lefferts, Navraj Nandra. SNUG Israel 2007 [https://picture.iczhiku.com/resource/eetop/whKYwQorwYoPUVbm.pdf]


image-20240720073616597

Since duty-cycle error is high frequency component, the high-pass filter suppresses the duty-cycle error propagating to the output

image-20240720005226736

  • The AC-coupling capacitor blocks the low-frequency component of the input
  • The feedback resistor sets common mode voltage to the crossover voltage

Bae, Woorham; Jeong, Deog-Kyoon: 'Analysis and Design of CMOS Clocking Circuits for Low Phase Noise' (Materials, Circuits and Devices, 2020)

Casper B, O'Mahony F. Clocking analysis, implementation and measurement techniques for high-speed data links: A tutorial. IEEE Transactions on Circuits and Systems I: Regular Papers. 2009;56(1):17-39

reference

Article (20500632) Title: How to simulate Random and Deterministic Jitters URL: https://support.cadence.com/apex/ArticleAttachmentPortal?id=a1O3w000009fiXeEAI

Spectre Tech Tips: Measuring Noise in Digital Circuits - Analog/Custom Design - Cadence Blogs - Cadence Community https://community.cadence.com/cadence_blogs_8/b/cic/posts/s . . .

Cadence RAK: Deterministic Jitter Measurement using SpectreRF

Frank Wiedmann. Using sampled pxf analysis to simulate deterministic jitter [https://community.cadence.com/cadence_technology_forums/f/custom-ic-design/51605/using-sampled-pxf-analysis-to-simulate-deterministic-jitter]

supply noise sensitivity: PSS+PAC or PSS+PX [https://designers-guide.org/forum/YaBB.pl?num=1376500816]


J. Kim et al., "A 112 Gb/s PAM-4 56 Gb/s NRZ Reconfigurable Transmitter With Three-Tap FFE in 10-nm FinFET," in IEEE Journal of Solid-State Circuits, vol. 54, no. 1, pp. 29-42, Jan. 2019, doi: 10.1109/JSSC.2018.2874040

— et al., "A 224-Gb/s DAC-Based PAM-4 Quarter-Rate Transmitter With 8-Tap FFE in 10-nm FinFET," in IEEE Journal of Solid-State Circuits, vol. 57, no. 1, pp. 6-20, Jan. 2022, doi: 10.1109/JSSC.2021.3108969


J. N. Tripathi, V. K. Sharma and H. Shrimali, "A Review on Power Supply Induced Jitter," in IEEE Transactions on Components, Packaging and Manufacturing Technology, vol. 9, no. 3, pp. 511-524, March 2019 [https://sci-hub.st/10.1109/TCPMT.2018.2872608]

H. Kim, J. Fan and C. Hwang, "Modeling of power supply induced jitter (PSIJ) transfer function at inverter chains," 2017 IEEE International Symposium on Electromagnetic Compatibility & Signal/Power Integrity (EMCSI), Washington, DC, USA, 2017 [https://sci-hub.st/10.1109/ISEMC.2017.8077937]

Yin Sun, Chulsoon Hwang EMC Laboratory. Improving Power Supply Induced Jitter Simulation Accuracy for IBIS Model [https://ibis.org/summits/aug20/sun.pdf]

High Speed Communications Part 8 – On Die CMOS Clock Distribution. [https://youtu.be/nx5CiHcwrF0?si=-eSO-LaaaFrVuIA1]

Low-Jitter CMOS Clock Distribution [https://youtu.be/LMT-T41Y64U?si=y8IpWCtU90zpe4Ob]

Mo, Xunjun & Wu, Jiaqi & Wary, Nijwm & Carusone, Tony. (2021). Design Methodologies for Low-Jitter CMOS Clock Distribution. IEEE Open Journal of the Solid-State Circuits Society. 1. 94-103. 10.1109/OJSSCS.2021.3117930. [https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9559395]


Mozhgan Mansuri. ISSCC2021 SC3: Clocking, Clock Distribution, and Clock Management in Wireline/Wireless Subsystems [https://www.nishanchettri.com/isscc-slides/2021%20ISSCC/SHORT%20COURSE/ISSCC2021-SC3.pdf]

Phillip Restle. ISSCC2021 SC4: Processor Clock Generation, Distribution, and Clock Sensor/Management Loops [https://www.nishanchettri.com/isscc-slides/2021%20ISSCC/SHORT%20COURSE/ISSCC2021-SC4.pdf]

Sam Palermo. Spring 2025 ECEN720 : High-Speed Links Circuits and Systems [Lecture 14: Clock Distribution Techniques]

DC offset

Performing FFT to a signal with a large DC offset would often result in a big impulse around frequency 0 Hz, thus masking out the signals of interests with relatively small amplitude.

Remove_DC_Offset_Blog_10

One method to remove DC offset from the original signal before performing FFT

  • Subtracting the Mean of Original Signal

You can also not filter the input, but set zero to the zero frequency point for FFT result.

Nyquist component

If we go back to the definition of the DFT \[ X(N/2)=\sum_{n=0}^{N-1}x[n]e^{-j2\pi (N/2)n/2}=\sum_{n=0}^{N-1}x[n]e^{-j\pi n}=\sum_{n=0}^{N-1}x[n](-1)^n \] which is a real number.

The discrete function \[ x[n]=\cos(\pi n) \] is always \((-1)^n\) for integer \(n\)

One general sinusoid at Nyquist and has phase shift \(\theta\), this is \(T=2\) and \(T_s=1\)

\[\begin{align} x[n] &= A \cos(\pi n + \theta) \\ &= A \big( \cos(\pi n) \cos(\theta) - \sin(\pi n) \sin(\theta) \big) \\ &= \big(A\cos(\theta)\big) \cos(\pi n) + \big(-A\sin(\theta)\big) \sin(\pi n) \\ &= \big(A\cos(\theta)\big) (-1)^n + \big(-A\sin(\theta)\big) \cdot 0 \\ &= B \cdot (-1)^n \\ \end{align}\]

Where \(A\cos(\theta)=B\).

Moreover \(B \cdot (-1)^n = B\cdot \cos(\pi n)\), then \[ B\cdot \cos(\pi n) = A \cdot \cos(\pi n + \theta) \] We can NOT distinguish one from another.

In other words, you CAN'T infer the signal from \(X(\frac{N}{2})\) \[\begin{align} X(k)\frac{1}{N}e^{j 2 \pi \frac{nk}{N}}\bigg|_{k=\frac{N}{2}} &= \frac{X\left(\frac{N}{2} \right)}{N}(-1)^n \\ &= \frac{X\left(\frac{N}{2} \right)}{N}\cos(\pi n) \\ &= \frac{X\left(\frac{N}{2} \right)}{N}\left( \cos(\pi n) - \beta \sin(\pi n) \right) \\ &= \frac{X\left(\frac{N}{2} \right)}{N}\sqrt{1+\beta^2}\left(\frac{1}{\sqrt{1+\beta^2}} \cos(\pi n) - \frac{\beta}{\sqrt{1+\beta^2}} \sin(\pi n) \right) \\ &= \frac{X\left(\frac{N}{2} \right)}{N} \frac{1}{\cos(\theta)}\left(\cos(\theta) \cos(\pi n) - \sin(\theta) \sin(\pi n) \right) \\ &= \frac{X\left(\frac{N}{2} \right)}{N} \frac{1}{\cos(\theta)} \cos(\pi n+\theta) \end{align}\]

where \(\beta \in \mathbb{R}\) and you wouldn't know it because \(\sin(\pi n)=0 \quad \forall n \in \mathbb{Z}\)

For example, if \(\theta=0\) \[ X(k)\frac{1}{N}e^{j 2 \pi \frac{nk}{N}}\bigg|_{k=\frac{N}{2}}= \frac{X\left(\frac{N}{2} \right)}{N} \cos(\pi n) \] However, if \(\theta=\frac{\pi}{3}\) \[ X(k)\frac{1}{N}e^{j 2 \pi \frac{nk}{N}}\bigg|_{k=\frac{N}{2}}= \frac{X\left(\frac{N}{2} \right)}{N}\cdot 2 \cos(\pi n+\frac{\pi}{3}) \]

That sort of ambiguity is the reason for the strict inequality of the sampling theorem's condition.

Duty Cycle Distortion

image-20250522203602328

[Prof. Tony Chan Carusone, Low-Jitter CMOS Clock Distribution]

Both edges is used for clock waveform to evaluate the duty cycle distortion,

Assuming TIE is [0 0.2 0 0.2 0 0.2 ...], then subtract DC offset, we get [-0.1 0.1 -0.1 0.1 ...], shown as below

image-20220516224616908

The amplitude is manifested in FFT amplitude spectrum, i.e. Nyquist component, which is 0.1 in follow figure

image-20220516233618663

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
N = 32;
n = (1:N);
x = 0.1*(-1).^n;
figure(1)
stem(n-1, x);
X = fft(x)/N;
Xshift = fftshift(X);
fprintf("nyquist component: %.2f\n", Xshift(1));
magXshift = abs(Xshift);
ph = phase(Xshift)/pi*180;
figure(2)
subplot(3, 1, 1)
fx = (-N/2:N/2-1);
stem(fx, magXshift);
xlabel('Freq');
ylabel('|X(k)|');
title('mag of DFT');
grid on;

subplot(3, 1, 2)
stem(fx, ph);
xlabel('Freq');
ylabel('\angle X(k)(^oC)');
title('phase of DFT');
grid on;

%% inverse dft
ninv = (0:32-1);
xinv = Xshift(1)*cos(pi*ninv);
subplot(3, 1, 3);
hold on;
stem(ninv, x, "filled", 'r');
stem(ninv, xinv,'bd-.');
ninfer = (0:0.1:32+1);
xinfer1 = Xshift(1)*cos(pi*ninfer); % theta = 0
xinfer2 = Xshift(1)*2*cos(pi*ninfer+pi/3); % theta = pi/3
plot(ninfer, xinfer1, 'm--');
plot(ninfer, xinfer2, 'c--');
hold off;
legend('original', 'IDFT', '\theta=0', '\theta=\pi/3');
xlabel('time');
ylabel('V');
title('sample points');
grid on;

Single-Sided Amplitude Spectrum

DC and Nyquist frequency of FFT left over

P1(2:end-1) = 2*P1(2:end-1);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Fs = 1000;            % Sampling frequency
T = 1/Fs; % Sampling period
L = 1500; % Length of signal
t = (0:L-1)*T; % Time vector
S = 0.7*sin(2*pi*50*t) + sin(2*pi*120*t);
X = S + 2*randn(size(t));
figure(1)
plot(1000*t(1:50),X(1:50))
title('Signal Corrupted with Zero-Mean Random Noise')
xlabel('t (milliseconds)')
ylabel('X(t)')

figure(2)
Y = fft(X);
P2 = abs(Y/L); %!!! two-sided spectrum P2.
P1 = P2(1:L/2+1); %!!! single-sided spectrum P1
P1(2:end-1) = 2*P1(2:end-1); % exclude DC and Nyquist freqency
f = Fs*(0:(L/2))/L;
figure(2)
plot(f,P1)
title('Single-Sided Amplitude Spectrum of X(t)')
xlabel('f (Hz)')
ylabel('|P1(f)|')

image-20220514221609734

image-20220514221642170

Alternative View

The direct current (DC) bin (\(k=0\)) and the bin at \(k=N/2\), i.e., the bin that corresponds to the Nyquist frequency are purely real and unique.

sinusoidal waveform with \(10Hz\), amplitude 1 is \(cos(2\pi f_c t)\). The plot is shown as below with sampling frequency is \(20Hz\)

image-20220427172526711

Amplitude and Phase spectrum, sampled with \(f_s=20\) Hz

image-20220427174557915

The FFT magnitude of \(10Hz\) is 1 and its phase is 0 as shown as above, which proves the DFT and IDFT.

Caution: the power of FFT is related to samples (DFT Parseval's theorem), which may not be the power of continuous signal. The average power of samples is ([1 -1 1 -1 -1 1 ...]) is 1, that of corresponding continuous signal is \(\frac{1}{2}\).

Power spectrum derived from FFT provide information of samples, i.e. 1

Moreover, average power of sample [1 -1 1 -1 1 ...] is same with DC [1 1 1 1 ...].

code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
fc = 10;
fs = 2*fc;
fov = 64*fc;

ts = (0:1/fs:26);
tov = (0:1/fov:26);

ys = cos(2*pi*fc*ts);
yov = cos(2*pi*fc*tov);
%% waveform
stem(ts, ys)
hold on;
plot(tov, yov);
legend('sample', 'waveform')
ylim([-1.5 1.5])
grid on;
xlabel('time(s)');
ylabel('mag(V)')

nfft = 256;
X = fftshift(fft(ys, nfft))/nfft;
f = (-nfft/2:nfft/2-1)*fs/nfft;
magX = abs(X);
phsX = atan2(imag(X),real(X));
%% fft spectrum
figure(2)
subplot(2, 1, 1);
stem(f, magX);
xlabel('Frequency(Hz)');
ylabel('mag')
xlim([min(f)-1 max(f)+1])
title('Amplitude spectrum')
subplot(2, 1, 2)
plot(f, phsX);
xlabel('Frequency(Hz)');
ylabel('phase (rad)')
xlim([min(f)-1 max(f)+1])
title('Phase spectrum')

%% power spectrum
yssq_sum_avg = sum(ys(1:nfft).^2)/nfft;
specsq_sum_avg = sum(abs(X).^2);

reference

OriginLab, How to Remove DC Offset before Performing FFT URL: http://blog.originlab.com/how-to-remove-dc-offset-before-performing-fft

How to remove DC component in FFT? URL: https://www.mathworks.com/matlabcentral/answers/712808-how-to-remove-dc-component-in-fft#answer_594373

Analyzing a signal that contains frequency content at Fs/2 doesn't seem to work unless there is a phase shift URL: https://dsp.stackexchange.com/a/59807/59253

Nyquist–Shannon sampling theorem, Critical frequency URL: https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem#Critical_frequency

Why remove energy at Nyquist before ifft? URL: https://dsp.stackexchange.com/a/22851/59253

questasim sim flow

A Short Intro to ModelSim Verilog Simulator [https://users.ece.cmu.edu/~jhoe/doku/doku.php?id=a_short_intro_to_modelsim_verilog_simulator]

1
2
3
4
vlib work
vlog -f filelist tb.sv
# "-c": command line mode
vsim -voptargs=+acc -c -do "run 100ns; exit" work.topmodule

-voptargs=+acc: Add the option -voptargs=+acc to the vsim command, This enables full visibility into every aspect of the design.

1
2
3
module topmodule;
...
endmodule

uvm:

1
2
> vlog test_pkg.sv tb_top.sv -L $QUESTA_HOME/uvm-1.2
> vsim -c -do "run -all;exit" +UVM_TESTNAME=my_test work.tb_top -L $QUESTA_HOME/uvm-1.2

verilog-mode.el

Emacs Online Documentation https://doc.endlessparentheses.com/

Emacs verilog-mode 的使用 URL: https://www.wenhui.space/docs/02-emacs/verilog_mode_useguide/

1
emacs --no-site-file --load path/to/verilog-mode.el --batch filename.v -f verilog-auto-save-compile

CAUTION: filename.v is overwrite by command

verilog-mode.el

1
2
3
4
/*AUTOINPUT*/
/*AUTOWIRE*/
/*AUTOINST*/
/*AUTO_TEMPLATE*/

-f verilog-batch-auto

For use with --batch, perform automatic expansions as a stand-alone tool. This sets up the appropriate Verilog mode environment, updates automatics with M-x verilog-auto on all command-line files, and saves the buffers. For proper results, multiple filenames need to be passed on the command line in bottom-up order.

-f verilog-auto-save-compile

Update automatics with M-x verilog-auto, save the buffer, and compile

Emacs

--no-site-file

Another file for site-customization is site-start.el. Emacs loads this before the user's init file (.emacs, .emacs.el or .emacs.d/.emacs.d). You can inhibit the loading of this file with the option --no-site-file

--batch

The command-line option --batch causes Emacs to run noninteractively. The idea is that you specify Lisp programs to run; when they are finished, Emacs should exit.

--load, -l FILE, load Emacs Lisp FILE using the load function;

--funcall, -f FUNC, call Emacs Lisp function FUNC with no arguments

-f FUNC

--funcall, -f FUNC, call Emacs Lisp function FUNC with no arguments

--load, -l FILE

--load, -l FILE, load Emacs Lisp FILE using the load function

Verilog-mode is a standard part of GNU Emacs as of 22.2.

multiple directories

AUTOINST only search in the file's directory default.

You can append below verilog-library-directories for multiple directories search

1
2
3
// Local Variables:
// verilog-library-directories:("." "subdir" "subdir2")
// End:

plusargs in Verilog

systemverilog-command-line-input URL: https://www.chipverify.com/systemverilog/systemverilog-command-line-input

PLUSARGS IN SYSTEMVERILOG URL:https://www.theartofverification.com/plusargs-in-systemverilog/

plusargs are command-line switches supported by the simulator. As per SystemVerilog LRM arguments beginning with the + character will be available using the $test$plusargs and $value$plusargs PLI APIs.

1
2
3
$test$plusargs (user_string)

$value$plusargs (user_string, variable)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// tb.v
module tb;
int a;
initial begin
if($test$plusargs("RUNSIM")) begin
$display("There is RUNSIM plusargs");
end else begin
$display("There is NO $test$plusargs");
end
if($value$plusargs("SEED=%d",a)) begin
$display("SEED=%d",a);
end else begin
$display("There is NO $value$plusargs");
end
end
endmodule
  • compile

    1
    2
    $ vlib work
    $ vlog -sv tb.v
  • simulate (QuestaSim)

    • without plusargs

      1
      $ vsim work.tb -c -do "run; exit"
      1
      2
      3
      4
      5
      6
      7
      8
      9
      # //
      # Loading sv_std.std
      # Loading work.tb(fast)
      # run
      # There is NO $test$plusargs
      # There is NO $value$plusargs
      # exit
      # End time: 13:04:23 on Jun 04,2022, Elapsed time: 0:00:01
      # Errors: 0, Warnings: 0
    • with plusargs

      1
      $ vsim work.tb -c -do "run; exit" +SEED=31 +RUNSIM

      +SEED=31 +RUNSIM

      1
      2
      3
      4
      5
      6
      7
      8
      9
      # //
      # Loading sv_std.std
      # Loading work.tb(fast)
      # run
      # There is RUNSIM plusargs
      # SEED= 31
      # exit
      # End time: 13:04:55 on Jun 04,2022, Elapsed time: 0:00:01
      # Errors: 0, Warnings: 0

Inertial & transport delays

Verilog Nonblocking Assignments With Delays, Myths & Mysteries

Correct Methods For Adding Delays To Verilog Behavioral Models

Article (20488135) Title: Selecting Different Delay Modes in GLS (RAK) URL: https://support.cadence.com/apex/ArticleAttachmentPortal?id=a1O3w000009bdLyEAI

Article (20447759) Title: Gate Level Simulation (GLS): A Quick Guide for Beginners URL: https://support.cadence.com/apex/ArticleAttachmentPortal?id=a1Od0000005xEorEAE

image-20230414232256309

image-20230414232439556

Inertial delay

Inertial delay models are simulation delay models that filter pulses that are shorted than the propagation delay of Verilog gate primitives or continuous assignments (assign #5 y = ~a;)

COMBINATIONAL LOGIC ONLY !!!

  • Inertial delays swallow glitches
  • sequential logic implemented with procedure assignments DON'T follow the rule

continuous assignments

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
`timescale 1ns/100ps
module tb;

reg in;
/////////////////////////////////////////////////////
wire out;
assign #2.5 out = in;
/////////////////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220317000509716

procedure assignment - combinational logic

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
`timescale 1ns/100ps
module tb;

reg in;
reg out;

//////////// combination logic ////////////////////////
always @(*)
#2.5 out = in;
///////////////////////////////////////////////////////
/* the above code is same with following code
always @(*) begin
#2.5;
out = in;
end
*/
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220316235257361

procedure assignment - sequential logic

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
`timescale 1ns/100ps
module tb;
reg clk;

reg in;
reg out;

always begin
clk = 0;
#5;
forever begin
clk = ~clk;
#5;
end
end
//////////// sequential logic //////////////////
always @(posedge clk)
#2.5 out <= in;
///////////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
end

initial begin
#50;
$finish();
end

endmodule

image-20220316235620168

As shown above, sequential logic DON'T follow inertial delay

Transport delay

Transport delay models are simulation delay models that pass all pulses, including pulses that are shorter than the propagation delay of corresponding Verilog procedural assignments

  • Transport delays pass glitches, delayed in time
  • Verilog can model RTL transport delays by adding explicit delays to the right-hand-side (RHS) of a nonblocking assignment
1
2
always @(*)
y <= #5 ~a;

nonblocking assignment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
`timescale 1ns/100ps
module tb;

reg in;
reg out;
/////////////// nonblocking assignment ///
always @(*) begin
out <= #2.5 in;
end
/////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220317003146825

blocking assignment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
`timescale 1ns/100ps
module tb;

reg in;
reg out;
/////////////// blocking assignment ///
always @(*) begin
out = #2.5 in;
end
/////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220317003819457

It seems that new event is discarded before previous event is realized.

clocking block in SystemVerilog

Assignment at <interface>.<clocking block>.<output signal> (i.e. synchronous) do NOT change <interface>.<output signal> until active clock edge.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// router_io.sv

interface router_io(input bit clock);
logic reset_n;
logic [15:0] din;
logic [15:0] frame_n;
logic [15:0] valid_n;
logic [15:0] dout;
logic [15:0] valido_n;
logic [15:0] busy_n;
logic [15:0] frameo_n;

clocking cb @(posedge clock);
default input #1ns output #1ns;
output reset_n;
output din;
output frame_n;
output valid_n;
input dout;
input valido_n;
input frameo_n;
input busy_n;
endclocking: cb

// `reset_n` can be either a synchronous or an asynchronous signal
modport TB(clocking cb, output reset_n);

endinterface: router_io

All interface signals are asynchronous and without a direction spection (i.e. input, output, inout).

  • The direction can only be specified in clocking block for synchronous signals
  • or a modport for asynchronous signals

All directions for the signals in the clocking block must be with respect to the test program;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// test.sv

program automatic test(router_io.TB rtr_io);

initial begin
reset();
end

task reset();
rtr_io.reset_n = 1'b0;
rtr_io.cb.frame_n <= '1;
rtr_io.cb.valid_n <= '1;
repeat(2) @rtr_io.cb;
rtr_io.cb.reset_n <= 1'b1;
repeat(15) @(rtr_io.cb);
endtask: reset

endprogram: test
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// router_test_top.sv

`timescale 1ns/100ps

module router_test_top;
parameter simulation_cycle = 100;

bit SystemClock = 0;

router_io top_io(SystemClock);
test t(top_io);

router dut(
.reset_n (top_io.reset_n),
.clock (top_io.clock),
.din (top_io.din),
.frame_n (top_io.frame_n),
.valid_n (top_io.valid_n),
.dout (top_io.dout),
.valido_n (top_io.valido_n),
.busy_n (top_io.busy_n),
.frameo_n (top_io.frameo_n)
);

initial begin
$timeformat(-9, 1, "ns", 10);
$fsdbDumpvars;
end

always begin
#(simulation_cycle/2) SystemClock = ~SystemClock;
end

endmodule

compile:

1
2
$ vcs -sverilog -full64 -kdb -debug_access+all router_test_top.sv test.sv router_io
.sv ../../rtl/router.v

file with `timescale must be placed in the first, which is router_test_top.sv in above example

clocking.output

image-20220621005749074

systemverilog don't pass clocking.output to interface's until current or next active edge and after output-skew

clocking.input

image-20220621010546293

Systemverilog automatically update clocking.input signal from interface's value, input-skew before active edge

Gotcha

An interface must be compiled separately like a module and CANNOT `include inside a package or ohter module

Cadence EE_pkg 101

What Is Real Number Modeling?

  • model analog blocks operation as signal flow model
  • only the digital solver is used for high-speed simulation
    • Event-driven
    • No convergence issues, because no analog solver is used
  • Five different language standards support real number modeling:
    • wreal (wired-real) ports in Verilog-AMS
    • real data type in VHDL
    • real data type in Verilog
    • real variables and nettypes in SystemVerilog (SV)
    • real types in e

Benefits of RNM

  • Most analog circuits that need to be modeled for MS verification at the SoC level can be described in terms of real-valued voltages or currents
  • RNM is a mixed approach, borrowing concepts from both continuous and discrete domains
    • The values are floating-point (real) number.
    • Time is discrete; the real signals change values based on discrete events
  • Applicability of RNM is bounded primarily by signal-flow model style
  • Migrating analog behavior from the analog domain to the event or pseudo-analog domain can bring huge benefits without sacrificing too much accuracy
  • Simulation is executed by a digital simulation engine without need for the analog solver
  • Hence real-number modeling enables very high performance simulation of mixed-signal systems

Limitations of RNM

  • connecting real or wreal signals to electrical signals requires careful consideration
    • Too conservative an approach can lead to large numbers of timepoints
    • Too liberal an approach can lead to losing signal accuracy
  • Time accuracy limited by the discrete sampling approach and the `timescale setting - no continuous signals anymore
  • Limited capability for combination of signals by wiring outputs together
    • Requires assumptions about impedances to do simple merging

constant part-select and indexed part-select in Verilog

Verilog scalar and vector [link]

What is the "+:" operator called in Verilog? [link]

A range of contiguous bits can be selected and is known as part-select. There are two types of part-selects, one with a constant part-select and another with an indexed part-select

1
2
reg [31:0] addr;
addr [23:16] = 8'h23; //bits 23 to 16 will be replaced by the new value 'h23 -> constant part-select

Having a variable part-select allows it to be used effectively in loops to select parts of the vector. Although the starting bit can be varied, the width has to be constant.

[<start_bit +: ] // part-select increments from start-bit

[<start_bit -: ] // part-select decrements from start-bit

Example

1
2
3
4
5
6
7
8
9
10
11
12
logic [31: 0] a_vect;
logic [0 :31] b_vect;

logic [63: 0] dword;
integer sel;

a_vect[ 0 +: 8] // == a_vect[ 7 : 0]
a_vect[15 -: 8] // == a_vect[15 : 8]
b_vect[ 0 +: 8] // == b_vect[0 : 7]
b_vect[15 -: 8] // == b_vect[8 :15]

dword[8*sel +: 8] // variable part-select with fixed width

Mixing Signed and Unsigned in Verilog

Bevan Baas, VLSI Digital Signal Processing, EEC 281 - VLSI Digital Signal Processing [https://www.ece.ucdavis.edu/~bbaas/281/]

Sign Extension

  1. Calculate the necessary minimum width of the sum so that it contains all input possibilities
  2. Extend the inputs' sign bits to the width of the answer
  3. Add as usual
  4. Ignore bits that ripple to the left of the answer's MSB
  1. signed
inA (signed) inB (signed) outSum
(signed/unsigned)
0101 (5) 1111 (-1)
extend sign 00101 11111
sum result 00100
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg signed [3:0] inA;
reg signed [3:0] inB;
reg signed [4:0] outSumSg; // signed result
reg [4:0] outSumUs; // unsigned result

initial begin
inA = 4'b0101;
inB = 4'b1111;
outSumUs = inA + inB;
outSumSg = inA + inB;

$display("signed out(%%d): %0d", outSumSg);
$display("signed out(%%b): %b", outSumSg);

$display("unsigned out(%%d): %0d", outSumUs);
$display("unsigned out(%%b): %b", outSumUs);
end
endmodule
  1. mixed
inA (signed) inB (unsigned) outSum
(signed/unsigned)
0101 (5) 1111 (15)
extend sign 00101 01111
sum result 10100
1
2
reg signed [3:0] inA;
reg [3:0] inB;
inA (unsigned) inB (signed) outSum
(signed/unsigned)
0101 (5) 1111 (-1)
extend sign 00101 01111
sum result 10100
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg [3:0] inA;
reg signed [3:0] inB;
reg signed [4:0] outSumSg;
reg [4:0] outSumUs;

initial begin
inA = 4'b0101;
inB = 4'b1111;
outSumUs = inA + inB;
outSumSg = inA + inB;

$display("signed out(%%d): %0d", outSumSg);
$display("signed out(%%b): %b", outSumSg);

$display("unsigned out(%%d): %0d", outSumUs);
$display("unsigned out(%%b): %b", outSumUs);
end
endmodule

xcelium

1
2
3
4
5
6
7
xcelium> run
signed out(%d): -12
signed out(%b): 10100
unsigned out(%d): 20
unsigned out(%b): 10100
xmsim: *W,RNQUIE: Simulation is complete.
xcelium> exit

vcs

1
2
3
4
5
6
Compiler version S-2021.09-SP2-1_Full64; Runtime version S-2021.09-SP2-1_Full64;  May  7 17:24 2022
signed out(%d): -12
signed out(%b): 10100
unsigned out(%d): 20
unsigned out(%b): 10100
V C S S i m u l a t i o n R e p o r t

observation

When signed and unsigned is mixed, the result is by default unsigned.

Prepend to operands with 0s instead of extending sign, even though the operands is signed

LHS DONT affect how the simulator operate on the operands but what the results represent, signed or unsigned

Therefore, although outSumUs is declared as signed, its result is unsigned

subtraction example

In logic arithmetic, addition and subtraction are commonly used for digital design. Subtraction is similar to addition except that the subtracted number is 2's complement. By using 2's complement for the subtracted number, both addition and subtraction can be unified to using addition only.

operands are signed
inA (signed) inB (signed) outSub
(signed/unsigned)
1111 (-1) 0010 (2)
extend sign 11111 00010
sub result 11101
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg signed [3:0] inA;
reg signed [3:0] inB;
reg signed [4:0] outSubSg;
reg [4:0] outSubUs;

initial begin
inA = 4'b1111;
inB = 4'b0010;
outSubUs = inA - inB;
outSubSg = inA - inB;

$display("signed out(%%d): %0d", outSubSg);
$display("signed out(%%b): %b", outSubSg);

$display("unsigned out(%%d): %0d", outSubUs);
$display("unsigned out(%%b): %b", outSubUs);
end
endmodule
1
2
3
4
5
6
Compiler version S-2021.09-SP2-1_Full64; Runtime version S-2021.09-SP2-1_Full64;  May  7 17:46 2022
signed out(%d): -3
signed out(%b): 11101
unsigned out(%d): 29
unsigned out(%b): 11101
V C S S i m u l a t i o n R e p o r t
1
2
3
4
5
6
xcelium> run
signed out(%d): -3
signed out(%b): 11101
unsigned out(%d): 29
unsigned out(%b): 11101
xmsim: *W,RNQUIE: Simulation is complete.
operands are mixed
inA (signed) inB (unsigned) outSub
(signed/unsigned)
1111 (-1) 0010 (2)
extend sign 01111 00010
sub result 01101
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg signed [3:0] inA;
reg [3:0] inB;
reg signed [4:0] outSubSg;
reg [4:0] outSubUs;

initial begin
inA = 4'b1111;
inB = 4'b0010;
outSubUs = inA - inB;
outSubSg = inA - inB;

$display("signed out(%%d): %0d", outSubSg);
$display("signed out(%%b): %b", outSubSg);

$display("unsigned out(%%d): %0d", outSubUs);
$display("unsigned out(%%b): %b", outSubUs);
end
endmodule
1
2
3
4
5
6
Compiler version S-2021.09-SP2-1_Full64; Runtime version S-2021.09-SP2-1_Full64;  May  7 17:50 2022
signed out(%d): 13
signed out(%b): 01101
unsigned out(%d): 13
unsigned out(%b): 01101
V C S S i m u l a t i o n R e p o r t
1
2
3
4
5
6
7
xcelium> run
signed out(%d): 13
signed out(%b): 01101
unsigned out(%d): 13
unsigned out(%b): 01101
xmsim: *W,RNQUIE: Simulation is complete.
xcelium> exit

Danger Sign

https://projectf.io/posts/numbers-in-verilog/

Verilog has a nasty habit of treating everything as unsigned unless all variables in an expression are signed. To add insult to injury, most tools won’t warn you if signed values are being ignored.

If you take one thing away from this post:

Never mix signed and unsigned variables in one expression!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
module signed_tb ();
logic [7:0] x; // 'x' is unsigned
logic signed [7:0] y; // 'y' is signed
logic signed [7:0] x1, y1;
logic signed [3:0] move;

always_comb begin
x1 = x + move; // !? DANGER: 'x' is unsigned but 'move' is signed
y1 = y + move;
end

initial begin
#10
$display("Coordinates (7,7):");
x = 8'd7;
y = 8'd7;
#10
$display("x : %b %d", x, x);
$display("y : %b %d", y, y);

#10
$display("Move +4:");
move = 4'sd4; // signed positive value
#10
$display("x1: %b %d *LOOKS OK*", x1, x1);
$display("y1: %b %d", y1, y1);

#10
$display("Move -4:");
move = -4'sd4; // signed negative value
#10
$display("x1: %b %d *SURPRISE*", x1, x1); // 0000_0111 + {0000}_1100 = 0001_0011
$display("y1: %b %d", y1, y1); // 0000_0111 + {1111}_1100 = 0000_0011
end
endmodule
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Chronologic VCS simulator copyright 1991-2021
Contains Synopsys proprietary information.
Compiler version S-2021.09-SP2-2_Full64; Runtime version S-2021.09-SP2-2_Full64; Nov 19 11:02 2022
Coordinates (7,7):
x : 00000111 7
y : 00000111 7
Move +4:
x1: 00001011 11 *LOOKS OK*
y1: 00001011 11
Move -4:
x1: 00010011 19 *SURPRISE*
y1: 00000011 3
V C S S i m u l a t i o n R e p o r t
Time: 60
CPU Time: 0.260 seconds; Data structure size: 0.0Mb

time format in Verilog

https://verificationacademy.com/forums/systemverilog/time-vs-realtime#answer-94062 https://verificationacademy.com/forums/systemverilog/time-vs-realtime#answer-94096

realtime vs time

  • $realtime round the current time to timeprecision

  • $time round the current time to integer

  • %t will scale the rounded value to represent timeprecision,

    i.e. \([\$\text{realtime}, \$\text{time}]\cdot \$\text{timeunit} / \$\text{timeprecision}\)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
timeunit 10ns;
timeprecision 1ps;

initial begin
$display("$realtime = %g", $realtime);
$display("$time = %g", $time);
// $realtime round to timeprecision
// $time round to integer
#1.10016
$display("$realtime = %g", $realtime);
$display("$time = %g", $time);

// %t format will scale the rounded value to represent timeprecision
// 1.1002*10e-9/1e-12 = 11002
$display("$realtime %%t = %t", $realtime);
$display("$time %%t = %t", $time);
end
endmodule

output

1
2
3
4
5
6
$realtime = 0
$time = 0
$realtime = 1.1002
$time = 1
$realtime %t = 11002
$time %t = 10000

timeunit, timeprecision

The time unit and time precision can be specified in the following two ways:

  • Using the compiler directive `timescale
  • Using the keywords timeunit and timeprecision
1
2
3
4
5
6
7
8
9
10
module D (...);
timeunit 100ps;
timeprecision 10fs;
...
endmodule

module E (...);
timeunit 100ps / 10fs; // timeunit with optional second argument
...
endmodule

The minimum of timeprecision determine %t output, the nearest timeunit and timeprecision determine the round of $realtime and $time. Of course, the simulator follow the time tick shown by $realtime.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
`timescale 1ns/1ns

// Due to minimum of timeprecision is 10ps
module rawplant;
timeunit 1ns;
timeprecision 100ps;

task print;
/*
1ns: 1
100ps: 6
10ps: 6
1ps: 6
*/
#1.666
$display("raw: $realtime = %g", $realtime); // 1.7
$display("raw: $time = %g", $time); // 2
$display("raw: $realtime %%t = %t", $realtime); // 1.7*1ns/10ps=170
$display("raw: $time %%t = %t", $time); // 2*1ns/10ps = 200
endtask

endmodule

module fineplant;
timeunit 100ps;
timeprecision 10ps;

task print;
/*
100ps: 2
10ps: 6
1ps: 6
*/
#2.66
$display("fine: $realtime = %g", $realtime); // 2.7
$display("fine: $time = %g", $time); // 3
$display("fine: $realtime %%t = %t", $realtime); // 2.7*100ps/10ps = 27
$display("fine: $time %%t = %t", $time); // 3*100ps/10ps = 30
endtask

endmodule

module tb;
rawplant rawblock();
fineplant fineblock();

initial begin
fork
rawblock.print();
fineblock.print();
join
end
endmodule

output

1
2
3
4
5
6
7
8
fine: $realtime = 2.7
fine: $time = 3
fine: $realtime %t = 27
fine: $time %t = 30
raw: $realtime = 1.7
raw: $time = 2
raw: $realtime %t = 170
raw: $time %t = 200

questasim cmd

1
2
3
vlib work
vlog tb.v
vsim -c -do "run -all;exit" work.tb

data dumping of simulation

$dumpvars and $dumpfile Verilog, [http://www.referencedesigner.com/tutorials/verilog/verilog_62.php]

FSDB

$fsdbDumpfile

It specifies the FSDB file name created by the Novas object files for FSDB dumping. If it is not specified, then the default FSDB file name is "novas.fsdb".

This command is valid only before executing $fsdbDumpvars and is ignored if specified after $fsdbDumpvars

$fsdbSuppress

The fsdbSuppressutility is used to skip dumping of few instances, scopes, modules and signals. The fsdbSuppressutility is a system task like other fsdb tasks.

For $fsdbSuppress() to be effective, it needs to be specified/called before $fsdbDumpvars

$fsdbAutoSwitchDumpfile

Automatically switch to a new dump file when the working FSDB file reaches the specified size or the specified wall time period.

After the dumping is finished, a virtual FSDB file (*.vf) is automatically created and list all of the generated FSDB files with the correct sequence. Only the virtual FSDB file, rather than all of the FSDB files, needs to be loaded to view the simulation results

When specified in the design to switch based on file size:

1
$fsdbAutoSwitchDumpfile(File_Size | File_Size_var, "FSDB_Name" |FSDB_Name_var, Number_of_Files | Number_of_Files_var[ ,"log_filename" | ,log_filename_var ], ["+no_overwrite"]);

When specified in the design to switch based on time period

1
$fsdbAutoSwitchDumpfile(File_Size | File_Size_var, "FSDB_Name" |FSDB_Name_var, Number_of_Files | Number_of_Files_var[ ,"log_filename" | ,log_filename_var ], ["+no_overwrite"], “+by_period”);

“+by_period”

$fsdbDumpvars

This command dumps the change in signal value to the FSDB file.

1
$fsdbDumpvars([ depth, | "level=",depth_var, ],[instance | "instance=",instance_var])

For VCS users, to include memory, MDA, packed array and structure information in the generated FSDB file, the -debug_access option must be included when VCS is invoked to compile the design

  • depth

    Specify how many sub-scope levels under the given scope you want to dump.

    • Specify this argument as 1 to dump the signals under the given scope
    • Specify this argument as 0 to dump all signals under the given scope and its descendant scopes.

    0: all signals in all scopes.

    1: all signals in current scope.

    2: all signals in the current scope and all scopes one level below.

    n: all signals in the current scope and all scopes n-1 levels below.

    tb.clk tb.u_div2.div2 tb.u_div2.u_div2neg.div2neg
    $fsdbDumpvars(0)
    $fsdbDumpvars(1)
    $fsdbDumpvars(2)
    $fsdbDumpvars(1, tb.u_div2)
    $fsdbDumpvars(0, tb.u_div2)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    module tb;
    reg clk;

    divider2 u_div2(clk);

    initial begin
    clk = 1'b0;
    forever #5 clk = ~clk;
    end

    initial begin
    #100;
    $finish();
    end

    initial begin
    #10;
    $fsdbDumpfile("tb.fsdb");
    //$fsdbDumpvars(0); // same with $fsdbDumpvars(0, tb)
    //$fsdbDumpvars(1); // same with $fsdbDumpvars(1, tb)
    //$fsdbDumpvars(2); // same with $fsdbDumpvars(2, tb)
    //$fsdbDumpvars(1, tb.u_div2);
    $fsdbDumpvars(0, tb.u_div2);
    #80 $finish();
    end

    endmodule


    module divider2 (
    input clk
    );
    reg div2;

    divider2neg u_div2neg(div2);

    always@(posedge clk) begin
    div2 = ~div2;
    end

    initial begin
    div2 = 1'b0;
    end

    endmodule

    module divider2neg (
    input clk
    );
    reg div2neg;

    always@(negedge clk) begin
    div2neg = ~div2neg;
    end

    initial begin
    div2neg= 1'b0;
    end

    endmodule

    compile

    1
    vcs -full64 -kdb -debug_access+all tb.v

    simulate

    1
    ./simv

    load fsdb

    1
    verdi -ssf tb.fsdb

    image-20220604192421888

$fsdbDumpon, $fsdbDumpoff

1
2
3
$fsdbDumpon(["+fsdbfile+filename"])

$fsdbDumpoff(["+fsdbfile+filename"])

These FSDB dumping commands turn dumping on and off. fsdbDumpon/fsdbDumpoff has the highest priority and overrides all other FSDB dumping commands.

fsdbDumpon/fsdbDumpoff is not restricted to only fsdbDumpvars. If there is more than one FSDB file open for dumping at one simulation run, fsdbDumpon/fsdbDumpoff may only affect a specific FSDB file by specifying the specific file name.

  • +fsdbfile+filename: Specify the FSDB file name. If not specified, the default FSDB file name is "novas.fsdb"

$fsdbDumpFinish

This command closes all FSDB files in the current simulation and stops dumping of signals. Although all FSDB files are closed automatically at the end of simulation, this dumping command can be invoked to explicitly close the FSDB files during the simulation

VCD

$dumpfile

The declaration onf $dumpfile must come before the $dumpvars or any other system tasks that specifies dump.

1
$dumpfile("test.vcd");

argument is necessary, there is no default value

$dumpvars

The $dumpvars is used to specify which variables are to be dumped ( in the file mentioned by $dumpfile). The simplest way to use it is without any argument.

1
$dumpvars(<levels> <, <module_or_variable>>* );

$dumplimit

It is possible that you inadvertantly generate huge file in Gigabytes ( for examples while dumping a Gigahertz clock for one second). To reduce such occurrences, we may use $dumplimit. It usage is

1
$dumplimit(<filesize>);

$dumpoff and $dumpon

During the simulation if you are bothered about about only during a certain interval then you can use $dumpoff and $dumpon. The following example shows its usage. It will dump the changes for first 100 units of time and then between 10200 and 10400 units of time.

1
2
3
4
5
6
7
8
9
10
11
12
initial
$monitor($time, " reset=%b,clk_out=%b",reset,clk_out);
initial begin
$dumpfile("clkdiv2n_tb.vcd");
$dumpvars(0,clkdiv2n_tb);
#100;
$dumpoff;
#10200;
$dumpon;
#10400;
$dumpoff;
end

demo

stimulus.v

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
`timescale 1ns / 1ps
module stimulus;
// Inputs
reg x;
reg y;
// Outputs
wire z;
// Instantiate the Unit Under Test (UUT)
comparator uut (
.x(x),
.y(y),
.z(z)
);

initial begin
$dumpfile("test.vcd");
$dumpvars(0);
// Initialize Inputs
x = 0;
y = 0;

#20 x = 1;
#20 y = 1;
#20 y = 0;
#20 x = 1;
#40 ;

end

initial begin
$monitor("t=%3d x=%d,y=%d,z=%d \n",$time,x,y,z, );
end

endmodule

comparator.v

1
2
3
4
5
6
7
8
9
module comparator(
input x,
input y,
output z
);

assign z = (~x & ~y) |(x & y);

endmodule
1
2
$ xrun stimulus.v comparator.v -access +rwc
$ simvision test.vcd

readmemb & readmemh in Verilog

Initialize Memory in Verilog [https://projectf.io/posts/initialize-memory-in-verilog/]

$readmemh("hex_memory_file.mem", memory_array, [start_address], [end_address]) $readmemb("bin_memory_file.mem", memory_array, [start_address], [end_address])

The system task has no versions to accept octal data or decimal data.

  • The 1st argument is the data file name.
  • The 2nd argument is the array to receive the data.
  • The 3rd argument is an optional start address, and if you provide it, you can also provide
  • The 4th argument optional end address.

Note, the 3rd and 4th argument address is for array not data file.

If the memory addresses are not specified anywhere, then the system tasks load file data sequentially from the lowest address toward the highest address.

The standard before 2005 specify that the system tasks load file data sequentially from the left memory address bound to the right memory address bound.

readtest.v

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
module readfile;
reg [7:0] array4 [0:3];
reg [7:0] array7 [6:0];
reg [7:0] array12 [11:0];

integer i;

initial begin
$readmemb("data.txt", array4);
$readmemb("data.txt", array7, 2, 5);
$readmemb("data.txt", array12);

for (i = 0; i < 4; i = i+1)
$display("array4[%0d] = %b", i, array4[i]);

$display("=========================");

for (i = 0; i < 7; i = i+1)
$display("array7[%0d] = %b", i, array7[i]);

$display("=========================");

for (i = 0; i < 12; i = i+1)
$display("array12[%0d] = %b", i, array12[i]);
end
endmodule

data.txt

1
2
3
4
5
6
7
8
00000000 
00000001
00000010
00000011
00000100
00000101
00000110
00001000

result

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
array4[0] = 00000000
array4[1] = 00000001
array4[2] = 00000010
array4[3] = 00000011
=========================
array7[0] = xxxxxxxx
array7[1] = xxxxxxxx
array7[2] = 00000000
array7[3] = 00000001
array7[4] = 00000010
array7[5] = 00000011
array7[6] = xxxxxxxx
=========================
array12[0] = 00000000
array12[1] = 00000001
array12[2] = 00000010
array12[3] = 00000011
array12[4] = 00000100
array12[5] = 00000101
array12[6] = 00000110
array12[7] = 00001000
array12[8] = xxxxxxxx
array12[9] = xxxxxxxx
array12[10] = xxxxxxxx
array12[11] = xxxxxxxx

iff in SystemVerilog

system verilog中的iff, [https://www.francisz.cn/2019/07/18/sv-iff]

1
2
@(posedge clk iff(vld));
do_something;

is equivalent to

1
2
3
4
5
forever begin
@(posedge clk);
if(vld) break;
end
do_something;

iff is more efficient than if because the expression is recalculated when vld transition rather than clk.

One example, detecting the negative edge of rtr_io.cb.frameo_n[da]

1
2
3
wait(rtr_io.cb.frameo_n[da] !== 0);
@(rtr_io.cb iff(rtr_io.cb.frameo_n[da] === 0 ));
$display("[DEBUG HGUO] %0t, rtr_io.cb.frameo_n[da] negedge", $realtime);

image-20220621182019927

[DEBUG HGUO] 6887250.0ns, rtr_io.cb.frameo_n[da] negedge

signed and unsigned arithmetic in Verilog

With implict sign extension, the implementation of signed arithmetic is DIFFERENT from that of unsigned. Otherwise, their implementations are same.

The implementations manifest the RTL's behaviour correctly

add without implicit sign extension

unsigned

rtl
1
2
3
4
5
6
7
module TOP (
input wire [2:0] data0
,input wire [2:0] data1
,output wire [2:0] result
);
assign result = data0 + data1;
endmodule

image-20220507114215200

synthesized netlist

image-20220507114307439

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 11:43:27 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [2:0] data1;
output [2:0] result;
wire n4, n5, n6;

an02d0 U6 ( .A1(data0[0]), .A2(data1[0]), .Z(n5) );
nr02d0 U7 ( .A1(data0[0]), .A2(data1[0]), .ZN(n4) );
nr02d0 U8 ( .A1(n5), .A2(n4), .ZN(result[0]) );
ad01d0 U9 ( .A(data1[1]), .B(data0[1]), .CI(n5), .CO(n6), .S(result[1]) );
xr03d1 U10 ( .A1(n6), .A2(data0[2]), .A3(data1[2]), .Z(result[2]) );
endmodule

vcs compile with -v /path/to/lib.v

signed

rtl
1
2
3
4
5
6
7
module TOP (
input wire signed [2:0] data0
,input wire signed [2:0] data1
,output wire signed [2:0] result
);
assign result = data0 + data1;
endmodule

image-20220507114654777

synthesized netlist

image-20220507114844111

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 11:48:54 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [2:0] data1;
output [2:0] result;
wire n4, n5, n6;

an02d0 U6 ( .A1(data0[0]), .A2(data1[0]), .Z(n5) );
nr02d0 U7 ( .A1(data0[0]), .A2(data1[0]), .ZN(n4) );
nr02d0 U8 ( .A1(n5), .A2(n4), .ZN(result[0]) );
ad01d0 U9 ( .A(data1[1]), .B(data0[1]), .CI(n5), .CO(n6), .S(result[1]) );
xr03d1 U10 ( .A1(n6), .A2(data0[2]), .A3(data1[2]), .Z(result[2]) );
endmodule

add WITH implicit sign extension

unsigned with 0 extension

rtl
1
2
3
4
5
6
7
module TOP (
input wire [2:0] data0 // 3 bit unsigned
,input wire [1:0] data1 // 2 bit unsigned
,output wire [2:0] result // 3 bit unsigned
);
assign result = data0 + data1;
endmodule

image-20220507121521303

synthesized netlist

image-20220507121622001

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 12:15:58 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [1:0] data1;
output [2:0] result;
wire n4, n5, n6;

an02d0 U6 ( .A1(data1[0]), .A2(data0[0]), .Z(n6) );
ad01d0 U7 ( .A(data1[1]), .B(data0[1]), .CI(n6), .CO(n4), .S(result[1]) );
xr02d1 U8 ( .A1(data0[2]), .A2(n4), .Z(result[2]) );
nr02d0 U9 ( .A1(data1[0]), .A2(data0[0]), .ZN(n5) );
nr02d0 U10 ( .A1(n6), .A2(n5), .ZN(result[0]) );
endmodule

signed with implicit sign extension

rtl
1
2
3
4
5
6
7
module TOP (
input wire signed [2:0] data0
,input wire signed [1:0] data1
,output wire signed [2:0] result
);
assign result = data0 + data1;
endmodule

image-20220507122053948

synthesized netlist

image-20220507122217830

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 12:21:51 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [1:0] data1;
output [2:0] result;
wire n6, n7, n8, n9, n10;

nd02d0 U9 ( .A1(data1[0]), .A2(data0[0]), .ZN(n10) );
inv0d0 U10 ( .I(n10), .ZN(n9) );
nr02d0 U11 ( .A1(data0[1]), .A2(data1[1]), .ZN(n7) );
aor221d1 U12 ( .B1(n9), .B2(data1[1]), .C1(n10), .C2(data0[1]), .A(n7), .Z(
n6) );
xn02d1 U13 ( .A1(data0[2]), .A2(n6), .ZN(result[2]) );
ora21d1 U14 ( .B1(data1[0]), .B2(data0[0]), .A(n10), .Z(result[0]) );
aor21d1 U15 ( .B1(data1[1]), .B2(data0[1]), .A(n7), .Z(n8) );
mx02d0 U16 ( .I0(n10), .I1(n9), .S(n8), .Z(result[1]) );
endmodule

Latch Inference in Verilog

UC Berkeley CS150 Lec #20: Finite State Machines [slides]

always@( * )

always@( * ) blocks are used to describe Combinational Logic, or Logic Gates. Only = (blocking) assignments should be used in an always@( * ) block.

Latch Inference

If you DON'T assign every element that can be assigned inside an always@( * ) block every time that always@( * ) block is executed, a latch will be inferred for that element

The approaches to avoid latch generation:

  • set default values
  • proper use of the else statement, and other flow constructs

without default values

latch is generated

RTL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
module TOP (
input wire Trigger,
input wire Pass,
output reg A,
output reg C
);
always @(*) begin
A = 1'b0;
if (Trigger) begin
A = Pass;
C = Pass;
end
end
endmodule
synthesized netlist

image-20220509170640006

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Mon May 9 17:09:18 2022
/////////////////////////////////////////////////////////////


module TOP ( Trigger, Pass, A, C );
input Trigger, Pass;
output A, C;


lanhq1 C_reg ( .E(Trigger), .D(Pass), .Q(C) );
an02d0 U3 ( .A1(Pass), .A2(Trigger), .Z(A) );
endmodule

add default value

Default values are an easy way to avoid latch generation

RTL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
module TOP (
input wire Trigger,
input wire Pass,
output reg A,
output reg C
);
always @(*) begin
A = 1'b0;
C = 1'b1;
if (Trigger) begin
A = Pass;
C = Pass;
end
end
synthesized netlist

image-20220509171319204

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Mon May 9 17:12:47 2022
/////////////////////////////////////////////////////////////


module TOP ( Trigger, Pass, A, C );
input Trigger, Pass;
output A, C;


nd12d0 U5 ( .A1(Pass), .A2(Trigger), .ZN(C) );
an02d0 U6 ( .A1(Pass), .A2(Trigger), .Z(A) );
endmodule

if evaluation

signed number cast to unsigned automatically before evaluating

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// tb.v
module tb;
reg signed [1:0] datasg;
reg [1:0] dataug;

initial begin
datasg = 2'b11;
dataug = 2'b11;

$display("datasg(%%d): %d", datasg);
$display("dataug(%%d): %d", dataug);

if (datasg)
$display("datasg is OK");
if (dataug)
$display("dataug is OK");

$finish();
end
endmodule
1
2
3
4
5
6
7
8
9
$ vlog tb.v
$ vsim -c -do "run;exit" work.tb
# Loading work.tb(fast)
# run
# datasg(%d): -1
# dataug(%d): 3
# datasg is OK
# dataug is OK
# ** Note: $finish : tb.v(16)

Arithmetic in Verilog

unsigned + unsigned = unsigned

1
2
3
4
5
6
7
8
9
10
function [7:0] satadd_uuu8b;   // unsigned + unsigned = unsigned
input [7:0] a;
input [7:0] b;

reg [8:0] t; // extend 1b
begin
t = {1'b0, a} + {1'b0, b};
satop_uuu16b = t[8] ? {8{1'b1}} : t[7:0];
end
endfunction

1'b1: overflow

signed + signed = signed

1
2
3
4
5
6
7
8
9
10
11
12
function [7:0] satadd_sss8b;    // signed + signed = signed
input signed [7:0] a;
input signed [7:0] b;

reg signed [8:0] t; // extend 1b
begin
t = a + b; // extend sign bit automatically
satadd_sss8b = (t[8:7] == 2'b01) ? {1'b0, 7{1'b1}} : // up sat
(t[8:7] == 2'b10) ? {1'b1, 7{1'b0}} : // dn sat
t[7:0];
end
endfunction

2'b01: overflow

2'b10: underflow

signed + unsigned = unsigned

1
2
3
4
5
6
7
8
9
10
11
12
function [7:0] satadd_suu8b;    // signed + unsigned = unsigned
input signed [7:0] a;
input [7:0] b;

reg signed [8:0] t; // extend 1b
begin
t = {a[7], a} + {1'b0, b};
satadd_ssu8b = (t[8:7] == 2'b10) ? {8{1'b1}} : // up saturate for unsigned
(t[8:7] == 2'b11) ? {8{1'b0}} : // dn saturate for unsigned
t[7:0];
end
endfunction

signed + unsigned = signed

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
function signed [7:0] satop_sus8b;    //signed +/- unsigned = signed
input signed [7:0] a;
input [7:0] b;
input plus;

reg signed [8:0] t; // extend 1b
begin
if(plus) begin
t = {a[7], a} + {1'b0, b};
satop_sus8b = (t[8:7]==2'b01) ? {1'b0, {7{1'b1}}} // up saturate for signed
: t[7:0];
end else begin
t = {a[7], a} - {1'b0, b};
satop_sus8b = (t[8:7]==2'b10) ? {1'b1, {7{1'b0}}} // dn saturate for signed
: t[7:0];
end
end
endfunction

Overflow Detection in Verilog

Overflow Detection: [http://www.c-jump.com/CIS77/CPU/Overflow/lecture.html]

  • Arithmetic operations have a potential to run into a condition known as overflow.
  • Overflow occurs with respect to the size of the data type that must accommodate the result.
  • Overflow indicates that the result was too large or too small to fit in the original data type.

Overflow when adding unsigned number

When two unsigned numbers are added, overflow occurs if

  • there is a carry out of the leftmost bit.

Overflow when adding signed numbers

When two signed 2's complement numbers are added, overflow is detected if:

  1. both operands are positive and the result is negative, or
  2. both operands are negative and the result is positive.

Notice that when operands have opposite signs, their sum will never overflow. Therefore, overflow can only occur when the operands have the same sign.

A B carryout_sum overflow
011 (3) 011 (3) 0_110 (6) overflow
100 (-4) 100 (-4) 1_000 (-8) underflow
111 (-1) 110 (-2) 1_101 (-3) -

carryout information ISN'T needed to detect overflow/underflow for signed number addition

EXTBIT:MSB

extended 1bit and msb bit can be used to detect overflow or underflow

1
2
3
4
5
6
7
8
9
10
reg signed  [1:0]      acc_inc;
reg signed [10-1:0] acc;
wire signed [10 :0] acc_w; // extend 1b for saturation
wire signed [10-1:0] acc_stat;

assign acc_w = acc + acc_inc; // signed arithmetic

assign acc_stat = (acc_w[10-1 +: 2] == 2'b01) ? {1'b0, {(10-1){1'b1}}} : // up saturation
(acc_w[10-1 +: 2] == 2'b10) ? {1'b1, {(10-1){1'b0}}} : // down saturation
acc_w[10-1:0];

2'b01 : overflow, up saturation

2'b10: underflow, down saturation

2's complement negative number

  1. Flip all bits
  2. Add 1.

N-bit signed number \[ A = -M_{N-1}2^{N-1}+\sum_{k=0}^{N-2}M_k2^k \] Flip all bits \[\begin{align} A_{flip} &= -(1-M_{N-1})2^{N-1} +\sum_{k=0}^{N-2}(1-M_k)2^k \\ &= M_{N-1}2^{N-1}-\sum_{k=0}^{N-2}M_k2^k -2^{N-1}+\sum_{k=0}^{N-2}2^k \\ &= M_{N-1}2^{N-1}-\sum_{k=0}^{N-2}M_k2^k -1 \end{align}\]

Add 1 \[\begin{align} A_- &= A_{flip}+1 \\ &= M_{N-1}2^{N-1}-\sum_{k=0}^{N-2}M_k2^k \\ &= -A \end{align}\]

reference

Lee, Weng Fook, Weng Fook Lee, and Glaser. Learning from VLSI design experience. Springer International Publishing, 2019.

Bevan Baas, EEC281 VLSI Digital Signal Processing, [https://www.ece.ucdavis.edu/~bbaas/281/]

0%