Jitter separation lets you learn if the components of jitter are random or deterministic. That is, if they are caused by crosstalk, channel loss, or some other phenomenon. The identification of jitter and noise sources is critical when debugging failure sources in the transmission of high-speed serial signals

  • Tail Fit Method
  • Spectral method
RJ Extraction Methods Rationale
Spectral Speed/Consistency to Past Measurements;
Accuracy in low Crosstalk or Aperiodic Bounded Uncorrelated Jitter (ABUJ) conditions
Tail Fit General Purpose;
Accuracy in high Crosstalk or ABUJ conditions

Jitter Components

image-20220521190326201

dual-Dirac model

image-20220521181604467

Figure-1


image-20250816100336592

image-20250816101651481

Jitter Analysis: The Dual-Dirac Model, RJ/DJ, and Q-scale [https://people.engr.tamu.edu/spalermo/ecen689/jitter_dual_dirac_agilent.pdf]

Q-scale

TODO 📅

Spectral method

power spectral density (PSD) represents jitter spectrum and peaks in the spectrum can be interpreted as PJ or DDJ, while the average noise floor is the power of RJ

image-20220521182929127

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
S1 = sum(win);
S2 = sum(win.^2);
N = length(win);
spec_nospur2 = (spec_nospur*S1).^2/N/S2; % To obtain linear spectrum for rj
rj_utj = sqrt(sum(spec_nospur2))*1e12;

spec = 1*ones(length(spec_nospur), 1)*1e-21;
spec(index) = specx(index);
% insert fft nyquist frequency component between positive frequency and
% negative frequency component
% DC;posFreq;nyqFreq;negFreq
spec_ifft = [spec;specnyq;conj(spec(end:-1:2))]';
sfactor = sum(win)/sqrt(2);
spec_ifft = spec_ifft*sfactor;
sig_rec = real(ifft(spec_ifft));
sig_rec = sig_rec(:);
sig_rec_utj = sig_rec./win(1:end);

Tail Fit Method

Tail fitting algorithm based on the Gaussian tail model by using probability distribution of collected jitter value

image-20220521191029433

1
2
3
4
5
6
7
8
9
10
11
12
13
bin_sig = bin_sig*1e12;

x = qfuncinv(cdf_sig);

% coef(1)*bin_sig + coef(2) = x
% which x is norm(0, 1)
% bin_sig = (x - coef(2))/coef(1)
% Then bin is norm(-coef(2)/coef(1), 1/coef(1))
coef = polyfit(bin_sig, x, 1);
sigma = 1/coef(1);
mu = -coef(2)*sigma;

fprintf('sigma=%.3fps, mu=%.3fps\n', sigma, mu);

Least Squares (LS) method

image-20220524005848719

It is known that TIE jitter is a linear equation, shown in below formula \[ x[n] = d_n \times \left[ \Delta t_{pj}[n]+\Delta t_{DCD}[n] +\Delta t_{ISI}[n]+\Delta t_{RJ}[n]\right] \] LS can be used to estimate the PJ, DCD, RJ , and ISI parameters \([a,b,J_{DCD},J_0, J_1...J_{(2^k-1)}]\)

image-20220524185332637

image-20220524185351383

image-20220524010446422

Jitter modeling

Periodic Jitter (PJ)

PJ is a repeating jitter \[ \Delta t_{PJ}[n]=A\sin(2\pi f_0\cdot nT_s + \theta)=a \sin(2\pi f_0 \cdot nT_s)+b\cos(2\pi f_0 \cdot nT_s) \] where \(f_0\) represents the fundamental frequency of PJ; \(A\) is the amplitude of PJ; \(T_s\) is the data stream period, and \(\theta\) is the initial phase of PJ

In the spectrum, the frequency of maximum amount of the jitter is PJ frequency \(f_0\).

Duty Cycle Distortion (DCD)

DCD is viewed as a series of adjacent positive and negative impulses \[ \Delta t_{DCD}[n] = J_{DCD}\times (-1)^n = [-J_{DCD},J_{DCD},-J_{DCD},J_{DCD},...] \] Where \(J_{DCD}\) is the DCD amplitude.

Random Jitter (RJ)

RJ is created by unbounded jitter sources, such as Gaussian white noise. The statistical PDF for RJ is enerally treated as a Gaussian distribution \[ f_{RJ}(\Delta t) = \frac{1}{\sqrt{2\pi\sigma}}\exp(-\frac{(\Delta t)^2}{2\sigma^2}) \]

DJ/RJ

K. Bidaj, J. -B. Begueret, N. Houdali, J. Deroo and S. Rieubon, "Time-domain PLL modeling and RJ/DJ jitter decomposition," 2016 IEEE International Symposium on Circuits and Systems (ISCAS), Montreal, QC, Canada, 2016 [https://sci-hub.se/10.1109/ISCAS.2016.7527201]

image-20251121235231154

Periodic Jitter Generator & Insertion

Analysis and Estimation of Jitter Sub-Components: Classification and Segregation of Jitter Components

image-20220521212129098

image-20220521212142719

Reference

Li, Mike Peng. Jitter, Noise, and Signal Integrity at High- Speed. Prentice Hall, 2008.

—,DesignCon2007, Simultaneous Jitter Analysis in Time, Frequency, and Statistical Domains and Their Interrelationships [pdf]

—,DesignCon2023, Accuracy and Challenges of PAM4 jitter and noise measurements for >100Gbps Serial Links [pdf]

—, "A new method for jitter decomposition through its distribution tail fitting," International Test Conference 1999. Proceedings (IEEE Cat. No.99CH37034), 1999, pp. 788-794, [https://sci-hub.jp/10.1109/TEST.1999.805809]


余宥浚 Jacky Yu, Keysight Taiwan AEO, Advanced Jitter and Eye-Diagram Analysis

Y. Duan and D. Chen, "Accurate jitter decomposition in high-speed links," 2017 IEEE 35th VLSI Test Symposium (VTS), 2017, pp. 1-6, doi: 10.1109/VTS.2017.7928918.

Y. Duan's phd thesis URL: https://dr.lib.iastate.edu/handle/20.500.12876/30459

Y. Duan and D. Chen, "Fast and Accurate Decomposition of Deterministic Jitter Components in High-Speed Links," in IEEE Transactions on Electromagnetic Compatibility, vol. 61, no. 1, pp. 217-225, Feb. 2019, doi: 10.1109/TEMC.2018.2797122.

"Jitter Analysis: The Dual-Dirac Model, RJ/DJ, and Q-Scale", Whitepaper: Keysight Technologies, U.S.A., Dec. 2017

Sharma, Vijender Kumar and Sujay Deb. "Analysis and Estimation of Jitter Sub-Components." (2014).

Qingqi Dou and J. A. Abraham, "Jitter decomposition in ring oscillators," Asia and South Pacific Conference on Design Automation, 2006

E. Balestrieri, L. De Vito, F. Lamonaca, F. Picariello, S. Rapuano and I. Tudosa, "The jitter measurement ways: The jitter decomposition," in IEEE Instrumentation & Measurement Magazine, vol. 23, no. 7, pp. 3-12, Oct. 2020, doi: 10.1109/MIM.2020.9234759.

McClure, Mark Scott. "Digital jitter measurement and separation." PhD diss., 2005.

Ren, Nan, Zaiming Fu, Shengcu Lei, Hanglin Liu, and Shulin Tian. "Jitter generation model based on timing modulation and cross point calibration for jitter decomposition." Metrology and Measurement Systems 28, no. 1 (2021).

K. Bidaj, J. -B. Begueret and J. Deroo, "RJ/DJ jitter decomposition technique for high speed links," 2016 IEEE International Conference on Electronics, Circuits and Systems (ICECS) [https://sci-hub.se/10.1109/ICECS.2016.7841269]

Clocking Structures

Mark Horowitz, Lecture 6 Clocked Elements [https://web.stanford.edu/class/archive/ee/ee371/ee371.1066/lectures/lect_06.pdf]

Ran Ginosar, ISCAS2008 Tutorial-4: Synchronization Circuits for Multiple Clock Domain SoCs

Synchronous, Mesochronous, Plesiochronous

image-20250815221238051

image-20260304220804525

image-20260403213719657

image-20260403213819780

QEC (Quadrature Error Corrector )

Shaokang ZHAO, 2025, "Multi-Phase Clock Generator for High-Speed Wireline Systems," [paper, slides]

TODO 📅

Standing Wave Based Clock Distribution Technique

G. Li, W. Lee, D. Cui, B. Zhang, A. Momtaz and J. Cao, "Standing wave based clock distribution technique with application to a 10 × 11 Gbps transceiver in 28 nm CMOS," 2015 IEEE Asian Solid-State Circuits Conference (A-SSCC), Xiamen, China, 2015, pp. 1-4 [https://sci-hub.se/10.1109/ASSCC.2015.7387451]

T. Ali et al., "6.4 A 180mW 56Gb/s DSP-Based Transceiver for High Density IOs in Data Center Switches in 7nm FinFET Technology," 2019 IEEE International Solid-State Circuits Conference - (ISSCC), San Francisco, CA, USA, 2019, pp. 118-120 [https://sci-hub.se/10.1109/ISSCC.2019.8662523]

TODO 📅

image-20251217232140562

Inductive-Loaded Clock Distribution Technique

T. Shibasaki et al., "3.5 A 56Gb/s NRZ-electrical 247mW/lane serial-link transceiver in 28nm CMOS," 2016 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2016, pp. 64-65 [https://sci-hub.se/10.1109/ISSCC.2016.7417908]

TODO 📅

Cascaded PLLs

Chembiyan T, A General Theory of Cascaded PLL Design [link]

Nicola Da Dalt, ISSCC 2012 T5: JITTER basic and advanced concepts, statistics and applications [https://www.nishanchettri.com/isscc-slides/2012%20ISSCC/TUTORIALS/ISSCC2012Visuals-T5.pdf]

—. ESSCIRC 2019 Tutorials: Jitter in Wireline and Data Converter Applications [https://youtu.be/aapkfCeHTrQ]

To understand the impact of the clock jitter on the performance of a wireline system, the transfer functions of the PLL in the transmitter side and the CDR loop in the receiver should be taken into consideration

image-20250524111908032

image-20250524091655525

image-20250524222625524

the minimum jitter occurs at the point where the transmit PLL UGB is minimum and the CDR UGB is maximized

  • the net rms jitter that impacts the performance of a wireline transceiver is much lower than the rms jitter of the transmit PLL
  • the jitter requirements of the transmit PLL on the wireline system is much more relaxed compared to the wireless transceiver

AC-coupled buffer & DCC

The amount of correction can be set by intentional injection of an offset current into the summing input node of INV, threshold-adjustable inverter

Note that the change to the threshold is opposite in direction to the change to INV

increasing DC of input signal is equivalent to lower down the threshold of INV

image-20241215233057176


image-20241216205525818

voltage at INV1 will increased by: \[ \frac{\Delta V_{DAC} - \Delta {INV1}}{R_{DAC}} = \frac{\Delta {INV1} +A_0 \Delta {INV1}}{R_{F}} \] therefore \[ \Delta {INV1} = \Delta V_{DAC} \cdot \frac{R_F}{R_F+(A_0+1)R_{DAC}} \approx \Delta V_{DAC} \cdot \frac{R_F}{A_0R_{DAC}} \]

variable \(R_{DAC}\) can be used to tweak tuning resolution & range

If \(R_{DAC} = R_F\) \[ \Delta {INV1}\approx \frac{\Delta V_{DAC}}{A_0} \]


image-20251014215409535

image-20251014220640238

C. Menolfi et al., "A 112Gb/S 2.6pJ/b 8-Tap FFE PAM-4 SST TX in 14nm CMOS," 2018 IEEE International Solid-State Circuits Conference - (ISSCC) [https://sci-hub.se/https://doi.org/10.1109/ISSCC.2018.8310205],[visual]

M. A. Kossel et al., "8.3 An 8b DAC-Based SST TX Using Metal Gate Resistors with 1.4pJ/b Efficiency at 112Gb/s PAM-4 and 8-Tap FFE in 7nm CMOS," 2021 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2021[https://sci-hub.se/10.1109/ISSCC42613.2021.9365784]

C. Menolfi et al., "A 28Gb/s source-series terminated TX in 32nm CMOS SOI," 2012 IEEE International Solid-State Circuits Conference, San Francisco, CA, USA, 2012

Bob Lefferts, Navraj Nandra. SNUG Israel 2007 [https://picture.iczhiku.com/resource/eetop/whKYwQorwYoPUVbm.pdf]


image-20240720073616597

Since duty-cycle error is high frequency component, the high-pass filter suppresses the duty-cycle error propagating to the output

image-20240720005226736

  • The AC-coupling capacitor blocks the low-frequency component of the input
  • The feedback resistor sets common mode voltage to the crossover voltage

Bae, Woorham; Jeong, Deog-Kyoon: 'Analysis and Design of CMOS Clocking Circuits for Low Phase Noise' (Materials, Circuits and Devices, 2020)

Casper B, O'Mahony F. Clocking analysis, implementation and measurement techniques for high-speed data links: A tutorial. IEEE Transactions on Circuits and Systems I: Regular Papers. 2009;56(1):17-39

divide-by-1.5 circuit

TODO 📅

Deterministic Jitter

image-20220516004008878


image-20220516004058916

image-20220516004206118

j_Djpp can be calculated by PSD,too

image-20220516004615033

1
2
3
4
5
6
7
8
9
fck = 38.4e6;
Nfft = 15000;
fres = fck/Nfft;
psddBc = -99.3343;
psBc = psddBc + 10*log10(fres); % psd -> ps;
phrad2 = 10^(psBc/10);
phrms = sqrt(phrad2);
Jrms = phrms/2/pi*1/fck;
Jpp = 2*sqrt(2)*Jrms;
1
2
3
Jpp =

6.4038e-12

For DJ, we usually use peak to peak value

BTW, the psd value at half of fundamental frequency (\(f_s/2\)) is duty cycle distortion due to the NMOS/PMOS imbalance, because of rising only data

Random Jitter

RJ can be accurately and efficiently measured using PSS/Pnoise or HB/HBnoise.

Note that the transient noise can also be used to compute RJ;

However, the computation cost is typically very high, and the accuracy is lesser as compared to PSS/Pnoise and HB/HBnoise.

Since RJ follows a Gaussian distribution, it can be fully characterized using its Root-Mean-Squared value (RMS) or the standard deviation value (\(\sigma\))

The Peak-to-Peak value of RJ (\(\text{RJ}_{\text{p-p}}\)) can be calculated under certain observation conditions \[ \text{RJ}_{\text{p-p}}\equiv K \ast \text{RJ}_{\text{RMS}} \] Here, \(K\) is a constant determined by the BER specification of the system given in the following Table

BER Crest factor (K)
\(10^{-3}\) 6.18
\(10^{-4}\) 7.438
\(10^{-5}\) 8.53
\(10^{-6}\) 9.507
\(10^{-7}\) 10.399
\(10^{-8}\) 11.224
\(10^{-9}\) 11.996
\(10^{-10}\) 12.723
\(10^{-11}\) 13.412
\(10^{-12}\) 14.069
\(10^{-13}\) 14.698
1
2
3
4
K = 14.698;
Ks = K/2;
p = normcdf([-Ks Ks]);
BER = 1 - (p(2)-p(1));
1
2
3
BER =

1.9962e-13

image-20220516160050961

image-20220516193125490

Total Jitter

\[ \text{TJ}_{\text{p-p}}\equiv \text{DJ}_{\text{p-p}} + \text{RJ}_{\text{p-p}}(\text{BER}) \]

tj.drawio

image-20220516160006909

image-20220516012200383

In the psd of TJ, the spur is DJ and floor is RJ

Phase Noise to Jitter

The phase noise is traditionally defined as the ratio of the power of the signal in 1Hz bandwidth at offset \(f\) from the carrier \(P\), divided by the power of the carrier \[ \ell (f) = \frac {S_v'(f_0+f)}{P} \] where \(S_v'\) is is one-sided voltage PSD and \(f \geqslant 0\)

Under narrow angle assumption \[ S_{\varphi}(f)= \frac {S_v'(f_0+f)}{P} \] where \(\forall f\in \left[-\infty +\infty\right]\)

Using the Wiener-Khinchin theorem, it is possible to easily derive the variance of the absolute jitter(\(J_{ee}\))via integration of the corresponding PSD \[ J_{ee,rms}^2 = \int S_{J_{ee}}(f)df \]

And we know the relationship between absolute jitter and excess phase is \[ J_{ee}=\frac {\varphi}{\omega_0} \] Considering that phase noise is normally symmetrical about the zero frequency, multiplied by two is shown as below \[ J_{ee,rms} = \frac{\sqrt{2\int_{0}^{+\infty}\ell(f)df}}{\omega_0} \] where phase noise is in linear units not in logarithmic ones.

Because the unit of phase noise in Spectre-RF is logarithmic unit (dBc), we have to convert the unit before applying the above equation \[ \ell[linear] = 10^{\frac {\ell [dBc/Hz]}{10}} \] The complete equation using the simulation result of Spectre-RF Pnoise is \[ J_{ee,rms} = \frac{\sqrt{2\int_{0}^{+\infty}10^{\frac {\ell [dBc/Hz]}{10}}df}}{\omega_0} \]

The above equation has been verified for sampled pnoise, i.e. Jee and Edge Phase Noise.

  • For pnoise-sampled(jitter), Direct Plot Form - Function: Jee:Integration Limits can calculate it conveniently
  • But for pnoise-timeaveage, you have to use the below equation to get RMS jitter.

One example, integrate to \(\frac{f_{osc}}{2}\) and \(f_{osc} = 16GHz\)

image-20220415100034220

Of course, it apply to conventional pnoise simulation.

On the other hand, output rms voltage noise, \(V_{out,rms}\) divied by slope should be close to \(J_{ee,rms}\) \[ J_{ee,rms} = \frac {V_{out,rms}}{slope} \]

Pulse Width Jitter (PWJ)

[Spectre Tech Tips: Measuring Noise in Digital Circuits]

Pnoise sampled: Edge Delay mode measures the noise defined by two edges. Both edges are defined by a threshold voltage and rising or falling edges, which measures the noise of the pulse itself and direct plot calculate the variation of the pulse width

TODO 📅

Power supply induced jitter (PSIJ)

Chulsoon Hwang. New Way to Improve Power Supply Induced Jitter Simulation Accuracy for IBIS Model [https://ibis.org/summits/aug21b/ding.pdf]

—, "A Generalized Power Supply Induced Jitter Model Based on Power Supply Rejection Ratio Response," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 29, no. 6, pp. 1052-1060, June 2021

—, DesignCon 2021. A Generalized Power Supply Induced Jitter Model Based on Power Supply Rejection Ratio Response [paper]

—, "Power Supply Induced Jitter: Introduction and Recent Advances," in IEEE Transactions on Signal and Power Integrity, vol. 5, pp. 22-32, 2026

X. Mo, J. Wu, N. Wary and T. C. Carusone, "Design Methodologies for Low-Jitter CMOS Clock Distribution," in IEEE Open Journal of the Solid-State Circuits Society, vol. 1, pp. 94-103, 2021. [https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9559395]

A sampled pxf analysis can be used to simulate the deterministic jitter of a circuit due to power supply ripple

image-20260203230643833

TODO 📅

reference

Article (20500632) Title: How to simulate Random and Deterministic Jitters

Spectre Tech Tips: Measuring Noise in Digital Circuits - Analog/Custom Design - Cadence Blogs - Cadence Community https://community.cadence.com/cadence_blogs_8/b/cic/posts/s . . .

Cadence RAK: Deterministic Jitter Measurement using SpectreRF

Frank Wiedmann. Using sampled pxf analysis to simulate deterministic jitter [https://community.cadence.com/cadence_technology_forums/f/custom-ic-design/51605/using-sampled-pxf-analysis-to-simulate-deterministic-jitter]

supply noise sensitivity: PSS+PAC or PSS+PX [https://designers-guide.org/forum/YaBB.pl?num=1376500816]


J. Kim et al., "A 112 Gb/s PAM-4 56 Gb/s NRZ Reconfigurable Transmitter With Three-Tap FFE in 10-nm FinFET," in IEEE Journal of Solid-State Circuits, vol. 54, no. 1, pp. 29-42, Jan. 2019, doi: 10.1109/JSSC.2018.2874040

— et al., "A 224-Gb/s DAC-Based PAM-4 Quarter-Rate Transmitter With 8-Tap FFE in 10-nm FinFET," in IEEE Journal of Solid-State Circuits, vol. 57, no. 1, pp. 6-20, Jan. 2022, doi: 10.1109/JSSC.2021.3108969


J. N. Tripathi, V. K. Sharma and H. Shrimali, "A Review on Power Supply Induced Jitter," in IEEE Transactions on Components, Packaging and Manufacturing Technology, vol. 9, no. 3, pp. 511-524, March 2019 [https://sci-hub.st/10.1109/TCPMT.2018.2872608]

H. Kim, J. Fan and C. Hwang, "Modeling of power supply induced jitter (PSIJ) transfer function at inverter chains," 2017 IEEE International Symposium on Electromagnetic Compatibility & Signal/Power Integrity (EMCSI), Washington, DC, USA, 2017 [https://sci-hub.st/10.1109/ISEMC.2017.8077937]

High Speed Communications Part 8 – On Die CMOS Clock Distribution. [https://youtu.be/nx5CiHcwrF0]

Low-Jitter CMOS Clock Distribution [https://youtu.be/LMT-T41Y64U]


Mozhgan Mansuri. ISSCC2021 SC3: Clocking, Clock Distribution, and Clock Management in Wireline/Wireless Subsystems [https://www.nishanchettri.com/isscc-slides/2021%20ISSCC/SHORT%20COURSE/ISSCC2021-SC3.pdf]

Phillip Restle. ISSCC2021 SC4: Processor Clock Generation, Distribution, and Clock Sensor/Management Loops [https://www.nishanchettri.com/isscc-slides/2021%20ISSCC/SHORT%20COURSE/ISSCC2021-SC4.pdf]

Jihwan Kim,Intel, ISSCC 2023 Forum F1.5: Circuit Designs for 200+Gb/s Electrical Transceivers

Sam Palermo. Spring 2025 ECEN720 : High-Speed Links Circuits and Systems [Lecture 14: Clock Distribution Techniques]

Muhammad Aldacher. Analog Design of Clock Distribution Network using Standing-Waves [https://github.com/muhammadaldacher/Analog-Design-of-Clock-Distribution-Network-using-Standing-Waves]

Rhee, W. (2020). Phase-locked frequency generation and clocking : architectures and circuits for modern wireless and wireline systems. The Institution of Engineering and Technology

DC offset

Performing FFT to a signal with a large DC offset would often result in a big impulse around frequency 0 Hz, thus masking out the signals of interests with relatively small amplitude.

Remove_DC_Offset_Blog_10

One method to remove DC offset from the original signal before performing FFT

  • Subtracting the Mean of Original Signal

You can also not filter the input, but set zero to the zero frequency point for FFT result.

Nyquist component

If we go back to the definition of the DFT \[ X(N/2)=\sum_{n=0}^{N-1}x[n]e^{-j2\pi (N/2)n/2}=\sum_{n=0}^{N-1}x[n]e^{-j\pi n}=\sum_{n=0}^{N-1}x[n](-1)^n \] which is a real number.

The discrete function \[ x[n]=\cos(\pi n) \] is always \((-1)^n\) for integer \(n\)

One general sinusoid at Nyquist and has phase shift \(\theta\), this is \(T=2\) and \(T_s=1\)

\[\begin{align} x[n] &= A \cos(\pi n + \theta) \\ &= A \big( \cos(\pi n) \cos(\theta) - \sin(\pi n) \sin(\theta) \big) \\ &= \big(A\cos(\theta)\big) \cos(\pi n) + \big(-A\sin(\theta)\big) \sin(\pi n) \\ &= \big(A\cos(\theta)\big) (-1)^n + \big(-A\sin(\theta)\big) \cdot 0 \\ &= B \cdot (-1)^n \\ \end{align}\]

Where \(A\cos(\theta)=B\).

Moreover \(B \cdot (-1)^n = B\cdot \cos(\pi n)\), then \[ B\cdot \cos(\pi n) = A \cdot \cos(\pi n + \theta) \] We can NOT distinguish one from another.

In other words, you CAN'T infer the signal from \(X(\frac{N}{2})\) \[\begin{align} X(k)\frac{1}{N}e^{j 2 \pi \frac{nk}{N}}\bigg|_{k=\frac{N}{2}} &= \frac{X\left(\frac{N}{2} \right)}{N}(-1)^n \\ &= \frac{X\left(\frac{N}{2} \right)}{N}\cos(\pi n) \\ &= \frac{X\left(\frac{N}{2} \right)}{N}\left( \cos(\pi n) - \beta \sin(\pi n) \right) \\ &= \frac{X\left(\frac{N}{2} \right)}{N}\sqrt{1+\beta^2}\left(\frac{1}{\sqrt{1+\beta^2}} \cos(\pi n) - \frac{\beta}{\sqrt{1+\beta^2}} \sin(\pi n) \right) \\ &= \frac{X\left(\frac{N}{2} \right)}{N} \frac{1}{\cos(\theta)}\left(\cos(\theta) \cos(\pi n) - \sin(\theta) \sin(\pi n) \right) \\ &= \frac{X\left(\frac{N}{2} \right)}{N} \frac{1}{\cos(\theta)} \cos(\pi n+\theta) \end{align}\]

where \(\beta \in \mathbb{R}\) and you wouldn't know it because \(\sin(\pi n)=0 \quad \forall n \in \mathbb{Z}\)

For example, if \(\theta=0\) \[ X(k)\frac{1}{N}e^{j 2 \pi \frac{nk}{N}}\bigg|_{k=\frac{N}{2}}= \frac{X\left(\frac{N}{2} \right)}{N} \cos(\pi n) \] However, if \(\theta=\frac{\pi}{3}\) \[ X(k)\frac{1}{N}e^{j 2 \pi \frac{nk}{N}}\bigg|_{k=\frac{N}{2}}= \frac{X\left(\frac{N}{2} \right)}{N}\cdot 2 \cos(\pi n+\frac{\pi}{3}) \]

That sort of ambiguity is the reason for the strict inequality of the sampling theorem's condition.

Duty Cycle Distortion

image-20250522203602328

[Prof. Tony Chan Carusone, Low-Jitter CMOS Clock Distribution]

Both edges is used for clock waveform to evaluate the duty cycle distortion,

Assuming TIE is [0 0.2 0 0.2 0 0.2 ...], then subtract DC offset, we get [-0.1 0.1 -0.1 0.1 ...], shown as below

image-20220516224616908

The amplitude is manifested in FFT amplitude spectrum, i.e. Nyquist component, which is 0.1 in follow figure

image-20220516233618663

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
N = 32;
n = (1:N);
x = 0.1*(-1).^n;
figure(1)
stem(n-1, x);
X = fft(x)/N;
Xshift = fftshift(X);
fprintf("nyquist component: %.2f\n", Xshift(1));
magXshift = abs(Xshift);
ph = phase(Xshift)/pi*180;
figure(2)
subplot(3, 1, 1)
fx = (-N/2:N/2-1);
stem(fx, magXshift);
xlabel('Freq');
ylabel('|X(k)|');
title('mag of DFT');
grid on;

subplot(3, 1, 2)
stem(fx, ph);
xlabel('Freq');
ylabel('\angle X(k)(^oC)');
title('phase of DFT');
grid on;

%% inverse dft
ninv = (0:32-1);
xinv = Xshift(1)*cos(pi*ninv);
subplot(3, 1, 3);
hold on;
stem(ninv, x, "filled", 'r');
stem(ninv, xinv,'bd-.');
ninfer = (0:0.1:32+1);
xinfer1 = Xshift(1)*cos(pi*ninfer); % theta = 0
xinfer2 = Xshift(1)*2*cos(pi*ninfer+pi/3); % theta = pi/3
plot(ninfer, xinfer1, 'm--');
plot(ninfer, xinfer2, 'c--');
hold off;
legend('original', 'IDFT', '\theta=0', '\theta=\pi/3');
xlabel('time');
ylabel('V');
title('sample points');
grid on;

Single-Sided Amplitude Spectrum

DC and Nyquist frequency of FFT left over

P1(2:end-1) = 2*P1(2:end-1);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Fs = 1000;            % Sampling frequency
T = 1/Fs; % Sampling period
L = 1500; % Length of signal
t = (0:L-1)*T; % Time vector
S = 0.7*sin(2*pi*50*t) + sin(2*pi*120*t);
X = S + 2*randn(size(t));
figure(1)
plot(1000*t(1:50),X(1:50))
title('Signal Corrupted with Zero-Mean Random Noise')
xlabel('t (milliseconds)')
ylabel('X(t)')

figure(2)
Y = fft(X);
P2 = abs(Y/L); %!!! two-sided spectrum P2.
P1 = P2(1:L/2+1); %!!! single-sided spectrum P1
P1(2:end-1) = 2*P1(2:end-1); % exclude DC and Nyquist freqency
f = Fs*(0:(L/2))/L;
figure(2)
plot(f,P1)
title('Single-Sided Amplitude Spectrum of X(t)')
xlabel('f (Hz)')
ylabel('|P1(f)|')

image-20220514221609734

image-20220514221642170

Alternative View

The direct current (DC) bin (\(k=0\)) and the bin at \(k=N/2\), i.e., the bin that corresponds to the Nyquist frequency are purely real and unique.

sinusoidal waveform with \(10Hz\), amplitude 1 is \(cos(2\pi f_c t)\). The plot is shown as below with sampling frequency is \(20Hz\)

image-20220427172526711

Amplitude and Phase spectrum, sampled with \(f_s=20\) Hz

image-20220427174557915

The FFT magnitude of \(10Hz\) is 1 and its phase is 0 as shown as above, which proves the DFT and IDFT.

Caution: the power of FFT is related to samples (DFT Parseval's theorem), which may not be the power of continuous signal. The average power of samples is ([1 -1 1 -1 -1 1 ...]) is 1, that of corresponding continuous signal is \(\frac{1}{2}\).

Power spectrum derived from FFT provide information of samples, i.e. 1

Moreover, average power of sample [1 -1 1 -1 1 ...] is same with DC [1 1 1 1 ...].

code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
fc = 10;
fs = 2*fc;
fov = 64*fc;

ts = (0:1/fs:26);
tov = (0:1/fov:26);

ys = cos(2*pi*fc*ts);
yov = cos(2*pi*fc*tov);
%% waveform
stem(ts, ys)
hold on;
plot(tov, yov);
legend('sample', 'waveform')
ylim([-1.5 1.5])
grid on;
xlabel('time(s)');
ylabel('mag(V)')

nfft = 256;
X = fftshift(fft(ys, nfft))/nfft;
f = (-nfft/2:nfft/2-1)*fs/nfft;
magX = abs(X);
phsX = atan2(imag(X),real(X));
%% fft spectrum
figure(2)
subplot(2, 1, 1);
stem(f, magX);
xlabel('Frequency(Hz)');
ylabel('mag')
xlim([min(f)-1 max(f)+1])
title('Amplitude spectrum')
subplot(2, 1, 2)
plot(f, phsX);
xlabel('Frequency(Hz)');
ylabel('phase (rad)')
xlim([min(f)-1 max(f)+1])
title('Phase spectrum')

%% power spectrum
yssq_sum_avg = sum(ys(1:nfft).^2)/nfft;
specsq_sum_avg = sum(abs(X).^2);

reference

OriginLab, How to Remove DC Offset before Performing FFT URL: http://blog.originlab.com/how-to-remove-dc-offset-before-performing-fft

How to remove DC component in FFT? URL: https://www.mathworks.com/matlabcentral/answers/712808-how-to-remove-dc-component-in-fft#answer_594373

Analyzing a signal that contains frequency content at Fs/2 doesn't seem to work unless there is a phase shift URL: https://dsp.stackexchange.com/a/59807/59253

Nyquist–Shannon sampling theorem, Critical frequency URL: https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem#Critical_frequency

Why remove energy at Nyquist before ifft? URL: https://dsp.stackexchange.com/a/22851/59253

questasim sim flow

A Short Intro to ModelSim Verilog Simulator [https://users.ece.cmu.edu/~jhoe/doku/doku.php?id=a_short_intro_to_modelsim_verilog_simulator]

1
2
3
4
vlib work
vlog -f filelist tb.sv
# "-c": command line mode
vsim -voptargs=+acc -c -do "run 100ns; exit" work.topmodule

-voptargs=+acc: Add the option -voptargs=+acc to the vsim command, This enables full visibility into every aspect of the design.

1
2
3
module topmodule;
...
endmodule

uvm:

1
2
> vlog test_pkg.sv tb_top.sv -L $QUESTA_HOME/uvm-1.2
> vsim -c -do "run -all;exit" +UVM_TESTNAME=my_test work.tb_top -L $QUESTA_HOME/uvm-1.2

verilog-mode.el

Emacs Online Documentation https://doc.endlessparentheses.com/

Emacs verilog-mode 的使用 URL: https://www.wenhui.space/docs/02-emacs/verilog_mode_useguide/

1
emacs --no-site-file --load path/to/verilog-mode.el --batch filename.v -f verilog-auto-save-compile

CAUTION: filename.v is overwrite by command

verilog-mode.el

1
2
3
4
/*AUTOINPUT*/
/*AUTOWIRE*/
/*AUTOINST*/
/*AUTO_TEMPLATE*/

-f verilog-batch-auto

For use with --batch, perform automatic expansions as a stand-alone tool. This sets up the appropriate Verilog mode environment, updates automatics with M-x verilog-auto on all command-line files, and saves the buffers. For proper results, multiple filenames need to be passed on the command line in bottom-up order.

-f verilog-auto-save-compile

Update automatics with M-x verilog-auto, save the buffer, and compile

Emacs

--no-site-file

Another file for site-customization is site-start.el. Emacs loads this before the user's init file (.emacs, .emacs.el or .emacs.d/.emacs.d). You can inhibit the loading of this file with the option --no-site-file

--batch

The command-line option --batch causes Emacs to run noninteractively. The idea is that you specify Lisp programs to run; when they are finished, Emacs should exit.

--load, -l FILE, load Emacs Lisp FILE using the load function;

--funcall, -f FUNC, call Emacs Lisp function FUNC with no arguments

-f FUNC

--funcall, -f FUNC, call Emacs Lisp function FUNC with no arguments

--load, -l FILE

--load, -l FILE, load Emacs Lisp FILE using the load function

Verilog-mode is a standard part of GNU Emacs as of 22.2.

multiple directories

AUTOINST only search in the file's directory default.

You can append below verilog-library-directories for multiple directories search

1
2
3
// Local Variables:
// verilog-library-directories:("." "subdir" "subdir2")
// End:

plusargs in Verilog

systemverilog-command-line-input URL: https://www.chipverify.com/systemverilog/systemverilog-command-line-input

PLUSARGS IN SYSTEMVERILOG URL:https://www.theartofverification.com/plusargs-in-systemverilog/

plusargs are command-line switches supported by the simulator. As per SystemVerilog LRM arguments beginning with the + character will be available using the $test$plusargs and $value$plusargs PLI APIs.

1
2
3
$test$plusargs (user_string)

$value$plusargs (user_string, variable)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// tb.v
module tb;
int a;
initial begin
if($test$plusargs("RUNSIM")) begin
$display("There is RUNSIM plusargs");
end else begin
$display("There is NO $test$plusargs");
end
if($value$plusargs("SEED=%d",a)) begin
$display("SEED=%d",a);
end else begin
$display("There is NO $value$plusargs");
end
end
endmodule
  • compile

    1
    2
    $ vlib work
    $ vlog -sv tb.v
  • simulate (QuestaSim)

    • without plusargs

      1
      $ vsim work.tb -c -do "run; exit"
      1
      2
      3
      4
      5
      6
      7
      8
      9
      # //
      # Loading sv_std.std
      # Loading work.tb(fast)
      # run
      # There is NO $test$plusargs
      # There is NO $value$plusargs
      # exit
      # End time: 13:04:23 on Jun 04,2022, Elapsed time: 0:00:01
      # Errors: 0, Warnings: 0
    • with plusargs

      1
      $ vsim work.tb -c -do "run; exit" +SEED=31 +RUNSIM

      +SEED=31 +RUNSIM

      1
      2
      3
      4
      5
      6
      7
      8
      9
      # //
      # Loading sv_std.std
      # Loading work.tb(fast)
      # run
      # There is RUNSIM plusargs
      # SEED= 31
      # exit
      # End time: 13:04:55 on Jun 04,2022, Elapsed time: 0:00:01
      # Errors: 0, Warnings: 0

Inertial & transport delays

Verilog Nonblocking Assignments With Delays, Myths & Mysteries

Correct Methods For Adding Delays To Verilog Behavioral Models

Article (20488135) Title: Selecting Different Delay Modes in GLS (RAK)

Article (20447759) Title: Gate Level Simulation (GLS): A Quick Guide for Beginners

image-20230414232256309

image-20230414232439556

Inertial delay

Inertial delay models are simulation delay models that filter pulses that are shorted than the propagation delay of Verilog gate primitives or continuous assignments (assign #5 y = ~a;)

COMBINATIONAL LOGIC ONLY !!!

  • Inertial delays swallow glitches
  • sequential logic implemented with procedure assignments DON'T follow the rule

continuous assignments

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
`timescale 1ns/100ps
module tb;

reg in;
/////////////////////////////////////////////////////
wire out;
assign #2.5 out = in;
/////////////////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220317000509716

procedure assignment - combinational logic

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
`timescale 1ns/100ps
module tb;

reg in;
reg out;

//////////// combination logic ////////////////////////
always @(*)
#2.5 out = in;
///////////////////////////////////////////////////////
/* the above code is same with following code
always @(*) begin
#2.5;
out = in;
end
*/
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220316235257361

procedure assignment - sequential logic

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
`timescale 1ns/100ps
module tb;
reg clk;

reg in;
reg out;

always begin
clk = 0;
#5;
forever begin
clk = ~clk;
#5;
end
end
//////////// sequential logic //////////////////
always @(posedge clk)
#2.5 out <= in;
///////////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
end

initial begin
#50;
$finish();
end

endmodule

image-20220316235620168

As shown above, sequential logic DON'T follow inertial delay

Transport delay

Transport delay models are simulation delay models that pass all pulses, including pulses that are shorter than the propagation delay of corresponding Verilog procedural assignments

  • Transport delays pass glitches, delayed in time
  • Verilog can model RTL transport delays by adding explicit delays to the right-hand-side (RHS) of a nonblocking assignment
1
2
always @(*)
y <= #5 ~a;

nonblocking assignment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
`timescale 1ns/100ps
module tb;

reg in;
reg out;
/////////////// nonblocking assignment ///
always @(*) begin
out <= #2.5 in;
end
/////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220317003146825

blocking assignment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
`timescale 1ns/100ps
module tb;

reg in;
reg out;
/////////////// blocking assignment ///
always @(*) begin
out = #2.5 in;
end
/////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220317003819457

It seems that new event is discarded before previous event is realized.

clocking block in SystemVerilog

Assignment at <interface>.<clocking block>.<output signal> (i.e. synchronous) do NOT change <interface>.<output signal> until active clock edge.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// router_io.sv

interface router_io(input bit clock);
logic reset_n;
logic [15:0] din;
logic [15:0] frame_n;
logic [15:0] valid_n;
logic [15:0] dout;
logic [15:0] valido_n;
logic [15:0] busy_n;
logic [15:0] frameo_n;

clocking cb @(posedge clock);
default input #1ns output #1ns;
output reset_n;
output din;
output frame_n;
output valid_n;
input dout;
input valido_n;
input frameo_n;
input busy_n;
endclocking: cb

// `reset_n` can be either a synchronous or an asynchronous signal
modport TB(clocking cb, output reset_n);

endinterface: router_io

All interface signals are asynchronous and without a direction spection (i.e. input, output, inout).

  • The direction can only be specified in clocking block for synchronous signals
  • or a modport for asynchronous signals

All directions for the signals in the clocking block must be with respect to the test program;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// test.sv

program automatic test(router_io.TB rtr_io);

initial begin
reset();
end

task reset();
rtr_io.reset_n = 1'b0;
rtr_io.cb.frame_n <= '1;
rtr_io.cb.valid_n <= '1;
repeat(2) @rtr_io.cb;
rtr_io.cb.reset_n <= 1'b1;
repeat(15) @(rtr_io.cb);
endtask: reset

endprogram: test
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// router_test_top.sv

`timescale 1ns/100ps

module router_test_top;
parameter simulation_cycle = 100;

bit SystemClock = 0;

router_io top_io(SystemClock);
test t(top_io);

router dut(
.reset_n (top_io.reset_n),
.clock (top_io.clock),
.din (top_io.din),
.frame_n (top_io.frame_n),
.valid_n (top_io.valid_n),
.dout (top_io.dout),
.valido_n (top_io.valido_n),
.busy_n (top_io.busy_n),
.frameo_n (top_io.frameo_n)
);

initial begin
$timeformat(-9, 1, "ns", 10);
$fsdbDumpvars;
end

always begin
#(simulation_cycle/2) SystemClock = ~SystemClock;
end

endmodule

compile:

1
2
$ vcs -sverilog -full64 -kdb -debug_access+all router_test_top.sv test.sv router_io
.sv ../../rtl/router.v

file with `timescale must be placed in the first, which is router_test_top.sv in above example

clocking.output

image-20220621005749074

systemverilog don't pass clocking.output to interface's until current or next active edge and after output-skew

clocking.input

image-20220621010546293

Systemverilog automatically update clocking.input signal from interface's value, input-skew before active edge

Gotcha

An interface must be compiled separately like a module and CANNOT `include inside a package or ohter module

Cadence EE_pkg 101

What Is Real Number Modeling?

  • model analog blocks operation as signal flow model
  • only the digital solver is used for high-speed simulation
    • Event-driven
    • No convergence issues, because no analog solver is used
  • Five different language standards support real number modeling:
    • wreal (wired-real) ports in Verilog-AMS
    • real data type in VHDL
    • real data type in Verilog
    • real variables and nettypes in SystemVerilog (SV)
    • real types in e

Benefits of RNM

  • Most analog circuits that need to be modeled for MS verification at the SoC level can be described in terms of real-valued voltages or currents
  • RNM is a mixed approach, borrowing concepts from both continuous and discrete domains
    • The values are floating-point (real) number.
    • Time is discrete; the real signals change values based on discrete events
  • Applicability of RNM is bounded primarily by signal-flow model style
  • Migrating analog behavior from the analog domain to the event or pseudo-analog domain can bring huge benefits without sacrificing too much accuracy
  • Simulation is executed by a digital simulation engine without need for the analog solver
  • Hence real-number modeling enables very high performance simulation of mixed-signal systems

Limitations of RNM

  • connecting real or wreal signals to electrical signals requires careful consideration
    • Too conservative an approach can lead to large numbers of timepoints
    • Too liberal an approach can lead to losing signal accuracy
  • Time accuracy limited by the discrete sampling approach and the `timescale setting - no continuous signals anymore
  • Limited capability for combination of signals by wiring outputs together
    • Requires assumptions about impedances to do simple merging

constant part-select and indexed part-select in Verilog

Verilog scalar and vector [link]

What is the "+:" operator called in Verilog? [link]

A range of contiguous bits can be selected and is known as part-select. There are two types of part-selects, one with a constant part-select and another with an indexed part-select

1
2
reg [31:0] addr;
addr [23:16] = 8'h23; //bits 23 to 16 will be replaced by the new value 'h23 -> constant part-select

Having a variable part-select allows it to be used effectively in loops to select parts of the vector. Although the starting bit can be varied, the width has to be constant.

[<start_bit +: ] // part-select increments from start-bit

[<start_bit -: ] // part-select decrements from start-bit

Example

1
2
3
4
5
6
7
8
9
10
11
12
logic [31: 0] a_vect;
logic [0 :31] b_vect;

logic [63: 0] dword;
integer sel;

a_vect[ 0 +: 8] // == a_vect[ 7 : 0]
a_vect[15 -: 8] // == a_vect[15 : 8]
b_vect[ 0 +: 8] // == b_vect[0 : 7]
b_vect[15 -: 8] // == b_vect[8 :15]

dword[8*sel +: 8] // variable part-select with fixed width

Mixing Signed and Unsigned in Verilog

Bevan Baas, VLSI Digital Signal Processing, EEC 281 - VLSI Digital Signal Processing [https://www.ece.ucdavis.edu/~bbaas/281/]

Sign Extension

  1. Calculate the necessary minimum width of the sum so that it contains all input possibilities
  2. Extend the inputs' sign bits to the width of the answer
  3. Add as usual
  4. Ignore bits that ripple to the left of the answer's MSB
  1. signed
inA (signed) inB (signed) outSum
(signed/unsigned)
0101 (5) 1111 (-1)
extend sign 00101 11111
sum result 00100
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg signed [3:0] inA;
reg signed [3:0] inB;
reg signed [4:0] outSumSg; // signed result
reg [4:0] outSumUs; // unsigned result

initial begin
inA = 4'b0101;
inB = 4'b1111;
outSumUs = inA + inB;
outSumSg = inA + inB;

$display("signed out(%%d): %0d", outSumSg);
$display("signed out(%%b): %b", outSumSg);

$display("unsigned out(%%d): %0d", outSumUs);
$display("unsigned out(%%b): %b", outSumUs);
end
endmodule
  1. mixed
inA (signed) inB (unsigned) outSum
(signed/unsigned)
0101 (5) 1111 (15)
extend sign 00101 01111
sum result 10100
1
2
reg signed [3:0] inA;
reg [3:0] inB;
inA (unsigned) inB (signed) outSum
(signed/unsigned)
0101 (5) 1111 (-1)
extend sign 00101 01111
sum result 10100
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg [3:0] inA;
reg signed [3:0] inB;
reg signed [4:0] outSumSg;
reg [4:0] outSumUs;

initial begin
inA = 4'b0101;
inB = 4'b1111;
outSumUs = inA + inB;
outSumSg = inA + inB;

$display("signed out(%%d): %0d", outSumSg);
$display("signed out(%%b): %b", outSumSg);

$display("unsigned out(%%d): %0d", outSumUs);
$display("unsigned out(%%b): %b", outSumUs);
end
endmodule

xcelium

1
2
3
4
5
6
7
xcelium> run
signed out(%d): -12
signed out(%b): 10100
unsigned out(%d): 20
unsigned out(%b): 10100
xmsim: *W,RNQUIE: Simulation is complete.
xcelium> exit

vcs

1
2
3
4
5
6
Compiler version S-2021.09-SP2-1_Full64; Runtime version S-2021.09-SP2-1_Full64;  May  7 17:24 2022
signed out(%d): -12
signed out(%b): 10100
unsigned out(%d): 20
unsigned out(%b): 10100
V C S S i m u l a t i o n R e p o r t

observation

When signed and unsigned is mixed, the result is by default unsigned.

Prepend to operands with 0s instead of extending sign, even though the operands is signed

LHS DONT affect how the simulator operate on the operands but what the results represent, signed or unsigned

Therefore, although outSumUs is declared as signed, its result is unsigned

subtraction example

In logic arithmetic, addition and subtraction are commonly used for digital design. Subtraction is similar to addition except that the subtracted number is 2's complement. By using 2's complement for the subtracted number, both addition and subtraction can be unified to using addition only.

operands are signed
inA (signed) inB (signed) outSub
(signed/unsigned)
1111 (-1) 0010 (2)
extend sign 11111 00010
sub result 11101
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg signed [3:0] inA;
reg signed [3:0] inB;
reg signed [4:0] outSubSg;
reg [4:0] outSubUs;

initial begin
inA = 4'b1111;
inB = 4'b0010;
outSubUs = inA - inB;
outSubSg = inA - inB;

$display("signed out(%%d): %0d", outSubSg);
$display("signed out(%%b): %b", outSubSg);

$display("unsigned out(%%d): %0d", outSubUs);
$display("unsigned out(%%b): %b", outSubUs);
end
endmodule
1
2
3
4
5
6
Compiler version S-2021.09-SP2-1_Full64; Runtime version S-2021.09-SP2-1_Full64;  May  7 17:46 2022
signed out(%d): -3
signed out(%b): 11101
unsigned out(%d): 29
unsigned out(%b): 11101
V C S S i m u l a t i o n R e p o r t
1
2
3
4
5
6
xcelium> run
signed out(%d): -3
signed out(%b): 11101
unsigned out(%d): 29
unsigned out(%b): 11101
xmsim: *W,RNQUIE: Simulation is complete.
operands are mixed
inA (signed) inB (unsigned) outSub
(signed/unsigned)
1111 (-1) 0010 (2)
extend sign 01111 00010
sub result 01101
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg signed [3:0] inA;
reg [3:0] inB;
reg signed [4:0] outSubSg;
reg [4:0] outSubUs;

initial begin
inA = 4'b1111;
inB = 4'b0010;
outSubUs = inA - inB;
outSubSg = inA - inB;

$display("signed out(%%d): %0d", outSubSg);
$display("signed out(%%b): %b", outSubSg);

$display("unsigned out(%%d): %0d", outSubUs);
$display("unsigned out(%%b): %b", outSubUs);
end
endmodule
1
2
3
4
5
6
Compiler version S-2021.09-SP2-1_Full64; Runtime version S-2021.09-SP2-1_Full64;  May  7 17:50 2022
signed out(%d): 13
signed out(%b): 01101
unsigned out(%d): 13
unsigned out(%b): 01101
V C S S i m u l a t i o n R e p o r t
1
2
3
4
5
6
7
xcelium> run
signed out(%d): 13
signed out(%b): 01101
unsigned out(%d): 13
unsigned out(%b): 01101
xmsim: *W,RNQUIE: Simulation is complete.
xcelium> exit

Danger Sign

https://projectf.io/posts/numbers-in-verilog/

Verilog has a nasty habit of treating everything as unsigned unless all variables in an expression are signed. To add insult to injury, most tools won’t warn you if signed values are being ignored.

If you take one thing away from this post:

Never mix signed and unsigned variables in one expression!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
module signed_tb ();
logic [7:0] x; // 'x' is unsigned
logic signed [7:0] y; // 'y' is signed
logic signed [7:0] x1, y1;
logic signed [3:0] move;

always_comb begin
x1 = x + move; // !? DANGER: 'x' is unsigned but 'move' is signed
y1 = y + move;
end

initial begin
#10
$display("Coordinates (7,7):");
x = 8'd7;
y = 8'd7;
#10
$display("x : %b %d", x, x);
$display("y : %b %d", y, y);

#10
$display("Move +4:");
move = 4'sd4; // signed positive value
#10
$display("x1: %b %d *LOOKS OK*", x1, x1);
$display("y1: %b %d", y1, y1);

#10
$display("Move -4:");
move = -4'sd4; // signed negative value
#10
$display("x1: %b %d *SURPRISE*", x1, x1); // 0000_0111 + {0000}_1100 = 0001_0011
$display("y1: %b %d", y1, y1); // 0000_0111 + {1111}_1100 = 0000_0011
end
endmodule
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Chronologic VCS simulator copyright 1991-2021
Contains Synopsys proprietary information.
Compiler version S-2021.09-SP2-2_Full64; Runtime version S-2021.09-SP2-2_Full64; Nov 19 11:02 2022
Coordinates (7,7):
x : 00000111 7
y : 00000111 7
Move +4:
x1: 00001011 11 *LOOKS OK*
y1: 00001011 11
Move -4:
x1: 00010011 19 *SURPRISE*
y1: 00000011 3
V C S S i m u l a t i o n R e p o r t
Time: 60
CPU Time: 0.260 seconds; Data structure size: 0.0Mb

time format in Verilog

https://verificationacademy.com/forums/systemverilog/time-vs-realtime#answer-94062 https://verificationacademy.com/forums/systemverilog/time-vs-realtime#answer-94096

realtime vs time

  • $realtime round the current time to timeprecision

  • $time round the current time to integer

  • %t will scale the rounded value to represent timeprecision,

    i.e. \([\$\text{realtime}, \$\text{time}]\cdot \$\text{timeunit} / \$\text{timeprecision}\)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
timeunit 10ns;
timeprecision 1ps;

initial begin
$display("$realtime = %g", $realtime);
$display("$time = %g", $time);
// $realtime round to timeprecision
// $time round to integer
#1.10016
$display("$realtime = %g", $realtime);
$display("$time = %g", $time);

// %t format will scale the rounded value to represent timeprecision
// 1.1002*10e-9/1e-12 = 11002
$display("$realtime %%t = %t", $realtime);
$display("$time %%t = %t", $time);
end
endmodule

output

1
2
3
4
5
6
$realtime = 0
$time = 0
$realtime = 1.1002
$time = 1
$realtime %t = 11002
$time %t = 10000

timeunit, timeprecision

The time unit and time precision can be specified in the following two ways:

  • Using the compiler directive `timescale
  • Using the keywords timeunit and timeprecision
1
2
3
4
5
6
7
8
9
10
module D (...);
timeunit 100ps;
timeprecision 10fs;
...
endmodule

module E (...);
timeunit 100ps / 10fs; // timeunit with optional second argument
...
endmodule

The minimum of timeprecision determine %t output, the nearest timeunit and timeprecision determine the round of $realtime and $time. Of course, the simulator follow the time tick shown by $realtime.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
`timescale 1ns/1ns

// Due to minimum of timeprecision is 10ps
module rawplant;
timeunit 1ns;
timeprecision 100ps;

task print;
/*
1ns: 1
100ps: 6
10ps: 6
1ps: 6
*/
#1.666
$display("raw: $realtime = %g", $realtime); // 1.7
$display("raw: $time = %g", $time); // 2
$display("raw: $realtime %%t = %t", $realtime); // 1.7*1ns/10ps=170
$display("raw: $time %%t = %t", $time); // 2*1ns/10ps = 200
endtask

endmodule

module fineplant;
timeunit 100ps;
timeprecision 10ps;

task print;
/*
100ps: 2
10ps: 6
1ps: 6
*/
#2.66
$display("fine: $realtime = %g", $realtime); // 2.7
$display("fine: $time = %g", $time); // 3
$display("fine: $realtime %%t = %t", $realtime); // 2.7*100ps/10ps = 27
$display("fine: $time %%t = %t", $time); // 3*100ps/10ps = 30
endtask

endmodule

module tb;
rawplant rawblock();
fineplant fineblock();

initial begin
fork
rawblock.print();
fineblock.print();
join
end
endmodule

output

1
2
3
4
5
6
7
8
fine: $realtime = 2.7
fine: $time = 3
fine: $realtime %t = 27
fine: $time %t = 30
raw: $realtime = 1.7
raw: $time = 2
raw: $realtime %t = 170
raw: $time %t = 200

questasim cmd

1
2
3
vlib work
vlog tb.v
vsim -c -do "run -all;exit" work.tb

data dumping of simulation

$dumpvars and $dumpfile Verilog, [http://www.referencedesigner.com/tutorials/verilog/verilog_62.php]

FSDB

$fsdbDumpfile

It specifies the FSDB file name created by the Novas object files for FSDB dumping. If it is not specified, then the default FSDB file name is "novas.fsdb".

This command is valid only before executing $fsdbDumpvars and is ignored if specified after $fsdbDumpvars

$fsdbSuppress

The fsdbSuppressutility is used to skip dumping of few instances, scopes, modules and signals. The fsdbSuppressutility is a system task like other fsdb tasks.

For $fsdbSuppress() to be effective, it needs to be specified/called before $fsdbDumpvars

$fsdbAutoSwitchDumpfile

Automatically switch to a new dump file when the working FSDB file reaches the specified size or the specified wall time period.

After the dumping is finished, a virtual FSDB file (*.vf) is automatically created and list all of the generated FSDB files with the correct sequence. Only the virtual FSDB file, rather than all of the FSDB files, needs to be loaded to view the simulation results

When specified in the design to switch based on file size:

1
$fsdbAutoSwitchDumpfile(File_Size | File_Size_var, "FSDB_Name" |FSDB_Name_var, Number_of_Files | Number_of_Files_var[ ,"log_filename" | ,log_filename_var ], ["+no_overwrite"]);

When specified in the design to switch based on time period

1
$fsdbAutoSwitchDumpfile(File_Size | File_Size_var, "FSDB_Name" |FSDB_Name_var, Number_of_Files | Number_of_Files_var[ ,"log_filename" | ,log_filename_var ], ["+no_overwrite"], “+by_period”);

“+by_period”

$fsdbDumpvars

This command dumps the change in signal value to the FSDB file.

1
$fsdbDumpvars([ depth, | "level=",depth_var, ],[instance | "instance=",instance_var])

For VCS users, to include memory, MDA, packed array and structure information in the generated FSDB file, the -debug_access option must be included when VCS is invoked to compile the design

  • depth

    Specify how many sub-scope levels under the given scope you want to dump.

    • Specify this argument as 1 to dump the signals under the given scope
    • Specify this argument as 0 to dump all signals under the given scope and its descendant scopes.

    0: all signals in all scopes.

    1: all signals in current scope.

    2: all signals in the current scope and all scopes one level below.

    n: all signals in the current scope and all scopes n-1 levels below.

    tb.clk tb.u_div2.div2 tb.u_div2.u_div2neg.div2neg
    $fsdbDumpvars(0)
    $fsdbDumpvars(1)
    $fsdbDumpvars(2)
    $fsdbDumpvars(1, tb.u_div2)
    $fsdbDumpvars(0, tb.u_div2)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    module tb;
    reg clk;

    divider2 u_div2(clk);

    initial begin
    clk = 1'b0;
    forever #5 clk = ~clk;
    end

    initial begin
    #100;
    $finish();
    end

    initial begin
    #10;
    $fsdbDumpfile("tb.fsdb");
    //$fsdbDumpvars(0); // same with $fsdbDumpvars(0, tb)
    //$fsdbDumpvars(1); // same with $fsdbDumpvars(1, tb)
    //$fsdbDumpvars(2); // same with $fsdbDumpvars(2, tb)
    //$fsdbDumpvars(1, tb.u_div2);
    $fsdbDumpvars(0, tb.u_div2);
    #80 $finish();
    end

    endmodule


    module divider2 (
    input clk
    );
    reg div2;

    divider2neg u_div2neg(div2);

    always@(posedge clk) begin
    div2 = ~div2;
    end

    initial begin
    div2 = 1'b0;
    end

    endmodule

    module divider2neg (
    input clk
    );
    reg div2neg;

    always@(negedge clk) begin
    div2neg = ~div2neg;
    end

    initial begin
    div2neg= 1'b0;
    end

    endmodule

    compile

    1
    vcs -full64 -kdb -debug_access+all tb.v

    simulate

    1
    ./simv

    load fsdb

    1
    verdi -ssf tb.fsdb

    image-20220604192421888

$fsdbDumpon, $fsdbDumpoff

1
2
3
$fsdbDumpon(["+fsdbfile+filename"])

$fsdbDumpoff(["+fsdbfile+filename"])

These FSDB dumping commands turn dumping on and off. fsdbDumpon/fsdbDumpoff has the highest priority and overrides all other FSDB dumping commands.

fsdbDumpon/fsdbDumpoff is not restricted to only fsdbDumpvars. If there is more than one FSDB file open for dumping at one simulation run, fsdbDumpon/fsdbDumpoff may only affect a specific FSDB file by specifying the specific file name.

  • +fsdbfile+filename: Specify the FSDB file name. If not specified, the default FSDB file name is "novas.fsdb"

$fsdbDumpFinish

This command closes all FSDB files in the current simulation and stops dumping of signals. Although all FSDB files are closed automatically at the end of simulation, this dumping command can be invoked to explicitly close the FSDB files during the simulation

VCD

$dumpfile

The declaration onf $dumpfile must come before the $dumpvars or any other system tasks that specifies dump.

1
$dumpfile("test.vcd");

argument is necessary, there is no default value

$dumpvars

The $dumpvars is used to specify which variables are to be dumped ( in the file mentioned by $dumpfile). The simplest way to use it is without any argument.

1
$dumpvars(<levels> <, <module_or_variable>>* );

$dumplimit

It is possible that you inadvertantly generate huge file in Gigabytes ( for examples while dumping a Gigahertz clock for one second). To reduce such occurrences, we may use $dumplimit. It usage is

1
$dumplimit(<filesize>);

$dumpoff and $dumpon

During the simulation if you are bothered about about only during a certain interval then you can use $dumpoff and $dumpon. The following example shows its usage. It will dump the changes for first 100 units of time and then between 10200 and 10400 units of time.

1
2
3
4
5
6
7
8
9
10
11
12
initial
$monitor($time, " reset=%b,clk_out=%b",reset,clk_out);
initial begin
$dumpfile("clkdiv2n_tb.vcd");
$dumpvars(0,clkdiv2n_tb);
#100;
$dumpoff;
#10200;
$dumpon;
#10400;
$dumpoff;
end

demo

stimulus.v

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
`timescale 1ns / 1ps
module stimulus;
// Inputs
reg x;
reg y;
// Outputs
wire z;
// Instantiate the Unit Under Test (UUT)
comparator uut (
.x(x),
.y(y),
.z(z)
);

initial begin
$dumpfile("test.vcd");
$dumpvars(0);
// Initialize Inputs
x = 0;
y = 0;

#20 x = 1;
#20 y = 1;
#20 y = 0;
#20 x = 1;
#40 ;

end

initial begin
$monitor("t=%3d x=%d,y=%d,z=%d \n",$time,x,y,z, );
end

endmodule

comparator.v

1
2
3
4
5
6
7
8
9
module comparator(
input x,
input y,
output z
);

assign z = (~x & ~y) |(x & y);

endmodule
1
2
$ xrun stimulus.v comparator.v -access +rwc
$ simvision test.vcd

readmemb & readmemh in Verilog

Initialize Memory in Verilog [https://projectf.io/posts/initialize-memory-in-verilog/]

$readmemh("hex_memory_file.mem", memory_array, [start_address], [end_address]) $readmemb("bin_memory_file.mem", memory_array, [start_address], [end_address])

The system task has no versions to accept octal data or decimal data.

  • The 1st argument is the data file name.
  • The 2nd argument is the array to receive the data.
  • The 3rd argument is an optional start address, and if you provide it, you can also provide
  • The 4th argument optional end address.

Note, the 3rd and 4th argument address is for array not data file.

If the memory addresses are not specified anywhere, then the system tasks load file data sequentially from the lowest address toward the highest address.

The standard before 2005 specify that the system tasks load file data sequentially from the left memory address bound to the right memory address bound.

readtest.v

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
module readfile;
reg [7:0] array4 [0:3];
reg [7:0] array7 [6:0];
reg [7:0] array12 [11:0];

integer i;

initial begin
$readmemb("data.txt", array4);
$readmemb("data.txt", array7, 2, 5);
$readmemb("data.txt", array12);

for (i = 0; i < 4; i = i+1)
$display("array4[%0d] = %b", i, array4[i]);

$display("=========================");

for (i = 0; i < 7; i = i+1)
$display("array7[%0d] = %b", i, array7[i]);

$display("=========================");

for (i = 0; i < 12; i = i+1)
$display("array12[%0d] = %b", i, array12[i]);
end
endmodule

data.txt

1
2
3
4
5
6
7
8
00000000 
00000001
00000010
00000011
00000100
00000101
00000110
00001000

result

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
array4[0] = 00000000
array4[1] = 00000001
array4[2] = 00000010
array4[3] = 00000011
=========================
array7[0] = xxxxxxxx
array7[1] = xxxxxxxx
array7[2] = 00000000
array7[3] = 00000001
array7[4] = 00000010
array7[5] = 00000011
array7[6] = xxxxxxxx
=========================
array12[0] = 00000000
array12[1] = 00000001
array12[2] = 00000010
array12[3] = 00000011
array12[4] = 00000100
array12[5] = 00000101
array12[6] = 00000110
array12[7] = 00001000
array12[8] = xxxxxxxx
array12[9] = xxxxxxxx
array12[10] = xxxxxxxx
array12[11] = xxxxxxxx

iff in SystemVerilog

system verilog中的iff, [https://www.francisz.cn/2019/07/18/sv-iff]

1
2
@(posedge clk iff(vld));
do_something;

is equivalent to

1
2
3
4
5
forever begin
@(posedge clk);
if(vld) break;
end
do_something;

iff is more efficient than if because the expression is recalculated when vld transition rather than clk.

One example, detecting the negative edge of rtr_io.cb.frameo_n[da]

1
2
3
wait(rtr_io.cb.frameo_n[da] !== 0);
@(rtr_io.cb iff(rtr_io.cb.frameo_n[da] === 0 ));
$display("[DEBUG HGUO] %0t, rtr_io.cb.frameo_n[da] negedge", $realtime);

image-20220621182019927

[DEBUG HGUO] 6887250.0ns, rtr_io.cb.frameo_n[da] negedge

signed and unsigned arithmetic in Verilog

With implict sign extension, the implementation of signed arithmetic is DIFFERENT from that of unsigned. Otherwise, their implementations are same.

The implementations manifest the RTL's behaviour correctly

add without implicit sign extension

unsigned

rtl
1
2
3
4
5
6
7
module TOP (
input wire [2:0] data0
,input wire [2:0] data1
,output wire [2:0] result
);
assign result = data0 + data1;
endmodule

image-20220507114215200

synthesized netlist

image-20220507114307439

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 11:43:27 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [2:0] data1;
output [2:0] result;
wire n4, n5, n6;

an02d0 U6 ( .A1(data0[0]), .A2(data1[0]), .Z(n5) );
nr02d0 U7 ( .A1(data0[0]), .A2(data1[0]), .ZN(n4) );
nr02d0 U8 ( .A1(n5), .A2(n4), .ZN(result[0]) );
ad01d0 U9 ( .A(data1[1]), .B(data0[1]), .CI(n5), .CO(n6), .S(result[1]) );
xr03d1 U10 ( .A1(n6), .A2(data0[2]), .A3(data1[2]), .Z(result[2]) );
endmodule

vcs compile with -v /path/to/lib.v

signed

rtl
1
2
3
4
5
6
7
module TOP (
input wire signed [2:0] data0
,input wire signed [2:0] data1
,output wire signed [2:0] result
);
assign result = data0 + data1;
endmodule

image-20220507114654777

synthesized netlist

image-20220507114844111

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 11:48:54 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [2:0] data1;
output [2:0] result;
wire n4, n5, n6;

an02d0 U6 ( .A1(data0[0]), .A2(data1[0]), .Z(n5) );
nr02d0 U7 ( .A1(data0[0]), .A2(data1[0]), .ZN(n4) );
nr02d0 U8 ( .A1(n5), .A2(n4), .ZN(result[0]) );
ad01d0 U9 ( .A(data1[1]), .B(data0[1]), .CI(n5), .CO(n6), .S(result[1]) );
xr03d1 U10 ( .A1(n6), .A2(data0[2]), .A3(data1[2]), .Z(result[2]) );
endmodule

add WITH implicit sign extension

unsigned with 0 extension

rtl
1
2
3
4
5
6
7
module TOP (
input wire [2:0] data0 // 3 bit unsigned
,input wire [1:0] data1 // 2 bit unsigned
,output wire [2:0] result // 3 bit unsigned
);
assign result = data0 + data1;
endmodule

image-20220507121521303

synthesized netlist

image-20220507121622001

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 12:15:58 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [1:0] data1;
output [2:0] result;
wire n4, n5, n6;

an02d0 U6 ( .A1(data1[0]), .A2(data0[0]), .Z(n6) );
ad01d0 U7 ( .A(data1[1]), .B(data0[1]), .CI(n6), .CO(n4), .S(result[1]) );
xr02d1 U8 ( .A1(data0[2]), .A2(n4), .Z(result[2]) );
nr02d0 U9 ( .A1(data1[0]), .A2(data0[0]), .ZN(n5) );
nr02d0 U10 ( .A1(n6), .A2(n5), .ZN(result[0]) );
endmodule

signed with implicit sign extension

rtl
1
2
3
4
5
6
7
module TOP (
input wire signed [2:0] data0
,input wire signed [1:0] data1
,output wire signed [2:0] result
);
assign result = data0 + data1;
endmodule

image-20220507122053948

synthesized netlist

image-20220507122217830

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 12:21:51 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [1:0] data1;
output [2:0] result;
wire n6, n7, n8, n9, n10;

nd02d0 U9 ( .A1(data1[0]), .A2(data0[0]), .ZN(n10) );
inv0d0 U10 ( .I(n10), .ZN(n9) );
nr02d0 U11 ( .A1(data0[1]), .A2(data1[1]), .ZN(n7) );
aor221d1 U12 ( .B1(n9), .B2(data1[1]), .C1(n10), .C2(data0[1]), .A(n7), .Z(
n6) );
xn02d1 U13 ( .A1(data0[2]), .A2(n6), .ZN(result[2]) );
ora21d1 U14 ( .B1(data1[0]), .B2(data0[0]), .A(n10), .Z(result[0]) );
aor21d1 U15 ( .B1(data1[1]), .B2(data0[1]), .A(n7), .Z(n8) );
mx02d0 U16 ( .I0(n10), .I1(n9), .S(n8), .Z(result[1]) );
endmodule

Latch Inference in Verilog

UC Berkeley CS150 Lec #20: Finite State Machines [slides]

always@( * )

always@( * ) blocks are used to describe Combinational Logic, or Logic Gates. Only = (blocking) assignments should be used in an always@( * ) block.

Latch Inference

If you DON'T assign every element that can be assigned inside an always@( * ) block every time that always@( * ) block is executed, a latch will be inferred for that element

The approaches to avoid latch generation:

  • set default values
  • proper use of the else statement, and other flow constructs

without default values

latch is generated

RTL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
module TOP (
input wire Trigger,
input wire Pass,
output reg A,
output reg C
);
always @(*) begin
A = 1'b0;
if (Trigger) begin
A = Pass;
C = Pass;
end
end
endmodule
synthesized netlist

image-20220509170640006

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Mon May 9 17:09:18 2022
/////////////////////////////////////////////////////////////


module TOP ( Trigger, Pass, A, C );
input Trigger, Pass;
output A, C;


lanhq1 C_reg ( .E(Trigger), .D(Pass), .Q(C) );
an02d0 U3 ( .A1(Pass), .A2(Trigger), .Z(A) );
endmodule

add default value

Default values are an easy way to avoid latch generation

RTL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
module TOP (
input wire Trigger,
input wire Pass,
output reg A,
output reg C
);
always @(*) begin
A = 1'b0;
C = 1'b1;
if (Trigger) begin
A = Pass;
C = Pass;
end
end
synthesized netlist

image-20220509171319204

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Mon May 9 17:12:47 2022
/////////////////////////////////////////////////////////////


module TOP ( Trigger, Pass, A, C );
input Trigger, Pass;
output A, C;


nd12d0 U5 ( .A1(Pass), .A2(Trigger), .ZN(C) );
an02d0 U6 ( .A1(Pass), .A2(Trigger), .Z(A) );
endmodule

if evaluation

signed number cast to unsigned automatically before evaluating

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// tb.v
module tb;
reg signed [1:0] datasg;
reg [1:0] dataug;

initial begin
datasg = 2'b11;
dataug = 2'b11;

$display("datasg(%%d): %d", datasg);
$display("dataug(%%d): %d", dataug);

if (datasg)
$display("datasg is OK");
if (dataug)
$display("dataug is OK");

$finish();
end
endmodule
1
2
3
4
5
6
7
8
9
$ vlog tb.v
$ vsim -c -do "run;exit" work.tb
# Loading work.tb(fast)
# run
# datasg(%d): -1
# dataug(%d): 3
# datasg is OK
# dataug is OK
# ** Note: $finish : tb.v(16)

Arithmetic in Verilog

unsigned + unsigned = unsigned

1
2
3
4
5
6
7
8
9
10
function [7:0] satadd_uuu8b;   // unsigned + unsigned = unsigned
input [7:0] a;
input [7:0] b;

reg [8:0] t; // extend 1b
begin
t = {1'b0, a} + {1'b0, b};
satop_uuu16b = t[8] ? {8{1'b1}} : t[7:0];
end
endfunction

1'b1: overflow

signed + signed = signed

1
2
3
4
5
6
7
8
9
10
11
12
function [7:0] satadd_sss8b;    // signed + signed = signed
input signed [7:0] a;
input signed [7:0] b;

reg signed [8:0] t; // extend 1b
begin
t = a + b; // extend sign bit automatically
satadd_sss8b = (t[8:7] == 2'b01) ? {1'b0, 7{1'b1}} : // up sat
(t[8:7] == 2'b10) ? {1'b1, 7{1'b0}} : // dn sat
t[7:0];
end
endfunction

2'b01: overflow

2'b10: underflow

signed + unsigned = unsigned

1
2
3
4
5
6
7
8
9
10
11
12
function [7:0] satadd_suu8b;    // signed + unsigned = unsigned
input signed [7:0] a;
input [7:0] b;

reg signed [8:0] t; // extend 1b
begin
t = {a[7], a} + {1'b0, b};
satadd_ssu8b = (t[8:7] == 2'b10) ? {8{1'b1}} : // up saturate for unsigned
(t[8:7] == 2'b11) ? {8{1'b0}} : // dn saturate for unsigned
t[7:0];
end
endfunction

signed + unsigned = signed

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
function signed [7:0] satop_sus8b;    //signed +/- unsigned = signed
input signed [7:0] a;
input [7:0] b;
input plus;

reg signed [8:0] t; // extend 1b
begin
if(plus) begin
t = {a[7], a} + {1'b0, b};
satop_sus8b = (t[8:7]==2'b01) ? {1'b0, {7{1'b1}}} // up saturate for signed
: t[7:0];
end else begin
t = {a[7], a} - {1'b0, b};
satop_sus8b = (t[8:7]==2'b10) ? {1'b1, {7{1'b0}}} // dn saturate for signed
: t[7:0];
end
end
endfunction

Overflow Detection in Verilog

Overflow Detection: [http://www.c-jump.com/CIS77/CPU/Overflow/lecture.html]

  • Arithmetic operations have a potential to run into a condition known as overflow.
  • Overflow occurs with respect to the size of the data type that must accommodate the result.
  • Overflow indicates that the result was too large or too small to fit in the original data type.

Overflow when adding unsigned number

When two unsigned numbers are added, overflow occurs if

  • there is a carry out of the leftmost bit.

Overflow when adding signed numbers

When two signed 2's complement numbers are added, overflow is detected if:

  1. both operands are positive and the result is negative, or
  2. both operands are negative and the result is positive.

Notice that when operands have opposite signs, their sum will never overflow. Therefore, overflow can only occur when the operands have the same sign.

A B carryout_sum overflow
011 (3) 011 (3) 0_110 (6) overflow
100 (-4) 100 (-4) 1_000 (-8) underflow
111 (-1) 110 (-2) 1_101 (-3) -

carryout information ISN'T needed to detect overflow/underflow for signed number addition

EXTBIT:MSB

extended 1bit and msb bit can be used to detect overflow or underflow

1
2
3
4
5
6
7
8
9
10
reg signed  [1:0]      acc_inc;
reg signed [10-1:0] acc;
wire signed [10 :0] acc_w; // extend 1b for saturation
wire signed [10-1:0] acc_stat;

assign acc_w = acc + acc_inc; // signed arithmetic

assign acc_stat = (acc_w[10-1 +: 2] == 2'b01) ? {1'b0, {(10-1){1'b1}}} : // up saturation
(acc_w[10-1 +: 2] == 2'b10) ? {1'b1, {(10-1){1'b0}}} : // down saturation
acc_w[10-1:0];

2'b01 : overflow, up saturation

2'b10: underflow, down saturation

2's complement negative number

  1. Flip all bits
  2. Add 1.

N-bit signed number \[ A = -M_{N-1}2^{N-1}+\sum_{k=0}^{N-2}M_k2^k \] Flip all bits \[\begin{align} A_{flip} &= -(1-M_{N-1})2^{N-1} +\sum_{k=0}^{N-2}(1-M_k)2^k \\ &= M_{N-1}2^{N-1}-\sum_{k=0}^{N-2}M_k2^k -2^{N-1}+\sum_{k=0}^{N-2}2^k \\ &= M_{N-1}2^{N-1}-\sum_{k=0}^{N-2}M_k2^k -1 \end{align}\]

Add 1 \[\begin{align} A_- &= A_{flip}+1 \\ &= M_{N-1}2^{N-1}-\sum_{k=0}^{N-2}M_k2^k \\ &= -A \end{align}\]

reference

Lee, Weng Fook, Weng Fook Lee, and Glaser. Learning from VLSI design experience. Springer International Publishing, 2019.

Bevan Baas, EEC281 VLSI Digital Signal Processing, [https://www.ece.ucdavis.edu/~bbaas/281/]

One will see the following phenomena:

  1. A power spectrum is equal to the square of the absolute value of DFT.
  2. The sum of all power spectral lines in a power spectrum is equal to the power of the input signal.
  3. The integral of a PSD is equal to the power of the input signal.

power spectrum has units of \(V^2\) and power spectral density has units of \(V^2/Hz\)

Parseval's theorem is a property of the Discrete Fourier Transform (DFT) that states: \[ \sum_{n=0}^{N-1}|x(n)|^2 = \frac{1}{N}\sum_{k=0}^{N-1}|X(k)|^2 \] Multiply both sides of the above by \(1/N\): \[ \frac{1}{N}\sum_{n=0}^{N-1}|x(n)|^2 = \frac{1}{N^2}\sum_{k=0}^{N-1}|X(k)|^2 \] \(|x(n)|^2\) is instantaneous power of a sample of the time signal. So the left side of the equation is just the average power of the signal over the N samples. \[ P_{\text{av}} = \frac{1}{N^2}\sum_{k=0}^{N-1}|X(k)|^2\text{, }V^2 \] For the each DFT bin, we can say: \[ P_{\text{bin}}(k) = \frac{1}{N^2}|X(k)|^2\text{, k=0:N-1, }V^2/\text{bin} \] This is the power spectrum of the signal.

Note that \(X(k)\) is the two-sided spectrum. If \(x(n)\) is real, then \(X(k)\) is symmetric about \(fs/2\), with each side containing half of the power. In that case, we can choose to keep just the one-sided spectrum, and multiply Pbin by 2 (except DC & Nyquist):

\[ P_{\text{bin,one-sided}}(k) = \left\{ \begin{array}{cl} \frac{1}{N^2}|X(0)|^2 & : \ k = 0 \\ \frac{2}{N^2}|X(k)|^2 & : \ 1 \leq k \leq N/2-1 \\ \frac{1}{N^2}|X(N/2)|^2 & : \ k = N/2 \end{array} \right. \]

1
2
3
4
5
6
7
8
rng default
Fs = 1000;
t = 0:1/Fs:1-1/Fs;
x = cos(2*pi*100*t) + randn(size(t));
N = length(x);
xdft = fft(x);
xsq_sum_avg = sum(x.^2)/N;
specsq_sum_avg = sum(abs(xdft).^2)/N^2;

where xsq_sum_avg is same with specsq_sum_avg

For a discrete-time sequence x(n), the DFT is defined as: \[ X(k) = \sum_{n=0}^{N-1}x(n)e^{-j2\pi kn/N} \] By it definition, the DFT does NOT apply to infinite duration signals.

scaling DFT results

spectra type unit
amplitude spectrum V
power spectrum V^2
power spectrum density V^2/Hz

Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals Jens Ahrens, Carl Andersson, Patrik Höstmad, Wolfgang Kropp

[https://github.com/AppliedAcousticsChalmers/scaling-of-the-dft]

Different scaling is needed to apply for amplitude spectrum, power spectrum and power spectrum density, which shown as below

image-20220427220639646

\(f_s\) in Eq.(13) is sample rate rather than frequency resolution.

And Eq.(13) can be expressed as \[ \text{PSD}(k) =\frac{1}{f_{\text{res}}\cdot N\sum_{n}w^2(n)}\left|X_{\omega}(k)\right|^2 \] where \(f_{\text{res}}\) is frequency resolution

We define the following two sums for normalization purposes:

\[\begin{align} S_1 &= \sum_{j=0}^{N-1}w_j \\ S_2 &= \sum_{j=0}^{N-1}w_j^2 \end{align}\]

Given Eq.(12) and Eq.(13) \[\begin{align} \text{PS(k)} &= \text{PSD(k)} \cdot \frac{f_s \sum_{n}w^2(n)}{(\sum_{n}\omega(n))^2} \\ &= \text{PSD(k)} \cdot f_{\text{res}} \cdot \frac{N \sum_{n}w^2(n)}{(\sum_{n}w(n))^2} \\ &= \text{PSD(k)} \cdot f_{\text{res}} \cdot \frac{N S_2}{S_1^2} \\ &= \text{PSD(k)} \cdot f_{\text{res}} \cdot \text{NENBW} \\ &= \text{PSD(k)} \cdot \text{ENBW} \end{align}\]

where Normalized Equivalent Noise BandWidth is defined as \[ \text{NENBW} =\frac{N S_2}{S_1^2} \] and Effective Noise BandWidth is \[ \text{ENBW} =f_{\text{res}} \cdot \frac{N S_2}{S_1^2} \]

For Rectangular window, \(\text{ENBW} =f_{\text{res}}\)

This equivalent noise bandwidth is required when the resulting spectrum is to be expressed as spectral density (such as for noise measurements).

Eq.(13) Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Fs = 1024; 
t = (0:1/Fs:1-1/Fs).';
x = sin(2*pi*t*200);
Nx = length(x);
% Window data
w = hamming(Nx); % hamming window
xw = x.*w;
% Calculate power
nfft = Nx;
X = fft(xw,nfft);

NumUniquePts = nfft/2+1;
Pxx1 = abs(X(1:NumUniquePts)).^2/sum(w.^2)/Fs; %!!! Eq.(13)
Pxx1(2:end-1) = Pxx1(2:end-1)*2; % twosided -> onesided
Fx1 = (0:NumUniquePts-1)*Fs/nfft;

[Pxx2,Fx2] = pwelch(x,w,0,nfft,Fs);

plot(Fx1,10*log10(Pxx1),Fx2,10*log10(Pxx2),'r--');
legend('PSD via Eq.(13)','PSD via pwelch')

image-20220514165525958

window effects

It is possible to correct both the amplitude and energy content of the windowed signal to equal the original signal. However, both corrections cannot be applied simultaneously

Amplitude correction

Linear (amplitude) Spectrum \[ X(k) = \frac{X_{\omega}(k)}{S_1} \] power spectrum \[ \text{PS}=\frac{\left| X_{\omega}(k) \right|^2}{S_1^2} \]

usage: tone amplitude

Power correction

power spectral density (PSD) \[ \text{PSD} =\frac{\left|X_{\omega}(k)\right|^2}{f_s\cdot S_2} \]

We have \(\text{PSD} = \frac{\text{PS}}{\text{ENBW}}\), where \(\text{ENBW}=\frac{N \cdot S_2}{S_1^2}f_{\text{res}}\)

linear power spectrum \[ \text{PS}_L=\frac{|X_{\omega}(k)|^2}{N\cdot S_2} \]

usage: RMS value, total power \[ \text{PS}_L(k)=\text{PSD(k)} \cdot f_{\text{res}} \]

Window Correction Factors

While a window helps reduce leakage (The window reduces the jumps at the ends of the repeated signal), the window itself distorts the data in two different ways:

  • Amplitude – The amplitude of the signal is reduced

    This is due to the fact that the window removes information in the signal

  • Energy – The area under the curve, or energy of the signal, is reduced

rtaImage

Window correction factors are used to try and compensate for the effects of applying a window to data. There are both amplitude and energy correction factors.

Window Type Amplitude Correction (\(K_a\)) Energy Correction (\(K_e\))
Rectangluar 1.0 1.0
hann 1.9922 1.6298
blackman 2.3903 1.8155
kaiser 1.0206 1.0204

Only the Uniform window (rectangular window), which is equivalent to no window, has the same amplitude and energy correction factors.

\[\begin{align} X(k) &= \frac{X_\omega(k)}{\sum_n w(n)} \\ &= \frac{N}{\sum_n w(n)} \frac{X_\omega(k)}{N} \\ &= K_a \cdot \frac{X_\omega(k)}{N} \end{align}\] where \(K_a = \frac{N}{\sum_n \omega(n)}\)

In literature, Coherent power gain is defined show below, which is close related to \(K_a\) \[ \text{Coherent power gain (dB)} = 20 \; log_{10} \left( \frac{\sum_n w[n]}{N} \right) \]

amplitude-correction

With amplitude correction, by multiplying by two, the peak value of both the original and corrected spectrum match. However the energy content is not the same.

The amplitude corrected signal (red) appears to have more energy, or area under the curve, than the original signal (blue).

\[\begin{align} \text{PSD} &=\frac{1}{f_{\text{res}}\cdot N\sum_{n}w^2(n)}\left|X_{\omega}(K)\right|^2 \\ &= \frac{N}{\sum_{n}w^2(n)} \frac{\left|X_{\omega}(K)\right|^2}{f_{\text{res}}\cdot N^2} \\ &=\left|K_e \cdot \frac{X_{\omega}(K)}{N}\right|^2 \cdot \frac{1}{f_{\text{res}}} \end{align}\]

wherer \(K_e = \sqrt{\frac{N}{\sum_{n}\omega^2(n)}}\)

energey-correction

Multiplying the values in the spectrum by 1.63, rather than 2, makes the area under the curve the same for both the original signal (blue) and energy corrected signal (red)

hanning's correction factors:

1
2
3
4
N = 256;
w = hanning(N);
Ka = N/sum(w)
Ke = sqrt(N/sum(w.^2))
1
2
3
4
5
6
7
8
Ka =

1.9922


Ke =

1.6298

Amplitude,Power Correction Example

image-20220518002927545

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
%% 200 Hz, amplitude 1 and 200/3 Hz amplitude 0.3
fosc = 200;
Fs = 8192;
Nx = Fs*4;
tstop = Nx/Fs;
t = (0:1/Fs:tstop-1/Fs).';
x = sin(2*pi*t*fosc) + 0.3*sin(2*pi*t*fosc/3);
Nx = length(x);

%% direct method, time domain
P0 = sum(x.^2)/Nx;
xrms0 = sqrt(P0); % RMS value caculated by time domain
fprintf("xrms0: %.4f\n", xrms0);

%% Window data
w = hamming(Nx); % hamming window
%w = ones(Nx, 1); % rectangular window

S1= sum(w);
S2 = sum(w.^2);
NENBW = Nx*S2/S1^2;

xw = x.*w;
nfft = Nx;
X = fft(xw,nfft); % windowed signal fft
NumUniquePts = nfft/2+1;
Fx1 = (0:NumUniquePts-1)*Fs/nfft;

%% fft - ps -> rms
Xmag = abs(X);
PS1 = Xmag.^2/S1^2; %!!! Eq.(12)
PS1 = PS1(1:NumUniquePts);
PS1(2:end-1) = PS1(2:end-1)*2; % twosided -> onesided
Pmax_fft = max(PS1); % !!! amplitude OK
fprintf("Pmax_fft: %.4f\n", Pmax_fft);

% linear power spectrum
PSL_fft = Xmag.^2/nfft/S2;
PSL_fft = PSL_fft(1:NumUniquePts);
PSL_fft = PSL_fft(2:end-1)*2;
Ptot_fft1 = sum(PSL_fft);
xrms_PSL = sqrt(Ptot_fft1); % !!! power OK
fprintf("xrms_PSL: %.4f\n", xrms_PSL);

% use psd*fres to calculate total power
Pxx1 = abs(X(1:NumUniquePts)).^2/sum(w.^2)/Fs; %!!! Eq.(13)
Pxx1(2:end-1) = Pxx1(2:end-1)*2; % twosided -> onesided
Ptot_fft2 = sum(Pxx1*Fs/nfft);
xrms_psd = sqrt(Ptot_fft2); % !!! power OK
fprintf("xrms_psd: %.4f\n", xrms_psd);


%% pwelch
[Pxx2,Fx2] = pwelch(x,w,0,nfft,Fs);
fres = Fx2(2)-Fx2(1);

PS2 = Pxx2*fres;
Ptot_pwelch = sum(PS2);
xrms_pwelch = sqrt(Ptot_pwelch); % !!! amplitude OK
fprintf("xrms_pwelch: %.4f\n", xrms_pwelch);

% PS = PSD * ENBW always hold
ENBW = fres * NENBW;
PS2_pwelch = Pxx2*ENBW;
Pmax_pwelch = max(PS2_pwelch); % !!! amplitude OK
fprintf("Pmax_pwelch: %.4f\n", Pmax_pwelch);

%% plot psd of two methods
plot(Fx1,10*log10(Pxx1),Fx2,10*log10(Pxx2),'r--');
legend('PSD via Eq.(13)','PSD via pwelch')
grid on;
xlabel('Freq (Hz)');
ylabel('dB; V^2/Hz')
1
2
3
4
5
6
xrms0: 0.7383
Pmax_fft: 0.5000
xrms_PSL: 0.7382
xrms_psd: 0.7382
xrms_pwelch: 0.7382
Pmax_pwelch: 0.5000

RMS

We may also want to know for example the RMS value of the signal, in order to know how much power the signal generates. This can be done using Parseval’s theorem.

For a periodic signal, which has a discrete spectrum, we obtain its total RMS value by summing the included signals using \[ x_{\text{rms}}=\sqrt{\sum R_{xk}^2} \] Where \(R_{xk}\) is the RMS value of each sinusoid for \(k=1,2,3,...\) The RMS value of a signal consisting of a number of sinusoids is consequently equal to the square root of the sum of the RMS values.

This result could also be explained by noting that sinusoids of different frequencies are orthogonal, and can therefore be summed like vectors (using Pythagoras’ theorem)

For a random signal we cannot interpret the spectrum in the same way. As we have stated earlier, the PSD of a random signal contains all frequencies in a particular frequency band, which makes it impossible to add the frequencies up. Instead, as the PSD is a density function, the correct interpretation is to sum the area under the PSD in a specific frequency range, which then is the square of the RMS, i.e., the mean-square value of the signal \[ x_{\text{rms}}=\sqrt{\int G_{xx}(f)df}=\sqrt{\text{area under the curve}} \] The linear spectrum, or RMS spectrum, defined by \[\begin{align} X_L(k) &= \sqrt{\text{PSD(k)} \cdot f_{\text{res}}}\\ &=\sqrt{\frac{\left|X_{\omega}(k)\right|^2}{f_{\text{res}}\cdot N\sum_{n}\omega^2(n)} \cdot f_{\text{res}}} \\ &= \sqrt{\frac{\left|X_{\omega}(k)\right|^2}{N\sum_{n}\omega^2(n)}} \\ &= \sqrt{\frac{|X_{\omega}(k)|^2}{N\cdot S_2}} \end{align}\]

The corresponding linear power spectrum or RMS power spectrum can be defined by \[\begin{align} \text{PS}_L(k)&=X_L(k)^2=\frac{|X_{\omega}(k)|^2}{S_1^2}\frac{S_1^2}{N\cdot S_2} \\ &=\text{PS}(k) \cdot \frac{S_1^2}{N\cdot S_2} \end{align}\]

So, RMS can be calculated as below \[\begin{align} P_{\text{tot}} &= \sum \text{PS}_L(k) \\ \text{RMS} &= \sqrt{P_{\text{tot}}} \end{align}\]

DFT averaging

we use \(N= 8*\text{nfft}\) time samples of \(x\) and set the number of overlapping samples to \(\text{nfft}/2 = 512\). pwelch takes the DFT of \(\text{Navg} = 15\) overlapping segments of \(x\), each of length \(\text{nfft}\), then averages the \(|X(k)|^2\) of the DFT’s.

In general, if there are an integer number of segments that cover all samples of N, we have \[ N = (N_{\text{avg}}-1)*D + \text{nfft} \] where \(D=\text{nfft}-\text{noverlap}\). For our case, with \(D = \text{nfft}/2\) and \(N/\text{nfft} = 8\), we have \[ N_{\text{avg}}=2*N/\text{nfft}-1=15 \] For a given number of time samples N, using overlapping segments lets us increase \(N_{avg}\) compared with no overlapping. In this case, overlapping of 50% increases \(N_{avg}\) from 8 to 15. Here is the Matlab code to compute the spectrum:

1
2
3
4
5
6
7
8
nfft= 1024;
N= nfft*8; % number of samples in signal
n= 0:N-1;
x= A*sin(2*pi*f0*n*Ts) + .1*randn(1,N); % 1 W sinewave + noise
noverlap= nfft/2; % number of overlapping time samples
window= hanning(nfft);
[pxx,f]= pwelch(x,window,noverlap,nfft,fs); % W/Hz PSD
PdB_bin= 10*log10(pxx*fs/nfft); % dBW/bin

figure 3_99153

DFT averaging reduces the variance \(\sigma^2\) of the noise spectrum by a factor of \(N_{avg}\), as long as noverlap is not greater than nfft/2

image-20220420013255660

image-20230526011628682

fftshift

The result of fft and its index is shown as below

complexDFT_Matlab_index

After fftshift

FFTshift_Matlab_index_resolved

1
2
3
4
5
>> fftshift([0 1 2 3 4 5 6 7])

ans =

4 5 6 7 0 1 2 3

image-20250825230510355

1
2
3
4
5
6
7
8
9
clear;
N = 100;
fs = 1000;
fx = 100;
x = cos(2*pi*fx/fs*[0:1:N-1]);
s = abs(fftshift(fft(x)))/N;
fx = linspace(0,N-1,N)*fs/N -fs/2;
plot(fx, s, 'o-', 'linewidth', 2);
grid on; grid minor;

dft and psd function in virtuoso

  • dft always return

    image-20230525011604685

To compensate windowing effect, \(W(n)\), the dft output should be multiplied by \(K_a\), e.g. 1.9922 for hanning window.

  • psd function has taken into account \(K_e\), postprocessing is not needed

reference

Heinzel, Gerhard, A. O. Rüdiger and Roland Schilling. "Spectrum and spectral density estimation by the Discrete Fourier transform (DFT), including a comprehensive list of window functions and some new at-top windows." (2002). URL: https://holometer.fnal.gov/GH_FFT.pdf

Demonstration-DFT-PS-PSD

mathworks, Power Spectral Density Estimates Using FFT

Lyons, Richard, “Widowing Functions Improve FFT Results”, EDN, June 1, 1998. https://www.edn.com/electronics-news/4383713/Windowing-Functions-Improve-FFT-Results-Part-I

Rapuano, Sergio, and Harris, Fred J., An Introduction to FFT and Time Domain Windows, IEEE Instrumentation and Measurement Magazine, December, 2007. https://ieeexplore.ieee.org/document/4428580

Neil Robertson, The Power Spectrum

—, Use Matlab Function pwelch to Find Power Spectral Density – or Do It Yourself

—, Evaluate Window Functions for the Discrete Fourier Transform

—. Add a Power Marker to a Power Spectral Density (PSD) Plot [https://www.dsprelated.com/showarticle/1387.php]

Mathuranathan Viswanathan, Interpret FFT, complex DFT, frequency bins & FFTShift

robert bristow-johnson, Does windowing affect Parseval's theorem?

Window Correction Factors URL: https://community.sw.siemens.com/s/article/window-correction-factors

Brandt, A, 2011. Noise and vibration analysis: signal analysis and experimental procedures. John Wiley & Sons

What should be the correct scaling for PSD calculation using fft. URL:https://dsp.stackexchange.com/a/32205

POWER SPECTRAL DENSITY CALCULATION VIA MATLAB Revision C. URL: http://www.vibrationdata.com/tutorials_alt/psd_mat.pdf

Jens Ahrens, Carl Andersson, Patrik Höstmad, Wolfgang Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals” in 148th Convention of the AES, e-Brief 56, May 2020 [ pdf, web ].

Manolakis, D., & Ingle, V. (2011). Applied Digital Signal Processing: Theory and Practice. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511835261

  • Laplace transform
    • a generalization of the continuous-time Fourier transform
    • converts integro-differential equations into algebraic equations
  • z-transforms
    • a generalization of the discrete-time Fourier transform
    • converts difference equations into algebraic equations

system function, transfer function: \(H(s)\)

frequency response: \(H(j\omega)\), if the ROC of \(H(s)\) includes the imaginary axis, i.e.\(s=j\omega \in \text{ROC}\)

Laplace Transform

To specify the Laplace transform of a signal, both the algebraic expression and the ROC are required. The ROC is the range of values of \(s\) for the integral of \(t\) converges

bilateral Laplace transform

image-20241001140721422

where \(s\) in the ROC and \(\mathfrak{Re}\{s\}=\sigma\)

The formal evaluation of the integral for a general \(X(s)\) requires the use of contour integration in the complex plane. However, for the class of rational transforms, the inverse Laplace transform can be determined without directly evaluating eq. (9.56) by using the technique of partial fraction expansion

ROC Property

The range of values of s for which the integral in converges is referred to as the region of convergence (which we abbreviate as ROC) of the Laplace transform

image-20241001133645466

image-20241001152307222

image-20241001153011746

i.e. no pole in RHP for stable LTI sytem

System Causality

For a causal LTI system, the impulse response is zero for \(t \lt 0\) and thus is right sided

image-20241001164006698

causality implies that the ROC is to the right of the rightmost pole, but the converse is not in general true, unless the system function is rational

System Stability

The system is stable, or equivalently, that \(h(t)\) is absolutely integrable and therefore has a Fourier transform, then the ROC must include the entire \(j\omega\)-axis

image-20241001223128791

image-20241001163547031

all of the poles have negative real parts

Unilateral Laplace transform

analyzing causal systems and, particularly, systems specified by linear constant-coefficient differential equations with nonzero initial conditions (i.e., systems that are not initially at rest)

image-20241001180202348

A particularly important difference between the properties of the unilateral and bilateral transforms is the differentiation property \(\frac{d}{dt}x(t)\)

Laplace Transform
Bilateral Laplace Transform \(sX(s)\)
Unilateral Laplace Transform \(sX(s)-x(0^-)\)

image-20241001183543726

Integration by parts for unilateral Laplace transform

image-20241001184124032

in Bilateral Laplace Transform

image-20241001231719320

In fact, the initial- and final-value theorems are basically unilateral transform properties, as they apply only to signals \(x(t)\) that are identically \(0\) for \(t \lt 0\).

\(z\)-Transform

The \(z\)-transform for discrete-time signals is the counterpart of the Laplace transform for continuous-time signals

image-20241001204402907

where \(z=re^{j\omega}\)

image-20241001204810918

The \(z\)-transform evaluated on the unit circle corresponds to the Fourier transform

image-20241001223944822

ROC Property

image-20241001222148012

system stability

The system is stable, or equivalently, that \(h[n]\) is absolutely summable and therefore has a Fourier transform, then the ROC must include the unit circle

\(z\)-Transform Table

image-20250622105429578

[http://www.ws.binghamton.edu/Fowler/fowler%20personal%20page/EE301_files/ZT%20Tables_rev2.pdf]

Unilateral \(z\)-Transform

image-20241001225603923

The time shifting property is different in the unilateral case because the lower limit in the unilateral transform definition is fixed at zero, \(x[n-n_0]\)

bilateral \(z\)-transform

image-20241001232749554

unilateral \(z\)-transform

image-20241001233014484

Initial rest condition

image-20241001233548788

Initial Value Theorem & Final Value Theorem

Laplace Transform

Two valuable Laplace transform theorem

  • Initial Value Theorem, which states that it is always possible to determine the initial value of the time function \(f(t)\) from its Laplace transform \[ \lim _{s\to \infty}sF(s) = f(0^+) \]

  • Final Value Theorem allows us to compute the constant steady-state value of a time function given its Laplace transform \[ \lim _{s\to 0}sF(s) = f(\infty) \]

    If \(f(t)\) is step response, then \(f(0^+) = H(\infty)\) and \(f(\infty) = H(0)\), where \(H(s)\) is transfer function

\(z\)-transform

Initial Value Theorem \[ f[0]=\lim_{z\to\infty}F(z) \] final value theorem \[ \lim_{n\to\infty}f[n]=\lim_{z\to1}(z-1)F(z) \]

Coert Vonk. Initial/final value proofs [https://coertvonk.com/physics/lfz-transforms/z/initial-final-value-proofs-31543]

\(s\)- and \(z\)-Domains Connection

discrete-time systems also can be analyzed by means of the Laplace transform — the \(z\)-transform is the Laplace transform in disguise and that discrete-time systems can be analyzed as if they were continuous-time systems.

image-20241002112611432

A continuous-time system with transfer function \(H(s)\) that is identical in structure to the discrete-time system \(H[z]\) except that the delays in \(H[z]\) are replaced by elements that delay continuous-time signals

delay element Time domain Transform
\(z\)-transform \(\delta[n-1]\) \(z^{-1}\)
Laplace transform \(\delta(t-T)\) \(e^{-sT}\)
\(z\)-transform Laplace transform
\(x[n]\) \(X[z]=\sum_{n=0}^{\infty}x[n]z^{-n}\) \(\bar{x}(t)=\sum_{n=0}^{\infty}x[n]\delta(t-nT)\) \(\overline{X}(s)=\sum_{n=0}^{\infty}x[n]e^{-snT}\)
\(y[n]\) \(Y[z]=\sum_{n=0}^{\infty}y[n]z^{-n}\) \(\bar{y}(t)=\sum_{n=0}^{\infty}y[n]\delta(t-nT)\) \(\overline{Y}(s)=\sum_{n=0}^{\infty}y[n]e^{-snT}\)
\(h[n]\) \(H[z]=\sum_{n=0}^{\infty}h[n]z^{-n}\) \(\bar{h}(t)=\sum_{n=0}^{\infty}h[n]\delta(t-nT)\) \(\overline{H}(s)=\sum_{n=0}^{\infty}h[n]e^{-snT}\)
\(y[n]=x[n]*h[n]\) \(Y[z]=X[z]H[z]\) \(\bar{y}(t)=\bar{x}(t)*\bar{h}(t)\) \(\overline{Y}(s)=\overline{X}(s)\overline{H}(s)\)

Therefore, we obtain

\[\begin{align} \sum_{n=0}^{\infty}y[n]z^{-n} &=\sum_{n=0}^{\infty}x[n]z^{-n}\cdot\sum_{n=0}^{\infty}h[n]z^{-n} \\ \sum_{n=0}^{\infty}y[n]e^{-snT} &=\sum_{n=0}^{\infty}x[n]e^{-snT}\cdot \sum_{n=0}^{\infty}h[n]e^{-snT} \end{align}\]

The \(z\)-transform of a sequence can be considered to be the Laplace transform of impulses sampled train with a change of variable \(z = e^{sT}\) or \(s = \frac{1}{T}\ln z\)

Note that the transformation \(z = e^{sT}\) transforms the imaginary axis in the \(s\) plane (\(s = j\omega\)) into a unit circle in the \(z\) plane (\(z = e^{sT} = e^{j\omega T}\), or \(|z| = 1\))

Note: \(\bar{h}(t)\) is the impulse sampled version of \(h(t)\)


Note \(\bar{h}(t)\) is impulse sampled signal, whose CTFT is scaled by \(\frac{1}{T}\) of continuous signal \(h(t)\), \(\overline{H}[e^{sT}]=\overline{H}(e^{sT})\) is the approximation of continuous time system response, for example summation \(\frac{1}{1-z^{-1}}\) \[ \frac{1}{1-z^{-1}} = \frac{1}{1-e^{-sT}} \approx \frac{1}{j\omega \cdot T} \] And we know transform of integral \(u(t)\) is \(\frac{1}{s}\), as expected there is ratio \(T\)


Staszewski, Robert Bogdan, and Poras T. Balsara. All-digital frequency synthesizer in deep-submicron CMOS. John Wiley & Sons, 2006

image-20250807205854057


D. Sundararajan. Signals and Systems A Practical Approach Second Edition, 2023

image-20250809233806280

Frequency Response of Discrete-Time LTI Systems

Eugenio Schuster, Lehigh University, Discrete-Time Signal Analysis in the Frequency-Domain [https://www.lehigh.edu/~eus204/teaching/ME450_SIRC/notes/DTS_Tutorial.pdf]

image-20250907091547262

image-20250907091735655

impulse invariance

A straightforward and useful relationship between the continuous-time impulse response \(h_c(t)\) and the discrete-time impulse response \(h[n]\)

image-20241002133303153

\[ h[n] = Th_c(nT) \]

and \(T\) is chosen such that

\[ H_c(j\omega)=0, \space\space |\hat{\omega}| \ge \frac{\pi}{T} \]

When \(h[n]\) and \(h_c(t)\) are related through the above equation, i.e., the impulse response of the discrete-time system is a scaled, sampled version of \(h_c(t)\), the discrete-time system is said to be an impulse-invariant version of the continuous-time system

we have \[ H(e^{j\hat{\omega}}) = H_c\left(j\frac{\hat{\omega}}{T}\right),\space\space |\hat{\omega}| \lt \pi \]

  • \(h[n] = Th_c(nT)\) and \(T\to 0\)

    guarantees \(y_c(nT) = y_r(nT)\), i.e. output equivalence only at the sampling instants

  • \(H_c(j\Omega)\) is bandlimited and \(T \lt 1/2f_h\)

    guarantees \(y_c(t) = y_r(t)\)

The scaling of \(T\) can alternatively be thought of as a normalization of the time domain, that is average impulse response shall be same i.e., \(h[n]\times 1 = h(nT)\times T\)

image-20250807202725879


note that the impulse-invariant transform is not appropriate for high-pass responses, because of aliasing errors

image-20250622082500575

⭐ bandlimited

image-20250621214228274

bandlimited

image-20250621214307332


image-20250809112858954


Scale the Result if impulse not multiplied with \(T_s\)

impulse_Ts.drawio

[https://www.mathworks.com/matlabcentral/answers/1582704-how-to-convolve-signal-and-transfer-function#comment_1825159]

[https://www.google.com.hk/search?q=matlab+continuous+transfer+function+in+time+domain+convolution]

1
2
3
4
5
6
7
8
9
t = linspace(0, 10);
dt = t(2) - t(1);
%u = @(t) heaviside(t); % by default, heavside(0) = 1/2, but at t = 0, we should have u(0) = 1 by definition of unit step
u = @(t) (t>=0);
Vin = @(t) u(t)-u(t-3);
h = @(t) ( exp(-2*t)/6 ) + ( 3/16*exp(-t) ) - ( exp(-5*t)/48 );
y = conv(Vin(t), h(t))*dt; % must scale by dt
plot(t,y(1:numel(t)),'-x')
legend('wt - Laplace','wt - convolution integral','wt - convolution sum')

Transfer function from sampled impulse response

continuous-time filter designs to discrete-time designs through techniques such as impulse invariance

useful functions

  • using fft

    The outputs of the DFT are samples of the DTFT

  • using freqz

    modeling as FIR filter, and the impulse response sequence of an FIR filter is the same as the sequence of filter coefficients, we can express the frequency response in terms of either the filter coefficients or the impulse response

    fft is used in freqz internally

freqz method is straightforward, which apply impulse invariance criteria. Though fft is used for signal processing mostly,


image-20241002082937903

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
clear all;
clc;

%% continuous system
s = tf('s');
h = 2*pi/(2*pi+s); % First order lowpass filter with 3-dB frequency 1Hz
[mag, phs, wout] = bode(h);
fct = wout(:)/2/pi;
Hct_dB = 20*log10(mag(:));


fstep = 0.01; % freq resolution
fnyqst = 1000;
Ts = 1/(2*fnyqst);
Fs = 1/Ts; % sampling freq
Ns = ceil(Fs/fstep); % samping points
fstep = Fs/Ns; % update fstep
t = (0:Ns-1)*Ts; % sampling time points

y = impulse(h, t); % impulse resp

%% modelling as discrete system
Y = fft(y); % dft
Hfft = Y * Ts; % !!! multiply Ts
Hfft_dB = 20*log10(abs(Hfft(1:Ns/2+1)));
ffft = (1:Ns/2+1)*fstep - fstep;


[Hfir, ffir] = freqz(y, 1, [], 1/Ts); % modelling as FIR
Hfir = Hfir * Ts; % !!! multiply Ts
Hfir_dB = 20*log10(abs(Hfir));

%% plot
semilogx(fct, Hct_dB, '-ok', ffft, Hfft_dB, 'r.-', ffir, Hfir_dB, 'b--');
legend('bode(s)', 'fft', 'FIR model')
xlabel('Freq(Hz)');
ylabel('dB');
xlim([1e-2 1e2]);
grid on;
title('frequency response of different methods');

!!! only multiply Ts

tf_impulse.drawio


Neil Robertson. The Discrete Fourier Transform as a Frequency Response [https://www.dsprelated.com/showarticle/1498.php]

H(k) by using complex exponentials is indeed identical to the DFT of h(n)

H(f) by DFT -> fft

H(k) by using complex exponential -> freqz

image-20250907105453920

bandwidth from pulse response

L.W. Couch, Digital and Analog Communication Systems, 8th Edition, Pearson, 2013. [pdf]

Generating Basic signals – Rectangular Pulse and Power Spectral Density using FFT

Convolution Property of the Fourier Transform \[ x(t)*h(t)\longleftrightarrow X(\omega)H(\omega) \] pulse response can be obtained by convolve impulse response with UI length rectangular \[ H(\omega) = \frac{Y_{\text{pulse}}(\omega)}{X_{\text{rect}}(\omega)} = \frac{Y_{\text{pulse}}(\omega)}{\text{sinc}(\omega)} \]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
% Convolution Property of the Fourier Transform
% pulse(t) = h(t) * rect(t)
% -> fourier transform
% PULSE = H * RECT
% FT(RECT) = sinc
% H = PULSE/RECT = PULSE/sinc
xx = pi*ui.*w(1:plt_num);
y_sinc = ui.*sin(xx)./xx;
y_sinc(1) = y_sinc(2);
y_sinc = y_sinc/y_sinc(1); % we dont care the absoulte gain
h_ban1 = abs(h(1:plt_num))./abs(y_sinc);

h_dB = 20*log10(abs(h_ban1));
[hmin, Index] = min(abs(h_dB +3 ));
f_3dB = w(Index);
f_3dB = f_3dB/1e9;

image-20220520223115184

image-20220520223621924

sinc function

Notice that the complete definition of \(\operatorname{sinc}\) on \(\mathbb R\) is \[ \operatorname{Sa}(x)=\operatorname{sinc}(x) = \begin{cases} \frac{\sin x}{x} & x\ne 0, \\ 1, & x = 0, \end{cases} \] which is continuous.

image-20220521112425477

image-20220521112722838

To approach to real spectrum of continuous rectangular waveform, \(\text{NFFT}\) has to be big enough.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
clear;
close all;
clc;

fs=500; %sampling frequency
Ts = 1/fs;
T=0.2; %width of the rectangule pulse in seconds

figure(1)
hold on;
t=-0.5:1/fs:0.5; %time base
x=rectpuls(t,T); %generating the square wave
sum(x>0.5)
stem(t,x,'--k');
plot(t, x, 'b.-')
xstart = T/2-Ts/2;
xend = -T/2-Ts/2;
plot([xstart, xstart], [-1, 1.02], 'r--');
plot([xend, xend], [-1, 1.02], 'r--');
plot([-0.11, 0.11], [1, 1], 'r--')
hold off;
grid on;
title(['Rectangular Pulse width=', num2str(T),'s']);
xlabel('Time(s)');
ylabel('Amplitude');
xlim([-0.12, 0.12]);
ylim([-0.05, 1.05]);

figure(2)
hold on
Titer = [1, 4, 16, 32, 64, 128];
color = {'k', 'g', 'r', 'm', 'c', 'b'};
formatSpec = 'NFFT=%d';
for k=1:length(Titer)
t=-Titer(k):Ts:Titer(k); %time base
x=rectpuls(t,T); %generating the square wave

L=length(x);
Np = nextpow2(L)-1;
NFFT = 2^Np;
X = fftshift(fft(x,NFFT)); %FFT with FFTshift for both negative & positive frequencies
f = fs*(-NFFT/2:NFFT/2-1)/NFFT; %Frequency Vector
Xc = abs(X)*Ts; % continuous signal spectrum
plot(f, Xc, "Color",color{k}, 'LineWidth',2);
legend_str{k} = [num2str(NFFT, formatSpec)];
end
hold off
legend(legend_str);
title('Magnitude of X_c (original continuous signal)');
xlabel('Frequency (Hz)')
ylabel('Magnitude |X(f)|');
grid on;

FIR Equalization

Frequency Response

image-20220322093428287 \[ z = e^{j\omega T_s} \]

Unit impulse

filter coefficients are [-0.131, 0.595, -0.274] and sampling period is 100ps

image-20220428125454912

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
%% Frequency response
w = [-0.131, 0.595, -0.274];
Ts = 100e-12;
[mag, w] = freqz(w, 1, [], 1/Ts);
subplot(2, 1, 1)
plot(w/1e9, 20*log10(abs(mag)));
xlabel('Freq(GHz)');
ylabel('dB');
grid on;
title('frequency response');

%% unit impulse response from transfer function
subplot(2, 1, 2)
z = tf('z', Ts);
h = -0.131 + 0.595*z^(-1) -0.274*z^(-2);
[y, t] = impulse(h);
stem(t*1e10, y*Ts); % !!! y*Ts is essential
grid on;
title("unit impulse response");
xlabel('Time(\times 100ps)');
ylabel('mag');

impulse:

For discrete-time systems, the impulse response is the response to a unit area pulse of length Ts and height 1/Ts, where Ts is the sample time of the system. (This pulse approaches \(\delta(t)\) as Ts approaches zero.)

Scale output:

Multiply impulse output with sample period Ts in order to correct 1/Ts height of impulse function.

PSD transformation

If we have power spectrum or power spectrum density of both edge's absolute jitter (\(x(n)\)) , \(P_{\text{xx}}\)

Then 1UI jitter is \(x_{\text{1UI}}(n)=x(n)-x(n-1)\), and Period jitter is \(x_{\text{Period}}(n)=x(n)-x(n-2)\), which can be modeled as FIR filter, \(H(\omega) = 1-z^{-k}\), i.e. \(k=1\) for 1UI jitter and \(k=2\) Period jitter \[\begin{align} P_{\text{xx}}'(\omega) &= P_{\text{xx}}(\omega) \cdot \left| 1-z^{-k} \right|^2 \\ &= P_{\text{xx}}(\omega) \cdot \left| 1-(e^{j\omega T_s})^{-k} \right|^2 \\ &= P_{\text{xx}}(\omega) \cdot \left| 1-e^{-j\omega T_s k} \right|^2 \end{align}\]

image-20220519172239916

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
clear all
close all
clc

xf_fs = 0:0.01:0.5;
k = 1;
H_1UI = 1 - exp(-1i*2*pi*xf_fs);
HH_1UI = abs(H_1UI).^2;
subplot(2, 1, 1);
plot(xf_fs, HH_1UI);
grid on;
xlabel('Freq');
ylabel('|H|^2')
title('Weight for 1UI jitter');

k = 2;
H_period = 1 - exp(-1i*2*pi*xf_fs);
HH_period = abs(H_period).^2;
subplot(2, 1, 2)
plot(xf_fs, HH_period);
grid on;
xlabel('Freq');
ylabel('|H|^2')
title('Weight for Period jitter');

image-20220709104127384 \[ x(t-\Delta T)\overset{FT}{\longrightarrow} X(s)e^{-\Delta T \cdot s} \]

Bilinear Transform

The design of IIR filters (cont.) [https://ocw.mit.edu/courses/2-161-signal-processing-continuous-and-discrete-fall-2008/cc00ac6d468dc9dcf2238fc1d1a194d4_lecture_19.pdf]

image-20250907112138651

image-20250907112553846


[https://tttapa.github.io/Pages/Mathematics/Systems-and-Control-Theory/Digital-filters/Discretization/Bilinear-transform.html]

an algebraic transformation between the variables \(s\) and \(z\) that maps the entire imaginary \(j\Omega\)-axis in the \(s\)-plane to one revolution of the unit circle in the \(z\)-plane

\[\begin{align} z &= \frac{1+s\frac{T_s}{2}}{1-s\frac{T_s}{2}} \\ s &= \frac{2}{T_s}\cdot \frac{1-z^{-1}}{1+z^{-1}} \end{align}\]

where \(T_s\) is the sampling period

image-20241125001024136

frequency warping:

image-20241125002523844

  • The bilinear transformation avoids the problem of aliasing encountered with the use of impulse invariance, because it maps the entire imaginary axis of the \(s\)-plane onto the unit circle in the \(z\)-plane

  • impulse invariance cannot be used to map highpass continuous-time designs to high pass discrete-time designs, since highpass continuous-time filters are not bandlimited

  • Due to nonlinear warping of the frequency axis introduced by the bilinear transformation, bilinear transformation applied to a continuous-time differentiator will not result in a discrete-time differentiator. However, impulse invariance can be applied to bandlimited continuous-time differentiator

    The feature of the frequency response of discrete-time differentiators is that it is linear with frequency


The simple approximation \(z=e^{sT}\approx1+sT\), the first equal come from impulse invariance essentially

image-20241024230308374

image-20241026230904395


Stephen Roberts. Lecture 3 - Design of Digital Filters [https://www.robots.ox.ac.uk/~sjrob/Teaching/B14_SP/b14_sp_lect3.pdf]

Design of IIR Filters From Analog Filters

Modeling a Continuous-Time System with Matlab

Because the mapping between the continuous (\(s\)-plane) and discrete domains (\(z\)-plane) cannot be done exactly, the various design methods are at best approximations

image-20250623205733778

Neil Robertson. Modeling a Continuous-Time System with Matlab [https://www.dsprelated.com/showarticle/1055.php]

—. Modeling Anti-Alias Filters [https://www.dsprelated.com/showarticle/1418.php]

impinvar [https://www.mathworks.com/help/signal/ref/impinvar.html]

bilinear [https://www.mathworks.com/help/signal/ref/bilinear.html]

filter [https://www.mathworks.com/help/matlab/ref/filter.html]


The design of IIR filters [https://ocw.mit.edu/courses/2-161-signal-processing-continuous-and-discrete-fall-2008/14f29be83ac38f11eb1a1bd5fdb6f5ea_lecture_18.pdf]

The design of IIR filters (cont.) [https://ocw.mit.edu/courses/2-161-signal-processing-continuous-and-discrete-fall-2008/cc00ac6d468dc9dcf2238fc1d1a194d4_lecture_19.pdf]

Lecture 19: Design of IIR Filters [http://smartdata.ece.ufl.edu/eee5502/2020_fall/media/2020_eee5502_slides28.pdf]

Derivatives Approximation

Perhaps the simplest method for low-order systems is to use backward-difference approximation to continuous domain derivatives

image-20250623205828010

Note this approximation is not same with impulse invariance, e.g. \(\frac{1}{s^3} \to \frac{T^3z(z+1)}{2(-1)^3}\) employing impulse invariance


\[ 1- z^{-1} = 1-e^{-j\Omega T} = 1-\cos(\omega T) + j\sin(\Omega T) \approx 1-1+j\Omega T = s T \]

That is

\[ s \approx \frac{1-z^{-1}}{T} \]


image-20250907140610175


Suppose we wish to make a discrete-time filter based on a prototype first-order low-pass filter \[ H_p(s) = \frac{1}{s\tau + 1} \] The differential equation describing this filter is \[ \tau\frac{dy}{dt} + y = x \] then differential equation gives

\[ \tau\frac{y_n-y_{n-1}}{T} + y_n=x_n \] or \[ (\frac{\tau}{T}+1)y_n - \frac{\tau}{T}y_{n-1} = x_n \] The transfer function is \[ H(z) = \frac{\frac{T}{T+\tau}}{1-\frac{\tau}{T+\tau}z^{-1}} \]

or substitute \(s\) with \(\frac{1-z^{-1}}{T}\) into \(H_p(s)\) \[ H_p(z) = \frac{1}{\tau\frac{1-z^{-1}}{T} + 1} = \frac{\frac{T}{T+\tau}}{1-\frac{\tau}{T+\tau}z^{-1}}= \frac{\alpha}{1 +(\alpha -1)z^{-1}} \] where \(\alpha = \frac{T}{\tau+T}\)

Further, if time constants much longer than the time step \(\tau \gg T\) \[\begin{align} \frac{T}{T+\tau}& = \frac{T}{\tau}\cdot \frac{\tau}{T+\tau}\approx \frac{T}{\tau} \\ \frac{\tau}{T+\tau} &= \frac{\tau - T}{\tau}\cdot\frac{\tau^2}{\tau^2-T^2} \approx \frac{\tau - T}{\tau} = 1- \frac{T}{\tau} \end{align}\]

Then the first-order low pass filter \[ H_p(z) \approx \frac{ \frac{T}{\tau}}{1 +( \frac{T}{\tau} -1)z^{-1}} = \frac{\beta}{1 +(\beta -1)z^{-1}} \] where \(\beta = \frac{T}{\tau}\)

1
2
3
4
5
6
7
8
9
10
11
12
13
# https://www.dsprelated.com/showarticle/1517/return-of-the-delta-sigma-modulators-part-1-modulation
# https://www.embeddedrelated.com/showarticle/779.php
def show_dsmod_samplewave(t,x,dsmod,args=(1,),tau=0.05, R=1, fig=None,
return_handles=False, filter_dsmod=False):
dt = t[1]-t[0]
if filter_dsmod:
x1 = dsmod(x, *args,R=R, modulate=False)
else:
x1 = x
y = dsmod(x,*args,R=R)
a = dt/tau
yfilt = scipy.signal.lfilter([a],[1,a-1],y)
xfilt = scipy.signal.lfilter([a],[1,a-1],x1)


forward-difference approximation

模集王小桃. 连续时间滤波器到离散时间滤波器的映射 Mapping Continuous-Time Filters to Discrete-Time Filters [https://zhuanlan.zhihu.com/p/1961811157099746349]

image-20251207221104376

Matched z-Transform (Root Matching)

The matched Z-transform method, also called the pole-zero mapping or pole-zero matching method, and abbreviated MPZ or MZT

\[ z = e^{sT} \]

image-20250623213116836

image-20250623214002049


image-20250623220318339

1
2
3
4
5
6
7
8
9
10
% https://www.dsprelated.com/showarticle/1642.php

%II One-pole RC filter model
Ts= 1/fs;
Wc= 1/(R*C); % rad -3 dB frequency
fc= Wc/(2*pi); % Hz -3 dB frequency
a1= -exp(-Wc*Ts);
b0= 1 + a1; % numerator coefficient
a= [1 a1]; % denominator coeffs
y_filt= filter(b0,a,y); % filter the DAC's output signal y

Impulse Invariance (impinvar)

  • To extend the accurate frequency range, we would need to increase the sample rate
  • impulse-invariant transform is not appropriate for high-pass responses, because of aliasing errors

Note \(h[n] = Th_c(nT)\), Multiply the analog impulse response by this gain to enable meaningful comparison (other response, like step, the amplitude correction is not needed)

image-20250622091049131


image-20250622095817885

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
% https://www.dsprelated.com/showarticle/1055.php
% modified by hguo, Sun Jun 22 09:21:48 AM CST 2025

% butter_3rd_order.m 6/4/17 nr
% Starting with the butterworth transfer function in s,
% Create discrete-time filter using the impulse invariance xform and compare
% its time and frequency responses to those of the continuous time filter.
% Filter fc = 1 rad/s = 0.159 Hz


% I. Given H(s), find H(z) using the impulse-invariant transform
fs= 4; % Hz sample frequency
% 3rd order butterworth polynomial
num= 1;
den= [1 2 2 1];
[b,a]= impinvar(num,den,fs); % coeffs of H(z)
%[b,a]= bilinear(num,den,fs)

% II. Impulse Response and Step Response
% find discrete-time impulse response
Ts= 1/fs;
N= 16*fs;
n= 0:N-1;
t= n*Ts;
x= [1, zeros(1,N-1)]; % impulse
x= fs*x; % make impulse response amplitude independent of fs
y= filter(b,a,x); % filter the impulse
subplot(2,2,1),plot(t,y,'.'),grid,
xlabel('seconds')

% Continuous-time Impulse response from inverse Laplace transform
% Blinchikoff and Zverev, p116
h= exp(-t) - 2/sqrt(3)*exp(-t/2).*cos(sqrt(3)/2*t + pi/6);
subplot(2,2,2),plot(t,y,'.',t,h),grid
title('Impulse Response'),legend('discrete-time filter response', 'continuous-time filter response'),xlabel('seconds')

% find discrete-time step response
x= ones(1,N); % step
y= filter(b,a,x); % filter the step
% Continuous-time step response. Blinchikoff and Zverev, p116
t= t+Ts/2; % offset t to to align step responses
h= 1 - exp(-t) - 2/sqrt(3)*exp(-t/2).*sin(sqrt(3)/2*t);
e= h-y; % error of discrete-time response
subplot(2,2,3),plot(t,y,'.',t,h),grid
title('Step Response'),legend('discrete-time filter response', 'continuous-time filter response'),xlabel('seconds')
% discrete-time response error
subplot(2,2,4),plot(t,e),grid
title('Error of discrete-time step response'),xlabel('seconds')

image-20250622093151404


  • To use the filter function with the b coefficients from an FIR filter, use y = filter(b,1,x).

image-20250622110340687

Bilinear Transformation (bilinear)

The Filter Wizard Substack: "Countdown to s-to-z…" [https://filterwizard.substack.com/p/countdown-to-s-to-z]

image-20250928210313814

image-20250928222031796

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
a2 = 9.751405; a1 = 11.5677; a0 = 1.290113;
b2 = 0.118385; b1 = 30.27418; b0 = 0.502194;

fs = 1e3; % sampling rate 1KHz

num0 = (4*fs^2*a2 + 2*fs*a1 + a0)/(4*fs^2*b2 + 2*fs*b1 + b0);
num1 = (-8*fs^2*a2 + 2*a0)/(4*fs^2*b2 + 2*fs*b1 + b0);
num2 = (4*fs^2*a2 - 2*fs*a1 + a0)/(4*fs^2*b2 + 2*fs*b1 + b0);
den1 = (-8*fs^2*b2 + 2*b0)/(4*fs^2*b2 + 2*fs*b1 + b0);
den2 = (4*fs^2*b2 - 2*fs*b1 + b0)/(4*fs^2*b2 + 2*fs*b1 + b0);

s = tf('s');
H1_s = (a2*s^2 + a1*s + a0) / (b2*s^2 + b1*s + b0);

z = tf('z', 1/fs);
H1_z = (num0 + num1*z^(-1) + num2*z^(-2))/(1 + den1*z^(-1) + den2*z^(-2));

%% visualization
options = bodeoptions;
options.FreqUnits = 'Hz';

freq_range = logspace(log10(1e-3), log10(500), 1000);

[mag_s,phase_s,fout_s] = bode(H1_s, freq_range);
[mag_z,phase_z,fout_z] = bode(H1_z, freq_range);

subplot(2,1,1)
plot(fout_s(:), mag_s(:), 'r', LineWidth=4)
hold on
plot(fout_z(:), mag_z(:), 'b--', LineWidth=3)
title('Magnitude $H_1(s)$ \& $H_1(z)$ with $f_s=1$KHz',Interpreter='latex', FontSize=16)
legend('$|H_1(s)|$', '|$H_1(z)|$', fontsize=14, Location='southeast',Interpreter='latex')
xlabel('Hz', FontSize=12); ylabel('mag', FontSize=12); grid on;

subplot(2,1,2)
plot(fout_s(:), phase_s(:), 'r', LineWidth=4)
hold on
plot(fout_z(:), phase_z(:), 'b--', LineWidth=3)
title('Phase $H_1(s)$ \& $H_1(z)$ with $f_s=1$KHz',Interpreter='latex', FontSize=16)
legend('$\angle H_1(s)$', '$\angle H_1(z)$', fontsize=14, Location='southeast',Interpreter='latex')
xlabel('Hz', FontSize=12); ylabel('phase', FontSize=12); grid on;

disp(mag_s(1,1:10))
disp(mag_z(1,1:10))
%{
2.5644 2.5643 2.5641 2.5640 2.5639 2.5637 2.5636 2.5635 2.5633 2.5632

2.5644 2.5643 2.5641 2.5640 2.5639 2.5637 2.5636 2.5635 2.5633 2.5632
%}

Stability check in z-domain

Unit circle criterion; Jury’s stability criterion

image-20250819213545563

reference

Alan V. Oppenheim, Alan S. Willsky, and S. Hamid Nawab. 1996. Signals & systems (2nd ed.)

Alan V Oppenheim, Ronald W. Schafer. 2010. Discrete-Time Signal Processing, 3rd edition

B.P. Lathi, Roger Green. Linear Systems and Signals (The Oxford Series in Electrical and Computer Engineering) 3rd Edition

Sam Palermo, ECEN720, Lecture 7: Equalization Introduction & TX FIR Eq

Sam Palermo, ECEN720, Lab5 –Equalization Circuits

B. Razavi, "The z-Transform for Analog Designers [The Analog Mind]," IEEE Solid-State Circuits Magazine, Volume. 12, Issue. 3, pp. 8-14, Summer 2020. [https://www.seas.ucla.edu/brweb/papers/Journals/BR_SSCM_3_2020.pdf]

Jhwan Kim, CICC 2022, ES4-4: Transmitter Design for High-speed Serial Data Communications

Mathuranathan. Digital filter design – Introduction [https://www.gaussianwaves.com/2020/02/introduction-to-digital-filter-design/]

Daniel Boschen, "Fast Track to Designing FIR Filters with Python" [https://events.gnuradio.org/event/24/contributions/598/attachments/186/485/Boschen%20FIR%20Filter%20Presentation.pdf]

Daniel Boschen, "Quick Start on Control Loops with Python" [https://events.gnuradio.org/event/24/contributions/599/attachments/187/480/Boschen%20Control%20Presentation.pdf]

T-coil

coupling factor

image-20251023214723225

note CTAP is grounded

\[ L_{12} = \frac{\text{im}[Z_{diff}]}{2\pi f} \]

where \(Z_{diff} = Z_{11} - Z_{12} - Z_{21} + Z_{22}\)

image-20251023214042381

Min-Sun Keel. Design of reliable and energy-efficient high-speed interface circuits. University of Illinois Urbana-Champaign, USA, 2015 [https://files.core.ac.uk/download/pdf/158312105.pdf]

image-20251124205353311

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


df_emx = pd.read_csv('./tcoil-emx.csv')
L1sim = df_emx['L1sim Y'].to_numpy()/1e-12 # pH
L2sim = df_emx['L2sim Y'].to_numpy()/1e-12 # pH
Q1sim = df_emx['Q1sim Y'].to_numpy()
Q2sim = df_emx['Q2sim Y'].to_numpy()
frqsim = df_emx['ksim X'].to_numpy()/1e9 # GHz
Ksim = df_emx['ksim Y'].to_numpy()

df_calc = pd.read_csv('./tcoil-calc.csv')
frqcalc = df_calc.iloc[:,0].to_numpy()/1e9 # GHz
L1calc = df_calc.iloc[:,1].to_numpy()/1e-12 # pH
L2calc = df_calc.iloc[:,3].to_numpy()/1e-12 # pH
Q1calc = df_calc.iloc[:,5].to_numpy()
Q2calc = df_calc.iloc[:,7].to_numpy()
L12calc = df_calc.iloc[:,9].to_numpy()/1e-12 # pH

kcalc = []
for l1i, l2i, l12i in zip(L1calc, L2calc, L12calc):
l1il2i = l1i * l2i
kk = (l12i - l1i -l2i)/2/np.sqrt(l1il2i) if l1il2i > 0.0 else 0.0 # L1 * L2 < 0 is meaningless
kk = kk if np.abs(kk) < 2.0 else 0.0 # constrain |k| < 2
kcalc.append(-kk) # opposite due to kk assuming differential stimulus

plt.figure(figsize=(20,8))
plt.subplot(3, 2, 1)
plt.plot(frqsim, L1sim, frqcalc, L1calc, 'r--', linewidth=3); plt.legend(['L1sim', 'L1calc'], loc='upper right'); plt.grid()
plt.subplot(3, 2, 2)
plt.plot(frqsim, Q1sim, frqcalc, Q1calc, 'r--', linewidth=3); plt.legend(['Q1sim', 'Q1calc'], loc='upper right'); plt.grid()
plt.subplot(3, 2, 3)
plt.plot(frqsim, L2sim, frqcalc, L2calc, 'r--', linewidth=3); plt.legend(['L2sim', 'L2calc'], loc='upper right'); plt.grid()
plt.subplot(3, 2, 4)
plt.plot(frqsim, Q2sim, frqcalc, Q2calc, 'r--', linewidth=3); plt.legend(['Q2sim', 'Q2calc'], loc='upper right'); plt.grid()
plt.subplot(3, 1, 3)
plt.plot(frqsim, Ksim, frqcalc, kcalc, 'r--', linewidth=3); plt.legend(['Ksim', 'kcalc'], loc='upper right'); plt.grid()

plt.show()

image-20251023213942643

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
  (get_k
(lambda (l12 l1122)
(letseq ((xvec (drGetWaveformXVec l12))
(n (drVectorLength xvec))
(l12v (drGetWaveformYVec l12))
(l1122v (drGetWaveformYVec l1122))
(resultv (drCreateVec 'double n)))
(do ((i 0 i+1))
((i >= n))
(letseq ((l12i (drGetElem l12v i))
(l1122i (drGetElem l1122v i))
(kk (if (l1122i > 0.0)
l12i/(sqrt l1122i)
0.0))
(k (if ((abs kk) < 2.0) kk 0.0)))
(drSetElem resultv i k)))
(drCreateWaveform xvec resultv)))))
(EMX_plot_aux bgui wid what 3
'("Inductance" "Q" "k")
'("Henry" "" "")
(lambda (ys)
(letseq ((zs (reduce ys))
(z11 (nth 0 zs))
(z12 (nth 1 zs))
(z22 (nth 2 zs))
(pi 3.14159265358979)
(f (xval z11))
(l11 (imag z11)/(2*pi*f))
(q11 (imag z11)/(real z11))
(l12 (imag z12)/(2*pi*f))
(l22 (imag z22)/(2*pi*f))
(q22 (imag z22)/(real z22))
(k (get_k l12 l11*l22)))
`((,l11 ,l22) (,q11 ,q22) (,k))))
'(("L1" "L2") ("Q1" "Q2") ("k")))))

[https://wiki.icprophet.com/doku.php?id=wiki#t-coil]

image-20251206101447859

\[ k = \frac{L_{tot}-L_1-L_2}{2M} = \frac{\text{im}[Z_{diff}]/2\pi f - \text{im}[Z_{11}]/2\pi f-\text{im}[Z_{22}]/2\pi f}{2\sqrt{\text{im}[Z_{11}]/2\pi f\times \text{im}[Z_{22}]/2\pi f}} = \frac{- \text{im}[Z_{12}] - \text{im}[Z_{21}]}{2\sqrt{\text{im}[Z_{11}]\times \text{im}[Z_{22}]}} \] if \(\text{im}[Z_{12}] = \text{im}[Z_{21}]\), then \[ \color{red}k =- \frac{\text{im}[Z_{21}]}{\sqrt{\text{im}[Z_{11}]\times \text{im}[Z_{22}]}} \] image-20251206101226005

\(C_B\) from Nport

Chapter 4.5. High Frequency Passive Devices [https://www.cambridge.org/il/files/7713/6698/2369/HFIC_chapter_4_passives.pdf]

image-20251121000517142


Min-Sun Keel. Design of reliable and energy-efficient high-speed interface circuits. University of Illinois Urbana-Champaign, USA, 2015 [https://files.core.ac.uk/download/pdf/158312105.pdf]

Measuring Self Resonant Frequency [https://www.coilcraft.com/getmedia/8ef1bd18-d092-40e8-a3c8-929bec6adfc9/doc363_measuringsrf.pdf?srsltid=AfmBOoqdBJ_CTB-N_wOVp2_7zIDXPEwOYLm7S4RLuws1CEcEWZUijblK]

image-20251121002136639

image-20251121002228554

T-coil vs tapped inductor

tcoil and tapped inductor share same EM simulation result, and use modelgen with different model formula.

The relationship is \[ L_{\text{sim}} = L1_{\text{sim}}+L2_{\text{sim}}+2\times k_{\text{sim}} \times \sqrt{L1_{\text{sim}}\cdot L2_{\text{sim}}} \] where \(L1_{\text{sim}}\), \(L2_{\text{sim}}\) and \(k_{\text{sim}}\) come from tcoil model result, \(L_{\text{sim}}\) comes from tapped inductor model result

\(k_{\text{sim}}\) in EMX have assumption, induce current from P1 and P2 Given Dot Convention:

Same direction : k > 0

Opposite direction : k < 0

So, the \(k_{\text{sim}}\) is negative if routing coil in same direction

image-20220623013225554

image-20220623013923263

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
% EMX - shield tcoil model
L1 = csvread('./L1sim.csv', 1, 0);
L2 = csvread('./L2sim.csv', 1, 0);
k = csvread('./ksim.csv', 1, 0);

% EMX - Tapped shield inductor
L = csvread('./Lsim.csv', 1, 0);

freq = L1(:, 1)/1e9; % GHz
L1 = L1(:, 2);
L2 = L2(:, 2);
k = -k(:, 2); % Caution: minus of EMX ksim due to same current direction

L = L(:, 2);

Lcalc = L1 + L2 + 2*k.*(L1.*L2).^0.5;

plot(freq, L*1e9, 'r', 'LineWidth', 3);
hold on;
plot(freq, Lcalc*1e9, '--b', 'LineWidth', 3);
grid on;
xlabel('Freq (GHz)');
ylabel('Inductance (nH)');
legend('Tapped inductor model', 'tcoil model calc');

Transformer

J. R. Long, "On-chip transformer design and application to RF and mm-wave front-ends," 2017 IEEE Custom Integrated Circuits Conference (CICC), Austin, TX, USA, 2017 [pdf]

A. Bevilacqua, "Tutorial: Fundamentals of Integrated Transformers: from Principles to Applications," 2020 IEEE International Solid-State Circuits Conference - (ISSCC), San Francisco, CA, USA, 2020 [pdf]

—, "Fundamentals of Integrated Transformers: From Principles to Applications," in IEEE Solid-State Circuits Magazine, vol. 12, no. 4, pp. 86-100, Fall 2020

image-20251206083550560

image-20251206090819816


[https://wiki.icprophet.com/doku.php?id=wiki#%E5%8F%98%E5%8E%8B%E5%99%A8]

变压器差分输入、差分输出 image-20251206101920561
变压器单端输入、差分输出 image-20251206102009723
变压器差分输入、单端输出 image-20251206102047381
变压器单端输入、单端输出 image-20251206102150510

image-20251206102240524

the formula is same with T-coil's

PGS (Patterned ground Shields)

Ring Pattern, Star Pattern

image-20251206005014571

Common-Mode Rejection

TODO 📅

image-20251206005603725

Transmission Zero

image-20251206103623404

like shunt-peaking, the impedance of \(C_o/n\), \(L_s\), \(r_s\) have resonant peak at resonant frequency, which block signal transmission to S

Mixed-Mode Z/Y-Parameters

Mixed-Mode Y/Z-Parameters [http://zeptoblog.com/2024/03/09/mixed-mode-yz-parameters.html]

image-20260109001415365

DE & SE excitation

Min-Sun Keel. Design of reliable and energy-efficient high-speed interface circuits. University of Illinois Urbana-Champaign, USA, 2015 [https://files.core.ac.uk/download/pdf/158312105.pdf]

image-20251120230342210

differential impedance

Y parameters to Z parameters

\[\begin{align} |Y| &= Y_{11}*Y_{22} - Y_{12}*Y_{22} \\ \begin{bmatrix} Z_{11} & Z_{12}\\ Z_{21} & Z_{22} \end{bmatrix} &= \begin{bmatrix} \frac{Y_{22}}{|Y|} & \frac{-Y_{12}}{|Y|}\\ \frac{-Y_{21}}{|Y|} & \frac{Y_{11}}{|Y|} \end{bmatrix} \end{align}\]

Then differential impedance is \[ Z_{diff} = Z_{11} - Z_{12} - Z_{21} + Z_{22} \]

image-20220330234833756

1
2
3
4
5
6
7
(define (EMX_differential y11 y12 y21 y22)
(letseq ((det y11*y22-y12*y21)
(z11 y22/det)
(z12 -y12/det)
(z21 -y21/det)
(z22 y11/det))
z11+z22-z12-z21))

similarly, Z parameters to Y parameters \[ \begin{bmatrix} Y_{11} & Y_{12}\\ Y_{21} & Y_{22} \end{bmatrix} = \begin{bmatrix} \frac{Z_{22}}{|Z|} & \frac{-Z_{12}}{|Z|}\\ \frac{-Z_{21}}{|Z|} & \frac{Z_{11}}{|Z|} \end{bmatrix} \] where \[ |Z| = Z_{11}Z_{22} - Z_{12}Z_{21} \]


Inductor EM simulation: 1-port or 2-port? [https://muehlhaus.com/support/ads-application-notes/inductor-em-ports]

image-20251220084731053

image-20251220083030590

image-20251220084025043

image-20251220085006044

image-20251220084559347

EMX Ports Setup

plain labels

  • pin layer
  • uncheck Cadence pins in Advanced options

rectangle pins

  • drawing layer rectangle pin and specify Access Direction as intended
  • check Cadence pins in Advanced options

The rectangle pins are always selected as driven port while there are only rectangle pin whether Cadence pins checked or not.


check ports used for simulation use GDS view - EMX


EMX Synthesis Kits

Synthesis is a capability of the EMX Pcell library and uses scalable model data pre-generated by Continuum for a specific process and metal scheme combination.

Synthesis is supported by the Pcells that are suffixed _scalable, and these Pcells have the additional fields and buttons needed for synthesis.

port order (signals)

/path/to/EMX/share/emx/virtuoso_ui/emxinterface/emxskill/emxform.ils

type Port order
inductor P1 P2
shield inductor P1 P2 SHIELD
tapped inductor P1 P2 CT
tapped shield inductor P1 P2 CT SHIELD
mom/mim capacitor P1 P2
tcoil P1 P2 TAP
shield tcoil P1 P2 TAP SHIELD
tline P1 P2
differential tline P1 P2 P3 P4

EMX device info

name menu_selection (split with _ ) num_ports modelgen_type generic_model_type plot_fn
Single-ended inductor inductor_no tap_no shield_single-ended 2 inductor inductor EMX_plot_se_ind
Differential inductor inductor_no tap_no shield_differential 2 inductor inductor EMX_plot_diff_ind
Single-ended shield inductor inductor_no tap_with shield_single-ended 3 shield_inductor shield_inductor EMX_plot_se_ind
Differential shield inductor inductor_no tap_with shield_differential 3 shield_inductor shield_inductor EMX_plot_diff_ind
Tapped inductor (diff mode only) inductor_with tap_no shield_differential mode only 3 center_tapped_inductor tapped_inductor EMX_plot_ct_ind
Tapped inductor (common mode too) inductor_with tap_no shield_also fit common mode 3 center_tapped_inductor_common_mode tapped_inductor EMX_plot_ct_ind
Tapped shield inductor (diff only) inductor_with tap_with shield_differential mode only 4 center_tapped_well_inductor_common_mode tapped_shield_inductor EMX_plot_ct_ind
Single-ended cap (symm) capacitor_symmetric single-ended 2 complex_mom_capacitor mom_capacitor EMX_plot_se_cap
Differential cap (symm) capacitor_symmetric differential 2 complex_mom_capacitor mom_capacitor EMX_plot_diff_cap
Single-ended cap (asymm) capacitor_asymmetric single-ended 2 complex_asymmetric_mom_capacitor mom_capacitor EMX_plot_se_cap
Differential cap (asymm) capacitor_asymmetric differential 2 complex_asymmetric_mom_capacitor mom_capacitor EMX_plot_diff_cap
MiM capacitor capacitor_MiM 2 mim_capacitor mim_capacitor EMX_plot_se_cap
Tcoil (simple model) tcoil_simple model 3 tcoil tcoil EMX_plot_tcoil
Tcoil (complex model) tcoil_complex model 3 complex_tcoil complex_tcoil EMX_plot_tcoil
Shield tcoil tcoil_with shield 4 shield_complex_tcoil shield_tcoil EMX_plot_shield_tcoil
Transmission line transmission line_single 2 xline xline EMX_plot_xline
Diff transmission line transmission line_coupled (differential) 4 coupled_xline diff_xline EMX_plot_diff_xline

EMX plot function

EMX's formulation is defined in

/path/to/EMX/share/emx/virtuoso_ui/emxinterface/emxskill/emxform.ils

EMX import this file at Virtuoso startup, you have to relaunch Virtuoso if you change this file

Single-ended inductor

Both with and without shield apply

  • port-1 impedance when port-2 short

\[ Z_1 = \frac{1}{Y_{11}} \]

  • port-2 impedance when port-1 short

\[ Z_2 = \frac{1}{Y_{22}} \] Then \[\begin{align} L1 &= \frac{Im(Z_1)}{2\pi f} \qquad Q1 = \frac{Im(Z_1)}{Re(Z_1)} \\ L2 &= \frac{Im(Z_2)}{2\pi f} \qquad Q2 = \frac{Im(Z_2)}{Re(Z_2)} \end{align}\]

EMX only plot L1 and Q1

Differential inductor

Both with and without shield apply

\[\begin{align} L_{diff} = \frac{Im(Z_{diff})}{2\pi f} \qquad Q_{diff} = \frac{Im(Z_{diff})}{Re(Z_{diff})} \end{align}\]

Center-tapped inductor

\[ Y = \begin{bmatrix} Y_{11} & Y_{12} & Y_{13}\\ Y_{21} & Y_{22} & Y_{23}\\ Y_{31} & Y_{32} & Y_{33} \end{bmatrix} \]

where port order is P1 P2 CT.

1
2
3
4
5
6
7
8
9
10
11
(define (EMX_plot_ct_ind bgui wid what)
(EMX_plot_aux bgui wid what 3
'("Differential inductance" "Differential Q")
'("Henry" "")
(lambda (ys)
(letseq ((z (EMX_differential (nth 0 ys) (nth 1 ys) (nth 3 ys) (nth 4 ys)))
(f (xval z))
(L (imag z)/(2*3.14159265358979*f))
(Q (imag z)/(real z)))
`((,L) (,Q))))
'(("L") ("Q"))))

Assume CT i.e. port 3 in S-parameter is grounded, (z (EMX_differential (nth 0 ys) (nth 1 ys) (nth 3 ys) (nth 4 ys))) obtain differential impedance with \(Y_{11}\), \(Y_{12}\), \(Y_{21}\) and \(Y_{22}\). \[ Y = \begin{bmatrix} Y_{11} & Y_{12}\\ Y_{21} & Y_{22} \end{bmatrix} \] Finally, differential inductance and Q are obtained, shown as below

\[\begin{align} L_{diff} = \frac{Im(Z_{diff})}{2\pi f} \qquad Q_{diff} = \frac{Im(Z_{diff})}{Re(Z_{diff})} \end{align}\]

image-20220331013735370

Single-ended cap

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
(define (EMX_plot_se_cap bgui wid what)
(EMX_plot_aux bgui wid what 2
'("Capacitance" "Q" "Capacitance" "Q")
'("Farad" "" "Farad" "")
(lambda (ys)
(letseq ((z1 1.0/(nth 0 ys))
(y12 (nth 1 ys))
(z2 1.0/(nth 3 ys))
(f (xval z1))
(C1 (-1.0/(imag z1))/(2*3.14159265358979*f))
(C12 -(imag y12)/(2*3.14159265358979*f))
(C2 (-1.0/(imag z2))/(2*3.14159265358979*f))
(Q1 -(imag z1)/(real z1))
(Q12 (imag y12)/(real y12))
(Q2 -(imag z2)/(real z2)))
`((,C1) (,Q1) (,C12))))
'(("Cse") ("Qse") ("C12"))))

We define Port-1 impedance \(Z_1\), Port-2 impedance \(Z_2\)

\[\begin{align} Z_1 = \frac {1}{Y_{11}} \qquad Z_2 = \frac {1}{Y_{22}} \end{align}\]

Caution above \(\color{red}z_1 \neq Z_{11}\), but \(z_1=\frac{1}{Y_{11}}\)

Then single-ended cap and Q \[\begin{align} C_1 &= -\frac{1/Im(Z_1)}{2\pi f} \qquad Q_1 = -\frac{Im(Z_1)}{Re(Z_1)} \\ C_2 &= -\frac{1/Im(Z_2)}{2\pi f} \qquad Q_2 = -\frac{Im(Z_2)}{Re(Z_2)} \\ C_{12} &= -\frac{Im(Y_{12})}{2\pi f} \qquad Q_{12} = \frac{Im(Y_{12})}{Re(Y_{12})} \end{align}\]

  • Series equivalent model is used in \(C_1\), \(Q_1\), \(C_2\) and \(Q_2\)
    • \(Z_1 = R + \frac{1}{sC_1}\) and \(Z_2 = R + \frac{1}{sC_2}\)
  • Parallel model is used in \(C_{12}\) and \(Q_{12}\)
    • \(Y_{12} = \frac{1}{R} + sC_{12}\)

EMX plot \(C_{se}\), \(Q_{se}\) and \(C_{12}\), i.e. \(C_1\), \(Q_1\) and \(C_{12}\)

image-20220331020334023

Differential cap

1
2
3
4
5
6
7
8
9
10
11
(define (EMX_plot_diff_cap bgui wid what)
(EMX_plot_aux bgui wid what 2
'("Differential capacitance" "Differential Q")
'("Farad" "")
(lambda (ys)
(letseq ((z (apply EMX_differential ys))
(f (xval z))
(C (-1.0/(imag z))/(2*3.14159265358979*f))
(Q -(imag z)/(real z)))
`((,C) (,Q))))
'(("C") ("Q"))))

First obtain differential impedance, \(Z_{diff}\) then apply series equivalent model \[\begin{align} C_{diff} = -\frac{1/Im(Z_{diff})}{2\pi f} \qquad Q_{diff} = -\frac{Im(Z_{diff})}{Re(Z_{diff})} \end{align}\]

image-20220331022224865

Tline

Open circuit impedance \(Z_o\), short circuit impedance \(Z_s\) and characteristic impedance \(Z_0\)

\[\begin{align} Z_o = Z_{11} \qquad Z_s = \frac{1}{Y_{11}} \qquad Z_0 = \sqrt{Z_o*Z_s} \end{align}\]

propagation constant is given as

\[\gamma \cdot l = \frac{1}{2}\log\left( \frac{Z_0+Z_s}{Z_0-Z_s} \right)\]

\(Z_s\) depends on the length of tline [Google AI Mode]

The relationship between these parameter and geometry of the transmission line \[\begin{align} Z_0 = \sqrt{\frac{R+j\omega L}{G+j\omega C}} \qquad \gamma = \sqrt{(G+j\omega C)(R+j\omega L)} \end{align}\] EMX plot the real and imaginary part of \(Z_0\), \(\alpha\) and \(\beta\) of \(\gamma\)

Note EMX plot the absolute value of \(\alpha\) and \(\beta\)

image-20220630215343377image-20220630215418372

image-20220630215630849

EMX autoplot

using AC simulation, and inductor's parallel model or series model

That is to say: both sp (network parameter) and ac (impedance) can be used to plot inductance, Q value.

usually EMX choose ac method

image-20220501173856442

image-20220501173930035

left 2 figures are used for AC simulation, \(Y_{nn}\) can be obtained conveniently

Model Parameter Extraction

Chapter 4.5. High Frequency Passive Devices [https://www.cambridge.org/il/files/7713/6698/2369/HFIC_chapter_4_passives.pdf]

image-20221217141519947

for single-end capicator \[\begin{align} Q_1 &= -\frac{Im(Z_1)}{Re(Z_1)} = -\frac{Im(1/Y_{11})}{Re(1/Y_{11})} = -\frac{Im(Y_{11}^*)/|Y_{11}|^2}{Re(Y_{11}^*)/|Y_{11}|^2} = \frac{Im(Y_{11})}{Re(Y_{11})} \end{align}\]

So, the EMX model and foundary model is consistent.


O. Hanay, J. Hulsman and R. Negra, "Three-Port S-Parameter based characterization of integrated bridged-T-Coils," 2019 12th German Microwave Conference (GeMiC), Stuttgart, Germany, 2019, pp. 268-271 [https://sci-hub.se/10.23919/GEMIC.2019.8698123]

image-20251121002530468

pad & bump

EMX process file contain M0 up to RDL-AP

PM, CB2_FC, UBM is in the chip package

image-20250613235639258

PEX extract up to RDL-AP as expected

DC model

quick find routing resistance

image-20250705231456546

image-20250705231907284

GPDK045 metal resistor model is not consistent with its process file on sheet resistance

image-20250705233530802

Ports & Ground Reference

The ground pin confusion in EM transmission line models [https://muehlhaus.com/support/ads-application-notes/em_line_ground]

Momentum port: global ground or differential? [https://muehlhaus.com/support/ads-application-notes/momentum-port-global-ground-or-differential]

Effect of ground cutout size on RFIC inductor performance [https://muehlhaus.com/support/rfic-em-appnotes/inductor-ground-cutout]

img

image-20251220090704827

dummy metal fill

60 GHz on-chip balun transformer: effect of dummy metal fill [https://muehlhaus.com/support/rfic-em-appnotes/60-ghz-balun-filler-effect]

TODO 📅

Edge/Area pins in Momentum

Edge/Area pins in Momentum EM simulation [https://muehlhaus.com/support/ads-application-notes/edge-area-pins]

For accurate results from EM, the current in the model needs to flow in the physically correct way, similar to the hardware. With edge/area pins, we can help Momentum to create the physically correct current flow in complex port configurations.

Edge port with user defined location and size

img

Area pins with user defined location and size

If we manually control the area pin size, Momentum will equally distribute the injected current across that area

img

Port Referencing in S-Parameters

Peter J. Pupalaikis, DesignCon 2026: Port referencing in S-parameters - Critical Insights You Need to Know

TODO 📅

same Pin names in EMX

It remains shrouded in myth

image-20251220123154767

EMX Setup Tricks

Process file*

Process file encryption mostly for advanced nodes, like TSMC 16nm Finfet, whose process file is encrypted.

  • Use --key=EMXkey in the EMX Advanced options

GDSviewer has two options

  • EMX: shows the final gds sent to EMX for simulation after it has been processed by EMX
  • Raw: shows the raw gds

If there are port name with the # sign, it means EMX sees a port but it is not in the signal list.

EMX Accuracy

  • Edge mesh: controls layout discretization in the X-Y plane

    • For MoM capacitors, use the edge mesh to be the same as the width of the finger (for example, 0.1um).
  • Thickness: controls layout discretization in the Z dimension

  • 3D metals: skips all 2D assumptions about conductors and their currents and charges

    • If you set 3D metals to * then all metals are treated as 3D
      • For Inductor type structures, only thick metal needs 3D.
      • For MoM, all layers are needed.

Ports entered in Grounds will cause these nets to be grounded; these ports will not show up in the S-parameter result.

Setup Temperature

  • EMX: --temperature=100

ParaView

  • If check ParaView related options when ParaView is not setup properly, EMX simulation stop at Creating mesh... without waring or errors (version 6.2).

Paraview & stimulus

image-20251023000524293

LVS check

LVS issue for circuits with customized devices

  • auCdl: Analog and Microwave CDL, is a netlister used for creating CDL netlist for analog circuits
  • auLVS: Analog and Microwave LVS, is used for analog circuit LVS

Dr. Muehlhaus

Dr. Muehlhaus Consulting & Software GmbH, RFIC-Inductor-Toolkit-Open [https://github.com/VolkerMuehlhaus/RFIC-Inductor-Toolkit-Open]

—, lumpedmodel [https://github.com/VolkerMuehlhaus/lumpedmodel]

—, plot_inductor [https://github.com/VolkerMuehlhaus/plot_inductor]

TODO 📅

reference

Tips on Specifying Ports in EMX

Using 'Cadence pins' as ports with access direction in EMX simulations

EMX miscellaneous features [https://picture.iczhiku.com/resource/eetop/WyIFKleSLTRIuvCb.pdf]

Cadence Rapid Adoption Kit (RAK). Analysis of a Figure-Eight Inductor with EMX

image-20250817101208717


image-20250606203026237

discrete time jitter impulse response (normalized to the input jitter stimulus similar to the procedure used to represent a conventional system impulse response)

image-20250606203531304

When impulsive jitter is injected into clock distribution circuits (i.e., a small incremental time delay or advance applied to an individual clock edge), it results in jitter in multiple subsequent edges in the output clock

image-20250606215100752

image-20250606215204625

transient noise and rms_jitter function

image-20220608233610066

image-20220313230930333

RJ(rms): single Edge or Both Edge?

RJ(seed): what is it?

phase noise method

Directly compare the input phase noise and output phase noise, the input waveform maybe is the PLL output or other clock distribution end point

Jitter Impulse Response & Jitter Transfer Function

assuming linear, time-invariant phase response

image-20250522223530464

n =5 buffers, fclk = 10GHz

[Alphawave’s CTO, Tony Chan Carusone, High Speed Communications Part 8 – On Die CMOS Clock Distribution]

image-20250818230528344

Theoretically, the DC gain of JTF of low pass filter shall be 1. Unfortunately, the gain less than 1 due to numerical error or nonlinearity

image-20220313231027512

Example

Low Pass Filter

image-20220322124344158

1
2
3
4
5
6
7
8
9
10
11
N = 32;
x = zeros(N,1);
x(1) = 6;
x(2) = -2;
x(3) = 0.5;
x = x/5;
figure(1)
stem(x)
Y = fft(x, N);
figure(2)
plot(abs(Y(1:N/2)));

image-20220322124902394

image-20220322124932584

discrete time jitter impulse response

both input and output are discrete time signal, i.e. no sampling in the input, that's why ratio \(1/T_s\) is not in the jtf

High Pass Filter

image-20220327010223664

1
2
3
4
5
6
7
8
9
10
11
12
N = 128;
j_hp = zeros(N, 1);
j_hp(1)= 1;
j_hp(2) = 0.5;
j_hp(3) = -0.3;
j_hp(4) = 0.3;
j_hp(5) = -0.1;
jtf_hp = abs(fft(j_hp));
semilogx(jtf_hp(1:N/2+1));
xlabel('Freq');
ylabel('Jitter Amplification Factor');
grid on;

image-20220327010421475

inverter chain

image-20220608232251056

image-20220608232658054

image-20220608232438188

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
ji = 1e-12; % 1ps
data = importdata('/path/to/jir.csv');
jo = data.data(:, 2);
Ts = 31.25e-12;
Fs = 1/Ts;
jir = jo/ji;
N = 2^(nextpow2(length(jir)-1));
Y = fft(jir, N);
jtf = abs(Y(1:N/2+1));
freqs= Fs/N*(0:N/2);
plot(feqs/1e9, jtf, 'linewidth', 2);
grid on;
xlabel('Freq (GHz)');
ylabel('JTF');
title('JTF of inverter chain');

image-20220608233303576

Colored Jitter Amplification

image-20250522225506217

[Alphawave’s CTO, Tony Chan Carusone, High Speed Communications Part 8 – On Die CMOS Clock Distribution]

image-20250522234522208

Four major noise sources are included in the modeling: Input noise, DAC quantization noise (DAC QN), DCO random noise (DCO RN), and delay line random noise (DL RN).

H. Kang et al., "A 42.7Gb/s Optical Receiver with Digital CDR in 28nm CMOS," 2023 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), San Diego, CA, USA, 2023 [https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10630516]

Dirac impulse at edge position

image-20250606205753885

Full-rate JTF

Singe edge is using

Half-rate

image-20250606220410050

image-20250607113251021

1
2
3
4
5
6
7
8
9
10
11
12
13
14
ji = 10;
jir = readmatrix("~/cargo/jir.csv");
jir = jir / ji;
jir_2e = jir(:, 2); % both edge
jir_1e = jir(1:2:end, 2); % single edge

[mag, w] = freqz(jir_2e, 1, [], 1);
plot(w, abs(mag), LineWidth=2);
hold on
[mag, w] = freqz(jir_1e, 1, [], 0.5);
plot(w, abs(mag), LineWidth=2);
grid on;
legend("2Edge", "1Edge")
title("JTF")

image-20250607113553089

image-20250607114048892

fck = 1GHz

Note: Phase Noise dBc is SSB, that's why we add 10log10(2)

image-20250607114018289

The calculated JTF from JIR is too small compared with Pnoise simualtion

245.1/272.4 = 89.98%


image-20250607123008345

image-20250607123126301

245.1/272.4 = 83.74%

image-20250607123235447

Reference

Sam Palermo, ECEN 720, Lecture 13 - Forwarded Clock Deskew Circuits

B. Casper and F. O'Mahony, "Clocking Analysis, Implementation and Measurement Techniques for High-Speed Data Links-A Tutorial," in IEEE Transactions on Circuits and Systems I. [https://people.engr.tamu.edu/spalermo/ecen689/clocking_analysis_hs_links_casper_tcas1_2009.pdf]

Rhee, W. (2020). Phase-locked frequency generation and clocking : architectures and circuits for modern wireless and wireline systems. The Institution of Engineering and Technology

Mathuranathan Viswanathan, Digital Modulations using Matlab : Build Simulation Models from Scratch

Tony Chan Carusone, University of Toronto, Canada, 2022 CICC Educational Sessions "Architectural Considerations in 100+ Gbps Wireline Transceivers"

X. Mo, J. Wu, N. Wary and T. Chan Carusone, "Design Methodologies for Low-Jitter CMOS Clock Distribution," in IEEE Open Journal of the Solid-State Circuits Society, 2021 [https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9559395]

Y. Zhao and B. Razavi, "Phase Noise Integration Limits for Jitter Calculation," 2022 IEEE International Symposium on Circuits and Systems (ISCAS), Austin, TX, USA, 2022 [https://www.seas.ucla.edu/brweb/papers/Conferences/YZ_ISCAS_22.pdf]

Thomas Toifl. TWEPP 2012. Low-power High-Speed CMOS I/Os: Design Challenges and Solutions [https://indico.cern.ch/event/170595/contributions/266344/attachments/212179/297391/twepp_sept2012_final_v2.pdf]

Ganesh Balamurugan and Naresh Shanbhag, "Modeling and mitigation of jitter in multiGbps source-synchronous I/O links," Proceedings 21st International Conference on Computer Design, San Jose, CA, USA, 2003, pp. 254-260, doi: 10.1109/ICCD.2003 [https://shanbhag.ece.illinois.edu/publications/ganesh-ICCD2203.pdf]

Balamurugan, G. & Casper, Bryan & Jaussi, James & Mansuri, Mozhgan & O'Mahony, Frank & Kennedy, Joseph. (2009). Modeling and Analysis of High-Speed I/O Links. Advanced Packaging, IEEE Transactions on. [https://sci-hub.se/10.1109/TADVP.2008.2011366]

Jihwan Kim, ISSCC2019 F5: Design Techniques for a 112Gbs PAM-4 Transmitter

Process Variation

image-20220302000743092

[https://ewh.ieee.org/r5/denver/sscs/Presentations/2004_12_Loke.pdf]

image-20251213113351611

Global/Local Variation

image-20220302000808679

Timing and RC Modeling with Process Corners

image-20220302000916728

Global and Local variation by Gaussian

image-20230111003336823

Local Monte-Carlo (SSG, FFG with Local Gaussian) as Signoff golden

image-20231215234014594

Process Corner Model Limitations

image-20220323232119661

image-20230111003705417

Variation section

  • Total corner (TT/SS/FF/SF/FS)
    • E.g. TTMacro_MOS_MOS_MOSCAP
  • Global Corner (TTG/SSG/FFG/SFG/FSG) + Local MC
    • E.g. TTGlobalCorner_LocalMC_MOS_MOSCAP
  • Local MC
    • E.g. LocalMCOnly_MOS_MOSCAP
  • Global MC + Local MC (Total MC)
    • GlobalMC_LocalMC_MOS_MOSCAP

image-20230308003005235

image-20230111010426775

img

image-20230111011757049

SSGNP, FFGNP:

When N/P global correlation is weak (R^2=0.15), the corner of N/PMOS balance circuit (e.g. inverter) can be tightened (3sigma -> 2.5sgma) due to the cancellation between NMOS and PMOS

SSGNP, FFGNP usually used in Digital STA

  • Global variation validation with global corner
    • 3-sigma of global MC simulation is aligned with global corner
  • Total variation validation with total corner
    • 3-sigma of global MC + local MC (total) simulation is aligned with total corner

Global Corner

https://community.cadence.com/cadence_technology_forums/f/custom-ic-design/20466/monte-carlo-simulation-global-local-vs-local-and-process-vs-mismatch/1365101#1365101

The "total corner" is representative of the maximum device parameter variation including local device variation effects. However, it is not a statistical corner.

The "global" corner is defined as the "total" corner minus the impact of "local variation"

Hence, if you were to examine simulation results for a parameter using a "total" and "global" corner, you would find the range of variation will be less with the "global" corner than with the "total" corner.

The "global" corner is provided for use in statistical simulations. Hence, when performing a Monte-Carlo simulation, the "global" corner is selected - NOT the "total" corner.

image-20230511234313012

\[ \Delta V_{T,\sigma_{total}} = \sqrt{\Delta^2 _{T, \sigma_{global}}+\Delta^2 _{T, \sigma_{local}}} \]

RC corner

Traditional RC Corners

Metal width variation (\(\Delta W\)), Metal thickness variation (\(\Delta T\)), IMD thickness variation (\(\Delta H\))

image-20230511232521956

Capacitance Dominant: C-best, C-worst

Resistance Dominant: RC-best, RC-worst

DPT effect

When using two masks per layer (Double Patterning Technology, DPT) there is an issue of mask alignment where any mis-alignment will cause layer spacing values to change, therefore changing the parasitic coupling capacitance values.

Misalignment scale and direction are not deterministic facts: coupling cap and total cap may be increased or decreased.

Five new corners are added in a DPT flow to account for RC variations accurately:

sapced-dependent side-wall dielectric constant also affect coupling cap

and CC_worst means to increase both K1 and K2

CC_best means to decrease both K1 and K2

  • Setup time sign-off would use:

    Cworst_CCworst / RCworst_CCworst

  • Hold time sign-off would use:

    Cbest_CCbest / RCbest_CCbest / Cworst_CCworst / RCworst_CCworst

image-20230416143108231

Signoff corner

with misalignment effect without misalignment effect
cworst_CCworst, cworst_CCworst_T cworst, cworst_T
cbest_CCbest, cbest_CCbest_T cbest, cbest_T
rcworst_CCworst, rcworst_CCworst_T rcworst, rcworst_T
rcbest_CCbest, rcbest_CCbest_T rcbest, rcbest_T

BEOL Target: typical

The recommended RC corner:

cworst_CCworst, cbest_CCbest, rcworst_CCworst, rcbest_CCbest and typical

The others are for pre-color RC calculation purpose

_T stands for "Tighten DPT corner"; these are less pessimistic 1.5 sigma corners

image-20230513001426737

tmp_fohu0q4

Below table is caputre of Aragio's TSMC16: LVDS datasheet

image-20230416130433181

BEOL corner

image-20230416144136870

Spacing variation is implicitly defined by \(\Delta W_m\).

We denote the conductor width and thickness of the layer m by \(W_m\) and \(T_m\), respectively.

Similarly, we denote the thickness of the layer's interlayer dielectric (i.e., the distance between layer m and layer m +1) by \(H_m\)

  • C-based means worst and best caps
  • RC-based means worst and best R in adjustment with C (RC product)

Based on experience, it was found that C-based extraction provides worst and best case over RC for internal timing paths because Capacitance dominates short wire.

However, for large design, inter-block timing paths were often worst with RC worst parasitic since R dominates for long wires.

image-20230416155654008

signoff corners for setup & hold

tmpwlgwrlq0

reference

Process Variation

Eric J.-W. Fang, T5: Fundamentals of Process Monitors for Signoff-Oriented Circuit Design, 2022 IEEE International Solid-State Circuits Conference

Alvin Loke, Device and Physical Design Considerations for Circuits in FinFET Technology, ISSCC 2020 Short Course

簡報 Cln16ffcll Sr V1d0 2p1 Usage Guide URL: https://usermanual.wiki/Document/cln16ffcllsrv1d02p1usageguide.1649731847/view

Radojcic, Riko, Dan Perry and Mark Nakamoto. “Design for manufacturability for fabless manufactuers.” IEEE Solid-State Circuits Magazine 1 (2009): n. pag.

How To Reduce Implementation Headaches In FinFET Processes URL: https://semiengineering.com/how-to-reduce-implementation-headaches-in-finfet-processes/

陌上风骑驴看IC, STA | SSGNP, FFGNP. https://mp.weixin.qq.com/s/eJ8fYRJBR1E9XbfH95OUOg

陌上风骑驴看IC, STA | ssg 跟ss corner 的区别——谬误更正版 https://mp.weixin.qq.com/s?__biz=MzUzODczODg2NQ==&mid=2247486225&idx=1&sn=e9c68f6108ae6c9958d47ca0b29373ca&chksm=fad262cfcda5ebd949cc91353c7cbfaf4ba61179306f7d8e98461a4f4ca9d8a9baef5e9f2cc1&scene=178&cur_album_id=1326356275000705025#rd

The Evolution, Pitfalls, and Cargo Cult Engineering of Advanced Digital Timing Sign-off https://www.tauworkshop.com/2021/speaker_slides/christian_l.pdf

Don O'Riordan Cadence Design Systems. Recommended Spectre Monte Carlo modeling methodology [https://designers-guide.org/modeling/montecarlo.pdf]


RC corner

Modeling Sub-90nm On-Chip Variation Using Monte Carlo Method for DFM https://www.aspdac.com/aspdac2007/pdf/archive/2D-1.pdf

Double Patterning for IC Design, Extraction and Signoff https://semiwiki.com/eda/synopsys/1974-double-patterning-for-ic-design-extraction-and-signoff/

抽刀断水水更流,RC Corner不再愁:STA之RC Corner URL: http://mp.weixin.qq.com/s?__biz=MzUzODczODg2NQ==&mid=2247484115&idx=1&sn=de99f27aadf58ea316c284dad9000b7c&chksm=fad26b0dcda5e21b8c9750f738b55053f695843a66c3c202ff0ba586c738f45aa270254c3722&scene=21#wechat_redirect

一曲新词酒一杯,RC Corner继续飞: STA之RC Corner拾遗 URL:https://mp.weixin.qq.com/s?__biz=MzUzODczODg2NQ==&mid=2247484135&idx=2&sn=bddc632850bd10c32b5688fd7af46218&chksm=fad26b39cda5e22f1c3970f8c8c2e1287c9492c526c4caf02b61f61faffdf829381c392d6ea1&scene=21#wechat_redirect

且将新火试新茶,深究趁年华:STA之RC Corner再论 URL:https://mp.weixin.qq.com/s?__biz=MzUzODczODg2NQ==&mid=2247484144&idx=1&sn=059843381e77cd4008d25166db388d02&chksm=fad26b2ecda5e23816b33b3a949f34d4ca09118bad76f1089dfcb228b7da6491f423a5f4e703&cur_album_id=1326356275000705025&scene=189#wechat_redirect

LDP_IN_800_25V_DN: 1.2GHz LVDS Receiver http://aragiosolutions.com/pdf/rgo_tsmc16_18v25_lvds_product_brief_rev_1a.pdf

Parasitic extraction technologies Advanced node and 3D-IC design https://static.sw.cdn.siemens.com/siemens-disw-assets/public/81845/en-US/Siemens-SW-Parasitic-extraction-technologies-for-advanced-WP-81845-C2.pdf

New Game, New Goal Posts: A Recent History of Timing Closure https://pdfs.semanticscholar.org/9360/5ce48f9bd3b7527ae8979f41a9c7e310efa4.pdf

The Evolution, Pitfalls, and Cargo Cult Engineering of Advanced Digital Timing Sign-off https://www.tauworkshop.com/2021/speaker_slides/christian_l.pdf

T. -B. Chan, S. Dobre and A. B. Kahng, "Improved signoff methodology with tightened BEOL corners," 2014 IEEE 32nd International Conference on Computer Design (ICCD), Seoul, Korea (South), 2014, pp. 311-316, doi: 10.1109/ICCD.2014.6974699.

Chan, T. (2014). Mitigation of Variability and Reliability Margins in IC Implementation /. UC San Diego. ProQuest ID: Chan_ucsd_0033D_14269. Merritt ID: ark:/20775/bb52916761. Retrieved from https://escholarship.org/uc/item/35r1m001

Dr. Adam Temanm, Digital VLSI Design:Lecture 10: Routing https://www.eng.biu.ac.il/temanad/files/2017/02/Lecture-10-Routing.pdf

Article (20487193) Title: Setting Pegasus - LVS to Quantus av_extracted view Flow with TSMC16 packages

primary clock, generated clock and virtual clock in SDC

primary clocks

  • Primary clocks should be created at input ports and output pins of black boxes.
  • Never create clocks on hierarchy pins. Creating clocks on hierarchy will cause problems when reading SDF. The net timing arc becomes segmented at the hierarchy and PrimeTime will be unable to annotate the net successfully.

generated clocks

  • Generated clocks are generally created for waveform modifications of a primary clock (not including simple inversions). PrimeTime does not simulate a design and thus will not derive internally generated clocks automatically - these clocks must be created by the user and applied as a constraint.
  • PrimeTime caculate source latency for generated clocks if primary clock is propagated, otherwise its source latency is zero.
1
2
3
4
5
# primary clock
create_clock -period 4 [get_ports Clk]
set_clock_latency -source 2 [get_clocks Clk]
set_propagated_clock [get_clocks Clk]
create_generated_clock -divide_by 2 -name div_clk -source [get_ports Clk] FF3/Q

virtual clocks

  • Are clock objects without a source
  • Do not clock sequential devices within the current_design
  • Serve as references of input or output delays
1
2
3
4
# create a virtual clock, vclk, for input and output delay constraints
create_clock -period 5 -name vclk
set_input_delay -max 2 -clock vclk [get_ports in1]
set_output_delay -max 1 -clock vclk [get_ports out2]

There is no network latency to calculate, even if a virtual clock is propagated.

Clock Edges used for Setup and Hold in PrimeTime

PT picks the most restrictive pair of edges for setup and for hold. It determines which edges to be used as follows:

  1. Evaluate waveforms over the smallest common base period

  2. For each capture edge, find the closest setup launch edge. Call these the primary pairs

  3. Out of the primary pairs, pick the most restrictive setup launch and capture edges.

  4. For each primary pair, draw two hold relationships:

    • Launch to (capture - 1)

    • (Launch + 1) to Capture

      From all of these hold relationships, pick the most restrictive.

PrimeTime uses the ideal clock waveform (as reported in report_clock) to determine the appropriate clock edges for inter-clock analysis.

image-20220301203135490

The most restrictive setup pair is from Clk1 8ns to Clk2 9ns

The most restrictive hold pair is from Clk1 0ns to Clk2 0ns

Multicycle Paths Constraints in PrimeTime

PrimeTime® User Guide Version O-2018.06-SP4 Chapter 1: Introduction to PrimeTime Overview of Static Timing Analysis - Timing Exceptions

PrimeTime does not automatically identifies multicycle paths

specifying multicycle path for setup

1
2
3
4
5
set_multicycle_path 6 -from reg[26]/CP -to reg/D
# or
set_multicycle_path -setup 6 -from reg[26]/CP -to reg/D
# check the exception
report_exception

specifying multicycle path for hold (new data every 6 cycles)

1
2
set_multicycle_path -setup 6 -to [get_pins "*reg[*]/D"]
set_multicycle_path -hold [expr 6-1] -to [get_pins "*reg[*]/D"]

image-20220301215938296

MH stands for Hold Multiplier, MS for Setup Multiplier. The Setup multiplier counts up with increasing clock cycles, the Hold Multiplier counts up with decreasing cycles. The origin (0) for the Hold Multiplier is always at the Setup Multiplier - 1 position.

Reporting a multicycle path with report_timing

1
report_timing -exceptions all -from *reg[26]/CP -to *reg/D

Timing Exceptions

If certain paths are not intended to operate according to the default setup and hold behavior assumed by the PrimeTime tool, you need to specify those paths as timing exceptions. Otherwise, the tool might incorrectly report those paths as having timing violations.

The PrimeTime tool lets you specify the following types of exceptions:

  • False path – A path that is never sensitized due to the logic configuration, expected data sequence, or operating mode.
  • Multicycle path – A path designed to take more than one clock cycle from launch to capture.
  • Minimum or maximum delay path – A path that must meet a delay constraint that you explicitly specify as a time value.
0%