Isolation cells are additional cells inserted by the synthesis tools for isolating the buses/wires crossing from power-gated domain of a circuit to its always-on domain (AON).

To prevent corruption of always-on domain, we clamp the nets crossing the power domains to a value depending upon the design.

A simple circuit having a switchable (or gated) power domain

isolation-cells-1-1

The circuit shown in Figure 1, after isolation cells are inserted

isolation-cells-2

Always-On Buffer

640?wx_fmt=png

image-20230211001607578

image-20230211001708189

image-20230211001849150

reference

Isolation cells and Level Shifter cells URL: https://vlsitutorials.com/isolation-cells-level-shifter-cells-low-power-vlsi/

Jitter separation lets you learn if the components of jitter are random or deterministic. That is, if they are caused by crosstalk, channel loss, or some other phenomenon. The identification of jitter and noise sources is critical when debugging failure sources in the transmission of high-speed serial signals

  • Tail Fit Method
  • Spectral method
RJ Extraction Methods Rationale
Spectral Speed/Consistency to Past Measurements;
Accuracy in low Crosstalk or Aperiodic Bounded Uncorrelated Jitter (ABUJ) conditions
Tail Fit General Purpose;
Accuracy in high Crosstalk or ABUJ conditions

Jitter Components

image-20220521190326201

dual-Dirac model

image-20220521181604467

Figure-1


image-20250816100336592

image-20250816101651481

Jitter Analysis: The Dual-Dirac Model, RJ/DJ, and Q-scale [https://people.engr.tamu.edu/spalermo/ecen689/jitter_dual_dirac_agilent.pdf]

Spectral method

power spectral density (PSD) represents jitter spectrum and peaks in the spectrum can be interpreted as PJ or DDJ, while the average noise floor is the power of RJ

image-20220521182929127

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
S1 = sum(win);
S2 = sum(win.^2);
N = length(win);
spec_nospur2 = (spec_nospur*S1).^2/N/S2; % To obtain linear spectrum for rj
rj_utj = sqrt(sum(spec_nospur2))*1e12;

spec = 1*ones(length(spec_nospur), 1)*1e-21;
spec(index) = specx(index);
% insert fft nyquist frequency component between positive frequency and
% negative frequency component
% DC;posFreq;nyqFreq;negFreq
spec_ifft = [spec;specnyq;conj(spec(end:-1:2))]';
sfactor = sum(win)/sqrt(2);
spec_ifft = spec_ifft*sfactor;
sig_rec = real(ifft(spec_ifft));
sig_rec = sig_rec(:);
sig_rec_utj = sig_rec./win(1:end);

Tail Fit Method

Tail fitting algorithm based on the Gaussian tail model by using probability distribution of collected jitter value

image-20220521191029433

1
2
3
4
5
6
7
8
9
10
11
12
13
bin_sig = bin_sig*1e12;

x = qfuncinv(cdf_sig);

% coef(1)*bin_sig + coef(2) = x
% which x is norm(0, 1)
% bin_sig = (x - coef(2))/coef(1)
% Then bin is norm(-coef(2)/coef(1), 1/coef(1))
coef = polyfit(bin_sig, x, 1);
sigma = 1/coef(1);
mu = -coef(2)*sigma;

fprintf('sigma=%.3fps, mu=%.3fps\n', sigma, mu);

Least Squares (LS) method

image-20220524005848719

It is known that TIE jitter is a linear equation, shown in below formula \[ x[n] = d_n \times \left[ \Delta t_{pj}[n]+\Delta t_{DCD}[n] +\Delta t_{ISI}[n]+\Delta t_{RJ}[n]\right] \] LS can be used to estimate the PJ, DCD, RJ , and ISI parameters \([a,b,J_{DCD},J_0, J_1...J_{(2^k-1)}]\)

image-20220524185332637

image-20220524185351383

image-20220524010446422

Jitter modeling

Periodic Jitter (PJ)

PJ is a repeating jitter \[ \Delta t_{PJ}[n]=A\sin(2\pi f_0\cdot nT_s + \theta)=a \sin(2\pi f_0 \cdot nT_s)+b\cos(2\pi f_0 \cdot nT_s) \] where \(f_0\) represents the fundamental frequency of PJ; \(A\) is the amplitude of PJ; \(T_s\) is the data stream period, and \(\theta\) is the initial phase of PJ

In the spectrum, the frequency of maximum amount of the jitter is PJ frequency \(f_0\).

Duty Cycle Distortion (DCD)

DCD is viewed as a series of adjacent positive and negative impulses \[ \Delta t_{DCD}[n] = J_{DCD}\times (-1)^n = [-J_{DCD},J_{DCD},-J_{DCD},J_{DCD},...] \] Where \(J_{DCD}\) is the DCD amplitude.

Random Jitter (RJ)

RJ is created by unbounded jitter sources, such as Gaussian white noise. The statistical PDF for RJ is enerally treated as a Gaussian distribution \[ f_{RJ}(\Delta t) = \frac{1}{\sqrt{2\pi\sigma}}\exp(-\frac{(\Delta t)^2}{2\sigma^2}) \]

Remarks

Periodic Jitter Generator and Insertion

Analysis and Estimation of Jitter Sub-Components: Classification and Segregation of Jitter Components

image-20220521212129098

image-20220521212142719

Reference

Mike Li. 2007. Jitter, noise, and signal integrity at high-speed (First. ed.). Prentice Hall Press, USA.

余宥浚 Jacky Yu, Keysight Taiwan AEO, Advanced Jitter and Eye-Diagram Analysis

Y. Duan and D. Chen, "Accurate jitter decomposition in high-speed links," 2017 IEEE 35th VLSI Test Symposium (VTS), 2017, pp. 1-6, doi: 10.1109/VTS.2017.7928918.

Y. Duan's phd thesis URL: https://dr.lib.iastate.edu/handle/20.500.12876/30459

Y. Duan and D. Chen, "Fast and Accurate Decomposition of Deterministic Jitter Components in High-Speed Links," in IEEE Transactions on Electromagnetic Compatibility, vol. 61, no. 1, pp. 217-225, Feb. 2019, doi: 10.1109/TEMC.2018.2797122.

"Jitter Analysis: The Dual-Dirac Model, RJ/DJ, and Q-Scale", Whitepaper: Keysight Technologies, U.S.A., Dec. 2017

Sharma, Vijender Kumar and Sujay Deb. "Analysis and Estimation of Jitter Sub-Components." (2014).

Qingqi Dou and J. A. Abraham, "Jitter decomposition in ring oscillators," Asia and South Pacific Conference on Design Automation, 2006

E. Balestrieri, L. De Vito, F. Lamonaca, F. Picariello, S. Rapuano and I. Tudosa, "The jitter measurement ways: The jitter decomposition," in IEEE Instrumentation & Measurement Magazine, vol. 23, no. 7, pp. 3-12, Oct. 2020, doi: 10.1109/MIM.2020.9234759.

McClure, Mark Scott. "Digital jitter measurement and separation." PhD diss., 2005.

Ren, Nan, Zaiming Fu, Shengcu Lei, Hanglin Liu, and Shulin Tian. "Jitter generation model based on timing modulation and cross point calibration for jitter decomposition." Metrology and Measurement Systems 28, no. 1 (2021).

M. P. Li, J. Wilstrup, R. Jessen and D. Petrich, "A new method for jitter decomposition through its distribution tail fitting," International Test Conference 1999. Proceedings (IEEE Cat. No.99CH37034), 1999, pp. 788-794, doi: 10.1109/TEST.1999.805809.

K. Bidaj, J. -B. Begueret and J. Deroo, "RJ/DJ jitter decomposition technique for high speed links," 2016 IEEE International Conference on Electronics, Circuits and Systems (ICECS) [https://sci-hub.se/10.1109/ICECS.2016.7841269]

divide-by-1.5 circuit

TODO 📅

Phase Interpolator

A phase interpolator (PI) is normally used as a phase shifter (or phase rotator) to generate an output clock whose phase is precisely controlled

TODO 📅

B. Razavi, "The Design of a Phase Interpolator [The Analog Mind]," IEEE Solid-State Circuits Magazine, Volume. 15, Issue. 4, pp. 6-10, Fall 2023. [http://www.seas.ucla.edu/brweb/papers/Journals/BR_SSCM_4_2023.pdf]

Deterministic Jitter

image-20220516004008878


image-20220516004058916

image-20220516004206118

j_Djpp can be calculated by PSD,too

image-20220516004615033

1
2
3
4
5
6
7
8
9
fck = 38.4e6;
Nfft = 15000;
fres = fck/Nfft;
psddBc = -99.3343;
psBc = psddBc + 10*log10(fres); % psd -> ps;
phrad2 = 10^(psBc/10);
phrms = sqrt(phrad2);
Jrms = phrms/2/pi*1/fck;
Jpp = 2*sqrt(2)*Jrms;
1
2
3
Jpp =

6.4038e-12

For DJ, we usually use peak to peak value

BTW, the psd value at half of fundamental frequency (\(f_s/2\)) is duty cycle distortion due to the NMOS/PMOS imbalance, because of rising only data

Random Jitter

RJ can be accurately and efficiently measured using PSS/Pnoise or HB/HBnoise.

Note that the transient noise can also be used to compute RJ;

However, the computation cost is typically very high, and the accuracy is lesser as compared to PSS/Pnoise and HB/HBnoise.

Since RJ follows a Gaussian distribution, it can be fully characterized using its Root-Mean-Squared value (RMS) or the standard deviation value (\(\sigma\))

The Peak-to-Peak value of RJ (\(\text{RJ}_{\text{p-p}}\)) can be calculated under certain observation conditions \[ \text{RJ}_{\text{p-p}}\equiv K \ast \text{RJ}_{\text{RMS}} \] Here, \(K\) is a constant determined by the BER specification of the system given in the following Table

BER Crest factor (K)
\(10^{-3}\) 6.18
\(10^{-4}\) 7.438
\(10^{-5}\) 8.53
\(10^{-6}\) 9.507
\(10^{-7}\) 10.399
\(10^{-8}\) 11.224
\(10^{-9}\) 11.996
\(10^{-10}\) 12.723
\(10^{-11}\) 13.412
\(10^{-12}\) 14.069
\(10^{-13}\) 14.698
1
2
3
4
K = 14.698;
Ks = K/2;
p = normcdf([-Ks Ks]);
BER = 1 - (p(2)-p(1));
1
2
3
BER =

1.9962e-13

image-20220516160050961

image-20220516193125490

Total Jitter

\[ \text{TJ}_{\text{p-p}}\equiv \text{DJ}_{\text{p-p}} + \text{RJ}_{\text{p-p}}(\text{BER}) \]

tj.drawio

image-20220516160006909

image-20220516012200383

In the psd of TJ, the spur is DJ and floor is RJ

Phase Noise to Jitter

The phase noise is traditionally defined as the ratio of the power of the signal in 1Hz bandwidth at offset \(f\) from the carrier \(P\), divided by the power of the carrier \[ \ell (f) = \frac {S_v'(f_0+f)}{P} \] where \(S_v'\) is is one-sided voltage PSD and \(f \geqslant 0\)

Under narrow angle assumption \[ S_{\varphi}(f)= \frac {S_v'(f_0+f)}{P} \] where \(\forall f\in \left[-\infty +\infty\right]\)

Using the Wiener-Khinchin theorem, it is possible to easily derive the variance of the absolute jitter(\(J_{ee}\))via integration of the corresponding PSD \[ J_{ee,rms}^2 = \int S_{J_{ee}}(f)df \]

And we know the relationship between absolute jitter and excess phase is \[ J_{ee}=\frac {\varphi}{\omega_0} \] Considering that phase noise is normally symmetrical about the zero frequency, multiplied by two is shown as below \[ J_{ee,rms} = \frac{\sqrt{2\int_{0}^{+\infty}\ell(f)df}}{\omega_0} \] where phase noise is in linear units not in logarithmic ones.

Because the unit of phase noise in Spectre-RF is logarithmic unit (dBc), we have to convert the unit before applying the above equation \[ \ell[linear] = 10^{\frac {\ell [dBc/Hz]}{10}} \] The complete equation using the simulation result of Spectre-RF Pnoise is \[ J_{ee,rms} = \frac{\sqrt{2\int_{0}^{+\infty}10^{\frac {\ell [dBc/Hz]}{10}}df}}{\omega_0} \]

The above equation has been verified for sampled pnoise, i.e. Jee and Edge Phase Noise.

  • For pnoise-sampled(jitter), Direct Plot Form - Function: Jee:Integration Limits can calculate it conveniently
  • But for pnoise-timeaveage, you have to use the below equation to get RMS jitter.

One example, integrate to \(\frac{f_{osc}}{2}\) and \(f_{osc} = 16GHz\)

image-20220415100034220

Of course, it apply to conventional pnoise simulation.

On the other hand, output rms voltage noise, \(V_{out,rms}\) divied by slope should be close to \(J_{ee,rms}\) \[ J_{ee,rms} = \frac {V_{out,rms}}{slope} \]

Pulse Width Jitter (PWJ)

TODO 📅

[Spectre Tech Tips: Measuring Noise in Digital Circuits]

Pnoise sampled: Edge Delay mode measures the noise defined by two edges. Both edges are defined by a threshold voltage and rising or falling edges, which measures the noise of the pulse itself and direct plot calculate the variation of the pulse width

Power supply induced jitter (PSIJ)

A sampled pxf analysis can be used to simulate the deterministic jitter of a circuit due to power supply ripple

TODO 📅

DCC & AC-coupled buffer

The amount of correction can be set by intentional injection of an offset current into the summing input node of INV, threshold-adjustable inverter

Note that the change to the threshold is opposite in direction to the change to INV

increasing DC of input signal is equivalent to lower down the threshold of INV

image-20241215233057176


image-20241216205525818

voltage at INV1 will increased by: \[ \frac{\Delta V_{DAC} - \Delta {INV1}}{R_{DAC}} = \frac{\Delta {INV1} +A_0 \Delta {INV1}}{R_{F}} \] therefore \[ \Delta {INV1} = \Delta V_{DAC} \cdot \frac{R_F}{R_F+(A_0+1)R_{DAC}} \approx \Delta V_{DAC} \cdot \frac{R_F}{A_0R_{DAC}} \]

If \(R_{DAC} = R_F\) \[ \Delta {INV1}\approx \frac{\Delta V_{DAC}}{A_0} \]

C. Menolfi et al., "A 112Gb/S 2.6pJ/b 8-Tap FFE PAM-4 SST TX in 14nm CMOS," 2018 IEEE International Solid-State Circuits Conference - (ISSCC) [https://sci-hub.se/https://doi.org/10.1109/ISSCC.2018.8310205],[visual]

Bob Lefferts, Navraj Nandra. SNUG Israel 2007 [https://picture.iczhiku.com/resource/eetop/whKYwQorwYoPUVbm.pdf]


image-20240720073616597

Since duty-cycle error is high frequency component, the high-pass filter suppresses the duty-cycle error propagating to the output

image-20240720005226736

  • The AC-coupling capacitor blocks the low-frequency component of the input
  • The feedback resistor sets common mode voltage to the crossover voltage

Bae, Woorham; Jeong, Deog-Kyoon: 'Analysis and Design of CMOS Clocking Circuits for Low Phase Noise' (Materials, Circuits and Devices, 2020)

Casper B, O'Mahony F. Clocking analysis, implementation and measurement techniques for high-speed data links: A tutorial. IEEE Transactions on Circuits and Systems I: Regular Papers. 2009;56(1):17-39

reference

Article (20500632) Title: How to simulate Random and Deterministic Jitters URL: https://support.cadence.com/apex/ArticleAttachmentPortal?id=a1O3w000009fiXeEAI

Spectre Tech Tips: Measuring Noise in Digital Circuits - Analog/Custom Design - Cadence Blogs - Cadence Community https://community.cadence.com/cadence_blogs_8/b/cic/posts/s . . .

Cadence RAK: Deterministic Jitter Measurement using SpectreRF

Frank Wiedmann. Using sampled pxf analysis to simulate deterministic jitter [https://community.cadence.com/cadence_technology_forums/f/custom-ic-design/51605/using-sampled-pxf-analysis-to-simulate-deterministic-jitter]

supply noise sensitivity: PSS+PAC or PSS+PX [https://designers-guide.org/forum/YaBB.pl?num=1376500816]


J. Kim et al., "A 112 Gb/s PAM-4 56 Gb/s NRZ Reconfigurable Transmitter With Three-Tap FFE in 10-nm FinFET," in IEEE Journal of Solid-State Circuits, vol. 54, no. 1, pp. 29-42, Jan. 2019, doi: 10.1109/JSSC.2018.2874040

— et al., "A 224-Gb/s DAC-Based PAM-4 Quarter-Rate Transmitter With 8-Tap FFE in 10-nm FinFET," in IEEE Journal of Solid-State Circuits, vol. 57, no. 1, pp. 6-20, Jan. 2022, doi: 10.1109/JSSC.2021.3108969


J. N. Tripathi, V. K. Sharma and H. Shrimali, "A Review on Power Supply Induced Jitter," in IEEE Transactions on Components, Packaging and Manufacturing Technology, vol. 9, no. 3, pp. 511-524, March 2019 [https://sci-hub.st/10.1109/TCPMT.2018.2872608]

H. Kim, J. Fan and C. Hwang, "Modeling of power supply induced jitter (PSIJ) transfer function at inverter chains," 2017 IEEE International Symposium on Electromagnetic Compatibility & Signal/Power Integrity (EMCSI), Washington, DC, USA, 2017 [https://sci-hub.st/10.1109/ISEMC.2017.8077937]

Yin Sun, Chulsoon Hwang EMC Laboratory. Improving Power Supply Induced Jitter Simulation Accuracy for IBIS Model [https://ibis.org/summits/aug20/sun.pdf]

High Speed Communications Part 8 – On Die CMOS Clock Distribution. [https://youtu.be/nx5CiHcwrF0?si=-eSO-LaaaFrVuIA1]

Low-Jitter CMOS Clock Distribution [https://youtu.be/LMT-T41Y64U?si=y8IpWCtU90zpe4Ob]

Mo, Xunjun & Wu, Jiaqi & Wary, Nijwm & Carusone, Tony. (2021). Design Methodologies for Low-Jitter CMOS Clock Distribution. IEEE Open Journal of the Solid-State Circuits Society. 1. 94-103. 10.1109/OJSSCS.2021.3117930. [https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9559395]


Mozhgan Mansuri. ISSCC2021 SC3: Clocking, Clock Distribution, and Clock Management in Wireline/Wireless Subsystems [https://www.nishanchettri.com/isscc-slides/2021%20ISSCC/SHORT%20COURSE/ISSCC2021-SC3.pdf]

Phillip Restle. ISSCC2021 SC4: Processor Clock Generation, Distribution, and Clock Sensor/Management Loops [https://www.nishanchettri.com/isscc-slides/2021%20ISSCC/SHORT%20COURSE/ISSCC2021-SC4.pdf]

Sam Palermo. Spring 2025 ECEN720 : High-Speed Links Circuits and Systems [Lecture 14: Clock Distribution Techniques]

DC offset

Performing FFT to a signal with a large DC offset would often result in a big impulse around frequency 0 Hz, thus masking out the signals of interests with relatively small amplitude.

Remove_DC_Offset_Blog_10

One method to remove DC offset from the original signal before performing FFT

  • Subtracting the Mean of Original Signal

You can also not filter the input, but set zero to the zero frequency point for FFT result.

Nyquist component

If we go back to the definition of the DFT \[ X(N/2)=\sum_{n=0}^{N-1}x[n]e^{-j2\pi (N/2)n/2}=\sum_{n=0}^{N-1}x[n]e^{-j\pi n}=\sum_{n=0}^{N-1}x[n](-1)^n \] which is a real number.

The discrete function \[ x[n]=\cos(\pi n) \] is always \((-1)^n\) for integer \(n\)

One general sinusoid at Nyquist and has phase shift \(\theta\), this is \(T=2\) and \(T_s=1\)

\[\begin{align} x[n] &= A \cos(\pi n + \theta) \\ &= A \big( \cos(\pi n) \cos(\theta) - \sin(\pi n) \sin(\theta) \big) \\ &= \big(A\cos(\theta)\big) \cos(\pi n) + \big(-A\sin(\theta)\big) \sin(\pi n) \\ &= \big(A\cos(\theta)\big) (-1)^n + \big(-A\sin(\theta)\big) \cdot 0 \\ &= B \cdot (-1)^n \\ \end{align}\]

Where \(A\cos(\theta)=B\).

Moreover \(B \cdot (-1)^n = B\cdot \cos(\pi n)\), then \[ B\cdot \cos(\pi n) = A \cdot \cos(\pi n + \theta) \] We can NOT distinguish one from another.

In other words, you CAN'T infer the signal from \(X(\frac{N}{2})\) \[\begin{align} X(k)\frac{1}{N}e^{j 2 \pi \frac{nk}{N}}\bigg|_{k=\frac{N}{2}} &= \frac{X\left(\frac{N}{2} \right)}{N}(-1)^n \\ &= \frac{X\left(\frac{N}{2} \right)}{N}\cos(\pi n) \\ &= \frac{X\left(\frac{N}{2} \right)}{N}\left( \cos(\pi n) - \beta \sin(\pi n) \right) \\ &= \frac{X\left(\frac{N}{2} \right)}{N}\sqrt{1+\beta^2}\left(\frac{1}{\sqrt{1+\beta^2}} \cos(\pi n) - \frac{\beta}{\sqrt{1+\beta^2}} \sin(\pi n) \right) \\ &= \frac{X\left(\frac{N}{2} \right)}{N} \frac{1}{\cos(\theta)}\left(\cos(\theta) \cos(\pi n) - \sin(\theta) \sin(\pi n) \right) \\ &= \frac{X\left(\frac{N}{2} \right)}{N} \frac{1}{\cos(\theta)} \cos(\pi n+\theta) \end{align}\]

where \(\beta \in \mathbb{R}\) and you wouldn't know it because \(\sin(\pi n)=0 \quad \forall n \in \mathbb{Z}\)

For example, if \(\theta=0\) \[ X(k)\frac{1}{N}e^{j 2 \pi \frac{nk}{N}}\bigg|_{k=\frac{N}{2}}= \frac{X\left(\frac{N}{2} \right)}{N} \cos(\pi n) \] However, if \(\theta=\frac{\pi}{3}\) \[ X(k)\frac{1}{N}e^{j 2 \pi \frac{nk}{N}}\bigg|_{k=\frac{N}{2}}= \frac{X\left(\frac{N}{2} \right)}{N}\cdot 2 \cos(\pi n+\frac{\pi}{3}) \]

That sort of ambiguity is the reason for the strict inequality of the sampling theorem's condition.

Duty Cycle Distortion

image-20250522203602328

[Prof. Tony Chan Carusone, Low-Jitter CMOS Clock Distribution]

Both edges is used for clock waveform to evaluate the duty cycle distortion,

Assuming TIE is [0 0.2 0 0.2 0 0.2 ...], then subtract DC offset, we get [-0.1 0.1 -0.1 0.1 ...], shown as below

image-20220516224616908

The amplitude is manifested in FFT amplitude spectrum, i.e. Nyquist component, which is 0.1 in follow figure

image-20220516233618663

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
N = 32;
n = (1:N);
x = 0.1*(-1).^n;
figure(1)
stem(n-1, x);
X = fft(x)/N;
Xshift = fftshift(X);
fprintf("nyquist component: %.2f\n", Xshift(1));
magXshift = abs(Xshift);
ph = phase(Xshift)/pi*180;
figure(2)
subplot(3, 1, 1)
fx = (-N/2:N/2-1);
stem(fx, magXshift);
xlabel('Freq');
ylabel('|X(k)|');
title('mag of DFT');
grid on;

subplot(3, 1, 2)
stem(fx, ph);
xlabel('Freq');
ylabel('\angle X(k)(^oC)');
title('phase of DFT');
grid on;

%% inverse dft
ninv = (0:32-1);
xinv = Xshift(1)*cos(pi*ninv);
subplot(3, 1, 3);
hold on;
stem(ninv, x, "filled", 'r');
stem(ninv, xinv,'bd-.');
ninfer = (0:0.1:32+1);
xinfer1 = Xshift(1)*cos(pi*ninfer); % theta = 0
xinfer2 = Xshift(1)*2*cos(pi*ninfer+pi/3); % theta = pi/3
plot(ninfer, xinfer1, 'm--');
plot(ninfer, xinfer2, 'c--');
hold off;
legend('original', 'IDFT', '\theta=0', '\theta=\pi/3');
xlabel('time');
ylabel('V');
title('sample points');
grid on;

Single-Sided Amplitude Spectrum

DC and Nyquist frequency of FFT left over

P1(2:end-1) = 2*P1(2:end-1);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Fs = 1000;            % Sampling frequency
T = 1/Fs; % Sampling period
L = 1500; % Length of signal
t = (0:L-1)*T; % Time vector
S = 0.7*sin(2*pi*50*t) + sin(2*pi*120*t);
X = S + 2*randn(size(t));
figure(1)
plot(1000*t(1:50),X(1:50))
title('Signal Corrupted with Zero-Mean Random Noise')
xlabel('t (milliseconds)')
ylabel('X(t)')

figure(2)
Y = fft(X);
P2 = abs(Y/L); %!!! two-sided spectrum P2.
P1 = P2(1:L/2+1); %!!! single-sided spectrum P1
P1(2:end-1) = 2*P1(2:end-1); % exclude DC and Nyquist freqency
f = Fs*(0:(L/2))/L;
figure(2)
plot(f,P1)
title('Single-Sided Amplitude Spectrum of X(t)')
xlabel('f (Hz)')
ylabel('|P1(f)|')

image-20220514221609734

image-20220514221642170

Alternative View

The direct current (DC) bin (\(k=0\)) and the bin at \(k=N/2\), i.e., the bin that corresponds to the Nyquist frequency are purely real and unique.

sinusoidal waveform with \(10Hz\), amplitude 1 is \(cos(2\pi f_c t)\). The plot is shown as below with sampling frequency is \(20Hz\)

image-20220427172526711

Amplitude and Phase spectrum, sampled with \(f_s=20\) Hz

image-20220427174557915

The FFT magnitude of \(10Hz\) is 1 and its phase is 0 as shown as above, which proves the DFT and IDFT.

Caution: the power of FFT is related to samples (DFT Parseval's theorem), which may not be the power of continuous signal. The average power of samples is ([1 -1 1 -1 -1 1 ...]) is 1, that of corresponding continuous signal is \(\frac{1}{2}\).

Power spectrum derived from FFT provide information of samples, i.e. 1

Moreover, average power of sample [1 -1 1 -1 1 ...] is same with DC [1 1 1 1 ...].

code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
fc = 10;
fs = 2*fc;
fov = 64*fc;

ts = (0:1/fs:26);
tov = (0:1/fov:26);

ys = cos(2*pi*fc*ts);
yov = cos(2*pi*fc*tov);
%% waveform
stem(ts, ys)
hold on;
plot(tov, yov);
legend('sample', 'waveform')
ylim([-1.5 1.5])
grid on;
xlabel('time(s)');
ylabel('mag(V)')

nfft = 256;
X = fftshift(fft(ys, nfft))/nfft;
f = (-nfft/2:nfft/2-1)*fs/nfft;
magX = abs(X);
phsX = atan2(imag(X),real(X));
%% fft spectrum
figure(2)
subplot(2, 1, 1);
stem(f, magX);
xlabel('Frequency(Hz)');
ylabel('mag')
xlim([min(f)-1 max(f)+1])
title('Amplitude spectrum')
subplot(2, 1, 2)
plot(f, phsX);
xlabel('Frequency(Hz)');
ylabel('phase (rad)')
xlim([min(f)-1 max(f)+1])
title('Phase spectrum')

%% power spectrum
yssq_sum_avg = sum(ys(1:nfft).^2)/nfft;
specsq_sum_avg = sum(abs(X).^2);

reference

OriginLab, How to Remove DC Offset before Performing FFT URL: http://blog.originlab.com/how-to-remove-dc-offset-before-performing-fft

How to remove DC component in FFT? URL: https://www.mathworks.com/matlabcentral/answers/712808-how-to-remove-dc-component-in-fft#answer_594373

Analyzing a signal that contains frequency content at Fs/2 doesn't seem to work unless there is a phase shift URL: https://dsp.stackexchange.com/a/59807/59253

Nyquist–Shannon sampling theorem, Critical frequency URL: https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem#Critical_frequency

Why remove energy at Nyquist before ifft? URL: https://dsp.stackexchange.com/a/22851/59253

  • A glitch is an unwanted pulse at the output of a combinational logic network – a momentary change in an output that should not have changed
  • A circuit with the potential for a glitch is said to have a hazard
  • In other words a hazard is something intrinsic about a circuit; a circuit with hazard may or may not have a glitch depending on input patterns and the electric characteristics of the circuit.

When do circuits have hazards ?

Hazards are potential unwanted transients that occur in the output when different paths from input to output have different propagation delays

Types of Hazards (on an output)

static 1-hazard, static 0-hazard, dynamic hazard

image-20220508183800744

Hazard's Concern

  • Hazards do not hurt synchronous circuits
  • Hazards Kill Asynchronous Circuits
  • Glitches Increase Power Consumption

referece

CPE166/EEE 270 Advanced Logic Design-Digital Design: Time Behavior of Combinational Networks: https://www.csus.edu/indiv/p/pangj/166/f/sram/Handout_Hazard.pdf

John Knight, ELEC3500 Glitches and Hazards in Digital Circuits http://www.doe.carleton.ca/~shams/ELEC3500/hazards.pdf

verilog-mode.el

Emacs Online Documentation https://doc.endlessparentheses.com/

Emacs verilog-mode 的使用 URL: https://www.wenhui.space/docs/02-emacs/verilog_mode_useguide/

1
emacs --no-site-file --load path/to/verilog-mode.el --batch filename.v -f verilog-auto-save-compile

CAUTION: filename.v is overwrite by command

verilog-mode.el

1
2
3
4
/*AUTOINPUT*/
/*AUTOWIRE*/
/*AUTOINST*/
/*AUTO_TEMPLATE*/

-f verilog-batch-auto

For use with --batch, perform automatic expansions as a stand-alone tool. This sets up the appropriate Verilog mode environment, updates automatics with M-x verilog-auto on all command-line files, and saves the buffers. For proper results, multiple filenames need to be passed on the command line in bottom-up order.

-f verilog-auto-save-compile

Update automatics with M-x verilog-auto, save the buffer, and compile

Emacs

--no-site-file

Another file for site-customization is site-start.el. Emacs loads this before the user's init file (.emacs, .emacs.el or .emacs.d/.emacs.d). You can inhibit the loading of this file with the option --no-site-file

--batch

The command-line option --batch causes Emacs to run noninteractively. The idea is that you specify Lisp programs to run; when they are finished, Emacs should exit.

--load, -l FILE, load Emacs Lisp FILE using the load function;

--funcall, -f FUNC, call Emacs Lisp function FUNC with no arguments

-f FUNC

--funcall, -f FUNC, call Emacs Lisp function FUNC with no arguments

--load, -l FILE

--load, -l FILE, load Emacs Lisp FILE using the load function

Verilog-mode is a standard part of GNU Emacs as of 22.2.

multiple directories

AUTOINST only search in the file's directory default.

You can append below verilog-library-directories for multiple directories search

1
2
3
// Local Variables:
// verilog-library-directories:("." "subdir" "subdir2")
// End:

plusargs in Verilog

systemverilog-command-line-input URL: https://www.chipverify.com/systemverilog/systemverilog-command-line-input

PLUSARGS IN SYSTEMVERILOG URL:https://www.theartofverification.com/plusargs-in-systemverilog/

plusargs are command-line switches supported by the simulator. As per SystemVerilog LRM arguments beginning with the + character will be available using the $test$plusargs and $value$plusargs PLI APIs.

1
2
3
$test$plusargs (user_string)

$value$plusargs (user_string, variable)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// tb.v
module tb;
int a;
initial begin
if($test$plusargs("RUNSIM")) begin
$display("There is RUNSIM plusargs");
end else begin
$display("There is NO $test$plusargs");
end
if($value$plusargs("SEED=%d",a)) begin
$display("SEED=%d",a);
end else begin
$display("There is NO $value$plusargs");
end
end
endmodule
  • compile

    1
    2
    $ vlib work
    $ vlog -sv tb.v
  • simulate (QuestaSim)

    • without plusargs

      1
      $ vsim work.tb -c -do "run; exit"
      1
      2
      3
      4
      5
      6
      7
      8
      9
      # //
      # Loading sv_std.std
      # Loading work.tb(fast)
      # run
      # There is NO $test$plusargs
      # There is NO $value$plusargs
      # exit
      # End time: 13:04:23 on Jun 04,2022, Elapsed time: 0:00:01
      # Errors: 0, Warnings: 0
    • with plusargs

      1
      $ vsim work.tb -c -do "run; exit" +SEED=31 +RUNSIM

      +SEED=31 +RUNSIM

      1
      2
      3
      4
      5
      6
      7
      8
      9
      # //
      # Loading sv_std.std
      # Loading work.tb(fast)
      # run
      # There is RUNSIM plusargs
      # SEED= 31
      # exit
      # End time: 13:04:55 on Jun 04,2022, Elapsed time: 0:00:01
      # Errors: 0, Warnings: 0

Inertial & transport delays

Verilog Nonblocking Assignments With Delays, Myths & Mysteries

Correct Methods For Adding Delays To Verilog Behavioral Models

Article (20488135) Title: Selecting Different Delay Modes in GLS (RAK) URL: https://support.cadence.com/apex/ArticleAttachmentPortal?id=a1O3w000009bdLyEAI

Article (20447759) Title: Gate Level Simulation (GLS): A Quick Guide for Beginners URL: https://support.cadence.com/apex/ArticleAttachmentPortal?id=a1Od0000005xEorEAE

image-20230414232256309

image-20230414232439556

Inertial delay

Inertial delay models are simulation delay models that filter pulses that are shorted than the propagation delay of Verilog gate primitives or continuous assignments (assign #5 y = ~a;)

COMBINATIONAL LOGIC ONLY !!!

  • Inertial delays swallow glitches
  • sequential logic implemented with procedure assignments DON'T follow the rule

continuous assignments

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
`timescale 1ns/100ps
module tb;

reg in;
/////////////////////////////////////////////////////
wire out;
assign #2.5 out = in;
/////////////////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220317000509716

procedure assignment - combinational logic

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
`timescale 1ns/100ps
module tb;

reg in;
reg out;

//////////// combination logic ////////////////////////
always @(*)
#2.5 out = in;
///////////////////////////////////////////////////////
/* the above code is same with following code
always @(*) begin
#2.5;
out = in;
end
*/
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220316235257361

procedure assignment - sequential logic

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
`timescale 1ns/100ps
module tb;
reg clk;

reg in;
reg out;

always begin
clk = 0;
#5;
forever begin
clk = ~clk;
#5;
end
end
//////////// sequential logic //////////////////
always @(posedge clk)
#2.5 out <= in;
///////////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
end

initial begin
#50;
$finish();
end

endmodule

image-20220316235620168

As shown above, sequential logic DON'T follow inertial delay

Transport delay

Transport delay models are simulation delay models that pass all pulses, including pulses that are shorter than the propagation delay of corresponding Verilog procedural assignments

  • Transport delays pass glitches, delayed in time
  • Verilog can model RTL transport delays by adding explicit delays to the right-hand-side (RHS) of a nonblocking assignment
1
2
always @(*)
y <= #5 ~a;

nonblocking assignment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
`timescale 1ns/100ps
module tb;

reg in;
reg out;
/////////////// nonblocking assignment ///
always @(*) begin
out <= #2.5 in;
end
/////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220317003146825

blocking assignment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
`timescale 1ns/100ps
module tb;

reg in;
reg out;
/////////////// blocking assignment ///
always @(*) begin
out = #2.5 in;
end
/////////////////////////////////////////
initial begin
in = 0;
#16;
in = 1;
#2;
in = 0;
#10;
in = 1;
#4;
in = 0;
end

initial begin
#50;
$finish();
end

endmodule

image-20220317003819457

It seems that new event is discarded before previous event is realized.

clocking block in SystemVerilog

Assignment at <interface>.<clocking block>.<output signal> (i.e. synchronous) do NOT change <interface>.<output signal> until active clock edge.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// router_io.sv

interface router_io(input bit clock);
logic reset_n;
logic [15:0] din;
logic [15:0] frame_n;
logic [15:0] valid_n;
logic [15:0] dout;
logic [15:0] valido_n;
logic [15:0] busy_n;
logic [15:0] frameo_n;

clocking cb @(posedge clock);
default input #1ns output #1ns;
output reset_n;
output din;
output frame_n;
output valid_n;
input dout;
input valido_n;
input frameo_n;
input busy_n;
endclocking: cb

// `reset_n` can be either a synchronous or an asynchronous signal
modport TB(clocking cb, output reset_n);

endinterface: router_io

All interface signals are asynchronous and without a direction spection (i.e. input, output, inout).

  • The direction can only be specified in clocking block for synchronous signals
  • or a modport for asynchronous signals

All directions for the signals in the clocking block must be with respect to the test program;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// test.sv

program automatic test(router_io.TB rtr_io);

initial begin
reset();
end

task reset();
rtr_io.reset_n = 1'b0;
rtr_io.cb.frame_n <= '1;
rtr_io.cb.valid_n <= '1;
repeat(2) @rtr_io.cb;
rtr_io.cb.reset_n <= 1'b1;
repeat(15) @(rtr_io.cb);
endtask: reset

endprogram: test
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// router_test_top.sv

`timescale 1ns/100ps

module router_test_top;
parameter simulation_cycle = 100;

bit SystemClock = 0;

router_io top_io(SystemClock);
test t(top_io);

router dut(
.reset_n (top_io.reset_n),
.clock (top_io.clock),
.din (top_io.din),
.frame_n (top_io.frame_n),
.valid_n (top_io.valid_n),
.dout (top_io.dout),
.valido_n (top_io.valido_n),
.busy_n (top_io.busy_n),
.frameo_n (top_io.frameo_n)
);

initial begin
$timeformat(-9, 1, "ns", 10);
$fsdbDumpvars;
end

always begin
#(simulation_cycle/2) SystemClock = ~SystemClock;
end

endmodule

compile:

1
2
$ vcs -sverilog -full64 -kdb -debug_access+all router_test_top.sv test.sv router_io
.sv ../../rtl/router.v

file with `timescale must be placed in the first, which is router_test_top.sv in above example

clocking.output

image-20220621005749074

systemverilog don't pass clocking.output to interface's until current or next active edge and after output-skew

clocking.input

image-20220621010546293

Systemverilog automatically update clocking.input signal from interface's value, input-skew before active edge

Gotcha

An interface must be compiled separately like a module and CANNOT `include inside a package or ohter module

Cadence EE_pkg 101

What Is Real Number Modeling?

  • model analog blocks operation as signal flow model
  • only the digital solver is used for high-speed simulation
    • Event-driven
    • No convergence issues, because no analog solver is used
  • Five different language standards support real number modeling:
    • wreal (wired-real) ports in Verilog-AMS
    • real data type in VHDL
    • real data type in Verilog
    • real variables and nettypes in SystemVerilog (SV)
    • real types in e

Benefits of RNM

  • Most analog circuits that need to be modeled for MS verification at the SoC level can be described in terms of real-valued voltages or currents
  • RNM is a mixed approach, borrowing concepts from both continuous and discrete domains
    • The values are floating-point (real) number.
    • Time is discrete; the real signals change values based on discrete events
  • Applicability of RNM is bounded primarily by signal-flow model style
  • Migrating analog behavior from the analog domain to the event or pseudo-analog domain can bring huge benefits without sacrificing too much accuracy
  • Simulation is executed by a digital simulation engine without need for the analog solver
  • Hence real-number modeling enables very high performance simulation of mixed-signal systems

Limitations of RNM

  • connecting real or wreal signals to electrical signals requires careful consideration
    • Too conservative an approach can lead to large numbers of timepoints
    • Too liberal an approach can lead to losing signal accuracy
  • Time accuracy limited by the discrete sampling approach and the `timescale setting - no continuous signals anymore
  • Limited capability for combination of signals by wiring outputs together
    • Requires assumptions about impedances to do simple merging

constant part-select and indexed part-select in Verilog

Verilog scalar and vector [link]

What is the "+:" operator called in Verilog? [link]

A range of contiguous bits can be selected and is known as part-select. There are two types of part-selects, one with a constant part-select and another with an indexed part-select

1
2
reg [31:0] addr;
addr [23:16] = 8'h23; //bits 23 to 16 will be replaced by the new value 'h23 -> constant part-select

Having a variable part-select allows it to be used effectively in loops to select parts of the vector. Although the starting bit can be varied, the width has to be constant.

[<start_bit +: ] // part-select increments from start-bit

[<start_bit -: ] // part-select decrements from start-bit

Example

1
2
3
4
5
6
7
8
9
10
11
12
logic [31: 0] a_vect;
logic [0 :31] b_vect;

logic [63: 0] dword;
integer sel;

a_vect[ 0 +: 8] // == a_vect[ 7 : 0]
a_vect[15 -: 8] // == a_vect[15 : 8]
b_vect[ 0 +: 8] // == b_vect[0 : 7]
b_vect[15 -: 8] // == b_vect[8 :15]

dword[8*sel +: 8] // variable part-select with fixed width

Mixing Signed and Unsigned in Verilog

Bevan Baas, VLSI Digital Signal Processing, EEC 281 - VLSI Digital Signal Processing [https://www.ece.ucdavis.edu/~bbaas/281/]

Sign Extension

  1. Calculate the necessary minimum width of the sum so that it contains all input possibilities
  2. Extend the inputs' sign bits to the width of the answer
  3. Add as usual
  4. Ignore bits that ripple to the left of the answer's MSB
  1. signed
inA (signed) inB (signed) outSum
(signed/unsigned)
0101 (5) 1111 (-1)
extend sign 00101 11111
sum result 00100
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg signed [3:0] inA;
reg signed [3:0] inB;
reg signed [4:0] outSumSg; // signed result
reg [4:0] outSumUs; // unsigned result

initial begin
inA = 4'b0101;
inB = 4'b1111;
outSumUs = inA + inB;
outSumSg = inA + inB;

$display("signed out(%%d): %0d", outSumSg);
$display("signed out(%%b): %b", outSumSg);

$display("unsigned out(%%d): %0d", outSumUs);
$display("unsigned out(%%b): %b", outSumUs);
end
endmodule
  1. mixed
inA (signed) inB (unsigned) outSum
(signed/unsigned)
0101 (5) 1111 (15)
extend sign 00101 01111
sum result 10100
1
2
reg signed [3:0] inA;
reg [3:0] inB;
inA (unsigned) inB (signed) outSum
(signed/unsigned)
0101 (5) 1111 (-1)
extend sign 00101 01111
sum result 10100
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg [3:0] inA;
reg signed [3:0] inB;
reg signed [4:0] outSumSg;
reg [4:0] outSumUs;

initial begin
inA = 4'b0101;
inB = 4'b1111;
outSumUs = inA + inB;
outSumSg = inA + inB;

$display("signed out(%%d): %0d", outSumSg);
$display("signed out(%%b): %b", outSumSg);

$display("unsigned out(%%d): %0d", outSumUs);
$display("unsigned out(%%b): %b", outSumUs);
end
endmodule

xcelium

1
2
3
4
5
6
7
xcelium> run
signed out(%d): -12
signed out(%b): 10100
unsigned out(%d): 20
unsigned out(%b): 10100
xmsim: *W,RNQUIE: Simulation is complete.
xcelium> exit

vcs

1
2
3
4
5
6
Compiler version S-2021.09-SP2-1_Full64; Runtime version S-2021.09-SP2-1_Full64;  May  7 17:24 2022
signed out(%d): -12
signed out(%b): 10100
unsigned out(%d): 20
unsigned out(%b): 10100
V C S S i m u l a t i o n R e p o r t

observation

When signed and unsigned is mixed, the result is by default unsigned.

Prepend to operands with 0s instead of extending sign, even though the operands is signed

LHS DONT affect how the simulator operate on the operands but what the results represent, signed or unsigned

Therefore, although outSumUs is declared as signed, its result is unsigned

subtraction example

In logic arithmetic, addition and subtraction are commonly used for digital design. Subtraction is similar to addition except that the subtracted number is 2's complement. By using 2's complement for the subtracted number, both addition and subtraction can be unified to using addition only.

operands are signed
inA (signed) inB (signed) outSub
(signed/unsigned)
1111 (-1) 0010 (2)
extend sign 11111 00010
sub result 11101
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg signed [3:0] inA;
reg signed [3:0] inB;
reg signed [4:0] outSubSg;
reg [4:0] outSubUs;

initial begin
inA = 4'b1111;
inB = 4'b0010;
outSubUs = inA - inB;
outSubSg = inA - inB;

$display("signed out(%%d): %0d", outSubSg);
$display("signed out(%%b): %b", outSubSg);

$display("unsigned out(%%d): %0d", outSubUs);
$display("unsigned out(%%b): %b", outSubUs);
end
endmodule
1
2
3
4
5
6
Compiler version S-2021.09-SP2-1_Full64; Runtime version S-2021.09-SP2-1_Full64;  May  7 17:46 2022
signed out(%d): -3
signed out(%b): 11101
unsigned out(%d): 29
unsigned out(%b): 11101
V C S S i m u l a t i o n R e p o r t
1
2
3
4
5
6
xcelium> run
signed out(%d): -3
signed out(%b): 11101
unsigned out(%d): 29
unsigned out(%b): 11101
xmsim: *W,RNQUIE: Simulation is complete.
operands are mixed
inA (signed) inB (unsigned) outSub
(signed/unsigned)
1111 (-1) 0010 (2)
extend sign 01111 00010
sub result 01101
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
reg signed [3:0] inA;
reg [3:0] inB;
reg signed [4:0] outSubSg;
reg [4:0] outSubUs;

initial begin
inA = 4'b1111;
inB = 4'b0010;
outSubUs = inA - inB;
outSubSg = inA - inB;

$display("signed out(%%d): %0d", outSubSg);
$display("signed out(%%b): %b", outSubSg);

$display("unsigned out(%%d): %0d", outSubUs);
$display("unsigned out(%%b): %b", outSubUs);
end
endmodule
1
2
3
4
5
6
Compiler version S-2021.09-SP2-1_Full64; Runtime version S-2021.09-SP2-1_Full64;  May  7 17:50 2022
signed out(%d): 13
signed out(%b): 01101
unsigned out(%d): 13
unsigned out(%b): 01101
V C S S i m u l a t i o n R e p o r t
1
2
3
4
5
6
7
xcelium> run
signed out(%d): 13
signed out(%b): 01101
unsigned out(%d): 13
unsigned out(%b): 01101
xmsim: *W,RNQUIE: Simulation is complete.
xcelium> exit

Danger Sign

https://projectf.io/posts/numbers-in-verilog/

Verilog has a nasty habit of treating everything as unsigned unless all variables in an expression are signed. To add insult to injury, most tools won’t warn you if signed values are being ignored.

If you take one thing away from this post:

Never mix signed and unsigned variables in one expression!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
module signed_tb ();
logic [7:0] x; // 'x' is unsigned
logic signed [7:0] y; // 'y' is signed
logic signed [7:0] x1, y1;
logic signed [3:0] move;

always_comb begin
x1 = x + move; // !? DANGER: 'x' is unsigned but 'move' is signed
y1 = y + move;
end

initial begin
#10
$display("Coordinates (7,7):");
x = 8'd7;
y = 8'd7;
#10
$display("x : %b %d", x, x);
$display("y : %b %d", y, y);

#10
$display("Move +4:");
move = 4'sd4; // signed positive value
#10
$display("x1: %b %d *LOOKS OK*", x1, x1);
$display("y1: %b %d", y1, y1);

#10
$display("Move -4:");
move = -4'sd4; // signed negative value
#10
$display("x1: %b %d *SURPRISE*", x1, x1); // 0000_0111 + {0000}_1100 = 0001_0011
$display("y1: %b %d", y1, y1); // 0000_0111 + {1111}_1100 = 0000_0011
end
endmodule
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Chronologic VCS simulator copyright 1991-2021
Contains Synopsys proprietary information.
Compiler version S-2021.09-SP2-2_Full64; Runtime version S-2021.09-SP2-2_Full64; Nov 19 11:02 2022
Coordinates (7,7):
x : 00000111 7
y : 00000111 7
Move +4:
x1: 00001011 11 *LOOKS OK*
y1: 00001011 11
Move -4:
x1: 00010011 19 *SURPRISE*
y1: 00000011 3
V C S S i m u l a t i o n R e p o r t
Time: 60
CPU Time: 0.260 seconds; Data structure size: 0.0Mb

time format in Verilog

https://verificationacademy.com/forums/systemverilog/time-vs-realtime#answer-94062 https://verificationacademy.com/forums/systemverilog/time-vs-realtime#answer-94096

realtime vs time

  • $realtime round the current time to timeprecision

  • $time round the current time to integer

  • %t will scale the rounded value to represent timeprecision,

    i.e. \([\$\text{realtime}, \$\text{time}]\cdot \$\text{timeunit} / \$\text{timeprecision}\)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module tb;
timeunit 10ns;
timeprecision 1ps;

initial begin
$display("$realtime = %g", $realtime);
$display("$time = %g", $time);
// $realtime round to timeprecision
// $time round to integer
#1.10016
$display("$realtime = %g", $realtime);
$display("$time = %g", $time);

// %t format will scale the rounded value to represent timeprecision
// 1.1002*10e-9/1e-12 = 11002
$display("$realtime %%t = %t", $realtime);
$display("$time %%t = %t", $time);
end
endmodule

output

1
2
3
4
5
6
$realtime = 0
$time = 0
$realtime = 1.1002
$time = 1
$realtime %t = 11002
$time %t = 10000

timeunit, timeprecision

The time unit and time precision can be specified in the following two ways:

  • Using the compiler directive `timescale
  • Using the keywords timeunit and timeprecision
1
2
3
4
5
6
7
8
9
10
module D (...);
timeunit 100ps;
timeprecision 10fs;
...
endmodule

module E (...);
timeunit 100ps / 10fs; // timeunit with optional second argument
...
endmodule

The minimum of timeprecision determine %t output, the nearest timeunit and timeprecision determine the round of $realtime and $time. Of course, the simulator follow the time tick shown by $realtime.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
`timescale 1ns/1ns

// Due to minimum of timeprecision is 10ps
module rawplant;
timeunit 1ns;
timeprecision 100ps;

task print;
/*
1ns: 1
100ps: 6
10ps: 6
1ps: 6
*/
#1.666
$display("raw: $realtime = %g", $realtime); // 1.7
$display("raw: $time = %g", $time); // 2
$display("raw: $realtime %%t = %t", $realtime); // 1.7*1ns/10ps=170
$display("raw: $time %%t = %t", $time); // 2*1ns/10ps = 200
endtask

endmodule

module fineplant;
timeunit 100ps;
timeprecision 10ps;

task print;
/*
100ps: 2
10ps: 6
1ps: 6
*/
#2.66
$display("fine: $realtime = %g", $realtime); // 2.7
$display("fine: $time = %g", $time); // 3
$display("fine: $realtime %%t = %t", $realtime); // 2.7*100ps/10ps = 27
$display("fine: $time %%t = %t", $time); // 3*100ps/10ps = 30
endtask

endmodule

module tb;
rawplant rawblock();
fineplant fineblock();

initial begin
fork
rawblock.print();
fineblock.print();
join
end
endmodule

output

1
2
3
4
5
6
7
8
fine: $realtime = 2.7
fine: $time = 3
fine: $realtime %t = 27
fine: $time %t = 30
raw: $realtime = 1.7
raw: $time = 2
raw: $realtime %t = 170
raw: $time %t = 200

questasim cmd

1
2
3
vlib work
vlog tb.v
vsim -c -do "run -all;exit" work.tb

data dumping of simulation

$dumpvars and $dumpfile Verilog, [http://www.referencedesigner.com/tutorials/verilog/verilog_62.php]

FSDB

$fsdbDumpfile

It specifies the FSDB file name created by the Novas object files for FSDB dumping. If it is not specified, then the default FSDB file name is "novas.fsdb".

This command is valid only before executing $fsdbDumpvars and is ignored if specified after $fsdbDumpvars

$fsdbSuppress

The fsdbSuppressutility is used to skip dumping of few instances, scopes, modules and signals. The fsdbSuppressutility is a system task like other fsdb tasks.

For $fsdbSuppress() to be effective, it needs to be specified/called before $fsdbDumpvars

$fsdbAutoSwitchDumpfile

Automatically switch to a new dump file when the working FSDB file reaches the specified size or the specified wall time period.

After the dumping is finished, a virtual FSDB file (*.vf) is automatically created and list all of the generated FSDB files with the correct sequence. Only the virtual FSDB file, rather than all of the FSDB files, needs to be loaded to view the simulation results

When specified in the design to switch based on file size:

1
$fsdbAutoSwitchDumpfile(File_Size | File_Size_var, "FSDB_Name" |FSDB_Name_var, Number_of_Files | Number_of_Files_var[ ,"log_filename" | ,log_filename_var ], ["+no_overwrite"]);

When specified in the design to switch based on time period

1
$fsdbAutoSwitchDumpfile(File_Size | File_Size_var, "FSDB_Name" |FSDB_Name_var, Number_of_Files | Number_of_Files_var[ ,"log_filename" | ,log_filename_var ], ["+no_overwrite"], “+by_period”);

“+by_period”

$fsdbDumpvars

This command dumps the change in signal value to the FSDB file.

1
$fsdbDumpvars([ depth, | "level=",depth_var, ],[instance | "instance=",instance_var])

For VCS users, to include memory, MDA, packed array and structure information in the generated FSDB file, the -debug_access option must be included when VCS is invoked to compile the design

  • depth

    Specify how many sub-scope levels under the given scope you want to dump.

    • Specify this argument as 1 to dump the signals under the given scope
    • Specify this argument as 0 to dump all signals under the given scope and its descendant scopes.

    0: all signals in all scopes.

    1: all signals in current scope.

    2: all signals in the current scope and all scopes one level below.

    n: all signals in the current scope and all scopes n-1 levels below.

    tb.clk tb.u_div2.div2 tb.u_div2.u_div2neg.div2neg
    $fsdbDumpvars(0)
    $fsdbDumpvars(1)
    $fsdbDumpvars(2)
    $fsdbDumpvars(1, tb.u_div2)
    $fsdbDumpvars(0, tb.u_div2)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    module tb;
    reg clk;

    divider2 u_div2(clk);

    initial begin
    clk = 1'b0;
    forever #5 clk = ~clk;
    end

    initial begin
    #100;
    $finish();
    end

    initial begin
    #10;
    $fsdbDumpfile("tb.fsdb");
    //$fsdbDumpvars(0); // same with $fsdbDumpvars(0, tb)
    //$fsdbDumpvars(1); // same with $fsdbDumpvars(1, tb)
    //$fsdbDumpvars(2); // same with $fsdbDumpvars(2, tb)
    //$fsdbDumpvars(1, tb.u_div2);
    $fsdbDumpvars(0, tb.u_div2);
    #80 $finish();
    end

    endmodule


    module divider2 (
    input clk
    );
    reg div2;

    divider2neg u_div2neg(div2);

    always@(posedge clk) begin
    div2 = ~div2;
    end

    initial begin
    div2 = 1'b0;
    end

    endmodule

    module divider2neg (
    input clk
    );
    reg div2neg;

    always@(negedge clk) begin
    div2neg = ~div2neg;
    end

    initial begin
    div2neg= 1'b0;
    end

    endmodule

    compile

    1
    vcs -full64 -kdb -debug_access+all tb.v

    simulate

    1
    ./simv

    load fsdb

    1
    verdi -ssf tb.fsdb

    image-20220604192421888

$fsdbDumpon, $fsdbDumpoff

1
2
3
$fsdbDumpon(["+fsdbfile+filename"])

$fsdbDumpoff(["+fsdbfile+filename"])

These FSDB dumping commands turn dumping on and off. fsdbDumpon/fsdbDumpoff has the highest priority and overrides all other FSDB dumping commands.

fsdbDumpon/fsdbDumpoff is not restricted to only fsdbDumpvars. If there is more than one FSDB file open for dumping at one simulation run, fsdbDumpon/fsdbDumpoff may only affect a specific FSDB file by specifying the specific file name.

  • +fsdbfile+filename: Specify the FSDB file name. If not specified, the default FSDB file name is "novas.fsdb"

$fsdbDumpFinish

This command closes all FSDB files in the current simulation and stops dumping of signals. Although all FSDB files are closed automatically at the end of simulation, this dumping command can be invoked to explicitly close the FSDB files during the simulation

VCD

$dumpfile

The declaration onf $dumpfile must come before the $dumpvars or any other system tasks that specifies dump.

1
$dumpfile("test.vcd");

argument is necessary, there is no default value

$dumpvars

The $dumpvars is used to specify which variables are to be dumped ( in the file mentioned by $dumpfile). The simplest way to use it is without any argument.

1
$dumpvars(<levels> <, <module_or_variable>>* );

$dumplimit

It is possible that you inadvertantly generate huge file in Gigabytes ( for examples while dumping a Gigahertz clock for one second). To reduce such occurrences, we may use $dumplimit. It usage is

1
$dumplimit(<filesize>);

$dumpoff and $dumpon

During the simulation if you are bothered about about only during a certain interval then you can use $dumpoff and $dumpon. The following example shows its usage. It will dump the changes for first 100 units of time and then between 10200 and 10400 units of time.

1
2
3
4
5
6
7
8
9
10
11
12
initial
$monitor($time, " reset=%b,clk_out=%b",reset,clk_out);
initial begin
$dumpfile("clkdiv2n_tb.vcd");
$dumpvars(0,clkdiv2n_tb);
#100;
$dumpoff;
#10200;
$dumpon;
#10400;
$dumpoff;
end

demo

stimulus.v

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
`timescale 1ns / 1ps
module stimulus;
// Inputs
reg x;
reg y;
// Outputs
wire z;
// Instantiate the Unit Under Test (UUT)
comparator uut (
.x(x),
.y(y),
.z(z)
);

initial begin
$dumpfile("test.vcd");
$dumpvars(0);
// Initialize Inputs
x = 0;
y = 0;

#20 x = 1;
#20 y = 1;
#20 y = 0;
#20 x = 1;
#40 ;

end

initial begin
$monitor("t=%3d x=%d,y=%d,z=%d \n",$time,x,y,z, );
end

endmodule

comparator.v

1
2
3
4
5
6
7
8
9
module comparator(
input x,
input y,
output z
);

assign z = (~x & ~y) |(x & y);

endmodule
1
2
$ xrun stimulus.v comparator.v -access +rwc
$ simvision test.vcd

readmemb & readmemh in Verilog

Initialize Memory in Verilog [https://projectf.io/posts/initialize-memory-in-verilog/]

$readmemh("hex_memory_file.mem", memory_array, [start_address], [end_address]) $readmemb("bin_memory_file.mem", memory_array, [start_address], [end_address])

The system task has no versions to accept octal data or decimal data.

  • The 1st argument is the data file name.
  • The 2nd argument is the array to receive the data.
  • The 3rd argument is an optional start address, and if you provide it, you can also provide
  • The 4th argument optional end address.

Note, the 3rd and 4th argument address is for array not data file.

If the memory addresses are not specified anywhere, then the system tasks load file data sequentially from the lowest address toward the highest address.

The standard before 2005 specify that the system tasks load file data sequentially from the left memory address bound to the right memory address bound.

readtest.v

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
module readfile;
reg [7:0] array4 [0:3];
reg [7:0] array7 [6:0];
reg [7:0] array12 [11:0];

integer i;

initial begin
$readmemb("data.txt", array4);
$readmemb("data.txt", array7, 2, 5);
$readmemb("data.txt", array12);

for (i = 0; i < 4; i = i+1)
$display("array4[%0d] = %b", i, array4[i]);

$display("=========================");

for (i = 0; i < 7; i = i+1)
$display("array7[%0d] = %b", i, array7[i]);

$display("=========================");

for (i = 0; i < 12; i = i+1)
$display("array12[%0d] = %b", i, array12[i]);
end
endmodule

data.txt

1
2
3
4
5
6
7
8
00000000 
00000001
00000010
00000011
00000100
00000101
00000110
00001000

result

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
array4[0] = 00000000
array4[1] = 00000001
array4[2] = 00000010
array4[3] = 00000011
=========================
array7[0] = xxxxxxxx
array7[1] = xxxxxxxx
array7[2] = 00000000
array7[3] = 00000001
array7[4] = 00000010
array7[5] = 00000011
array7[6] = xxxxxxxx
=========================
array12[0] = 00000000
array12[1] = 00000001
array12[2] = 00000010
array12[3] = 00000011
array12[4] = 00000100
array12[5] = 00000101
array12[6] = 00000110
array12[7] = 00001000
array12[8] = xxxxxxxx
array12[9] = xxxxxxxx
array12[10] = xxxxxxxx
array12[11] = xxxxxxxx

iff in SystemVerilog

system verilog中的iff, [https://www.francisz.cn/2019/07/18/sv-iff]

1
2
@(posedge clk iff(vld));
do_something;

is equivalent to

1
2
3
4
5
forever begin
@(posedge clk);
if(vld) break;
end
do_something;

iff is more efficient than if because the expression is recalculated when vld transition rather than clk.

One example, detecting the negative edge of rtr_io.cb.frameo_n[da]

1
2
3
wait(rtr_io.cb.frameo_n[da] !== 0);
@(rtr_io.cb iff(rtr_io.cb.frameo_n[da] === 0 ));
$display("[DEBUG HGUO] %0t, rtr_io.cb.frameo_n[da] negedge", $realtime);

image-20220621182019927

[DEBUG HGUO] 6887250.0ns, rtr_io.cb.frameo_n[da] negedge

signed and unsigned arithmetic in Verilog

With implict sign extension, the implementation of signed arithmetic is DIFFERENT from that of unsigned. Otherwise, their implementations are same.

The implementations manifest the RTL's behaviour correctly

add without implicit sign extension

unsigned

rtl
1
2
3
4
5
6
7
module TOP (
input wire [2:0] data0
,input wire [2:0] data1
,output wire [2:0] result
);
assign result = data0 + data1;
endmodule

image-20220507114215200

synthesized netlist

image-20220507114307439

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 11:43:27 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [2:0] data1;
output [2:0] result;
wire n4, n5, n6;

an02d0 U6 ( .A1(data0[0]), .A2(data1[0]), .Z(n5) );
nr02d0 U7 ( .A1(data0[0]), .A2(data1[0]), .ZN(n4) );
nr02d0 U8 ( .A1(n5), .A2(n4), .ZN(result[0]) );
ad01d0 U9 ( .A(data1[1]), .B(data0[1]), .CI(n5), .CO(n6), .S(result[1]) );
xr03d1 U10 ( .A1(n6), .A2(data0[2]), .A3(data1[2]), .Z(result[2]) );
endmodule

vcs compile with -v /path/to/lib.v

signed

rtl
1
2
3
4
5
6
7
module TOP (
input wire signed [2:0] data0
,input wire signed [2:0] data1
,output wire signed [2:0] result
);
assign result = data0 + data1;
endmodule

image-20220507114654777

synthesized netlist

image-20220507114844111

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 11:48:54 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [2:0] data1;
output [2:0] result;
wire n4, n5, n6;

an02d0 U6 ( .A1(data0[0]), .A2(data1[0]), .Z(n5) );
nr02d0 U7 ( .A1(data0[0]), .A2(data1[0]), .ZN(n4) );
nr02d0 U8 ( .A1(n5), .A2(n4), .ZN(result[0]) );
ad01d0 U9 ( .A(data1[1]), .B(data0[1]), .CI(n5), .CO(n6), .S(result[1]) );
xr03d1 U10 ( .A1(n6), .A2(data0[2]), .A3(data1[2]), .Z(result[2]) );
endmodule

add WITH implicit sign extension

unsigned with 0 extension

rtl
1
2
3
4
5
6
7
module TOP (
input wire [2:0] data0 // 3 bit unsigned
,input wire [1:0] data1 // 2 bit unsigned
,output wire [2:0] result // 3 bit unsigned
);
assign result = data0 + data1;
endmodule

image-20220507121521303

synthesized netlist

image-20220507121622001

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 12:15:58 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [1:0] data1;
output [2:0] result;
wire n4, n5, n6;

an02d0 U6 ( .A1(data1[0]), .A2(data0[0]), .Z(n6) );
ad01d0 U7 ( .A(data1[1]), .B(data0[1]), .CI(n6), .CO(n4), .S(result[1]) );
xr02d1 U8 ( .A1(data0[2]), .A2(n4), .Z(result[2]) );
nr02d0 U9 ( .A1(data1[0]), .A2(data0[0]), .ZN(n5) );
nr02d0 U10 ( .A1(n6), .A2(n5), .ZN(result[0]) );
endmodule

signed with implicit sign extension

rtl
1
2
3
4
5
6
7
module TOP (
input wire signed [2:0] data0
,input wire signed [1:0] data1
,output wire signed [2:0] result
);
assign result = data0 + data1;
endmodule

image-20220507122053948

synthesized netlist

image-20220507122217830

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Sat May 7 12:21:51 2022
/////////////////////////////////////////////////////////////


module TOP ( data0, data1, result );
input [2:0] data0;
input [1:0] data1;
output [2:0] result;
wire n6, n7, n8, n9, n10;

nd02d0 U9 ( .A1(data1[0]), .A2(data0[0]), .ZN(n10) );
inv0d0 U10 ( .I(n10), .ZN(n9) );
nr02d0 U11 ( .A1(data0[1]), .A2(data1[1]), .ZN(n7) );
aor221d1 U12 ( .B1(n9), .B2(data1[1]), .C1(n10), .C2(data0[1]), .A(n7), .Z(
n6) );
xn02d1 U13 ( .A1(data0[2]), .A2(n6), .ZN(result[2]) );
ora21d1 U14 ( .B1(data1[0]), .B2(data0[0]), .A(n10), .Z(result[0]) );
aor21d1 U15 ( .B1(data1[1]), .B2(data0[1]), .A(n7), .Z(n8) );
mx02d0 U16 ( .I0(n10), .I1(n9), .S(n8), .Z(result[1]) );
endmodule

Latch Inference in Verilog

UC Berkeley CS150 Lec #20: Finite State Machines [slides]

always@( * )

always@( * ) blocks are used to describe Combinational Logic, or Logic Gates. Only = (blocking) assignments should be used in an always@( * ) block.

Latch Inference

If you DON'T assign every element that can be assigned inside an always@( * ) block every time that always@( * ) block is executed, a latch will be inferred for that element

The approaches to avoid latch generation:

  • set default values
  • proper use of the else statement, and other flow constructs

without default values

latch is generated

RTL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
module TOP (
input wire Trigger,
input wire Pass,
output reg A,
output reg C
);
always @(*) begin
A = 1'b0;
if (Trigger) begin
A = Pass;
C = Pass;
end
end
endmodule
synthesized netlist

image-20220509170640006

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Mon May 9 17:09:18 2022
/////////////////////////////////////////////////////////////


module TOP ( Trigger, Pass, A, C );
input Trigger, Pass;
output A, C;


lanhq1 C_reg ( .E(Trigger), .D(Pass), .Q(C) );
an02d0 U3 ( .A1(Pass), .A2(Trigger), .Z(A) );
endmodule

add default value

Default values are an easy way to avoid latch generation

RTL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
module TOP (
input wire Trigger,
input wire Pass,
output reg A,
output reg C
);
always @(*) begin
A = 1'b0;
C = 1'b1;
if (Trigger) begin
A = Pass;
C = Pass;
end
end
synthesized netlist

image-20220509171319204

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/////////////////////////////////////////////////////////////
// Created by: Synopsys DC Ultra(TM) in wire load mode
// Version : S-2021.06-SP5
// Date : Mon May 9 17:12:47 2022
/////////////////////////////////////////////////////////////


module TOP ( Trigger, Pass, A, C );
input Trigger, Pass;
output A, C;


nd12d0 U5 ( .A1(Pass), .A2(Trigger), .ZN(C) );
an02d0 U6 ( .A1(Pass), .A2(Trigger), .Z(A) );
endmodule

if evaluation

signed number cast to unsigned automatically before evaluating

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// tb.v
module tb;
reg signed [1:0] datasg;
reg [1:0] dataug;

initial begin
datasg = 2'b11;
dataug = 2'b11;

$display("datasg(%%d): %d", datasg);
$display("dataug(%%d): %d", dataug);

if (datasg)
$display("datasg is OK");
if (dataug)
$display("dataug is OK");

$finish();
end
endmodule
1
2
3
4
5
6
7
8
9
$ vlog tb.v
$ vsim -c -do "run;exit" work.tb
# Loading work.tb(fast)
# run
# datasg(%d): -1
# dataug(%d): 3
# datasg is OK
# dataug is OK
# ** Note: $finish : tb.v(16)

Arithmetic in Verilog

unsigned + unsigned = unsigned

1
2
3
4
5
6
7
8
9
10
function [7:0] satadd_uuu8b;   // unsigned + unsigned = unsigned
input [7:0] a;
input [7:0] b;

reg [8:0] t; // extend 1b
begin
t = {1'b0, a} + {1'b0, b};
satop_uuu16b = t[8] ? {8{1'b1}} : t[7:0];
end
endfunction

1'b1: overflow

signed + signed = signed

1
2
3
4
5
6
7
8
9
10
11
12
function [7:0] satadd_sss8b;    // signed + signed = signed
input signed [7:0] a;
input signed [7:0] b;

reg signed [8:0] t; // extend 1b
begin
t = a + b; // extend sign bit automatically
satadd_sss8b = (t[8:7] == 2'b01) ? {1'b0, 7{1'b1}} : // up sat
(t[8:7] == 2'b10) ? {1'b1, 7{1'b0}} : // dn sat
t[7:0];
end
endfunction

2'b01: overflow

2'b10: underflow

signed + unsigned = unsigned

1
2
3
4
5
6
7
8
9
10
11
12
function [7:0] satadd_suu8b;    // signed + unsigned = unsigned
input signed [7:0] a;
input [7:0] b;

reg signed [8:0] t; // extend 1b
begin
t = {a[7], a} + {1'b0, b};
satadd_ssu8b = (t[8:7] == 2'b10) ? {8{1'b1}} : // up saturate for unsigned
(t[8:7] == 2'b11) ? {8{1'b0}} : // dn saturate for unsigned
t[7:0];
end
endfunction

signed + unsigned = signed

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
function signed [7:0] satop_sus8b;    //signed +/- unsigned = signed
input signed [7:0] a;
input [7:0] b;
input plus;

reg signed [8:0] t; // extend 1b
begin
if(plus) begin
t = {a[7], a} + {1'b0, b};
satop_sus8b = (t[8:7]==2'b01) ? {1'b0, {7{1'b1}}} // up saturate for signed
: t[7:0];
end else begin
t = {a[7], a} - {1'b0, b};
satop_sus8b = (t[8:7]==2'b10) ? {1'b1, {7{1'b0}}} // dn saturate for signed
: t[7:0];
end
end
endfunction

Overflow Detection in Verilog

Overflow Detection: [http://www.c-jump.com/CIS77/CPU/Overflow/lecture.html]

  • Arithmetic operations have a potential to run into a condition known as overflow.
  • Overflow occurs with respect to the size of the data type that must accommodate the result.
  • Overflow indicates that the result was too large or too small to fit in the original data type.

Overflow when adding unsigned number

When two unsigned numbers are added, overflow occurs if

  • there is a carry out of the leftmost bit.

Overflow when adding signed numbers

When two signed 2's complement numbers are added, overflow is detected if:

  1. both operands are positive and the result is negative, or
  2. both operands are negative and the result is positive.

Notice that when operands have opposite signs, their sum will never overflow. Therefore, overflow can only occur when the operands have the same sign.

A B carryout_sum overflow
011 (3) 011 (3) 0_110 (6) overflow
100 (-4) 100 (-4) 1_000 (-8) underflow
111 (-1) 110 (-2) 1_101 (-3) -

carryout information ISN'T needed to detect overflow/underflow for signed number addition

EXTBIT:MSB

extended 1bit and msb bit can be used to detect overflow or underflow

1
2
3
4
5
6
7
8
9
10
reg signed  [1:0]      acc_inc;
reg signed [10-1:0] acc;
wire signed [10 :0] acc_w; // extend 1b for saturation
wire signed [10-1:0] acc_stat;

assign acc_w = acc + acc_inc; // signed arithmetic

assign acc_stat = (acc_w[10-1 +: 2] == 2'b01) ? {1'b0, {(10-1){1'b1}}} : // up saturation
(acc_w[10-1 +: 2] == 2'b10) ? {1'b1, {(10-1){1'b0}}} : // down saturation
acc_w[10-1:0];

2'b01 : overflow, up saturation

2'b10: underflow, down saturation

2's complement negative number

  1. Flip all bits
  2. Add 1.

N-bit signed number \[ A = -M_{N-1}2^{N-1}+\sum_{k=0}^{N-2}M_k2^k \] Flip all bits \[\begin{align} A_{flip} &= -(1-M_{N-1})2^{N-1} +\sum_{k=0}^{N-2}(1-M_k)2^k \\ &= M_{N-1}2^{N-1}-\sum_{k=0}^{N-2}M_k2^k -2^{N-1}+\sum_{k=0}^{N-2}2^k \\ &= M_{N-1}2^{N-1}-\sum_{k=0}^{N-2}M_k2^k -1 \end{align}\]

Add 1 \[\begin{align} A_- &= A_{flip}+1 \\ &= M_{N-1}2^{N-1}-\sum_{k=0}^{N-2}M_k2^k \\ &= -A \end{align}\]

reference

Lee, Weng Fook, Weng Fook Lee, and Glaser. Learning from VLSI design experience. Springer International Publishing, 2019.

Bevan Baas, EEC281 VLSI Digital Signal Processing, [https://www.ece.ucdavis.edu/~bbaas/281/]

One will see the following phenomena:

  1. A power spectrum is equal to the square of the absolute value of DFT.
  2. The sum of all power spectral lines in a power spectrum is equal to the power of the input signal.
  3. The integral of a PSD is equal to the power of the input signal.

power spectrum has units of \(V^2\) and power spectral density has units of \(V^2/Hz\)

Parseval's theorem is a property of the Discrete Fourier Transform (DFT) that states: \[ \sum_{n=0}^{N-1}|x(n)|^2 = \frac{1}{N}\sum_{k=0}^{N-1}|X(k)|^2 \] Multiply both sides of the above by \(1/N\): \[ \frac{1}{N}\sum_{n=0}^{N-1}|x(n)|^2 = \frac{1}{N^2}\sum_{k=0}^{N-1}|X(k)|^2 \] \(|x(n)|^2\) is instantaneous power of a sample of the time signal. So the left side of the equation is just the average power of the signal over the N samples. \[ P_{\text{av}} = \frac{1}{N^2}\sum_{k=0}^{N-1}|X(k)|^2\text{, }V^2 \] For the each DFT bin, we can say: \[ P_{\text{bin}}(k) = \frac{1}{N^2}|X(k)|^2\text{, k=0:N-1, }V^2/\text{bin} \] This is the power spectrum of the signal.

Note that \(X(k)\) is the two-sided spectrum. If \(x(n)\) is real, then \(X(k)\) is symmetric about \(fs/2\), with each side containing half of the power. In that case, we can choose to keep just the one-sided spectrum, and multiply Pbin by 2 (except DC & Nyquist):

\[ P_{\text{bin,one-sided}}(k) = \left\{ \begin{array}{cl} \frac{1}{N^2}|X(0)|^2 & : \ k = 0 \\ \frac{2}{N^2}|X(k)|^2 & : \ 1 \leq k \leq N/2-1 \\ \frac{1}{N^2}|X(N/2)|^2 & : \ k = N/2 \end{array} \right. \]

1
2
3
4
5
6
7
8
rng default
Fs = 1000;
t = 0:1/Fs:1-1/Fs;
x = cos(2*pi*100*t) + randn(size(t));
N = length(x);
xdft = fft(x);
xsq_sum_avg = sum(x.^2)/N;
specsq_sum_avg = sum(abs(xdft).^2)/N^2;

where xsq_sum_avg is same with specsq_sum_avg

For a discrete-time sequence x(n), the DFT is defined as: \[ X(k) = \sum_{n=0}^{N-1}x(n)e^{-j2\pi kn/N} \] By it definition, the DFT does NOT apply to infinite duration signals.

scaling DFT results

spectra type unit
amplitude spectrum V
power spectrum V^2
power spectrum density V^2/Hz

Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals Jens Ahrens, Carl Andersson, Patrik Höstmad, Wolfgang Kropp

[https://github.com/AppliedAcousticsChalmers/scaling-of-the-dft]

Different scaling is needed to apply for amplitude spectrum, power spectrum and power spectrum density, which shown as below

image-20220427220639646

\(f_s\) in Eq.(13) is sample rate rather than frequency resolution.

And Eq.(13) can be expressed as \[ \text{PSD}(k) =\frac{1}{f_{\text{res}}\cdot N\sum_{n}w^2(n)}\left|X_{\omega}(k)\right|^2 \] where \(f_{\text{res}}\) is frequency resolution

We define the following two sums for normalization purposes:

\[\begin{align} S_1 &= \sum_{j=0}^{N-1}w_j \\ S_2 &= \sum_{j=0}^{N-1}w_j^2 \end{align}\]

Given Eq.(12) and Eq.(13) \[\begin{align} \text{PS(k)} &= \text{PSD(k)} \cdot \frac{f_s \sum_{n}w^2(n)}{(\sum_{n}\omega(n))^2} \\ &= \text{PSD(k)} \cdot f_{\text{res}} \cdot \frac{N \sum_{n}w^2(n)}{(\sum_{n}w(n))^2} \\ &= \text{PSD(k)} \cdot f_{\text{res}} \cdot \frac{N S_2}{S_1^2} \\ &= \text{PSD(k)} \cdot f_{\text{res}} \cdot \text{NENBW} \\ &= \text{PSD(k)} \cdot \text{ENBW} \end{align}\]

where Normalized Equivalent Noise BandWidth is defined as \[ \text{NENBW} =\frac{N S_2}{S_1^2} \] and Effective Noise BandWidth is \[ \text{ENBW} =f_{\text{res}} \cdot \frac{N S_2}{S_1^2} \]

For Rectangular window, \(\text{ENBW} =f_{\text{res}}\)

This equivalent noise bandwidth is required when the resulting spectrum is to be expressed as spectral density (such as for noise measurements).

Eq.(13) Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Fs = 1024; 
t = (0:1/Fs:1-1/Fs).';
x = sin(2*pi*t*200);
Nx = length(x);
% Window data
w = hamming(Nx); % hamming window
xw = x.*w;
% Calculate power
nfft = Nx;
X = fft(xw,nfft);

NumUniquePts = nfft/2+1;
Pxx1 = abs(X(1:NumUniquePts)).^2/sum(w.^2)/Fs; %!!! Eq.(13)
Pxx1(2:end-1) = Pxx1(2:end-1)*2; % twosided -> onesided
Fx1 = (0:NumUniquePts-1)*Fs/nfft;

[Pxx2,Fx2] = pwelch(x,w,0,nfft,Fs);

plot(Fx1,10*log10(Pxx1),Fx2,10*log10(Pxx2),'r--');
legend('PSD via Eq.(13)','PSD via pwelch')

image-20220514165525958

window effects

It is possible to correct both the amplitude and energy content of the windowed signal to equal the original signal. However, both corrections cannot be applied simultaneously

Amplitude correction

Linear (amplitude) Spectrum \[ X(k) = \frac{X_{\omega}(k)}{S_1} \] power spectrum \[ \text{PS}=\frac{\left| X_{\omega}(k) \right|^2}{S_1^2} \]

usage: tone amplitude

Power correction

power spectral density (PSD) \[ \text{PSD} =\frac{\left|X_{\omega}(k)\right|^2}{f_s\cdot S_2} \]

We have \(\text{PSD} = \frac{\text{PS}}{\text{ENBW}}\), where \(\text{ENBW}=\frac{N \cdot S_2}{S_1^2}f_{\text{res}}\)

linear power spectrum \[ \text{PS}_L=\frac{|X_{\omega}(k)|^2}{N\cdot S_2} \]

usage: RMS value, total power \[ \text{PS}_L(k)=\text{PSD(k)} \cdot f_{\text{res}} \]

Window Correction Factors

While a window helps reduce leakage (The window reduces the jumps at the ends of the repeated signal), the window itself distorts the data in two different ways:

  • Amplitude – The amplitude of the signal is reduced

    This is due to the fact that the window removes information in the signal

  • Energy – The area under the curve, or energy of the signal, is reduced

rtaImage

Window correction factors are used to try and compensate for the effects of applying a window to data. There are both amplitude and energy correction factors.

Window Type Amplitude Correction (\(K_a\)) Energy Correction (\(K_e\))
Rectangluar 1.0 1.0
hann 1.9922 1.6298
blackman 2.3903 1.8155
kaiser 1.0206 1.0204

Only the Uniform window (rectangular window), which is equivalent to no window, has the same amplitude and energy correction factors.

\[\begin{align} X(k) &= \frac{X_\omega(k)}{\sum_n w(n)} \\ &= \frac{N}{\sum_n w(n)} \frac{X_\omega(k)}{N} \\ &= K_a \cdot \frac{X_\omega(k)}{N} \end{align}\] where \(K_a = \frac{N}{\sum_n \omega(n)}\)

In literature, Coherent power gain is defined show below, which is close related to \(K_a\) \[ \text{Coherent power gain (dB)} = 20 \; log_{10} \left( \frac{\sum_n w[n]}{N} \right) \]

amplitude-correction

With amplitude correction, by multiplying by two, the peak value of both the original and corrected spectrum match. However the energy content is not the same.

The amplitude corrected signal (red) appears to have more energy, or area under the curve, than the original signal (blue).

\[\begin{align} \text{PSD} &=\frac{1}{f_{\text{res}}\cdot N\sum_{n}w^2(n)}\left|X_{\omega}(K)\right|^2 \\ &= \frac{N}{\sum_{n}w^2(n)} \frac{\left|X_{\omega}(K)\right|^2}{f_{\text{res}}\cdot N^2} \\ &=\left|K_e \cdot \frac{X_{\omega}(K)}{N}\right|^2 \cdot \frac{1}{f_{\text{res}}} \end{align}\]

wherer \(K_e = \sqrt{\frac{N}{\sum_{n}\omega^2(n)}}\)

energey-correction

Multiplying the values in the spectrum by 1.63, rather than 2, makes the area under the curve the same for both the original signal (blue) and energy corrected signal (red)

hanning's correction factors:

1
2
3
4
N = 256;
w = hanning(N);
Ka = N/sum(w)
Ke = sqrt(N/sum(w.^2))
1
2
3
4
5
6
7
8
Ka =

1.9922


Ke =

1.6298

Amplitude,Power Correction Example

image-20220518002927545

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
%% 200 Hz, amplitude 1 and 200/3 Hz amplitude 0.3
fosc = 200;
Fs = 8192;
Nx = Fs*4;
tstop = Nx/Fs;
t = (0:1/Fs:tstop-1/Fs).';
x = sin(2*pi*t*fosc) + 0.3*sin(2*pi*t*fosc/3);
Nx = length(x);

%% direct method, time domain
P0 = sum(x.^2)/Nx;
xrms0 = sqrt(P0); % RMS value caculated by time domain
fprintf("xrms0: %.4f\n", xrms0);

%% Window data
w = hamming(Nx); % hamming window
%w = ones(Nx, 1); % rectangular window

S1= sum(w);
S2 = sum(w.^2);
NENBW = Nx*S2/S1^2;

xw = x.*w;
nfft = Nx;
X = fft(xw,nfft); % windowed signal fft
NumUniquePts = nfft/2+1;
Fx1 = (0:NumUniquePts-1)*Fs/nfft;

%% fft - ps -> rms
Xmag = abs(X);
PS1 = Xmag.^2/S1^2; %!!! Eq.(12)
PS1 = PS1(1:NumUniquePts);
PS1(2:end-1) = PS1(2:end-1)*2; % twosided -> onesided
Pmax_fft = max(PS1); % !!! amplitude OK
fprintf("Pmax_fft: %.4f\n", Pmax_fft);

% linear power spectrum
PSL_fft = Xmag.^2/nfft/S2;
PSL_fft = PSL_fft(1:NumUniquePts);
PSL_fft = PSL_fft(2:end-1)*2;
Ptot_fft1 = sum(PSL_fft);
xrms_PSL = sqrt(Ptot_fft1); % !!! power OK
fprintf("xrms_PSL: %.4f\n", xrms_PSL);

% use psd*fres to calculate total power
Pxx1 = abs(X(1:NumUniquePts)).^2/sum(w.^2)/Fs; %!!! Eq.(13)
Pxx1(2:end-1) = Pxx1(2:end-1)*2; % twosided -> onesided
Ptot_fft2 = sum(Pxx1*Fs/nfft);
xrms_psd = sqrt(Ptot_fft2); % !!! power OK
fprintf("xrms_psd: %.4f\n", xrms_psd);


%% pwelch
[Pxx2,Fx2] = pwelch(x,w,0,nfft,Fs);
fres = Fx2(2)-Fx2(1);

PS2 = Pxx2*fres;
Ptot_pwelch = sum(PS2);
xrms_pwelch = sqrt(Ptot_pwelch); % !!! amplitude OK
fprintf("xrms_pwelch: %.4f\n", xrms_pwelch);

% PS = PSD * ENBW always hold
ENBW = fres * NENBW;
PS2_pwelch = Pxx2*ENBW;
Pmax_pwelch = max(PS2_pwelch); % !!! amplitude OK
fprintf("Pmax_pwelch: %.4f\n", Pmax_pwelch);

%% plot psd of two methods
plot(Fx1,10*log10(Pxx1),Fx2,10*log10(Pxx2),'r--');
legend('PSD via Eq.(13)','PSD via pwelch')
grid on;
xlabel('Freq (Hz)');
ylabel('dB; V^2/Hz')
1
2
3
4
5
6
xrms0: 0.7383
Pmax_fft: 0.5000
xrms_PSL: 0.7382
xrms_psd: 0.7382
xrms_pwelch: 0.7382
Pmax_pwelch: 0.5000

RMS

We may also want to know for example the RMS value of the signal, in order to know how much power the signal generates. This can be done using Parseval’s theorem.

For a periodic signal, which has a discrete spectrum, we obtain its total RMS value by summing the included signals using \[ x_{\text{rms}}=\sqrt{\sum R_{xk}^2} \] Where \(R_{xk}\) is the RMS value of each sinusoid for \(k=1,2,3,...\) The RMS value of a signal consisting of a number of sinusoids is consequently equal to the square root of the sum of the RMS values.

This result could also be explained by noting that sinusoids of different frequencies are orthogonal, and can therefore be summed like vectors (using Pythagoras’ theorem)

For a random signal we cannot interpret the spectrum in the same way. As we have stated earlier, the PSD of a random signal contains all frequencies in a particular frequency band, which makes it impossible to add the frequencies up. Instead, as the PSD is a density function, the correct interpretation is to sum the area under the PSD in a specific frequency range, which then is the square of the RMS, i.e., the mean-square value of the signal \[ x_{\text{rms}}=\sqrt{\int G_{xx}(f)df}=\sqrt{\text{area under the curve}} \] The linear spectrum, or RMS spectrum, defined by \[\begin{align} X_L(k) &= \sqrt{\text{PSD(k)} \cdot f_{\text{res}}}\\ &=\sqrt{\frac{\left|X_{\omega}(k)\right|^2}{f_{\text{res}}\cdot N\sum_{n}\omega^2(n)} \cdot f_{\text{res}}} \\ &= \sqrt{\frac{\left|X_{\omega}(k)\right|^2}{N\sum_{n}\omega^2(n)}} \\ &= \sqrt{\frac{|X_{\omega}(k)|^2}{N\cdot S_2}} \end{align}\]

The corresponding linear power spectrum or RMS power spectrum can be defined by \[\begin{align} \text{PS}_L(k)&=X_L(k)^2=\frac{|X_{\omega}(k)|^2}{S_1^2}\frac{S_1^2}{N\cdot S_2} \\ &=\text{PS}(k) \cdot \frac{S_1^2}{N\cdot S_2} \end{align}\]

So, RMS can be calculated as below \[\begin{align} P_{\text{tot}} &= \sum \text{PS}_L(k) \\ \text{RMS} &= \sqrt{P_{\text{tot}}} \end{align}\]

DFT averaging

we use \(N= 8*\text{nfft}\) time samples of \(x\) and set the number of overlapping samples to \(\text{nfft}/2 = 512\). pwelch takes the DFT of \(\text{Navg} = 15\) overlapping segments of \(x\), each of length \(\text{nfft}\), then averages the \(|X(k)|^2\) of the DFT’s.

In general, if there are an integer number of segments that cover all samples of N, we have \[ N = (N_{\text{avg}}-1)*D + \text{nfft} \] where \(D=\text{nfft}-\text{noverlap}\). For our case, with \(D = \text{nfft}/2\) and \(N/\text{nfft} = 8\), we have \[ N_{\text{avg}}=2*N/\text{nfft}-1=15 \] For a given number of time samples N, using overlapping segments lets us increase \(N_{avg}\) compared with no overlapping. In this case, overlapping of 50% increases \(N_{avg}\) from 8 to 15. Here is the Matlab code to compute the spectrum:

1
2
3
4
5
6
7
8
nfft= 1024;
N= nfft*8; % number of samples in signal
n= 0:N-1;
x= A*sin(2*pi*f0*n*Ts) + .1*randn(1,N); % 1 W sinewave + noise
noverlap= nfft/2; % number of overlapping time samples
window= hanning(nfft);
[pxx,f]= pwelch(x,window,noverlap,nfft,fs); % W/Hz PSD
PdB_bin= 10*log10(pxx*fs/nfft); % dBW/bin

figure 3_99153

DFT averaging reduces the variance \(\sigma^2\) of the noise spectrum by a factor of \(N_{avg}\), as long as noverlap is not greater than nfft/2

image-20220420013255660

image-20230526011628682

fftshift

The result of fft and its index is shown as below

complexDFT_Matlab_index

After fftshift

FFTshift_Matlab_index_resolved

1
2
3
4
5
>> fftshift([0 1 2 3 4 5 6 7])

ans =

4 5 6 7 0 1 2 3

image-20250825230510355

1
2
3
4
5
6
7
8
9
clear;
N = 100;
fs = 1000;
fx = 100;
x = cos(2*pi*fx/fs*[0:1:N-1]);
s = abs(fftshift(fft(x)))/N;
fx = linspace(0,N-1,N)*fs/N -fs/2;
plot(fx, s, 'o-', 'linewidth', 2);
grid on; grid minor;

dft and psd function in virtuoso

  • dft always return

    image-20230525011604685

To compensate windowing effect, \(W(n)\), the dft output should be multiplied by \(K_a\), e.g. 1.9922 for hanning window.

  • psd function has taken into account \(K_e\), postprocessing is not needed

reference

Heinzel, Gerhard, A. O. Rüdiger and Roland Schilling. "Spectrum and spectral density estimation by the Discrete Fourier transform (DFT), including a comprehensive list of window functions and some new at-top windows." (2002). URL: https://holometer.fnal.gov/GH_FFT.pdf

Demonstration-DFT-PS-PSD

mathworks, Power Spectral Density Estimates Using FFT

Lyons, Richard, “Widowing Functions Improve FFT Results”, EDN, June 1, 1998. https://www.edn.com/electronics-news/4383713/Windowing-Functions-Improve-FFT-Results-Part-I

Rapuano, Sergio, and Harris, Fred J., An Introduction to FFT and Time Domain Windows, IEEE Instrumentation and Measurement Magazine, December, 2007. https://ieeexplore.ieee.org/document/4428580

Neil Robertson, The Power Spectrum

—, Use Matlab Function pwelch to Find Power Spectral Density – or Do It Yourself

—, Evaluate Window Functions for the Discrete Fourier Transform

—. Add a Power Marker to a Power Spectral Density (PSD) Plot [https://www.dsprelated.com/showarticle/1387.php]

Mathuranathan Viswanathan, Interpret FFT, complex DFT, frequency bins & FFTShift

robert bristow-johnson, Does windowing affect Parseval's theorem?

Window Correction Factors URL: https://community.sw.siemens.com/s/article/window-correction-factors

Brandt, A, 2011. Noise and vibration analysis: signal analysis and experimental procedures. John Wiley & Sons

What should be the correct scaling for PSD calculation using fft. URL:https://dsp.stackexchange.com/a/32205

POWER SPECTRAL DENSITY CALCULATION VIA MATLAB Revision C. URL: http://www.vibrationdata.com/tutorials_alt/psd_mat.pdf

Jens Ahrens, Carl Andersson, Patrik Höstmad, Wolfgang Kropp, “Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals” in 148th Convention of the AES, e-Brief 56, May 2020 [ pdf, web ].

Manolakis, D., & Ingle, V. (2011). Applied Digital Signal Processing: Theory and Practice. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511835261

In T* DRC deck, it is based on the voltage recognition CAD layer and net connection to calculate the voltage difference between two neighboring nets by the following formula:

\[ \Delta V = \max(V_H(\text{net1})-V_L(\text{net2}), V_H(\text{net2})-V_L(\text{net1})) \]

where \[ V_H(\text{netx}) = \max(V(\text{netx})) \] and \[ V_L(\text{netx}) = \min(V(\text{netx})) \]

  • The \(\Delta V\) will be 0 if two nets are connected as same potential
  • If \(V_L \gt V_H\) on a net, DRC will report warning on this net

Voltage recognition CAD Layer

Two method

  1. voltage text layer

    You place specific voltage text on specific drawing layer

  2. voltage marker layer

    Each voltage marker layer represent different voltage for specific drawing layer

voltage text layer has higher priority than voltage marker layer and is recommended

voltage text layer

For example M3

Process Layer CAD Layer# Voltage High Voltage High Top
(highest priority)
Voltage Low Voltage Low Top
(highest priority)
M3 63 110 112 111 113

where 63 is layer number, 110 ~ 113 is datatype

voltage marker layer

Different data type represent different voltage, like

DataType 100 101 102 ... 109
Voltage 0.0 0.1 0.2 0.3 0.9

Example

image-20220503171006936

reference

Automate those voltage-dependent DRC checks! - siemens

  • Laplace transform
    • a generalization of the continuous-time Fourier transform
    • converts integro-differential equations into algebraic equations
  • z-transforms
    • a generalization of the discrete-time Fourier transform
    • converts difference equations into algebraic equations

system function, transfer function: \(H(s)\)

frequency response: \(H(j\omega)\), if the ROC of \(H(s)\) includes the imaginary axis, i.e.\(s=j\omega \in \text{ROC}\)

Laplace Transform

To specify the Laplace transform of a signal, both the algebraic expression and the ROC are required. The ROC is the range of values of \(s\) for the integral of \(t\) converges

bilateral Laplace transform

image-20241001140721422

where \(s\) in the ROC and \(\mathfrak{Re}\{s\}=\sigma\)

The formal evaluation of the integral for a general \(X(s)\) requires the use of contour integration in the complex plane. However, for the class of rational transforms, the inverse Laplace transform can be determined without directly evaluating eq. (9.56) by using the technique of partial fraction expansion

ROC Property

The range of values of s for which the integral in converges is referred to as the region of convergence (which we abbreviate as ROC) of the Laplace transform

image-20241001133645466

image-20241001152307222

image-20241001153011746

i.e. no pole in RHP for stable LTI sytem

System Causality

For a causal LTI system, the impulse response is zero for \(t \lt 0\) and thus is right sided

image-20241001164006698

causality implies that the ROC is to the right of the rightmost pole, but the converse is not in general true, unless the system function is rational

System Stability

The system is stable, or equivalently, that \(h(t)\) is absolutely integrable and therefore has a Fourier transform, then the ROC must include the entire \(j\omega\)-axis

image-20241001223128791

image-20241001163547031

all of the poles have negative real parts

Unilateral Laplace transform

analyzing causal systems and, particularly, systems specified by linear constant-coefficient differential equations with nonzero initial conditions (i.e., systems that are not initially at rest)

image-20241001180202348

A particularly important difference between the properties of the unilateral and bilateral transforms is the differentiation property \(\frac{d}{dt}x(t)\)

Laplace Transform
Bilateral Laplace Transform \(sX(s)\)
Unilateral Laplace Transform \(sX(s)-x(0^-)\)

image-20241001183543726

Integration by parts for unilateral Laplace transform

image-20241001184124032

in Bilateral Laplace Transform

image-20241001231719320

In fact, the initial- and final-value theorems are basically unilateral transform properties, as they apply only to signals \(x(t)\) that are identically \(0\) for \(t \lt 0\).

\(z\)-Transform

The \(z\)-transform for discrete-time signals is the counterpart of the Laplace transform for continuous-time signals

image-20241001204402907

where \(z=re^{j\omega}\)

image-20241001204810918

The \(z\)-transform evaluated on the unit circle corresponds to the Fourier transform

image-20241001223944822

ROC Property

image-20241001222148012

system stability

The system is stable, or equivalently, that \(h[n]\) is absolutely summable and therefore has a Fourier transform, then the ROC must include the unit circle

\(z\)-Transform Table

image-20250622105429578

[http://www.ws.binghamton.edu/Fowler/fowler%20personal%20page/EE301_files/ZT%20Tables_rev2.pdf]

Unilateral \(z\)-Transform

image-20241001225603923

The time shifting property is different in the unilateral case because the lower limit in the unilateral transform definition is fixed at zero, \(x[n-n_0]\)

bilateral \(z\)-transform

image-20241001232749554

unilateral \(z\)-transform

image-20241001233014484

Initial rest condition

image-20241001233548788

Initial Value Theorem & Final Value Theorem

Laplace Transform

Two valuable Laplace transform theorem

  • Initial Value Theorem, which states that it is always possible to determine the initial value of the time function \(f(t)\) from its Laplace transform \[ \lim _{s\to \infty}sF(s) = f(0^+) \]

  • Final Value Theorem allows us to compute the constant steady-state value of a time function given its Laplace transform \[ \lim _{s\to 0}sF(s) = f(\infty) \]

    If \(f(t)\) is step response, then \(f(0^+) = H(\infty)\) and \(f(\infty) = H(0)\), where \(H(s)\) is transfer function

\(z\)-transform

Initial Value Theorem \[ f[0]=\lim_{z\to\infty}F(z) \] final value theorem \[ \lim_{n\to\infty}f[n]=\lim_{z\to1}(z-1)F(z) \]

Coert Vonk. Initial/final value proofs [https://coertvonk.com/physics/lfz-transforms/z/initial-final-value-proofs-31543]

\(s\)- and \(z\)-Domains Connection

discrete-time systems also can be analyzed by means of the Laplace transform — the \(z\)-transform is the Laplace transform in disguise and that discrete-time systems can be analyzed as if they were continuous-time systems.

image-20241002112611432

A continuous-time system with transfer function \(H(s)\) that is identical in structure to the discrete-time system \(H[z]\) except that the delays in \(H[z]\) are replaced by elements that delay continuous-time signals

delay element Time domain Transform
\(z\)-transform \(\delta[n-1]\) \(z^{-1}\)
Laplace transform \(\delta(t-T)\) \(e^{-sT}\)
\(z\)-transform Laplace transform
\(x[n]\) \(X[z]=\sum_{n=0}^{\infty}x[n]z^{-n}\) \(\bar{x}(t)=\sum_{n=0}^{\infty}x[n]\delta(t-nT)\) \(\overline{X}(s)=\sum_{n=0}^{\infty}x[n]e^{-snT}\)
\(y[n]\) \(Y[z]=\sum_{n=0}^{\infty}y[n]z^{-n}\) \(\bar{y}(t)=\sum_{n=0}^{\infty}y[n]\delta(t-nT)\) \(\overline{Y}(s)=\sum_{n=0}^{\infty}y[n]e^{-snT}\)
\(h[n]\) \(H[z]=\sum_{n=0}^{\infty}h[n]z^{-n}\) \(\bar{h}(t)=\sum_{n=0}^{\infty}h[n]\delta(t-nT)\) \(\overline{H}(s)=\sum_{n=0}^{\infty}h[n]e^{-snT}\)
\(y[n]=x[n]*h[n]\) \(Y[z]=X[z]H[z]\) \(\bar{y}(t)=\bar{x}(t)*\bar{h}(t)\) \(\overline{Y}(s)=\overline{X}(s)\overline{H}(s)\)

Therefore, we obtain

\[\begin{align} \sum_{n=0}^{\infty}y[n]z^{-n} &=\sum_{n=0}^{\infty}x[n]z^{-n}\cdot\sum_{n=0}^{\infty}h[n]z^{-n} \\ \sum_{n=0}^{\infty}y[n]e^{-snT} &=\sum_{n=0}^{\infty}x[n]e^{-snT}\cdot \sum_{n=0}^{\infty}h[n]e^{-snT} \end{align}\]

The \(z\)-transform of a sequence can be considered to be the Laplace transform of impulses sampled train with a change of variable \(z = e^{sT}\) or \(s = \frac{1}{T}\ln z\)

Note that the transformation \(z = e^{sT}\) transforms the imaginary axis in the \(s\) plane (\(s = j\omega\)) into a unit circle in the \(z\) plane (\(z = e^{sT} = e^{j\omega T}\), or \(|z| = 1\))

Note: \(\bar{h}(t)\) is the impulse sampled version of \(h(t)\)


Note \(\bar{h}(t)\) is impulse sampled signal, whose CTFT is scaled by \(\frac{1}{T}\) of continuous signal \(h(t)\), \(\overline{H}[e^{sT}]=\overline{H}(e^{sT})\) is the approximation of continuous time system response, for example summation \(\frac{1}{1-z^{-1}}\) \[ \frac{1}{1-z^{-1}} = \frac{1}{1-e^{-sT}} \approx \frac{1}{j\omega \cdot T} \] And we know transform of integral \(u(t)\) is \(\frac{1}{s}\), as expected there is ratio \(T\)


Staszewski, Robert Bogdan, and Poras T. Balsara. All-digital frequency synthesizer in deep-submicron CMOS. John Wiley & Sons, 2006

image-20250807205854057


D. Sundararajan. Signals and Systems A Practical Approach Second Edition, 2023

image-20250809233806280

Frequency Response of Discrete-Time LTI Systems

Eugenio Schuster, Lehigh University, Discrete-Time Signal Analysis in the Frequency-Domain [https://www.lehigh.edu/~eus204/teaching/ME450_SIRC/notes/DTS_Tutorial.pdf]

image-20250907091547262

image-20250907091735655

impulse invariance

A straightforward and useful relationship between the continuous-time impulse response \(h_c(t)\) and the discrete-time impulse response \(h[n]\)

image-20241002133303153

\[ h[n] = Th_c(nT) \]

and \(T\) is chosen such that

\[ H_c(j\omega)=0, \space\space |\hat{\omega}| \ge \frac{\pi}{T} \]

When \(h[n]\) and \(h_c(t)\) are related through the above equation, i.e., the impulse response of the discrete-time system is a scaled, sampled version of \(h_c(t)\), the discrete-time system is said to be an impulse-invariant version of the continuous-time system

we have \[ H(e^{j\hat{\omega}}) = H_c\left(j\frac{\hat{\omega}}{T}\right),\space\space |\hat{\omega}| \lt \pi \]

  • \(h[n] = Th_c(nT)\) and \(T\to 0\)

    guarantees \(y_c(nT) = y_r(nT)\), i.e. output equivalence only at the sampling instants

  • \(H_c(j\Omega)\) is bandlimited and \(T \lt 1/2f_h\)

    guarantees \(y_c(t) = y_r(t)\)

The scaling of \(T\) can alternatively be thought of as a normalization of the time domain, that is average impulse response shall be same i.e., \(h[n]\times 1 = h(nT)\times T\)

image-20250807202725879


note that the impulse-invariant transform is not appropriate for high-pass responses, because of aliasing errors

image-20250622082500575

⭐ bandlimited

image-20250621214228274

bandlimited

image-20250621214307332


image-20250809112858954


impulse_Ts.drawio

Transfer function & sampled impulse response

continuous-time filter designs to discrete-time designs through techniques such as impulse invariance

useful functions

  • using fft

    The outputs of the DFT are samples of the DTFT

  • using freqz

    modeling as FIR filter, and the impulse response sequence of an FIR filter is the same as the sequence of filter coefficients, we can express the frequency response in terms of either the filter coefficients or the impulse response

    fft is used in freqz internally

freqz method is straightforward, which apply impulse invariance criteria. Though fft is used for signal processing mostly,


image-20241002082937903

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
clear all;
clc;

%% continuous system
s = tf('s');
h = 2*pi/(2*pi+s); % First order lowpass filter with 3-dB frequency 1Hz
[mag, phs, wout] = bode(h);
fct = wout(:)/2/pi;
Hct_dB = 20*log10(mag(:));


fstep = 0.01; % freq resolution
fnyqst = 1000;
Ts = 1/(2*fnyqst);
Fs = 1/Ts; % sampling freq
Ns = ceil(Fs/fstep); % samping points
fstep = Fs/Ns; % update fstep
t = (0:Ns-1)*Ts; % sampling time points

y = impulse(h, t); % impulse resp

%% modelling as discrete system
Y = fft(y); % dft
Hfft = Y * Ts; % !!! multiply Ts
Hfft_dB = 20*log10(abs(Hfft(1:Ns/2+1)));
ffft = (1:Ns/2+1)*fstep - fstep;


[Hfir, ffir] = freqz(y, 1, [], 1/Ts); % modelling as FIR
Hfir = Hfir * Ts; % !!! multiply Ts
Hfir_dB = 20*log10(abs(Hfir));

%% plot
semilogx(fct, Hct_dB, '-ok', ffft, Hfft_dB, 'r.-', ffir, Hfir_dB, 'b--');
legend('bode(s)', 'fft', 'FIR model')
xlabel('Freq(Hz)');
ylabel('dB');
xlim([1e-2 1e2]);
grid on;
title('frequency response of different methods');

!!! only multiply Ts

image-20250517083748566


Neil Robertson. The Discrete Fourier Transform as a Frequency Response [https://www.dsprelated.com/showarticle/1498.php]

H(k) by using complex exponentials is indeed identical to the DFT of h(n)

H(f) by DFT -> fft

H(k) by using complex exponential -> freqz

image-20250907105453920

FIR Equalization

Frequency Response

image-20220322093428287 \[ z = e^{j\omega T_s} \]

Unit impulse

filter coefficients are [-0.131, 0.595, -0.274] and sampling period is 100ps

image-20220428125454912

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
%% Frequency response
w = [-0.131, 0.595, -0.274];
Ts = 100e-12;
[mag, w] = freqz(w, 1, [], 1/Ts);
subplot(2, 1, 1)
plot(w/1e9, 20*log10(abs(mag)));
xlabel('Freq(GHz)');
ylabel('dB');
grid on;
title('frequency response');

%% unit impulse response from transfer function
subplot(2, 1, 2)
z = tf('z', Ts);
h = -0.131 + 0.595*z^(-1) -0.274*z^(-2);
[y, t] = impulse(h);
stem(t*1e10, y*Ts); % !!! y*Ts is essential
grid on;
title("unit impulse response");
xlabel('Time(\times 100ps)');
ylabel('mag');

impulse:

For discrete-time systems, the impulse response is the response to a unit area pulse of length Ts and height 1/Ts, where Ts is the sample time of the system. (This pulse approaches \(\delta(t)\) as Ts approaches zero.)

Scale output:

Multiply impulse output with sample period Ts in order to correct 1/Ts height of impulse function.

PSD transformation

If we have power spectrum or power spectrum density of both edge's absolute jitter (\(x(n)\)) , \(P_{\text{xx}}\)

Then 1UI jitter is \(x_{\text{1UI}}(n)=x(n)-x(n-1)\), and Period jitter is \(x_{\text{Period}}(n)=x(n)-x(n-2)\), which can be modeled as FIR filter, \(H(\omega) = 1-z^{-k}\), i.e. \(k=1\) for 1UI jitter and \(k=2\) Period jitter \[\begin{align} P_{\text{xx}}'(\omega) &= P_{\text{xx}}(\omega) \cdot \left| 1-z^{-k} \right|^2 \\ &= P_{\text{xx}}(\omega) \cdot \left| 1-(e^{j\omega T_s})^{-k} \right|^2 \\ &= P_{\text{xx}}(\omega) \cdot \left| 1-e^{-j\omega T_s k} \right|^2 \end{align}\]

image-20220519172239916

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
clear all
close all
clc

xf_fs = 0:0.01:0.5;
k = 1;
H_1UI = 1 - exp(-1i*2*pi*xf_fs);
HH_1UI = abs(H_1UI).^2;
subplot(2, 1, 1);
plot(xf_fs, HH_1UI);
grid on;
xlabel('Freq');
ylabel('|H|^2')
title('Weight for 1UI jitter');

k = 2;
H_period = 1 - exp(-1i*2*pi*xf_fs);
HH_period = abs(H_period).^2;
subplot(2, 1, 2)
plot(xf_fs, HH_period);
grid on;
xlabel('Freq');
ylabel('|H|^2')
title('Weight for Period jitter');

image-20220709104127384 \[ x(t-\Delta T)\overset{FT}{\longrightarrow} X(s)e^{-\Delta T \cdot s} \]

Bilinear Transform

The design of IIR filters (cont.) [https://ocw.mit.edu/courses/2-161-signal-processing-continuous-and-discrete-fall-2008/cc00ac6d468dc9dcf2238fc1d1a194d4_lecture_19.pdf]

image-20250907112138651

image-20250907112553846


[https://tttapa.github.io/Pages/Mathematics/Systems-and-Control-Theory/Digital-filters/Discretization/Bilinear-transform.html]

an algebraic transformation between the variables \(s\) and \(z\) that maps the entire imaginary \(j\Omega\)-axis in the \(s\)-plane to one revolution of the unit circle in the \(z\)-plane

\[\begin{align} z &= \frac{1+s\frac{T_s}{2}}{1-s\frac{T_s}{2}} \\ s &= \frac{2}{T_s}\cdot \frac{1-z^{-1}}{1+z^{-1}} \end{align}\]

where \(T_s\) is the sampling period

image-20241125001024136

frequency warping:

image-20241125002523844

  • The bilinear transformation avoids the problem of aliasing encountered with the use of impulse invariance, because it maps the entire imaginary axis of the \(s\)-plane onto the unit circle in the \(z\)-plane

  • impulse invariance cannot be used to map highpass continuous-time designs to high pass discrete-time designs, since highpass continuous-time filters are not bandlimited

  • Due to nonlinear warping of the frequency axis introduced by the bilinear transformation, bilinear transformation applied to a continuous-time differentiator will not result in a discrete-time differentiator. However, impulse invariance can be applied to bandlimited continuous-time differentiator

    The feature of the frequency response of discrete-time differentiators is that it is linear with frequency


The simple approximation \(z=e^{sT}\approx1+sT\), the first equal come from impulse invariance essentially

image-20241024230308374

image-20241026230904395


Stephen Roberts. Lecture 3 - Design of Digital Filters [https://www.robots.ox.ac.uk/~sjrob/Teaching/B14_SP/b14_sp_lect3.pdf]

Design of IIR Filters From Analog Filters

Modeling a Continuous-Time System with Matlab

Because the mapping between the continuous (\(s\)-plane) and discrete domains (\(z\)-plane) cannot be done exactly, the various design methods are at best approximations

image-20250623205733778

Neil Robertson. Modeling a Continuous-Time System with Matlab [https://www.dsprelated.com/showarticle/1055.php]

—. Modeling Anti-Alias Filters [https://www.dsprelated.com/showarticle/1418.php]

impinvar [https://www.mathworks.com/help/signal/ref/impinvar.html]

bilinear [https://www.mathworks.com/help/signal/ref/bilinear.html]

filter [https://www.mathworks.com/help/matlab/ref/filter.html]


The design of IIR filters [https://ocw.mit.edu/courses/2-161-signal-processing-continuous-and-discrete-fall-2008/14f29be83ac38f11eb1a1bd5fdb6f5ea_lecture_18.pdf]

The design of IIR filters (cont.) [https://ocw.mit.edu/courses/2-161-signal-processing-continuous-and-discrete-fall-2008/cc00ac6d468dc9dcf2238fc1d1a194d4_lecture_19.pdf]

Lecture 19: Design of IIR Filters [http://smartdata.ece.ufl.edu/eee5502/2020_fall/media/2020_eee5502_slides28.pdf]

Derivatives Approximation

Perhaps the simplest method for low-order systems is to use backward-difference approximation to continuous domain derivatives

image-20250623205828010

Note this approximation is not same with impulse invariance, e.g. \(\frac{1}{s^3} \to \frac{T^3z(z+1)}{2(-1)^3}\) employing impulse invariance


\[ 1- z^{-1} = 1-e^{-j\Omega T} = 1-\cos(\omega T) + j\sin(\Omega T) \approx 1-1+j\Omega T = s T \]

That is

\[ s \approx \frac{1-z^{-1}}{T} \]


image-20250907140610175


Suppose we wish to make a discrete-time filter based on a prototype first-order low-pass filter \[ H_p(s) = \frac{1}{s\tau + 1} \] The differential equation describing this filter is \[ \tau\frac{dy}{dt} + y = x \] then differential equation gives

\[ \tau\frac{y_n-y_{n-1}}{T} + y_n=x_n \] or \[ (\frac{\tau}{T}+1)y_n - \frac{\tau}{T}y_{n-1} = x_n \] The transfer function is \[ H(z) = \frac{\frac{T}{T+\tau}}{1-\frac{\tau}{T+\tau}z^{-1}} \]

or substitute \(s\) with \(\frac{1-z^{-1}}{T}\) into \(H_p(s)\) \[ H_p(z) = \frac{1}{\tau\frac{1-z^{-1}}{T} + 1} = \frac{\frac{T}{T+\tau}}{1-\frac{\tau}{T+\tau}z^{-1}}= \frac{\alpha}{1 +(\alpha -1)z^{-1}} \] where \(\alpha = \frac{T}{\tau+T}\)

Under the assumption, time constants much longer than the timestep \(\tau \gg T\) \[\begin{align} \frac{T}{T+\tau}& = \frac{T}{\tau}\cdot \frac{\tau}{T+\tau}\approx \frac{T}{\tau} \\ \frac{\tau}{T+\tau} &= \frac{\tau - T}{\tau}\cdot\frac{\tau^2}{\tau^2-T^2} \approx \frac{\tau - T}{\tau} = 1- \frac{T}{\tau} \end{align}\]

Then the first-order low pass fiter \[ H_p(z) \approx \frac{\alpha}{1 +(\alpha -1)z^{-1}} \] where \(\alpha = \frac{T}{\tau}\)

1
2
3
4
5
6
7
8
9
10
11
12
13
# https://www.dsprelated.com/showarticle/1517/return-of-the-delta-sigma-modulators-part-1-modulation
# https://www.embeddedrelated.com/showarticle/779.php
def show_dsmod_samplewave(t,x,dsmod,args=(1,),tau=0.05, R=1, fig=None,
return_handles=False, filter_dsmod=False):
dt = t[1]-t[0]
if filter_dsmod:
x1 = dsmod(x, *args,R=R, modulate=False)
else:
x1 = x
y = dsmod(x,*args,R=R)
a = dt/tau
yfilt = scipy.signal.lfilter([a],[1,a-1],y)
xfilt = scipy.signal.lfilter([a],[1,a-1],x1)

Matched z-Transform (Root Matching)

The matched Z-transform method, also called the pole-zero mapping or pole-zero matching method, and abbreviated MPZ or MZT

\[ z = e^{sT} \]

image-20250623213116836

image-20250623214002049


image-20250623220318339

1
2
3
4
5
6
7
8
9
10
% https://www.dsprelated.com/showarticle/1642.php

%II One-pole RC filter model
Ts= 1/fs;
Wc= 1/(R*C); % rad -3 dB frequency
fc= Wc/(2*pi); % Hz -3 dB frequency
a1= -exp(-Wc*Ts);
b0= 1 + a1; % numerator coefficient
a= [1 a1]; % denominator coeffs
y_filt= filter(b0,a,y); % filter the DAC's output signal y

Impulse Invariance (impinvar)

  • To extend the accurate frequency range, we would need to increase the sample rate
  • impulse-invariant transform is not appropriate for high-pass responses, because of aliasing errors

Note \(h[n] = Th_c(nT)\), Multiply the analog impulse response by this gain to enable meaningful comparison (other response, like step, the amplitude correction is not needed)

image-20250622091049131


image-20250622095817885

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
% https://www.dsprelated.com/showarticle/1055.php
% modified by hguo, Sun Jun 22 09:21:48 AM CST 2025

% butter_3rd_order.m 6/4/17 nr
% Starting with the butterworth transfer function in s,
% Create discrete-time filter using the impulse invariance xform and compare
% its time and frequency responses to those of the continuous time filter.
% Filter fc = 1 rad/s = 0.159 Hz


% I. Given H(s), find H(z) using the impulse-invariant transform
fs= 4; % Hz sample frequency
% 3rd order butterworth polynomial
num= 1;
den= [1 2 2 1];
[b,a]= impinvar(num,den,fs); % coeffs of H(z)
%[b,a]= bilinear(num,den,fs)

% II. Impulse Response and Step Response
% find discrete-time impulse response
Ts= 1/fs;
N= 16*fs;
n= 0:N-1;
t= n*Ts;
x= [1, zeros(1,N-1)]; % impulse
x= fs*x; % make impulse response amplitude independent of fs
y= filter(b,a,x); % filter the impulse
subplot(2,2,1),plot(t,y,'.'),grid,
xlabel('seconds')

% Continuous-time Impulse response from inverse Laplace transform
% Blinchikoff and Zverev, p116
h= exp(-t) - 2/sqrt(3)*exp(-t/2).*cos(sqrt(3)/2*t + pi/6);
subplot(2,2,2),plot(t,y,'.',t,h),grid
title('Impulse Response'),legend('discrete-time filter response', 'continuous-time filter response'),xlabel('seconds')

% find discrete-time step response
x= ones(1,N); % step
y= filter(b,a,x); % filter the step
% Continuous-time step response. Blinchikoff and Zverev, p116
t= t+Ts/2; % offset t to to align step responses
h= 1 - exp(-t) - 2/sqrt(3)*exp(-t/2).*sin(sqrt(3)/2*t);
e= h-y; % error of discrete-time response
subplot(2,2,3),plot(t,y,'.',t,h),grid
title('Step Response'),legend('discrete-time filter response', 'continuous-time filter response'),xlabel('seconds')
% discrete-time response error
subplot(2,2,4),plot(t,e),grid
title('Error of discrete-time step response'),xlabel('seconds')

image-20250622093151404


  • To use the filter function with the b coefficients from an FIR filter, use y = filter(b,1,x).

image-20250622110340687

Bilinear Transformation (bilinear)

TODO 📅

bandwidth from pulse response

L.W. Couch, Digital and Analog Communication Systems, 8th Edition, Pearson, 2013.

Generating Basic signals – Rectangular Pulse and Power Spectral Density using FFT

Convolution Property of the Fourier Transform \[ x(t)*h(t)\longleftrightarrow X(\omega)H(\omega) \] pulse response can be obtained by convolve impulse response with UI length rectangular \[ H(\omega) = \frac{Y_{\text{pulse}}(\omega)}{X_{\text{rect}}(\omega)} = \frac{Y_{\text{pulse}}(\omega)}{\text{sinc}(\omega)} \]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
% Convolution Property of the Fourier Transform
% pulse(t) = h(t) * rect(t)
% -> fourier transform
% PULSE = H * RECT
% FT(RECT) = sinc
% H = PULSE/RECT = PULSE/sinc
xx = pi*ui.*w(1:plt_num);
y_sinc = ui.*sin(xx)./xx;
y_sinc(1) = y_sinc(2);
y_sinc = y_sinc/y_sinc(1); % we dont care the absoulte gain
h_ban1 = abs(h(1:plt_num))./abs(y_sinc);

h_dB = 20*log10(abs(h_ban1));
[hmin, Index] = min(abs(h_dB +3 ));
f_3dB = w(Index);
f_3dB = f_3dB/1e9;

image-20220520223115184

image-20220520223621924

sinc function

Notice that the complete definition of \(\operatorname{sinc}\) on \(\mathbb R\) is \[ \operatorname{Sa}(x)=\operatorname{sinc}(x) = \begin{cases} \frac{\sin x}{x} & x\ne 0, \\ 1, & x = 0, \end{cases} \] which is continuous.

image-20220521112425477

image-20220521112722838

To approach to real spectrum of continuous rectangular waveform, \(\text{NFFT}\) has to be big enough.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
clear;
close all;
clc;

fs=500; %sampling frequency
Ts = 1/fs;
T=0.2; %width of the rectangule pulse in seconds

figure(1)
hold on;
t=-0.5:1/fs:0.5; %time base
x=rectpuls(t,T); %generating the square wave
sum(x>0.5)
stem(t,x,'--k');
plot(t, x, 'b.-')
xstart = T/2-Ts/2;
xend = -T/2-Ts/2;
plot([xstart, xstart], [-1, 1.02], 'r--');
plot([xend, xend], [-1, 1.02], 'r--');
plot([-0.11, 0.11], [1, 1], 'r--')
hold off;
grid on;
title(['Rectangular Pulse width=', num2str(T),'s']);
xlabel('Time(s)');
ylabel('Amplitude');
xlim([-0.12, 0.12]);
ylim([-0.05, 1.05]);

figure(2)
hold on
Titer = [1, 4, 16, 32, 64, 128];
color = {'k', 'g', 'r', 'm', 'c', 'b'};
formatSpec = 'NFFT=%d';
for k=1:length(Titer)
t=-Titer(k):Ts:Titer(k); %time base
x=rectpuls(t,T); %generating the square wave

L=length(x);
Np = nextpow2(L)-1;
NFFT = 2^Np;
X = fftshift(fft(x,NFFT)); %FFT with FFTshift for both negative & positive frequencies
f = fs*(-NFFT/2:NFFT/2-1)/NFFT; %Frequency Vector
Xc = abs(X)*Ts; % continuous signal spectrum
plot(f, Xc, "Color",color{k}, 'LineWidth',2);
legend_str{k} = [num2str(NFFT, formatSpec)];
end
hold off
legend(legend_str);
title('Magnitude of X_c (original continuous signal)');
xlabel('Frequency (Hz)')
ylabel('Magnitude |X(f)|');
grid on;

Stability check in z-domain

Unit circle criterion; Jury’s stability criterion

image-20250819213545563

reference

Alan V. Oppenheim, Alan S. Willsky, and S. Hamid Nawab. 1996. Signals & systems (2nd ed.)

Alan V Oppenheim, Ronald W. Schafer. 2010. Discrete-Time Signal Processing, 3rd edition

B.P. Lathi, Roger Green. Linear Systems and Signals (The Oxford Series in Electrical and Computer Engineering) 3rd Edition

Sam Palermo, ECEN720, Lecture 7: Equalization Introduction & TX FIR Eq

Sam Palermo, ECEN720, Lab5 –Equalization Circuits

B. Razavi, "The z-Transform for Analog Designers [The Analog Mind]," IEEE Solid-State Circuits Magazine, Volume. 12, Issue. 3, pp. 8-14, Summer 2020. [https://www.seas.ucla.edu/brweb/papers/Journals/BR_SSCM_3_2020.pdf]

Jhwan Kim, CICC 2022, ES4-4: Transmitter Design for High-speed Serial Data Communications

Mathuranathan. Digital filter design – Introduction [https://www.gaussianwaves.com/2020/02/introduction-to-digital-filter-design/]

Daniel Boschen, "Fast Track to Designing FIR Filters with Python" [https://events.gnuradio.org/event/24/contributions/598/attachments/186/485/Boschen%20FIR%20Filter%20Presentation.pdf]

Daniel Boschen, "Quick Start on Control Loops with Python" [https://events.gnuradio.org/event/24/contributions/599/attachments/187/480/Boschen%20Control%20Presentation.pdf]

Matching network

TODO 📅

Q factor

TODO 📅

Self-Resonant Frequency

image-20240802210109935

\[ f_\text{SRF} = \frac{1}{2\pi \sqrt{LC}} \] The SRF of an inductor is the frequency at which the parasitic capacitance of the inductor resonates with the ideal inductance of the inductor, resulting in an extremely high impedance. The inductance only acts like an inductor below its SRF

image-20241221092745311

  • For choking applications, chose an inductor whose SRF is at or near the frequency to be attenuated

  • For other applications, the SRF should be at least 10 times higher than the operating frequency

    it is more important to have a relatively flat inductance curve (constant inductance vs. frequency) near the required frequency

[Understanding RF Inductor Specifications, https://www.ece.uprm.edu/~rafaelr/inel5325/SupportDocuments/doc671_Selecting_RF_Inductors.pdf]

[RFIC-GPT Wiki, https://wiki.icprophet.net/]

T-coil

  • Broaden the bandwidth
  • Create a constant, resistive input impedance in the presence of a heavy load capacitance (ESD protection circuit)

L1 = L2 for impedance matching

The use of T-coils can dramatically increase the bandwidth and improve the return loss in both TXs and RXs.

image-20240712234028634

equivalent circuit

image-20240713001415948

image-20240713011127116

image-20240713011152541

Note the negative sign of \(L_M\), which is a consequence of magnetic coupling; owing to this, the driving impedance at the center tap as seen by \(C\) is lower than without the coupling.

\[\begin{align} Z_{c} &= sL_a/2 -sL_M = sL/2 - sL_M/2 \\ Z_{noc} &= sL/2 \end{align}\]

Bridged T-Coil

The vertical stacking of the inductor halves causes a significant bridging capacitance \(C_B\)​ between them.

The most important aspect is that \(C_B\) does not prohibit perfect impedance matching but can be used to tune certain transfer functions.

Tcoil in RX

image-20220502201254057

ppwl: Independent Piece-Wise Linear Resistive Source

image-20220502201321341

simple model

L1, L2, Km, Cb

image-20220502215310947

image-20220622224842237

image-20220622225709712

lumped model

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
simulator lang=spectre
include "/path/to/INDL0.scs"
subckt for_import(p0 p1 p2 gnd)
xmod (p0 p1 p2 gnd) INDL0
ends for_import

x_1 (p_1_1 p_1_2 p_1_3 0) for_import
v_1_1 (p_1_1 0) vsource mag=-1
v_1_2 (p_1_2 0) vsource mag=0
v_1_3 (p_1_3 0) vsource mag=0
save v_1_1:p v_1_2:p v_1_3:p

x_2 (p_2_1 p_2_2 p_2_3 0) for_import
v_2_1 (p_2_1 0) vsource mag=0
v_2_2 (p_2_2 0) vsource mag=-1
v_2_3 (p_2_3 0) vsource mag=0
save v_2_1:p v_2_2:p v_2_3:p

x_3 (p_3_1 p_3_2 p_3_3 0) for_import
v_3_1 (p_3_1 0) vsource mag=0
v_3_2 (p_3_2 0) vsource mag=0
v_3_3 (p_3_3 0) vsource mag=-1
save v_3_1:p v_3_2:p v_3_3:p
Y ac start=1.000000e+08 stop=2.000000e+10 step=1.000000e+08

xsp (p_1 p_2 p_3 0) for_import
port1 (p_1 0) port
port2 (p_2 0) port
port3 (p_3 0) port
S sp start=1.000000e+08 stop=2.000000e+10 step=1.000000e+08 ports=[ port1 port2 port3]

image-20220503005728763

image-20220503005903105


EMX_plot_tcoil in emxform.ils

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
(define (EMX_plot_tcoil bgui wid what)
(needNCells 'adtComplex 100000)
(letseq ((to_z (lambda (ys)
(letseq ((y11 (EMX_matrix_ref ys 0 0))
(y12 (EMX_matrix_ref ys 0 1))
(y21 (EMX_matrix_ref ys 1 0))
(y22 (EMX_matrix_ref ys 1 1))
(det y11*y22-y12*y21)
(z11 y22/det)
(z12 -y12/det)
(z22 y11/det))
(list z11 z12 z22))))
(ground
(lambda (y p)
(letseq ((n (EMX_matrix_rows y))
(yg (make_EMX_matrix n-1)))
(do ((i 0 i+1))
((i >= n-1))
(do ((j 0 j+1))
((j >= n-1))
(EMX_matrix_set yg i j (EMX_matrix_ref y i+(if (i >= p) 1 0) j+(if (j >= p) 1 0)))))
yg)))
(reduce
(lambda (yys)
(letseq ((xvec (drGetWaveformXVec (car yys)))
(n (drVectorLength xvec))
(yyvecs (mapcar drGetWaveformYVec yys))
(zz1 (drCreateVec 'doublecomplex n))
(zz2 (drCreateVec 'doublecomplex n))
(zz3 (drCreateVec 'doublecomplex n)))
(do ((i 0 i+1))
((i >= n))
(let ((ys (mapcar (lambda (w) (drGetElem w i)) yyvecs)))
(setq ys (ground (as_EMX_matrix 3 3 ys) 2))
(let ((zz (to_z ys)))
(drSetElem zz1 i (nth 0 zz))
(drSetElem zz2 i (nth 1 zz))
(drSetElem zz3 i (nth 2 zz)))))
(list (drCreateWaveform xvec zz1)
(drCreateWaveform xvec zz2)
(drCreateWaveform xvec zz3)))))
(get_k
(lambda (l12 l1122)
(letseq ((xvec (drGetWaveformXVec l12))
(n (drVectorLength xvec))
(l12v (drGetWaveformYVec l12))
(l1122v (drGetWaveformYVec l1122))
(resultv (drCreateVec 'double n)))
(do ((i 0 i+1))
((i >= n))
(letseq ((l12i (drGetElem l12v i))
(l1122i (drGetElem l1122v i))
(kk (if (l1122i > 0.0)
l12i/(sqrt l1122i)
0.0))
(k (if ((abs kk) < 2.0) kk 0.0)))
(drSetElem resultv i k)))
(drCreateWaveform xvec resultv)))))
(EMX_plot_aux bgui wid what 3
'("Inductance" "Q" "k")
'("Henry" "" "")
(lambda (ys)
(letseq ((zs (reduce ys))
(z11 (nth 0 zs))
(z12 (nth 1 zs))
(z22 (nth 2 zs))
(pi 3.14159265358979)
(f (xval z11))
(l11 (imag z11)/(2*pi*f))
(q11 (imag z11)/(real z11))
(l12 (imag z12)/(2*pi*f))
(l22 (imag z22)/(2*pi*f))
(q22 (imag z22)/(real z22))
(k (get_k l12 l11*l22)))
`((,l11 ,l22) (,q11 ,q22) (,k))))
'(("L1" "L2") ("Q1" "Q2") ("k")))))

Tcoil vs tapped inductor

tcoil and tapped inductor share same EM simulation result, and use modelgen with different model formula.

The relationship is \[ L_{\text{sim}} = L1_{\text{sim}}+L2_{\text{sim}}+2\times k_{\text{sim}} \times \sqrt{L1_{\text{sim}}\cdot L2_{\text{sim}}} \] where \(L1_{\text{sim}}\), \(L2_{\text{sim}}\) and \(k_{\text{sim}}\) come from tcoil model result, \(L_{\text{sim}}\) comes from tapped inductor model result

\(k_{\text{sim}}\) in EMX have assumption, induce current from P1 and P2 Given Dot Convention:

Same direction : k > 0

Opposite direction : k < 0

So, the \(k_{\text{sim}}\) is negative if routing coil in same direction

image-20220623013225554

image-20220623013923263

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
% EMX - shield tcoil model
L1 = csvread('./L1sim.csv', 1, 0);
L2 = csvread('./L2sim.csv', 1, 0);
k = csvread('./ksim.csv', 1, 0);

% EMX - Tapped shield inductor
L = csvread('./Lsim.csv', 1, 0);

freq = L1(:, 1)/1e9; % GHz
L1 = L1(:, 2);
L2 = L2(:, 2);
k = -k(:, 2); % Caution: minus of EMX ksim due to same current direction

L = L(:, 2);

Lcalc = L1 + L2 + 2*k.*(L1.*L2).^0.5;

plot(freq, L*1e9, 'r', 'LineWidth', 3);
hold on;
plot(freq, Lcalc*1e9, '--b', 'LineWidth', 3);
grid on;
xlabel('Freq (GHz)');
ylabel('Inductance (nH)');
legend('Tapped inductor model', 'tcoil model calc');

电磁感应定律 (electromagnetic induction)

任何封闭电路中感应电动势大小,等于穿过这一电路磁通量的变化率。 \[ \epsilon = -\frac{d\Phi_B}{dt} \] 其中 \(\epsilon\)是电动势,单位为伏特

\(\Phi_B\)是通过电路的磁通量,单位为韦伯

电动势的方向(公式中的负号)由楞次定律决定

楞次定律: 由于磁通量的改变而产生的感应电流,其方向为抗拒磁通量改变的方向。

在回路中产生感应电动势的原因是由于通过回路平面的磁通量的变化,而不是磁通量本身,即使通过回路的磁通量很大,但只要它不随时间变化,回路中依然不会产生感应电动势。

自感电动势

当电流\(I\)随时间变化时,在线圈中产生的自感电动势为 \[ \epsilon = -L\frac{dI}{dt} \]

image-20240713131201618

image-20240713145315890

image-20240713145343423

image-20240713145005306

image-20240713145212327


image-20240713140911612

image-20240713135148710

image-20240713135236291

同名端:当两个电流分别从两个线圈的对应端子流入 ,其所 产生的磁场相互加强时,则这两个对应端子称为同名端。

image-20240713142238398

image-20240713142249362

image-20240713142506673

Non ideal capacitor and inductor

Tank Circuits/Impedances [https://stanford.edu/class/ee133/handouts/lecturenotes/lecture5_tank.pdf]

Resonant Circuits [https://web.ece.ucsb.edu/~long/ece145b/Resonators.pdf]

Series & Parallel Impedance Parameters and Equivalent Circuits [https://assets.testequity.com/te1/Documents/pdf/series-parallel-impedance-parameters-an.pdf]

ES Lecture 35: Non ideal capacitor, Capacitor Q and series RC to parallel RC conversion [https://youtu.be/CJ_2U5pEB4o?si=4j4CWsLSapeu-hBo]

Capacitor

image-20231224163730529


image-20240119000951498

image-20240119001025892

image-20240119001309410

Inductor

image-20231224163740411


image-20231224163905233

image-20231224163916109

reference

S. Shekhar, J. S. Walling and D. J. Allstot, "Bandwidth Extension Techniques for CMOS Amplifiers," in IEEE Journal of Solid-State Circuits, vol. 41, no. 11, pp. 2424-2439, Nov. 2006

David J. Allstot Bandwidth Extension Techniques for CMOS Amplifiers

B. Razavi, "The Bridged T-Coil [A Circuit for All Seasons]," IEEE Solid-State Circuits Magazine, Volume. 7, Issue. 40, pp. 10-13, Fall 2015.

B. Razavi, "The Design of Broadband I/O Circuits [The Analog Mind]," IEEE Solid-State Circuits Magazine, Volume. 13, Issue. 2, pp. 6-15, Spring 2021.

S. Galal and B. Razavi, "Broadband ESD protection circuits in CMOS technology," in IEEE Journal of Solid-State Circuits, vol. 38, no. 12, pp. 2334-2340, Dec. 2003, doi: 10.1109/JSSC.2003.818568.

M. Ker and Y. Hsiao, "On-Chip ESD Protection Strategies for RF Circuits in CMOS Technology," 2006 8th International Conference on Solid-State and Integrated Circuit Technology Proceedings, 2006, pp. 1680-1683, doi: 10.1109/ICSICT.2006.306371.

M. Ker, C. Lin and Y. Hsiao, "Overview on ESD Protection Designs of Low-Parasitic Capacitance for RF ICs in CMOS Technologies," in IEEE Transactions on Device and Materials Reliability, vol. 11, no. 2, pp. 207-218, June 2011, doi: 10.1109/TDMR.2011.2106129.

Bob Ross, "T-Coil Topics" DesignCon IBIS Summit 2011

Ross, Bob and Cong Ling. “Wang Algebra: From Theory to Practice.” IEEE Open Journal of Circuits and Systems 3 (2022): 274-285.

S. Lin, D. Huang and S. Wong, "Pi Coil: A New Element for Bandwidth Extension," in IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 56, no. 6, pp. 454-458, June 2009

Starič, Peter and Erik Margan. “Wideband amplifiers.” (2006).

Kosnac, Stefan (2021) Analysis of On-Chip Inductors and Arithmetic Circuits in the Context of High Performance Computing [https://archiv.ub.uni-heidelberg.de/volltextserver/30559/1/Dissertation_Stefan_Kosnac.pdf]

Chapter 4.5. High Frequency Passive Devices [https://www.cambridge.org/il/files/7713/6698/2369/HFIC_chapter_4_passives.pdf]

0%