maximum frequency

A conventional inverter-based ring oscillator consists of a single loop of an odd number of inverters. While compact, easy to design and tunable over a wide frequency range, this oscillator suffers from several limitations.

  • it is not possible to increase the number of phases while maintaining the same oscillation frequency since the frequency is inversely proportional to the number of inverters in the loop. In other words, the time resolution of the oscillator is limited to one inverter delay and cannot be improved below this limit.
  • the number of phases that can be obtained from this oscillator is limited to odd values. Otherwise, if an even number of inverters is used, the circuit remains in a latched state and does not oscillate.

To overcome the limitations of conventional ring oscillators, multi-paths ring oscillator (MPRO) is proposed. Each phase can be driven by two or more inverters, or multi-paths instead of having each phase in oscillator driven by a single inverter, or single path.

One thing that makes the MPRO design problem even more complicated is its property of having multiple possible oscillation modes. Without a clear understanding of what makes one of these modes dominant, it is very likely that a designer might end-up having an oscillator that can start-up each time in a different oscillation mode depending on the initial state of the oscillator.

image-20220320155440541

In practive, the oscillator starts first from a linear mode of operation where all the buffers are indeed acting as linear transconductors. All oscillation modes that have mode gains, \(a_n\), lower than the actual dc gain of the inverter, \(a_0\), start to grow. As the oscillation amplitude grows, the effective gain of the inverter drops due to nonlinearity. Consequently, modes with higher mode gain die out and only the mode that requires the minimum gain continues to oscillate and hence is the dominant mode.

The dominant mode is dependent only on the relative sizing vector

maximum oscillation frequency

The oscillation frequency of the dominant mode of any MPRO having any arbitrary coupling structure and number of phases is \[ f_{n^*} = \frac {1}{2\pi}\frac {(a_0-1) \cdot \sum_{i=1}^{N}x_isin\left ( \frac {2\pi n^*(i-1)}{N} \right)}{(a_0\tau _p - \tau _o)\cdot \sum_{i=1}^{N}-x_icos\left( \frac{2\pi n^*(i-1)}{N}+(\tau _o - \tau _p) \right)} \] image-20220320172952925

A linear increase in the maximum possible normalized oscillation frequency as the number of stages increases provided that the dc gain of the buffer sufficient to provide the required amplification

image-20220320170324494

image-20220320170523390

assuming unlimited dc gain and zero mode gain margins

mode stability

A common problem in MPRO design is the stability of the dominant oscillation mode. Mode stability refers to whether the MPRO always oscillates at the same mode regardless of the initial conditions of the oscillator. This problem is especially pronounced for MPROs with a large number of phases. This is due to the existence of many modes and the very small differences in the value of the mode gain of adjacent modes if the MPRO is not well designed.

In general, when the mode gain difference between two modes is small, the oscillator can operate in either one depending on initial conditions.

coupling configurations and simulation results

image-20220320165217231

Abou-El-Sonoun, A. A. (2012). High Frequency Multiphase Clock Generation Using Multipath Oscillators and Applications. UCLA. ProQuest ID: AbouElSonoun_ucla_0031D_10684. Merritt ID: ark:/13030/m57p9288. Retrieved from https://escholarship.org/uc/item/75g8j8jt

frequency stability

A problem associated with the design of MPROs is the existence of different possible modes of oscillation. Each of these modes is characterized by a different frequency, phase shift and phase noise.

Linear delay-stage model (mode gain)

mode gain is based on the linear model, independent of process but depends on coupling structure (coupling configuration, size ratio).

The inverting buffer modeled as a linear transconductor. The input-output relationship of a single buffer scaled by \(h_i\) and driving the input capacitance of a similar buffer can be expressed as \[\begin{align} h_ig_mv_{in}(t) + h_ig_oV_{out}(t)+h_iC_g\frac {dV_{out}(t)}{dt} &= 0 \\ a_nV_{in}(t)+V_{out}(t)+\tau \frac {V_{out}(t)}{dt} &= 0 \end{align}\] where \(g_m\) is the transconductance, \(g_o\) is the output conductance, \(C_g\) is the buffer input capacitance which also acts as the load capacitance for the driving buffer, and \(a_n = \frac {g_m}{g_o}\) is the linear dc gain of the buffer, and \(\tau=\frac {C_g}{g_o}\) is a time constant.

Similarly, \(V_1\), the output of the first stage in MPRO can be expressed \[ \sum_{i=1}^{N}h_ig_mV_i(t)+\sum_{i=1}^{N}h_ig_oV_i(t)+\sum_{i=1}^{N}h_iC_g\frac {dV_it(t)}{dt} = 0 \] Defining the fractional sizing factors and the total sizing factor as \(x_i=\frac{h_i}{H}\) and \(H=\sum_{i=1}^{N}h_i\) \[ a_n\sum_{i=1}^{N}x_iV_i(t) + V_1(t)+\tau\frac {dV_i(t)}{dt} = 0 \] where \(a_n = \frac {g_m}{g_o}\) and \(\tau=\frac {C_g}{g_o}\) are same dc gain and time constant defined previously

Since the total phase shift around the loop should be multiples of \(2\pi\), the oscillation waveform at the ith node can be expressed as \[ V_i(t) = V_o \cos(\omega_nt-\Delta \varphi \cdot i) \] where \(\omega_n\) is the oscillation frequency and \(\Delta \varphi = \frac {2\pi n}{N}\), \(N\) is the number of stages in the oscillator and \(n\) can take values between \(0\) and \(N-1\)

Plug \(V_i(t)\) into differential equation, we get \[ a_n\sum_{i=1}^{N}x_i\cos(\omega_n t-\frac{2\pi n}{N}i)+\cos(\omega_n t-\frac{2\pi n}{N}) - \omega_n \tau \sin(\omega_n t-\frac{2\pi n}{N}) = 0 \] By equating the \(cos(\omega_n t)\) and \(sin(\omega_n t)\) terms of the above equation, we get expressions for the oscillation frequency of the nth mode and the minimum dc gain required for this mode to exist. we refer to this gain as the mode gain \[\begin{align} \omega_n\tau &= \frac {\sum_{i=1}^{N}x_i \cdot \sin(\frac{2\pi n}{N}(i-1))}{-\sum_{i=1}^{N}x_i \cdot \cos(\frac{2\pi n}{N}(i-1))} \\ a_n &= \frac {1}{-\sum_{i=1}^{N}x_i \cdot \cos(\frac{2\pi n}{N}(i-1))} \end{align}\] where \(a_n\) should be greater than \(0\) for a existent mode

In practice, the oscillator starts first from a linear mode of operation where all the buffers are indeed acting as linear transconductors. All oscillation modes that have mode gains \(a_n\) lower than the actual dc gain of the inverter \(a_o\) start to grow. As the oscillation amplitude grows, the effective gain of the inverter drops due to nonlinearity. Consequently, modes with higher mode gain die out and only the mode that requires the minimum gain continues to oscillate and hence is the dominant mode

A. A. Hafez and C. K. Yang, "Design and Optimization of Multipath Ring Oscillators," in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 58, no. 10, pp. 2332-2345, Oct. 2011, doi: 10.1109/TCSI.2011.2142810.

Simulation-based approach

GCHECK is an automated verification tool that validate whether a ring oscillator always converges to the desired mode of operation regardless of the initial conditions and variability conditions. This is the first tool ever reported to address the global convergence failures in presence of variability. It has been shown that the tool can successfully validate a number of coupled ring oscillator circuits with various global convergence failure modes (e.g. no oscillation, false oscillation, and even chaotic oscillation) with reasonable computational costs such as running 1000-point Monte-Carlo simulations for 7~60 initial conditions (maximum 4 hours).

  • The verification is performed using a predictive global optimization algorithm that looks for a problematic initial state from a discretized state space
  • despite the finite number of initial state candidates considered and finite number of Monte-Carlo samples to model variability, the proposed algorithm can verify the oscillator to a prescribed confidence level

image-20220320183344877

  • The observation that the responses of a circuit with nearby initial conditions are strongly correlated with respect to common variability conditions enables us to explore a discretized version of the initial condition space instead of the continuous one.
  • the settling time increases as the initial state gets farther away from the equilibrium state allowed us to use the settling time as a guidance metric to find a problematic initial condition.

Selecting the Next Initial Condition Candidate to Evaluate

To determine whether the algorithm should continue or terminate the search for a new maximum, the algorithm estimates the probability of finding a new initial condition with the longer settling time, based on the information obtained with the previously-evaluated initial conditions.

GCHECK EXAMPLE

1
python gcheck_osc.py input.scs

output log:

1
2
3
4
5
6
7
8
Step 1/4: Simulating the setting-time distribution with the reference initial condition
...
Step 2/4: Simulating the setting-time distribution for randomly-selected initial probes
...
Step 3/4: Searching for Problematic Initial Conditions
...
Step 4/4: Reporting Verification Results and Statistics
...

image-20220320201718018

T. Kim, D. -G. Song, S. Youn, J. Park, H. Park and J. Kim, "Verifying start-up failures in coupled ring oscillators in presence of variability using predictive global optimization," 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2013, pp. 486-493, doi: 10.1109/ICCAD.2013.6691161.GCHECK: Global Convergence Checker for Oscillators

Before inserting sign-off metal fill, stream out a GDSII stream file of the current database. Specify the mapping file and units that match with the rule deck you specify while inserting metal fill. If necessary, include the detailed-cell (-merge option) Graphic Database System (GDS).

PVS:

1
2
streamOut -merge $GDSFile -mode ALL -units $GDSUNITS -mapFile $GDSMAP -outputMacros pvs.fill.gds
run_pvs_metal_fill -ruleFile $DUMMYRULE -defMapFile $DEFMAP -gdsFile pvs.fill.gds -cell [dbgDesignName]

Pegasus:

1
2
streamOut -merge $GDSFile -mode ALL -units $GDSUNITS -mapFile $GDSMAP -outputMacros pegasus.fill.gds
run_pegasus_metal_fill -ruleFile $DUMMYRULE -defMapFile $DEFMAP -gdsFile pegasus.fill.gds -cell [dbgDesignName]

Just replace run_pvs_metal_fill with run_pegasus_metal_fill

Note: Innovus metal fill (e.g. addMetalFill, addViaFill, etc.) does not support 20nm and below node design rules. We strongly recommend the Pegasus/PVS metal fill solution for 20nm and below. If you have sign-off metal fill rule deck for 28nm and above available, we recommend you to move to Pegasus/PVS solution too.

  1. trimMetalFillNearNet does not check DRC rules. It only removes the metal fill with specified spacing

  2. Do not perform ECO operations after dump in sign-off metal fill (by run_pvs_metal_fill or run_pegasus_metal_fill), especially, at 20nm and below nodes.

  3. If you perform an ECO action, the tool cannot get DRC clean because trimMetalFill does not support 20nm and below node design rules.

  4. The sign-off metal fill typically does not cause DRC issues with regular wires.

The run_pvs_metal_fill command does the following:

  • Runs PVS with the fill rules to create a GDSII output file.
  • Converts the GDSII to a DEF format file based on the GDSII to DEF layermap provided.
  • Loads the resulting DEF file into Innovus.

Pegasus is similar to PVS, shown as below,

The run_pegasus_metal_fill command does the following:

  • Runs Pegasus with the fill rules to create a GDSII output file.
  • Converts the GDSII to a DEF format file based on the GDSII to DEF layermap provided.
  • Loads the resulting DEF file into Innovus.

Reference:

Innovus User Guide, Product Version 21.12, Last Updated in November 2021

Process Corners

image-20220302000743092

Global/Local Variation

image-20220302000808679

Timing and RC Modeling with Process Corners

image-20220302000916728

Global and Local variation by Gaussian

image-20230111003336823

Local Monte-Carlo (SSG, FFG with Local Gaussian) as Signoff golden

image-20231215234014594

Process Corner Model Limitations

image-20220323232119661

image-20230111003705417

Variation section

  • Total corner (TT/SS/FF/SF/FS)
    • E.g. TTMacro_MOS_MOS_MOSCAP
  • Global Corner (TTG/SSG/FFG/SFG/FSG) + Local MC
    • E.g. TTGlobalCorner_LocalMC_MOS_MOSCAP
  • Local MC
    • E.g. LocalMCOnly_MOS_MOSCAP
  • Global MC + Local MC (Total MC)
    • GlobalMC_LocalMC_MOS_MOSCAP

image-20230308003005235

image-20230111010426775

img

image-20230111011757049

SSGNP, FFGNP:

When N/P global correlation is weak (R^2=0.15), the corner of N/PMOS balance circuit (e.g. inverter) can be tightened (3sigma -> 2.5sgma) due to the cancellation between NMOS and PMOS

SSGNP, FFGNP usually used in Digital STA

  • Global variation validation with global corner
    • 3-sigma of global MC simulation is aligned with global corner
  • Total variation validation with total corner
    • 3-sigma of global MC + local MC (total) simulation is aligned with total corner

Global Corner

https://community.cadence.com/cadence_technology_forums/f/custom-ic-design/20466/monte-carlo-simulation-global-local-vs-local-and-process-vs-mismatch/1365101#1365101

The "total corner" is representative of the maximum device parameter variation including local device variation effects. However, it is not a statistical corner.

The "global" corner is defined as the "total" corner minus the impact of "local variation"

Hence, if you were to examine simulation results for a parameter using a "total" and "global" corner, you would find the range of variation will be less with the "global" corner than with the "total" corner.

The "global" corner is provided for use in statistical simulations. Hence, when performing a Monte-Carlo simulation, the "global" corner is selected - NOT the "total" corner.

image-20230511234313012

\[ \Delta V_{T,\sigma_{total}} = \sqrt{\Delta^2 _{T, \sigma_{global}}+\Delta^2 _{T, \sigma_{local}}} \]

reference

Eric J.-W. Fang, T5: Fundamentals of Process Monitors for Signoff-Oriented Circuit Design, 2022 IEEE International Solid-State Circuits Conference

Alvin Loke, Device and Physical Design Considerations for Circuits in FinFET Technology, ISSCC 2020 Short Course

簡報 Cln16ffcll Sr V1d0 2p1 Usage Guide URL: https://usermanual.wiki/Document/cln16ffcllsrv1d02p1usageguide.1649731847/view

Radojcic, Riko, Dan Perry and Mark Nakamoto. “Design for manufacturability for fabless manufactuers.” IEEE Solid-State Circuits Magazine 1 (2009): n. pag.

How To Reduce Implementation Headaches In FinFET Processes URL: https://semiengineering.com/how-to-reduce-implementation-headaches-in-finfet-processes/

陌上风骑驴看IC, STA | SSGNP, FFGNP. https://mp.weixin.qq.com/s/eJ8fYRJBR1E9XbfH95OUOg

陌上风骑驴看IC, STA | ssg 跟ss corner 的区别——谬误更正版 https://mp.weixin.qq.com/s?__biz=MzUzODczODg2NQ==&mid=2247486225&idx=1&sn=e9c68f6108ae6c9958d47ca0b29373ca&chksm=fad262cfcda5ebd949cc91353c7cbfaf4ba61179306f7d8e98461a4f4ca9d8a9baef5e9f2cc1&scene=178&cur_album_id=1326356275000705025#rd

The Evolution, Pitfalls, and Cargo Cult Engineering of Advanced Digital Timing Sign-off https://www.tauworkshop.com/2021/speaker_slides/christian_l.pdf

Don O'Riordan Cadence Design Systems. Recommended Spectre Monte Carlo modeling methodology [https://designers-guide.org/modeling/montecarlo.pdf]

PrimeTime does not automatically identifies multicycle paths.

specifying multicycle path for setup

1
2
3
4
5
set_multicycle_path 6 -from reg[26]/CP -to reg/D
# or
set_multicycle_path -setup 6 -from reg[26]/CP -to reg/D
# check the exception
report_exception

specifying multicycle path for hold (new data every 6 cycles)

1
2
set_multicycle_path -setup 6 -to [get_pins "*reg[*]/D"]
set_multicycle_path -hold [expr 6-1] -to [get_pins "*reg[*]/D"]

image-20220301215938296

MH stands for Hold Multiplier, MS for Setup Multiplier. The Setup multiplier counts up with increasing clock cycles, the Hold Multiplier counts up with decreasing cycles. The origin (0) for the Hold Multiplier is always at the Setup Multiplier - 1 position.

Reporting a multicycle path with report_timing

1
report_timing -exceptions all -from *reg[26]/CP -to *reg/D

Timing Exceptions

If certain paths are not intended to operate according to the default setup and hold behavior assumed by the PrimeTime tool, you need to specify those paths as timing exceptions. Otherwise, the tool might incorrectly report those paths as having timing violations.

The PrimeTime tool lets you specify the following types of exceptions:

  • False path – A path that is never sensitized due to the logic configuration, expected data sequence, or operating mode.
  • Multicycle path – A path designed to take more than one clock cycle from launch to capture.
  • Minimum or maximum delay path – A path that must meet a delay constraint that you explicitly specify as a time value.

REF

PrimeTime® User Guide Version O-2018.06-SP4 Chapter 1: Introduction to PrimeTime Overview of Static Timing Analysis - Timing Exceptions

PT picks the most restrictive pair of edges for setup and for hold. It determines which edges to be used as follows:

  1. Evaluate waveforms over the smallest common base period

  2. For each capture edge, find the closest setup launch edge. Call these the primary pairs

  3. Out of the primary pairs, pick the most restrictive setup launch and capture edges.

  4. For each primary pair, draw two hold relationships:

    • Launch to (capture - 1)

    • (Launch + 1) to Capture

      From all of these hold relationships, pick the most restrictive.

PrimeTime uses the ideal clock waveform (as reported in report_clock) to determine the appropriate clock edges for inter-clock analysis.

image-20220301203135490

The most restrictive setup pair is from Clk1 8ns to Clk2 9ns

The most restrictive hold pair is from Clk1 0ns to Clk2 0ns

primary clocks

  • Primary clocks should be created at input ports and output pins of black boxes.
  • Never create clocks on hierarchy pins. Creating clocks on hierarchy will cause problems when reading SDF. The net timing arc becomes segmented at the hierarchy and PrimeTime will be unable to annotate the net successfully.

generated clocks

  • Generated clocks are generally created for waveform modifications of a primary clock (not including simple inversions). PrimeTime does not simulate a design and thus will not derive internally generated clocks automatically - these clocks must be created by the user and applied as a constraint.
  • PrimeTime caculate source latency for generated clocks if primary clock is propagated, otherwise its source latency is zero.
1
2
3
4
5
# primary clock
create_clock -period 4 [get_ports Clk]
set_clock_latency -source 2 [get_clocks Clk]
set_propagated_clock [get_clocks Clk]
create_generated_clock -divide_by 2 -name div_clk -source [get_ports Clk] FF3/Q

virtual clocks

  • Are clock objects without a source
  • Do not clock sequential devices within the current_design
  • Serve as references of input or output delays
1
2
3
4
# create a virtual clock, vclk, for input and output delay constraints
create_clock -period 5 -name vclk
set_input_delay -max 2 -clock vclk [get_ports in1]
set_output_delay -max 1 -clock vclk [get_ports out2]

There is no network latency to calculate, even if a virtual clock is propagated.

Mechanism in UVM-1.1

1
2
3
4
5
function void test_base::start_of_simulation_phase(uvm_phase phase);
super.start_of_simulation_phase(phase);
uvm_top.print_topology(); // Will not compile in UVM-1.2
factory.print(); // Will not compile in UVM-1.2
endfunction

Global handles uvm_top and factory in uvm_pkg have been removed in UVM-1.2 and later

Mechanism in UVM-1.1 and UVM-1.2

Call the get() method of the class to retrieve the singleton handle.

1
2
3
4
5
function void test_base::start_of_simulation_phase(uvm_phase phase);
super.start_of_simulation_phase(phase);
uvm_root::get().print_topology(); // Works in UVM-1.1 & UVM-1.2
uvm_factory::get().print(); // Works in UVM-1.1 & UVM-1.2
endfunction

Mechanism Only in UVM-1.2

1
2
3
4
5
function void test_base::start_of_simulation_phase(uvm_phase phase);
uvm_coreservice_t cs = uvm_coreservice_t::get();
cs.get_root().print_topology();
cs.get_factory().print();
endfunction

uvm_coreservice_t is the uvm-1.2 mechanism for accessing all the central UVM services such as uvm_root,uvm_factory, uvm_report_server, etc.

1
2
3
4
5
6
7
// Using the uvm_coreservice_t:
uvm_coreservice_t cs;
uvm_factory f;
uvm_root top;
cs = uvm_coreservice_t::get();
f = cs.get_factory();
top = cs.get_root();

Topology of the test

1
2
3
function void test_base::start_of_simulation_phase(uvm_phase phase);
uvm_root::get().print_topology(); // defaults to table printer
endfunction

The default printer policy is uvm_default_table_printer

There are three default printer policies that the uvm_pkg provides:

uvm_default_table_printer uvm_default_tree_printer uvm_default_line_printer

Check all configuration settings

Check all configuration settings in a components configuration table to determine if the setting has been used, overridden or not used. When recurse is 1 (default), configuration for this and all child components are recursively checked. This function is automatically called in the check phase, but can be manually called at any time.

To get all configuration information prior to the run phase, do something like this in your top object:

1
2
3
function void start_of_simulation_phase(uvm_phase phase);
check_config_usage();
endfunction

UVM phase

image-20220221230440017

  • Functions are executed bottom-up (Except for build and final phases , which are executed top-down)
  • Tasks are forked into concurrent executing threads

Objections are handled in pre/post body decalared in a base sequence class

This is efficient for all sequence execution options:

  • Default sequences use body objections

  • Test sequences use test objections

  • Subsequences use objections of the root sequence which calls them

1
2
3
4
5
6
7
8
9
10
11
12
13
class yapp_base_seq extends uvm_sequence#(yapp_packet);

task pre_body();
if(starting_phase != null);
starting_phase.raise_objection(this, get_type_name());
endtask
task post_body();
if(starting_phase != null);
starting_phase.drop_objection(this, get_type_name());
endtask

class yapp012 extends yapp_base_seq;

Set as default sequence of sequencer:

A default sequence is a root sequence, so the pre/post body methods are executed and objection are raised/dropped there

Test sequence:

A test sequence is a root sequence, so the pre/post body methods are executed. However, for a test sequence, starting_phase is null and so the objection is not handled in the sequence. The test must raise and drop objections

Subsequence:

Not a root sequence, so pre/post body methods are not executed. The root sequence which ultimately called the subsequence handles the objections, using one of the two options above.

If a sequence is call via a `uvm_do variant, the it is defined as a subsequence and its pre/post_body() methods are not executed.

objection change in UVM1.2

Raising or dropping objections directly from starting_phase is deprecated

  • must use get_starting_phase() method

  • prevents modification of phase during sequence

1
2
3
4
5
task pre_body();
uvm_phase sp = get_starting_phase();
if (sp != null)
sp.raise_objection(this, get_type_name());
endtask

First place place hard macro and add placement halo, then execute the following code to add endcap and tapcell.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
deleteFill -prefix ENDCAP
deleteFill -prefix WELL

setEndCapMode \
-leftEdge BOUNDARY_RIGHTBWP16P90CPD \
-rightEdge BOUNDARY_LEFTBWP16P90CPD \
-leftBottomCorner BOUNDARY_NCORNERBWP16P90CPD \
-leftTopCorner BOUNDARY_PCORNERBWP16P90CPD \
-rightTopEdge FILL3BWP16P90CPD \
-rightBottomEdge FILL3BWP16P90CPD \
-topEdge "BOUNDARY_PROW2BWP16P90CPD BOUNDARY_PROW3BWP16P90CPD"
-bottomEdge "BOUNDARY_NROW2BWP16P90CPD BOUNDARY_NROW3BWP16P90CPD" \
-boundary_tap true

set_well_tap_mode \
-rule 33
-bottom_tap_cell BOUNDARY_NTAPBWP16P90CPD \
-top_tap_cell BOUNDARY_PTAPBWP16P90CPD \
-cell TAPCELLBWP16P90CPD

addEndCap
addWellTap -checkerBoard -cell TAPCELLBWP16P90CPD -cellInterval 160

endcap_tapcell_floorplan.drawio

REF: StanfordAHA/PowerDomainDesign

0%