June 6, 2016, Full Video Transcript Below
Variation-Aware, Energy Efficient SRAM Designs
I’ll talk about variation-aware design that we specifically went through for one of our test chips, and the context in which design can potentially solve many of the variation issues that we see in memory.
If you look at some of the data from SOC providers, mostly in the wireless space, you can see the trend is clearly for:
- More memory at lower cost;
- Memory that is faster, denser, and that also demands much lower active and retention Vmin and currents.
The traditional solution for the cost critical IoT space that we are entering is to use 8T or 10T cells, simply because the active Vmins are much lower than conventional 6T bit cells. However, the requirements on cost make a 6T solution much more attractive — if we can get the Vmin for 6T arrays as low as the near-threshold region.
The other reasons for pursuing a 6T solution are:
- It’s much less expensive in area, not only area but also in power and speed.
- Design verification test flows are readily available for 6T instances versus 8T or 10T arrays. And the 6T standard bit cell from the foundries is already optimized during the technology development of CMOS platforms.
From these trends and from emerging markets in IoT, you can clearly see fast, very low voltage, 6T SRAM is really the most desirable solution.
SRAM Design Primary Challenges
So what are the three primary challenges in SRAM design? As I just mentioned, ultra-low Vmin and the near threshold region less than 400 mV, mostly driven by battery life requirements of weeks to months in the IoT space, and edge SoCs in the IoT space.
Some of them requiring multi-gigahertz performance as well, again at low voltages for high-end variable devices where functionality and user experience are critical. Examples are Smart Watch and Smart Glass that are already in the market.
And the third requirement is ultra-low retention Vmin where the retention Vmin can be well into the sub-threshold region and requires as low as hundreds of femto amps per bit of retention card.
Our best practice from the design perspective is to develop novel, simple, peripheral circuit solutions.
- They add value to the CMOS solutions already provided by the foundry.
- They are much cheaper and take much less time to develop.
- They add value to the technology solutions whether it’s from device structure or high-k metal gate, etc.
- They are widely applicable across foundries and to standard 6T bit cells.
- They can actually enable simultaneously better Vmin, fmax, reliability, and leakage.
- They enable differentiation of design IP at the same foundry, and with the same bit cell.
To pursue this we do need to have a holistic and accurate statistical design. There are several examples that we can show. I’ll go into detail in one such example a couple of slide downstream.
Variation-aware Design Must-Haves
The four variation must-haves, the four requirements in any variation-aware analysis:
- The tool must be foundry calibrated so that the simulations of the foundry bit cell logic devices etc. match the foundry specs. For example the bit cell yield for 32 megabyte array.
- Simulation speed, of course, ease of adoption, breadth and variety of circuits and circuit metrics to which the tool can be applied.
- Ease of post processing so whatever data we get from statistical simulations can come in any desirable format, whether it’s excel sheets, ASCII text, PDF, etc.
- Most importantly, the ease to automate, capture and analyze the statistical simulations from multiple runs where you’re varying parameters across the entire design hierarchy from the device level, circuit level, multiple variants in the circuit styles, etc. So the ability of a tool to provide these statistics of statistical simulations is also very a useful thing.
Variation-Aware Design Results
[Below is] one example I wanted to give. This is one example that I wanted to highlight. So “Write Assist” is a common technique to improve the write margins of 6T bit cell at low voltages.
Write Assist simply makes the bit cell PFET weaker and the pass gate NFET stronger. And this can be accomplished in many ways: lower the source gate voltage of the PFET, raise the gate source voltage of the pass gate NFET or the drain-to-source voltage of the NFET.
You can accomplish this by raising virtual ground with a negative bitline, you can lower the column VDD or the WLOD.
These are all circuit techniques that are being published in the literature. However, from our analysis column VDD lowering is actually the most effective. I’ll show you why.
Here is a simple timing diagram of the word line and the power line, which is the bit cell power supply and the virtual ground which is the ground terminal of the bit cell. So, in column lowering, essentially during the write, we lower the voltage of the power supply, or in case of the array’s virtual ground, raise the virtual ground of the access bit cell to improve the write margin.
If you look at the pair of devices in the bit cell that are most critical for a write, the pass gate NFET and the pull up PFET, the variability of the voltage across the stack is measured by the write zero voltage here — the storage node that you’re trying to write a zero to.
And if you look at the measured distributions of that storage node as a function of operating voltage and the power supply voltage, you can see when you lower the column power supply, that the distribution not only shifts like you want it to so that it’s much easier to write to the bit cell, but it actually also shrinks.
So this circuit technique is very unique among all write assist circuits, in that it not only improves the nominal write margin of the bit cell but it actually improves the variability immunity to device fluctuations as well.
Just as a comparison, if you do the same thing for the virtual ground, if you raise the virtual ground to accomplish the same goals, that is, make the PFET weaker and the pass gate stronger, you can see that the distributions just shift. But they don’t shrink in their variance as much as the lowering of the column supply.
So this is an example of circuit technique that can give you much larger improvements in write margin than you can accomplish at the technology level. Even if you go from planer to FinFETs for example, you will not accomplish as large an improvement in the write margin and its variability, as would a circuit technique as this.
This is an exercise we accomplish not only by leveraging circuit solutions exclusively, but also by designing very holistically with the statistical tool with the must-haves that I mentioned earlier. We also do the standard general things, such as statistical simulations, to minimize the latency dispersion between matched race paths, which is also very critical for gaining performance at very low voltages.
Sensing bits at very low voltages with conventional sensing schemes such as differential sense amplifiers are a disaster, because not only the gain comes down at low voltage in the near threshold region, but also because the offset voltages are a bigger fraction of the gate overdrive of the conventional sensor schemes.
So there are lots of circuit technique opportunities that can enable much of the specs for 6T bit cell arrays in the near threshold region for cost-critical IoT devices.