Sorin Dobre, Qualcomm

EDA Powered by Machine Learning

Hello. My presentation is seven minutes. We do not have time to present all the techniques for machine learning. So, it’s more focused on the applicability in the EDA space, more of a kind of forecast, to use these techniques to address the design problems we are seeing at 10 nanometers, 7 nanometers and beyond.

In terms of the usage of machine learning in the EDA design space, data analysis and pattern recognition have been combined with algorithm-based solutions for physical and electrical verification even in the existing EDA tools — in yield analysis, and for model creation for electrical and power models, vectors generation with good enough coverage, and IP electrical validation, High-Sigma Monte Carlo analysis, and Worst Corner analysis and optimization.

These techniques are not exactly new. They are using the existing EDA tools. The question is how can they be used at the next level of modeling and to bring up a higher level of productivity and throughput for the EDA tools. 

There is a need for a high-throughput EDA solution that leverages machine learning which is driven mainly by big data requirements for advanced SOC designs. If you look to the designs implemented in 10 nanometers and 7 nanometers, we have multi-billion transistor designs.  

We have a huge amount of data and during design optimization, bringing up these designs in production requires new techniques in terms of analyzing the data and processing the data to be able to converge to an optimum solution. 

Design implementation in 7 nanometers and 10 nanometers has become very, very challenging.  How can you maintain your cycle time in terms of design, when you want to go to the next process technology? If you have a timeline from time to tape out or design to manufacturing, how can you maintain the same schedule and the same timelines when you double or triple or increase your data size by 4x, and the complexity, the functional complexity of the designs, increases by an order of magnitude.

EDA has a great opportunity to expand use of supervised and unsupervised machine learning solution for design flow optimization for RTL-to-GDS. We have highly automated flows, but they are not repetitive. They are based on the engineer’s experience. We have very good engineers with a very high level of experience, you can get very good quality designs. But if you bring up a new design team, you pretty much start from a very low level of experience, so you cannot guarantee the quality of your designs.

Then IT resource allocation. You have these compute farms with tens of thousands of CPUs. But if you look at the number of designs which must be implemented and validated in parallel, the question is how can you use these resources in an optimum fashion to maximize the usage of these resources, without having an explosion of resource requirements.

Another area where you can see a significant opportunity for machine learning is IP characterization. So, if you go to the latest process technology nodes, let’s say 10 nanometers, 7 nanometers, you see a lot of process variation. The effort in the foundry side to bring up a new process technology is significantly higher.

It is required to do a design which is closer to analog design. Even if it’s in the digital design space, you must validate and verify your design across multiple process corners which are driving explosion in the number of corners. The question is how can you get all these corners characterized on time, with very good quality, without having an explosion in number of resources you need and the number of licenses?

This is a great area where machine learning can bring:

  • an improvement in productivity by 10X
  • a reduction on characterization time by weeks
  • and a reduction in the number of resources — let’s say instead of characterizing the library suite with 4000 CPUs for a month — you can do it two weeks with 1000 CPUs.

It’s a significant economic benefit.

Then, as I mentioned, data management. How do you deal with this huge amount of data in the design space and on the foundry side in an efficient manner? How can you extract the knowledge and information you need for the next design and for the next product, to reduce your learning cycle on the design side and the foundry side?

In terms of areas where we see great potential for machine learning, the first one I put down is yield analysis, because from our experience what we have seen in the latest process technology nodes, we can spend months to root cause different sources of yield issues in the process side.

Being able to improve your yield by two or three points has a significant economic benefit. Now on this huge amount of data, machine learning is an effective method where you can identify patterns which are driving your yield failures. So, this is an area where we are seeing significant potential with a direct economic benefit, right now.

In terms of characterization of IPs, we must address the challenges for the next process technology node. We cannot look only at what we have today, but must find solutions for the near future and for the long-term future. 

If I need to characterize 200 corners in two weeks for old IPs, what kind of solution can I employ to meet this challenge with existing resources? If I just multiply the resources and the number of licenses with the number of corners from where I am today, I need to double or triple all these resources. This is a viable solution to address a simple technical problem.

In terms of design implementation and closure, we see a significant opportunity in terms of timing closure, physical implementation, physical verification, and functional verification. If we just observe the effort, and what is employed to close a design with a very high level of complexity, we can identify multiple repetitive tasks for the all the designers.

All these tasks can be captured, and then we can employ a system which applies these rules and captures this knowledge to automate this process. And to the drive the convergence much faster with predictable quality of design. 

That is very important. If you a complex SoC, and you have multiple design teams working across the globe, you can find out that one design team is doing a very good job and the other design team is doing something average, and another design team says they are under pressure with schedule, and the quality of the results is not on par with other implementations.

Now all these together are going into the same SoC. So, you have find a solution where you can guarantee — on the same timeline — the quality of the execution to help the engineers to get to this quality faster.

For physical design implementation, this solution can be employed inside the tools and can also be employed to work with the existing EDA solution. It’s not necessary that everything be implemented in the EDA tools. You can have independent EDA machine learning solutions which drive the existing tools to increase productivity, to increase predictability, and to increase the quality of design.

For functional verification, we are spending 80 to 90 percent of our resources in verification, and still after everything is done, I’m not sure the design is going to have 100 percent coverage. So, machine learning has a great opportunity to address some of these challenges in verification.

The full RTL-to-GDS flow which is employed today — from 180 nanometers to 7 nanometers to 5 nanometers to this extremely complex SoC — can benefit significantly from machine learning.

IT resource optimization for parallel computing: This is what we are using today for production, and there is a great opportunity for optimization. It’s not just throwing more computers and then buy another, later generation of CPUs to address the design problems we are facing today.

In terms of semiconductor areas where machine learning has the greatest potential, I’d like to highlight library characterization, where we can build models from existing characterization data to predict new data points.

What we have seen with working with the characterization solution developed by Solido, is that we can actually achieve with very good accuracy by using as reference data, probably half of the existing corners, to predict the other corners. This is a significant benefit in terms of reducing the resources that are required for characterization and reducing the characterization time, with a very good quality in terms of the output results.

Design implementation and timing closure — we need to develop methods to capture the design knowledge during the implementation process. We have senior engineers with 20 years experience who can guarantee extremely good quality of designs.

We also have junior engineers who cannot start every time from scratch, and you cannot go through five years of an educational process to bring these engineers to full productivity. So, we need to have a system that can help the engineers to generate data and to be productive with very good quality.

The machine learning can improve the overall QoR of new designs, significantly improve the time to design closure, timing closure, IR droop closure, and physical verification closure. And then a great opportunity for yield analysis of the data set associated with semiconductor volume production to identify the key physical and process parameters which impact yield loss.

And then we saw the direct benefit to the design community to improve the design, and also on the foundry side. To root cause of these peculiar problems on the foundry side, the process can take months to find the root cause and to fix these issues. 

Where and how can this solution can be integrated in the existing EDA ecosystem?

Some of the them can integrated in EDA tools. For characterization, it makes sense to have this solution integrated in the characterization tool. It completely changes the paradigm of how these tools must be developed. There’s a benefit for these tools to run faster, and to generate the data faster.

So, we want to develop a business model to give a reward to the EDA companies that are able to improve the productivity of the tool, to improve the throughput of the tool, and to reduce these run times by an order of magnitude. 

Machine learning is one of the solutions that can provide the roadmap for this type of EDA solution. But machine learning can also be used to develop a standalone solution driving EDA tools. It’s not necessary to know everything, you can just have a core technology in machine learning, and drive existing EDA solutions, especially in the design implementations space, in the design verification space, and in the RTL-to-GDS flow.

You can build a standalone machine learning solution that works with the existing EDA solutions to improve the quality of results, what is coming through to this flow. With machine learning, the expectation is to have the capability to optimize the overall design flow, and to provide significant benefit for the semiconductor industry.

This standalone solution requires the capability to interact and access multiple tools.  It’s possible to work with multiple EDA companies. There is a need for flexibility from the major EDA companies to enable and to work together on these new solutions. If you have a closed system and you don’t allow anyone to interact with your tools, we cannot develop anything.

There’s a significant benefit to incorporating machine learning – unsupervised and supervised – in verification and characterization tools. Having access to massive data provides the opportunity for significant throughput and productivity improvement by using these advanced modeling and data analytics.

Thank you.

Q&A

Amit Gupta: So, the question is how has machine learning been used to drive sample selection as part of the technologies. Does someone want to address that?

Sorin: I have used [Solido’s] Machine Learning Characterization. We developed a methodology how to identify these anchor corners which can be used for characterization.

First, you characterize all the corners for a small sample of cells. Then, based on these full characterizations, we identify which libraries — which corners — should be defined as anchor corners. Then they will be fully characterized for all the libraries, and then the rest of the libraries will be predicted. Now what we did to validate that the technology was valid, is we applied this methodology for all the libraries.

We have very complex libraries, sequential cells, we call them standard cell libraries. They’re multi-flops, minimacros – the cells have very high level of complexities. Standard cells in multiple architectures — we have 9 tracks, 10 tracks, 12 tracks libraries. Then we validate this methodology at multiple process technology nodes to see if this methodology produces a valid data with very good accuracy.

Then we identify these anchor corners, run the characterization for them, then do the prediction. Then you do multiple validations. You do point-to-point validation for the libraries you have fully characterized. You check every point in the library to see the delta between predicted value verses the characterized value.

We have a spec. We characterize everything for these libraries — not only delay, slew, CCS data, power, and leakage — everything you have today in the Liberty format gets predicted. Then we use the same spec which is used for the SPICE characterization.

For characterization, you can use multiple SPICE tools. Then how do I know that one SPICE tool or the data generated by SPICE characterization is correct? We define a set of accuracy criteria for these SPICE tools. Now we are using the same criteria for prediction and then we see the rate of failure. Through these mechanisms, you have different types of failures. You must root cause why you have the failures, what is driving the failures, and how you can address these failures.

So, we need to develop a recovery mechanism. If you have different failures, you must determine these failures, and then have a mechanism to recover from these failures to get to your accuracy data.

You know as Jeff [Jeff Dyck, Solido] mentioned, it’s required to have a methodology for engineers to be able to use these solutions. That comes from practice and through this validation process.