Ting Ku, NVIDIA

DAC 2017: EDA Powered by Machine Learning

So, Machine learning seems to be the biggest buzz out there right now. So, what’s the magic?

The first thing we need to learn, is to understand the classification of machine learning. I have a really, really simplified way of looking at things:

On the biggest circle outside we have this general way of doing work that we are all used to. And within it, there’s this machine learning way of doing things. And within that, there’s a deep learning aspect to this.

So, what does that mean? With a grand engineering fashion, we always make tables to describe things.

So here I have a very simplified table, where you can see that the regular way is just point automation, and it’s deterministic. There’s no database involved and there are defined features.

For the machine learning way of doing things, the style is statistical – it’s not deterministic. We do involve a database, because if you want to learn something, you must learn from experience, of course. 

And in the machine learning category, we have predefined features — that differentiates between machine learning and deep learning.

In deep learning, it’s all the same except there are no predefined features. The natural question would be: what are features?

Well, I have an example up that people are a little more familiar with. In [Solido] Variation Designer, if you want to model a probability density function, you must have attributes. Features are essentially attributes that differentiate between one thing over another.

For people, it could be your hair color, how tall you are, or your gender. Those are features. In this case the features would be: PVT corners, the algorithm necessary to define the device under evaluation, and the all the random variables of the devices.

Keep that in mind: features are things which are important to a problem. Now what is the flow of how to do this work?

It turns out, machine learning is an iterative process. You have a decision algorithm to begin with, and that decision-making algorithm will create a suggestion by the modeling.

The answer from the model may or may not be correct, so you must verify it. Once that verification is done, the data is included back in the database — that’s where the learning takes place, and the retraining takes place. And then the cycle starts over and over again.

At some point, the hope is that all these iterative cycles will make the model prediction pretty accurate. So, the next time a new case comes in, the prediction will be very good. The concept is pretty simple. I will talk about how to exactly do it in the next few slides.

One thing I like to emphasize, is when people talk about machine learning, most people think it’s neural network related.

But it turns out, it doesn’t necessarily need to be modeled by neural network. It could be modeled by a closed form equation, or it could be modeled by a neural network.

Here, I’m making a differentiation. In this example, you could do a Kernel Density Estimation, or you could do a deep learning neural network methodology. They are both machine learning.

In this talk, I want to talk a little more about the neural network because that seems to be a topic of interest by a lot of people. To make the process a little easier, I thought I’d give a very simple example of how machine learning works in the neural network environment.

So here I have a very simple chart of a database. In this problem, I’m trying to answer, “Is this person an Olympian or not?”

We all know Michael Phelps is an Olympian. And we all know that I, Ting Ku – I am not an Olympian. I also put Z [Marcisz] there, and Missy Franklin.

So over here essentially what we have are people who we know for sure are Olympians or not. This is a database that we want use to train the process. Now the training happens by looking at the attributes or the features that I mentioned before.

Over here we list out the features like height, reach, how big the wing span is, body mass index, the number of years trained, and gender. We all know the most important feature is probably the number of years trained. If you don’t train, you’re not going to be very good.

But will the machine learn about that? That’s the interesting point. A typical neural network mechanism goes like this: You take the features and you apply a particular weight, which we don’t know yet, and you put the data into an activation function and you answer the question: Olympian or not Olympian? 

Because we have a known answer in the previous data, you can play with this weight. Eventually at some point you find a magical weight function to successfully answer the question from the known database. Think of kind of guessing, you keep guessing the weight until you get the right answer for the known database.

Once you do that, you can predict someone else, like Amit [Amit Gupta, Solido] to see if he is an Olympian or not, and hopefully the model is accurate and you can predict that he is.

[Amit: What’s the answer? (laughter)]

Ting: So (Amit), to answer that question we would need to know your height, your reach, your BMI, and years of training. So please enter the information. (laughter)

Ok, so this was a rather simple example of how machine learning works.

In Nvidia, we have many applications of machine learning.

Here I show on the left-hand side, an imaging mechanism that will look at the lunch you have and figure out how much money you need to pay for your lunch. It’s quite interesting. I always eat salad, and when I put the salad there, it recognizes that salad, and says it cost four dollars. Great, so I just pay four dollars.

On the right-hand side is an internal tool that we made to suggest the circuit specifications. What we had before, was a problem of incorrect specifications in the data sheet. And since the data sheet is the source of all your information, if you have an error there, how would you know?

No one knows except the designer. However, there are so many specifications, that the designer cannot be so careful with hundreds of specifications.

What we did was to look at all the previous specifications, and created a prediction mechanism. In this particular example, the minimum spacing the designer put for this particular cell was this number.

And the suggestion is what? 80 percent of the time it was zero. But this time, it was this odd number. So, the designer now has a way to look at the previous history of what that entry should be, and he can decide what it needs to be. This helps to narrow down the error rate of the specs.

That’s a pretty successful application of machine learning using prediction from the previous data.

Another example is this mechanism from Solido.

They have a statistical PVT mechanism that will cover 4-sigma with only 300 simulations. This is also another machine learning mechanism. It creates a model, it makes a prediction, and you just need to do an iteration of simulations to see if that prediction is accurate. It learns slowly — or maybe quickly not slowly — that’s a bad word.  In the end, you gain efficiency out of that.

Now the question is “Is the machine just going to learn and replace all our jobs? Or perhaps, today the learning mechanism is actually doing simple functions like predicting models, predicting specs, but can it create things?” Well, that’s the million-dollar question because we as human beings pride ourselves to be creative, so to speak.

So low and behold, this team from a university created this picture. What it did was to have the machine look at master-level paintings. And then they said, “create a picture on the essence of this painting.”

This is the result — on the right-hand side.

I don’t know about you, but I think that’s pretty creative. It means that the machine learning technique is actually pretty powerful, and we are only seeing the tip of the iceberg.

What needs to happen for the EDA industry is for them to stop just providing us with data. Data is nice but we really want decisions.

All you need to do is put a layer in between the data and the decision and the machine algorithm will learn what the decision should be by looking at the data.

The EDA industry is in a perfect position to do this type of work. So that concludes my really brief introduction about machine learning.

Thank you.

Q&A

Audience Member Question:

Say basically that I have any digital block that has N input.  I need to do some power analysis on it and use the data to subsequently do the IR analysis. The problem I have is, given that N-input I may have two raised to the power N combinations of vectors. So the problem is very similar to either Monte Carlo or Fast PVT approaches, you pick either one.

The main problem is the number of samples is very, very large and in some cases the actual simulation time can be very long, even with one vector, because it can be a fully extracted netlist and so on.

So out of that many massive, like a billion samples, can machine learning be used to come up with a smaller set of vectors that can not only be used for power analysis, so that we don’t end up using this vector-less approach or activity propagation approach, which is mostly pessimistic. I haven’t seen any paper or publication on doing vector-less power analysis or IR analysis using machine learning. 

Ting Ku:

You are describing a problem that is a classic need for this notion called principal component analysis. What you do is that you somehow vectorize your problem. A vector could have Nth dimension. But you have no idea which dimension is the most important one. What you do is you use a clustering mechanism to figure out the principal component of the problem.

In that, you would realize that, even though you have 100 variables, only 5 of them are important. You run through that into the regression of the principal component analysis and can reduce your design space. That’s usually the classical way of how to reduce that problem.