According to wikipedia Pentium FDIV bug affects the floating point unit(FPU) of the early intel Pentium processors. Because of this bug incorrect floating point results were returned. In December 1994, Intel recalled defective processors. In January 1995, Intel announced “a pre-tax charge of 475 Million against earnings. Total cost associated with the replacements of the flawed processors.

What went wrong here?

According byte.com web page archive, Intel wanted to boost the execution of their floating-point unit. To do that they moved from its traditional shift-and-subtract division algorithm to new method called SRT algorithm. SRT algorithm uses a lookup table. Pentium’s SRT lookup table implementation is matrix of 2048 cells. Only 1066 of these cells actually contained the valid values. Due to issue in script loading the lookup table 5/1066 entries were not programmed with valid values.

SRT algorithm is recursive. This bug leads to corruption only with certain pairs of divisors and numerators. The “buggy pairs” are identified. They always lead to corruption. At its worst, the error can rise as high as the fourth significant digit of decimal number. Chances of this happening randomly, is about 1 in 360 billion. Usually the error appears around the 9^th or 10^th decimal digit. The chances of this happening randomly, is about 1 in 9 billion.

Isn’t randomization on inputs sufficient?

Here you can see even if we had constrained random generation on divisor and numerators how low is probability of hitting the buggy pair. Even if we create the functional coverage on the inputs how low is probability that we would specifically write coverpoints for the buggy pair.

Of-course now we have formal methods to prove correctness of such computationally intensive blocks.

We are already convinced that constrained random hitting on this buggy pair has very low probability. Hence even if it hits this case it may take long wall clock time. Pure floating-point arithmetic verification approach alone is not sufficient.

For a moment if you think what could have helped maximize the probability of finding this issue in traditional simulation based verification?

Simulation world is always time limited. We cannot do exhaustive verification of everything. Even a simple 32-bit counter exhaustive verification would mean 2 ^ 32 = 4G of state space. That’s where the judgment of the verification engineers plays major role as to what to abstract and what to focus.

Story of curious engineer

Let’s say a curious verification engineer like you looked inside the implementation. He tried to identify the major blocks of design and spotted the lookup table. His mind suddenly flashed insight floating-point inputs must use all valid entries of the lookup table. This is one more critical dimension for the quality of the input stimulus.

Let’s say he thought of checking that by writing functional coverage on input address of the lookup table. He thought of adding a bin per each valid cell of matrix.

This could have helped maximize the probability of catching this issue. There is a notion that verification engineers should not look into design as it can bias their thinking. While the intent is good but let’s not forget we are verifying this specific implementation of requirements. Some level of balancing act is required. It cannot be applied in purity.

He scans the following lookup table organization. Thanks to david deley for providing it:

Lookup table organization

Our curious engineer quickly makes notes about the following facts about the Look up table:

The matrix is organized as 128×16 = 2048 cells.
Out of these only 1066 cells are used
His well trained eyes notices that table is slightly asymmetrical
There are total of
- 20 rows at the top that have varying level of cell usage
- 46 rows that use all the cells of row
- 20 rows at bottom that have varying level of cell usage

He decides functional coverage on access for all the valid cells and simple assertion to check if the values were one of the {-2, -1, 1, 2}. Well assertion is simple enough in this case.

But the functional coverage requires 1066 bins. Due to asymmetrical nature it would at least need 86 entries of varying ranges to be coded. Any small change in the lookup table he would need to repeat it again.

Now remember although our engineer is curious he also has packed schedule to code the stimulus and write checks. Effort to write functional coverage is certainly discouraging. So he drops writing functional coverage and we all know what happened.

What could have helped this curious engineer?

Certainly it’s a daunting task to satisfy the curiosity of verification engineers. Not if the functional coverage is made more programmable. What if he had access to use the python along with the SystemVerilog functional coverage?

What would he have said? He would have said “easy peasy lemon squeezy”

Store all the valid row addresses in dictionary as keys
Store the mask of valid column address as values
- For example the first row address “0101.011” would contain the mask of “0000_0000_0000_0001”
Use two loops one looping over the row address and another looping over valid column address to create the bins
This also leads to highly programmable approach making any future changes quite easier

Here is how the code would look like:

[xyz-ips snippet=”curiosity-eval-common”]
[xyz-ips snippet=”post-pentium-bug”]

Out of these 1066 cells following 5 cells were left uninitialized in this famous bug. Row and columns of address of uninitialized cells were following. You can spot them in generated functional coverage code.

0101000_1101
0100100_1010
0100000_0111
0011100_0100
0011000_0001

Tool curiosity can create an automatic covergroups for you based on your input in python. Tool can do much more than this. Check out our product page for more details.