Category: ZVM – Quality

Zen & Art of Verification methodology for quality, reliability and productivity

  • Register verification : Why are we still missing the issues?

    RISC-V open virtual platform simulator quoted “Silicon without software is just a sand”. Well, it’s true. Isn’t it?

    For many design IPs, the programming interface to the software is through a set of registers. The registers contained by design IP can be broadly divided into 3 categories.

    • Information registers: Set of registers that provides the static information about the design IP. Some examples are vendor ID or revision information or capabilities of the design
    • Control registers: Set of registers that allow controlling the behavior or features of the design IPs. These are called control registers. Some examples include enable/disable controls, thresholds, timeout values etc.
    • Status registers: Set of registers that provides the ability to report various events. Some examples are interrupt status, error status, link operational status, faults etc.

    The software uses the “information registers” for the initial discovery of the design IP. There on programs the subset of control registers in a specific order to make the design IP ready for the operation. During the operation, status registers provide the ability for the software to figure out if the design IP is performing the operations as expected or does it need some attention.

    Register verification

    From verification point of view we need to look at it from two points of view. They are:

    • Micro-architecture point of view
    • Requirement specification point of view

    Micro-architecture point of view focuses on the correctness of the implementation of the register structure. Which can be described some of the following items:

    • Each register bit properties implemented as read-only, read-write, write to clear, read to clear or write-1-clear type of implementation
    • Is the entire register address space accessible for both read and write
    • If there are any byte enables used are they working correctly
    • All possible read and write sequences operational
    • Protected register behavior is as expected

    From requirements point of view focuses is on the correctness of the functionality provided by the register. Which can be described some of the following items:

    • Whether the power on reset value matches the expected value defined by the specification
    • For all the control registers whether the programmed values are having the desired effect
    • When the events corresponding to different status register updates are taking place, whether they are reflecting it correctly
    • Any status registers whose values needs to be retained through the reset cycling
    • Any registers that needs to be restored to proper values through power cycling are they taking place correctly

    Micro-architecture implementation correctness is typically verified by the set of the generated tests. Typically a single automation addresses the RTL registers generation, UVM register abstraction layer generation and associated set of test generation. These automations also tend to generate the documentations about registers, which can serve as programming guides for both verification and software engineering teams.

    Functional correctness of the registers is the more challenging area. Information category of registers is typically covered by the initial values checking tests. Control and status register functional correctness is spread out across various tests. Although some tests may explicitly verify the register functional correctness, but many tests that cover register functional correctness are not really verifying only that. They are focusing on higher-level operational correctness, in doing so, they utilize the control and status registers. Hence they verify them indirectly.

    In spite of all this verification effort, register blocks still will end up having some issues, which are found late in verification cycles or during silicon bring up.

    Why we are still missing the issues?

    Register functionality evolves throughout the project execution. Typical changes are in the form of additions, relocations, extensions, update to definitions of existing registers and compacting the register space by removing some registers.

    Automations associated with registers generation ease the process of any changes. At the same times, sometimes layers of automations can make the process of review difficult or give a false sense of positive security that all changes are being verified by automatically generated tests. The key point is automation is only as good as the high-level register specification provided as input. If there are mistakes in the input, sometimes automations  can  mask them.

    Since the register verification is spread out across automated tests, register specific tests and other features testing, its difficult to pin-point what gets verified and to what extent.

    What can we do about it?

    First is traditional form of review. This can help catch many of the issues. But considering the total number of registers and their dynamic nature it’s difficult to do this review thoroughly and iteratively.

    We need to aid the process of reviews. We need to open up the verification to questioning by designers and architects. This can be effectively done when there are high-level data about the verification done on the registers.

    We have built a register analytics app that can provide various insights about your register verification in the simulation.

    One of the capabilities register app helped catch issues of dynamically reprogrammed registers. There are subsets of control registers that can be programmed dynamically multiple times during the operations. As the register specification kept on changing the transition coverage on a specific register which was expected to be dynamically programmed was not added.

    Our register analytics app provided the data regarding which registers were dynamically reprogrammed, how many times they were reprogrammed and what unique value transitions were seen. It was made available as a spreadsheet. One could quickly filter the columns of registers that were not dynamically programmed. This enabled questioning of why certain registers were not dynamically reprogrammed? This enabled catching certain dynamically re-programmable registers, which were not reprogrammed. When they were dynamically reprogrammed some them even lead to the discovery of additional issues.

    We have many more micro-architecture stimulus coverage analytics apps that can quickly provide you the useful insights about your stimulus. The data is available both at per test as well as aggregated across the complete regression. Information from third party tools can additionally provide some level or redundancy to verification efforts catching any hidden issues in the automations already used.

    If you are busy, we do offer services to set up our analytics flow containing both functional and statistical coverage, run your test suites and share the insights that can help you catch the critical bugs and improve your verification quality.

  • Verification quality improvement – Test bench

    Test bench quality overhaul involves three primary activities cleaning, trimming and refactoring.

    First step of quality overhaul is cleaning and trimming. Typically verification solutions with the quality problems would have developped weeds. Weeds are redundant and useless growth in the code. This is a distraction. In order to focus on the right problems first step is to trim it and clean it.

    Trim the verification plans, regression lists consisting of test variants and seeds per test. Trim the tests. Trim test benches.  Be ruthless clean, trim and cut it down. Trim in all aspects of the functional verification.

    Cleaning test bench

    Cleaning test bench consists of three steps. Cleaning compile and run time errors, cleaning redundant files and cleaning dead code. More about cleaning the test bench is described in cleaning the test bench.

    Refactoring test bench

    According to wikipedia refactoring is process of restructuring existing computer code – changing the factoring(decomposition)- without changing its external behavior.
    (more…)

  • Testbench quality improvement: Refactoring test bench

    First step in refactoring the test bench code is to identify the code requiring refactoring.

    Poor code in the test bench typically bloated, showing the instability, limitations or partial implementation should be identified. This can be based on the areas identified during debugs, previous bug history and feedback from verification engineers.

    Let’s looks at some of the test bench components that can have major impact on quality and how to address it.

    Poor quality Bus functional model(BFM)

    Bus functional models are pillars of the test bench. Quality of BFMs used can have significant impact on overall quality. Poor quality BFMs can hide the real bugs, bring up false failures and lead to instability of regression.
    (more…)

  • Testbench quality improvement: Refactoring tests

    First step to refactoring tests is to review tests to identify the possible reductions. Reduce the number of tests to provide the necessary coverage. This is not going to provide immediate verification coverage increase but this opportunity to set things right for future maintenance. Any opportunity to group related tests to reduce the code redundancy should be spotted.

    Sometimes to update single additional scenario multiple related tests have to updated individually since they are kept separate. If two tests have greater than 60 % code same then it should merged. Tests that are no longer valid might still be running eliminate them. Tests might be running for the configurations and feature combinations that are no longer valid , eliminate them as well.

    Additionally part of refactoring reviews should spot tests that are not meeting their intent called out in the test plan. For large verification projects this is a big challenge. Selective reviews should be conducted around the areas where bugs are being discovered, tests which have frequent false failures or tests that have been passing for very long time or the ones complained by designers and verification engineers. These are some of the ideas basically review process needs to be come up with smart criterias to maximize the return on review.

    Maintaining regression history can be very helpful in helping identify the tests to be reviewed. Again bulk of the cleaning can be completed but periodic such audits and reviews have to be conducted to maintain it in the same form.

  • Testbench quality improvement: Cleaning test bench

    First step in the test bench quality rehaul is cleaning the test bench code.

    Clean compile and run time warnings

    Review the compile warnings from the compile log file. This should be done periodically. Some of the compile warnings can turn into bugs later.

    Review the runtime warnings. It could be possible some constraints might be failing and not getting caught because it’s not checked, some of the checks downgraded from error to warning and forgotten to be re enabled and any simulator tool warnings.

    In fact it’s best to add the check on the compile and run time warnings as part of the check-in regression to keep it under control.

    Cleaning redundant files

    Some of temporary files created during development which are no longer useful can accumulate over time.
    (more…)

  • Verification quality improvement – Regression & Debug

    Regression phase is the climax of the functional verification. This is phase during while the real value of the functional verification is being realized. That’s why this phase has to be highly efficient to meet the verification objectives planned. By nature this is a phase very hard to constrain it under a strict schedules. The only way to keep under control is to have highly efficient flow to manage regression and debugging.

    Improve the efficiency of the regression and debug process to support reactive fire fighting process to meet the deadline.

    Regression productivity

    Full regression is process that requires religious commitment to maintain quality. It contains elements that require repetition.

    Regressions are repetitive tasks. The efficiency of the overall regression process significantly affects the productivity of the overall verification and team. Frequent repetition can only be effectively when it’s automated.
    (more…)

  • Functional verification – Quality improvement

    Design quality is highly cared for. But it’s important to realize that the design quality cannot be achieved without functional verification quality. When the functional verification is done from scratch the principles of the ZVM: Verification methodology for quality can be applied to achieve quality. What if it’s late? Yes there is still a chance to do a functional verification quality overhaul to restore the quality.

    Before we understand more about functional verification quality overhaul, it’s important to understand what is functional verification quality and what leads to poor functional verification quality.

    Functional verification quality overhaul requires one to have good understanding of bigger picture of the functional verification. There are three phase of the functional verification execution. They are planning phase, development phase and regression phase. Poor functional verification quality is net result of poor planning phase, poor development phase and poor regression phase. Based on the symptom of poor quality right proportions of following guidelines have to adopted to restore the quality.
    (more…)

  • Verification quality improvement – Legacy test benches

    Legacy tests benches are the ones based on the hardware description language(HDL)s like VHDL or Verilog, with or without high level languages such as C or C++. These test benches are not based on the coverage driven constrained random approach. Although they may also contain randomization approach in rudimentary form.

    Typically designed for the early first generation of the older designs. It would have provided the necessary coverage at that point but as newer revisions show up with the increased complexity the legacy technologies based test bench may not be able to do a good job.

    In fact beyond certain point of design complexity the effort to verify the feature in the legacy test bench for the same coverage can become significantly higher compared to latest HVL and verification methodology driven test benches. This is because of lack of built-in support for the constructs aiding the constrained random verification. Finding engineers to maintain and update the legacy test benches can also become challenging.
    (more…)

  • Verification quality improvement: Verification plan

    Root of most the functional verification quality problems lies in the quality of verification plan and its management. So one of the key component of quality overhaul of verification requires quality overhaul verification plans.

    Verification plan is a seed. A bad seed grows into tree with the bitter fruits. Poor verification quality is result of poor quality in the planning phase.

    Verification plan consists of three plans : test plan, checks plan and coverage plan. In a coverage driven constrained random approach test plan and checks plan tends to get ignored. This is dangerous. Test plan and Checks plan are equally important. In fact more important than functional coverage plan because they talk about what needs to be achieved and how it needs to be achieved.
    (more…)

  • Verification plan review

    Verification plan review is still a manual process. Verification plan is made up of three plans Test plan, Checks plan and Coverage plan. As of today’s technology, I am not aware of any tools to do automatic verification plan review based on machine learning.

    First challenge in the verification plan review is to have a verification plan. Second challenge is to make it a traceable verification plan so that it can be trusted.

    What  home work is required from verification plan reviewer?

    Verification plan building requires specification expertise and verification expertise. Verification plan writer gets time to do a detailed scan of the requirement specification to write the verification plan.

    The challenge for the verification plan reviewer will not be able to get same amount of time. Verification plan reviewer will still have to do the home work to make the review effective.
    (more…)