Verification project management

Verification project management gave an overview of the six essential parameters and three phases of the verification project. This article will focus on applying the six essential parameters to regression phase which follows planning phase and development phase.

This is the final phase and climax of the verification. It’s a highly intense phase of verification. Regression phase duration cannot be controlled. It’s extremely difficult to plan for it and make this schedule bound. A general rule of thumb is, it will take about 1x-3x of the time it has taken for the development phase based on the quality of planning and development phase. Regression is highly reactive phase. Only quick response with the dynamic adaption is the way to closure.

Key objectives of this phase:

Minimize the passing tests turning in to failure
Maximize the bugs discovered
Get all planned tests executed and passing
Get the code coverage and functional coverage goals met

Regression phase: Clarity

Tests to be executed. The variants of tests to be exercised
The functional coverage and code coverage goals to be met
Set of the new tests to be written
Targets for clearing the backlog failures from previous milestones
Pool of engineers available for the debug and their areas of expertise
Process of running regression, regression failure triaging, failure assignments and reporting of the debug status
Process of filing bugs and guidelines for interacting with the designers to achieve the closure
Process of the debug utilizing the log files, waveforms, interactive debuggers and any additional tools available
Process for repeatability of the falling tests especially for the seeded constrained random tests
Periodic clarity on regression and bugs status with priorities

Regression phase: Metrics

Test status
- Total tests
- Development
  - Total tests to be written
  - Total tests under development
Regression status
- Total tests enabled in regression
- Total passing tests
- Total failing tests
  - Unique failures among failures which need debugging
  - Breakup of open debugs per engineer
- These may have to be generated on different criterias
Bug status
- Total bugs files
- Total bugs open
  - RTL Bugs open (can be grouped by priority)
  - Test bench bugs open (can be grouped by priority)
- Number of test bench bugs per RTL bugs (test bench quality indicator)
Coverage status
- Code coverage status
- Functional coverage status

Regression phase: People

An enthusiastic engineer to hold front on regressions. He should have good understanding of linux operating system, LSFs and flows of the regression process. He should also be automation expert ensuring productive experience for the full regression management
It may be tempting to push the new engineers into regression debug. It should be avoided. Instead attach them with the experienced engineers to help them ramp up. If its possible it’s best to start them with writing of the tests to help them get the awareness of the verification environment and then they can participate in the regression debug activity
Developers from both the design and test bench team should be part of the debug pool
Extroverts, analytical, go getter engineers who can interact with the various engineers from design, verification to close the debug are best suited for this activity. This activity involves bringing in many hands in to activity to get it to closure
Verification lead should set the appropriate debug strategy before the failures are rolled out to team. In case of large failures whether comparative debug approach vs. regular debug flow decisions needs to be made. When failure numbers are more than the engineers available in debug pool appropriate prioritization must be done

Regression phase: Tracking

This is one of the highly tracking intensive phases of verification. Good tracking and coordination can give good boost to the overall activity as things changes very quick
Based on overall project phase sometimes hourly to daily tracking may be required. This frequency of updates provides the clarity to team involved in the debugging and also helps avoid duplicate debug efforts
Since frequency of tracking can go to crazy rates the automation for generation of the metrics desired and failure debug tracking is a must. This can significantly boost the regression triage cycle efficiency
Subset of the critical debugs should be explicitly identified and should be tracked differently. Anything that needs quicker closure due to release or any such reason should be put on different track and tracked separately
Engineers can expedite provided it’s clearly reasoned out and differentiated. Don’t fall in trap of calling ASAP for everything. It doesn’t work and can backfire. Please don’t
Bug filing may look like an additional overhead but it’s an important step. There may be certain level of hesitation from engineers to follow it. Verification lead and manager should emphasize it. It provides history. Historic data about the bugs is very helpful in making final tape out decisions
Regression cycle: Failure -> Disable test in regression -> Debug -> File Bug -> Fix -> Validation -> Enable back in regression, needs close follow ups for closure. Without follow up link between steps may break and bugs may become zombies
Regression debugs generate lot of information. Large information can hide small issues. Care should be taken such that any additional related tasks emerging out of the bugs should be added to the task tracking system immediately. If needed the tasks can be tagged with the bug filed to keep the context alive
Regression triaging and debug is activity that can turn stinky very quick. Frequent quick meetings or phone calls may be more useful than email or bug report based interactions
Regression run duration, compute and memory requirement statistics should also be tracked periodically. This helps in release planning as well as future compute infrastructure upgrade planning
Bug trends, test passing status trend and coverage trends have to be closely tracked and monitored. Especially nearing closure as these help planning the verification activity signoff

Regression phase: Review

Full regression test list quickly starts bloating and sometimes it can end up with the lot of redundant tests added for the temporary purpose. Periodic review of this regress test and pruning redundant tests should be done. Also dead tests should be removed from the list
Check-in regression should be periodically reviewed and refreshed with the new tests enabled in full regression to keep it relevant
Code coverage and functional coverage reviews should be conducted periodically. Actions such as increase seeds of certain constrained random tests, constraint tuning, writing more tests and test bench upgrades required should be identified and fed back into task tracking system for fixing
Areas where highest bugs are being discovered should be subjected to additional architecture and code reviews. If required do not shy away from code refactoring. It pays back in long run
Very long running tests should be identified which are bottleneck in regression turnaround time. They should be broken into smaller tests wherever possible to improve regression cycle efficiency
Even with all measures failures rates during critical milestones can go out of control. To recover back from such conditions a call for code freezes on development may be required. Verification lead and manager should look out for such situations and make those hard calls to get things back in control

Regression phase: Closure

Regression is of the most challenging activities to close. It can be compared to building the bridge in the river flowing full force. Bridge will often be washed away. It’s highly iterative process. It takes grind and patience to get handle on the closure
True closure on regression is not possible unless the activity of the development tapers down and closes for most practical purposes. Step by step various threads of activity that can lead to code changes should be closed down one by one. Certain level of calculated risk is called for here to achieve closure
At appropriate point based on project status decision to freeze new feature development in design should be made. There on only bug fixes should be allowed in design code
After design is frozen for new changes, the test bench features pending for implementation should be carefully reviewed. Only ones which are relevant for the project closure should be allowed
Attempt should be made to minimize new development. Every development request should go through an approval process from verification manager
Open failing tests and coverage not hit has to be reviewed for their exclusion. Based on the return on investment (ROI) decisions may have to be made to exclude them from final sign off list. These should be carefully reviewed and documented as to why these are excluded
Not all open bugs may be closed in time. All open bugs either design or test bench should be carefully reviewed to identify the ones that really needs to be fixed. The bugs not critical should be tagged appropriately to be taken up in next version
After all these threads leading to code changes are slowly shut down, the code changes will starts tapering down. Verification focus should intensify to closing on the remaining failures and coverage convergence closure
Even when the code changes have practically come down to zero corner case failures may show up seeded constrained random tests in regression. These should be carefully reviewed and their criticality should be evaluated. If they are not critical the code changes in the form of fixes should be minimized