Achilles Test Systems Founders present at DAC 2009

July 7th, 2009

Join Achilles Test Systems (www.achillestest.com) at Design Automation Conference DAC July 26-31 in San Francisco, the conference offers an interesting mix of sessions for electronic design professionals. DAC is the premier event for bringing together industry leaders in electronic design. Achilles Test Systems is presenting on Managing Information Silos: Reducing Project Risk through Multi-Metric Tracking . Come talk to us about using multi-metric project health as a way to manage tradeoff between development time, cost, and quality. We will be discussing why traditional methods like measuring test coverage or measuring design stability are falling short because of fundamental changes that are occurring in electronic design.

Theresa Shafer

Tracking Customer Performance Data and Metrics

June 10th, 2009

Companies that sell complex and expensive equipment, usually do so with the promise of enhanced performance over their competition.  In order to deliver on these promises, companies usually gather performance metrics from their customers, and track these metrics so as to meet their performance goals.
If done properly, these performance metrics can allow you to watch for trends and spot problems in the following areas:

  • Trends in the utilization of system resources. This information can be used to plan and tailor changes and   upgrades to the configuration of the system.
  • Identification of stress on specific subsystems and elements of the system.
  • Balance the use of system resources during peak and normal usage.
  • Identify stress points on specific subsystems and elements of the system.
  • Use historical data to accurately predict the effect of specific changes to the system.
  • Identification of specific actions which may be causing problems with other activity on the system.
  • Efficiently manage utilization levels and trends for available resources.

The use of simple pass/fail criteria should be avoided.  These levels can be arbitrary set to a level that is twice that of a worse-case scenario so as to provide a margin of safety.  As time passes performances levels can gradually slip until they hit that level. Since the level was arbitrarily set, it is not clear when the actual problem occurred that allowed the system to reach this unacceptable level.  Therefore, when problems occur, it is essential to have performance data from before and after the incident to narrow down the cause of the performance problem, and to find an appropriate resolution.

However, customer databases can be inflexible and hard to use.  Data input might be done manually.  Reports can be time-consuming and tedious to generate.  As a result, customer performance data can take hours to research. Instead, the input of the data needs to done automatically and routinely.  Historic tracking of performance data cannot be done if data is spotty or insufficient.  Organization of collected data needs to be flexible and dynamic.  Archives that are too rigid quickly become tedious to use.

Report generation also needs to be automated.  Real-time dashboards should be available to present data in a quick and customized manner.  Reports for important customers might look different than reports for non-critical customers.   Researching customer data should take a matter of minutes, not hours.

Greg Goss

Trends in Project Health

May 29th, 2009

Most organizations use project health as a way to manage trade-off between development time, cost, and quality. Various measurements are used to access the performance or effectiveness of a project team. Without measurements they would be subjective guess. However traditional methods like measure test coverage or measuring design stability are falling short because of fundamental changes that are occurring in electronic design. Here are some of the top trends in design and the impact they have on tracking project health.

1. It’s an intelligent mesh, not just a flow.

There is no longer a sequential design flow in electronic design: there is architectural, layout, and verification exploration. Layout and verification often start before architecture and implementation details are finished. IP blocks further parallelize, loop and interconnect the project flow. With a mesh development, the next extremely difficult question is resource allocation. How do you apply people, machine, licenses, and testing runs? What stage needs more resources? What will be the resource impact on development time, cost and quality?

2. Global teams

Expect more “follow the sun” design work. However for distributed engineering groups, it can be difficult to manage and track progress. Global teams must share data, status, and issues. Project team can no longer rely on self report to get a picture of where things are. Teams must find ways to automate sharing status and escalating and brainstorming on issues.

3. Multiple dimensions of analysis

Engineering is managing constraints and trade-offs. A workable plan must balance cost, performance, schedule and quality to develop a useful design. Whole teams are dedicated to power, performance, fault, routing and timing. Each group needs to communicate and escalate optimization and trade-off decisions.

4. Runtime jobs growing faster than transistor count

Sure, transistors counts are growing but the jobs running analysis are exploding. Teams routinely run many hundreds of builds and automated tests. Expect to see this trend continue. How are you going to manage it? Are you getting best use of software licenses and CPU resources? Are you spending more on licenses than hardware?

5. Summarizing and Analysis of More Testing:

As system size grows, manual testing typically cannot keep up. So everyone is turning to test-data generation. But test generation–regardless of the framework used–requires manual sorting and analysis of the test results. Developers must wade though an ocean of data looking for warnings and errors. The result is that–if you don’t automate the data collection and analysis–you will spend all your time grepping log files. If you add more CPU resources and run more jobs, can you analyze the output by hand? You must automate the analysis of the results as well after a certain point or further test generation is pointless.

Why Achilles Test Systems?

Achilles Test Systems’ DV Notebook is a flexible repository for collecting, analyzing and summarizing a large amount of test data from multiple sources. We excel in multi-vendor environments. The DV Notebook’s visualization enables users to sift through test results, categorize failures, and highlight areas of concern. It draws attention to the key points while retaining the relevant context and the lower level details.

The DV Notebook can parse log files and extract key pieces of information such as test name, test description, time of execution, duration of test, key parameters, random seed values, error messages, and pass/fail status. Built-in templates track verification testing and manage regression suites out of the box. They offer a starting point that can be customized to address your exact situation, extracting the precise data you need and working with the tools in your flow. Collecting key information and storing it in an organized and easily accessible dashboard saves time, avoids errors, and facilitates re-use.

About Our Organization
Achilles Test Systems products and services enable development teams to correlate results across multiple project data files, cutting debug time in a collaborative development process. Project status is always available and up to date with automatically generated tables and charts to highlight trends from historical data. Every member of a global team just needs a web browser to contribute insights and to access an integrated visualization of seed tracking, test data, and source-code revision history.

Theresa Shafer

Employing Boyd’s OODA loop in design and verification

March 2nd, 2009

Created by military strategist and USAF Colonel John Boyd, OODA stands for Observe-Orient-Decide-Act.  Quoting from Wikipedia,

According to Boyd, decision-making occurs in a recurring cycle of observe-orient-decide-act.  An entity (whether an individual or an organization) that can process this cycle quickly, observing and reacting to unfolding events [...], can gain the advantage.

The speed and agility of the decision-making loop is key.  The greater the body of information that can be brought to bear quickly to orient one’s decisions, the more productive each action will be.

Different team members need different information to orient themselves.

  • Verification engineers need to observe:
    • which tests are passing and failing
    • who is doing checkin operations
    • what is the historical behavior of each test
    • which tests are most likely to detect a bug at a given point in time
  • Design engineers need similar metrics:
    • which RTL blocks are most likely to see timing failures
    • which lines of code have changed the most over time
    • which lint warnings represent the most risk in each block
  • Managers tend to observe trends and comparisons:
    • how has timing improved over the last few weeks
    • how many bugs are being filed
    • how many tests are passing and failing per week
    • which blocks have the most: warnings, timing failures, bug reports, code changes

These lists can be long, but the genius of the OODA model is that it applies to everyone in an organization.  The DV Notebook is designed to accelerate all aspects of observing and orienting unfolding events in a design and verification environment.  Information is brought together from many sources into customized dashboards.  This provides proper context for decisions and actions.  Many of the most common actions can be simplified to a single mouse click to initiate debug, file a bug report, rerun a simulation, or browse detailed reports.

Chris Kappler

Conventional wisdom and the “Intelligent Test Bench”

February 27th, 2009

During a conversation this year at DVCon 2009, Gary Smith classified the Dynamic Play List features of the DV Notebook as belonging to the classic definition of an Intelligent Test Bench.  Specifically, the ability of a test bench to prune unproductive work and avoid wasteful re-run of tests that are not going to find bugs.

Brian Bailey has written a compatible definition of an intelligent test bench in the following quote:

An intelligent testbench can either replace or enhance existing simulation based or formal verification methodologies. Constrained random generation techniques manage to create huge quantities of stimulus, but at the end of the day they have difficulties both with closure (achieving the desired verification goals) and secondly with efficiency (huge server farms required). An intelligent testbench can help either by determining efficient stimulus sets or by finding ways to reach difficult to reach coverage points.

In spite of this, one still needs to be clear about many aspects of intelligent test benches.  Most EDA products that belong in this category are focused on optimizing the outcome of a single test.  The vast majority of those are concerned with a single execution image.  This is in contrast with the idea of looking at a test population to selectively run those tests that are most likely to yield results.  (also see Do tests depreciate?)

There are several different scales to consider:

  • Optimizing a single execution to target a desired outcome
  • Optimizing runs of a single test to meet particular coverage goals
  • Optimally selecting the members (tests) within one or more regression lists
  • Optimizing test populations over the life span of a project or on subsequent projects

In part, the problem may be nomenclature.  The last two bullets above probably relate more to wisdom than to intelligence.  Wisdom is well rounded knowledge accumulated through time and experience.

The goal of the DV Notebook products is to enhance existing simulation based verification by determining efficient stimulus sets.  This entails looking more broadly at all available data (e.g. bugs, bug fixes, code check ins, lint results,…) to determine how the code is evolving, how the project is progressing.  This gives a picture of status and overall project health.  It is from that basis that cost reductions in verification can be realized.

Chris Kappler

Do tests depreciate?

January 20th, 2009

Every test can be analyzed in terms of its costs and values.

The costs of a test are mainly measured in time values such as the time it took to write the test or the time it takes to run the test.   Hardware cost factors such as special requirements for simulation memory or emulation hardware may be important.  Another measure of the cost of a test is the cost to check the test result in the event that automatic checking is not being performed.

A test’s value can be looked at many ways as well.  If writing or running a test makes engineers think differently, then that has value.  However, the most important value of a test is if it finds bugs.

In a recent conversation at DVClub, the topic turned to the time-value of  tests.  Our host pointed out that the value of a test does not stay constant over time.  Each test goes through a prime of life where it finds most of its bugs.  After those bugs are understood and addressed, that test begins depreciating.

Some tests lose their value in more drastic ways.  When tests are specific to old versions of the design or when they require configuration sequences that are no longer supported, they immediately lose all of their bug-finding value.

Regression testing is an important function to insure that new bugs are not introduced.  As a result, many tests always retain some minimal value.  However, a test that has never found a bug after 1000 executions is not likely to find one in a stable design.

What everyone seemed to agree on is that the natural depreciation of tests over time requires  dynamic test list management to stay on top of which tests should be run most often.  Some of the big micro-processor companies are at the forefront of managing verification resources to maximize the productivity and minimize the cost of their compute infrastructure.  This is partlly due to the fact that their design point has many common components year after year, so they have many product cycles to improve their environment.

With the DV Notebook, this kind of corporate environment is not required to analyze and manage the time-value of tests.

Chris Kappler

When the boss says he has great news…

January 2nd, 2009

I’ve got great news.  We’ve gotten approval for $100,000 to grow the cpu farm and buy more simulation licenses.  I think that we should be able to pull the schedule in by at least a month with this.

- Boss

We’ve all been in a similar discussion.  Management is looking for any way to increase productivity.  Random testing seems to create higher confidence with more instances of the same test .  The problem is, sometimes growing the number of simulations by 10x grows the debug effort too.

This particular case aside, each generation of ASIC requires more and more verification.  Each project therefore needs to run more simulations per day than the project before.  So, how can we scale up CPU usage year after year?  In short, the answer is organization.

Several companies in the Boston area employ SQL-based regression management tools to filter, sort, and organize testing lists and results.  The effect is CPU farms with hundreds or even thousands of CPUs running simulations.  A relatively small team can efficiently utilize these CPUs without wasting too many cycles on repeat errors or useless re-runs.

When tests fail, not all of them need to be debugged.  Some failures are debugged, others are stored in the database for re-run following the fix of bugs that are already filed.  One project that we know of required that every failing test and seed be re-run wihout failure as a pre-condition for tape-out.

Chris Kappler

How to tell when you’ve outgrown your spreadsheet

December 20th, 2008

Spreadsheets are great.  We all use them to organize our thoughts and experiment with what-if calculations.  In talking to different engineering companies in the Boston area, we have realized that spreadsheets are being used to do real engineering.  Sometimes, a spreadsheet is the perfect tool for our needs.  Other times, it starts to grow until we realize that we have surpassed the limits of the spreadsheet.

The examples of exceeding the limits of the spreadsheets tend to come in two forms.  Either, the spreadsheet is being used as an ad-hoc database or it is being used to perform a relatively complicated multi-step calculation that might be better served by a script or a program.

When the spreadsheet is being used as a database, there are a few clues to when it is time to consider moving data into a more powerful tool.  At a high level engineers will be better served by a simple database in place of the spreadsheet if any of the following are true:

  • If the spreadsheet has multiple authors
  • If data searching is required
  • If it has a combination of calculations and data storage
  • If it requires different numbers of different types of data

Similarly, there are a few clues that an engineer might look for that indicate that his spreadsheet would probably save time and money if it were in the form of a simple script for computation:

  • If the same calculation is replicated on different worksheets in order to consider different starting values
  • If many versions of the spreadsheet are saved with special names to indicate the what-if case that is being considered
  • If the data are being selected from indexed lists of pull downs

Migrating ad-hoc databases and multi-step calculations out of excel and into other tools or scripts is usually pretty easy. 

In some cases, useful excel functions are leveraged that a user does not want to re-program or replace from standard libraries.   Most scripting languages offer ways to access Microsoft excel functions directly.  Visual Basic, Ruby, and Python all have such extensions.  In linux, there are also C libraries for the open-office platform that can be used.

Chris Kappler