Verification Martial Arts: A Verification Methodology Blog

Archive for November, 2010

Communication Options in VMM 1.2

Posted by John Aynsley on 30th November 2010

 

John Aynsley, CTO, Doulos

Having given an example of transaction-level communication in VMM in my previous post, I am now going to say a little about each of the communication options available in VMM 1.2. The VMM user is presented with an array of communication options from which to chose, but which-is-which and which is best?

Blocking Transport

Blocking transport is used to send a single payload representing a transaction from an initiator to a target where very little timing information is required. Each transaction can have a begin time and an end time, but there is no more structure or detail to the lifetime of the transaction than that. It is possible to annotate timing information onto the b_transport call to offset the begin and end times of the transaction from the actual times as which the function is called and returned, but not to add further timing points or events during the lifetime of the transaction. The b_transport function is called with a transaction object as an argument, and returns only when the processing of that transaction at the target is complete. Blocking transport is the cleanest and simplest way to send a payload from A to B.

Non-blocking Transport

Like blocking transport, non-blocking transport is used to send a single payload representing a transaction from an initiator to a target. Unlike blocking transport, non-blocking transport allows significant events within the lifetime of the transaction to be modeled to whatever degree of timing granularity you like, and also allows multiple pipelined transactions to be in-flight at the same time. This is achieved by allowing multiple calls to nb_transport_fw and nb_transport_bw, in the forward and backward directions respectively, associated with a single transaction. The ability to model multiple timing points and pipelining comes at a cost in terms of the complexity of the interface, however. In the context of VMM, non-blocking transport is only of interest if you need to communicate with an existing SystemC TLM-2.0 model that uses this same interface.

Analysis Interface

The analysis interface is the third import to VMM from the TLM-2.0 standard. The analysis interface is used to broadcast a single payload to passive subscribers (aka observers or listeners). The defining characteristics of the analysis interface are firstly that there can be any number of subscribers to a single call and secondly that the subscribers are not permitted to modify the transaction object. Hence the analysis interface is ideal for distributing transactions to passive components such as checkers, scoreboards, and coverage collectors within a verification environment. The analysis interface is non-blocking and does not have any associated timing or status information.

Channels

Channels are the classic way to send payloads around a VMM verification environment. Unlike the three transaction-level interfaces described above, a channel is more than a function call. A channel is a FIFO buffer that can store multiple outstanding transactions. A transaction can be accessed by both producer and consumer transactors while it remains in the so-called active slot of the channel, and it is possible to model multiple timing points at that stage by having VMM notifications associated with the channel or with the transaction object. Channels are still the preferred mechanism used to model the timing and synchronization details of specific protocols within VMM, while blocking transport is preferred where a more abstract model is adequate.

Callbacks

In some respects callbacks play the same role as the analysis interface described above, while in other respects there are fundamentally different. Callbacks are like the analysis interface in the sense that they distribute function calls to any subscriber that has register itself with the transactor making the calls. They are different in that a callback is allowed to modify transactions passed as arguments to the function call whereas the analysis interface does not allow transactions to be modified. Also, callbacks are more flexible in the sense that both the callback function names and their argument lists are user-defined, which is not true of the analysis interface. In VMM, callbacks are the preferred mechanism where the behavior of a transactor needs to be modified by tweaking the contents of a transaction.

Notifications

VMM notifications are preferred for data-less synchronization where transactors need to synchronize without passing a payload. The analysis interface is preferred when passing around payloads. VMM notifications are also built into the transaction and channel base classes, so can be used to introduce additional timing points during the lifetime of a transaction when using VMM channels for communication as described above.

So, those are the six main choices for communication available to you in VMM. You can read more details in the VMM 1.2 Standard Library User Guide.

Posted in Communication, Transaction Level Modeling (TLM) | 1 Comment »

Planning for Functional Coverage

Posted by JL Gray on 23rd November 2010

JL Gray, Vice President, Verilab, Austin, Texas, and Author of Cool Verification

Functional coverage cures cancer and the common cold. It helps kittens down from trees and old ladies up the stairs. Functional coverage is the greatest thing since sliced bread, right? So all we need to do is put some in our verification environments and our projects will be a success.

Well… perhaps not.

Many teams I’ve worked with have struggled with various aspects of functional coverage. The area I’d like to address today is the part most engineers spend the least amount of time on – planning (as opposed to implementing) your functional coverage model. One of the critically important uses for functional coverage is to help us (engineers and project managers alike) know when we are done with verification. We all know it’s impossible to 100% verify a complex chip. But it is possible to define what “done” means for our individual teams and projects. The functional coverage model gives us a way to correlate what we’ve accomplished in the testbench with some definition of “done” in our verification plan.

What that means is that the verification plan itself is the key to understanding where we are relative to where we need to be to finish a project. It is the document that managers, designers, verification engineers and others can all interact and agree on what constitutes “good enough.” Verification plans often contain lists of design features and statements about what types of tests should be run to verify these features. Unfortunately, that is almost never sufficient to give stakeholders insight into what should be covered.

Verification plans need to contain a sufficient level of detail to meet the following goals:

  • 1. Allow key stakeholders who may not know anything about SystemVerilog to see what will be covered and then provide a platform for them to give feedback to the verification engineer.
  • 2. Allow this discussion to take place before any SystemVerilog code has been written.
  • 3. Allow an engineer other than the author of the document to understand and implement the coverage model.
  • 4. Allow the project manager to understand what coverage goals should be reached for the module to be considered verified.

Point 3 is important in ways some engineers may not realize. During my verification planning seminars I often ask students if they’ve ever heard of a project’s “truck factor”. Basically, a project’s truck factor is the number of people on the project who could get hit by a truck before the project is in trouble. A verification plan should specify coverage goals in enough detail that someone besides you (this could include the you 6 months from now who’s forgotten all relevant details about the block in question) can understand what the original intent of the document was. For example, some plans I’ve seen make comments like “test all valid packet lengths”. What are the valid lengths? They should be specified in the plan. Are there certain corner cases you think are important to cover? Make sure you call out this information with sufficient detail so that the project manager can tell if you’ve actually accomplished your goals!

Adding this extra level of detail to your verification plan takes time. But without the detail, how can you easily share with your colleagues what you plan to do? And how can you know which features your testbench needs to support in order to meet your coverage goals?

As always, questions and comments welcome.

Posted in Coverage, Metrics, Verification Planning & Management | 2 Comments »

You can also “C” with RAL.

Posted by S. Varun on 18th November 2010

The RAL C interface provides a convenient way for verification engineers to develop firmware code which can be debugged on a RTL simulation of the design. This interface provides a rich set of C API’s using which one can access fields, registers and memories included within the RAL model. The developed firmware code can be used to interface with the RAL model running on a SystemVerilog simulator using DPI (Direct Programming Interface) and can also be re-used as application-level code to be compiled on the target processor in the final application as illustrated in the figure below.

image

The “-gen_c” option available with “ralgen“ has to be used for this and that would cause the generator to generate the necessary files containing the API’s to interface to the C domain. The generated API’s can be used in one of two forms.

1. To interface to the System Verilog RAL model running on a System Verilog simulator using DPI and

2. As a pure standalone-C code designed to compile on the target processor.

Typically firmware runs a set of pre-defined sequences of writes/reads to the registers on a device performing functions for boot-up, servicing interrupts etc. You generally have these functions coded in C, and these would need to access the registers in the DUT. Using the RAL C model, these functions can be generated so that the firmware can now perform the register access through the SystemVerilog RAL model. Thus, this allows firmware and application-level code to be developed and debugged on a simulation of the design and the same functions can later be used as part of the device drivers to perform the same tasks on the hardware.

In the first scenario, when executing the C code within a simulation, it is necessary for the C code to be called by the simulation to be executed and hence the application software’s main() routine must be replaced by one or more entry points known to the simulation. All entry points must take at least one argument that will receive the base address of the RAL model to be used by the C code. The C-side reference is then used by the RAL C API to access required fields, registers or memories.

image

Consider a sequence of register accesses performed at the boot time of a router as defined in function system_bootup() shown below,

File : router_test.c

#ifdef VMM_RAL_PURE_C_MODEL

/*when used as a pure C model for the target processor */

int main() {

void *blk = calloc(410024,1);

system_bootup((void *) (((size_t)blk)>>2)+1);

}

#endif

void system_bootup (unsigned int blk){

unsigned int dev_id = 0xABCD;

unsigned int ver_num = 0×1234;

/* Invoking the generated RAL C APIs */

ral_write_DEV_ID_in_ROUTER_BLK(blk, &dev_id);

ral_write_VER_NUM_in_ROUTER_BLK(blk, &ver_num);

}

In the above example “ral_write_DEV_ID_in_ROUTER_BLK()” & “ral_write_VER_NUM_in_ROUTER_BLK()” are C API’s generated by “ralgen” to access registers named “DEV_ID” & “VER_NUM” located in a block named “ROUTER_BLK”. In general registers can be accessed using “ral_read_<reg>_in_<blk>” & “ral_write_<reg>_in_<blk>” macros present within the RAL-C interface.

The above C function can also be called from within a System Verilog testbench via DPI

File : ral_boot_seq_test.sv

import “DPI-C” context task system_bootup(int unsigned ral_model_ID);

program boot_seq_test();

initial

begin

router_blk_env env = new();

// Configuring the DUT

env.cfg_dut();

// Calling the C function using DPI

system_bootup(env.ral_model.get_block_ID()); // ‘ral_model ‘is the instance of the generated SV model in the env class

end

endprogram

In the example above, system_bootup is the service entry point which is called from the SV simulation and is passed the RAL Model reference. The ‘C’ code that is executed and simulation freezes till one of the registers accesses is made which then shifts the execution to the SV side.

image

The entire execution in ‘C’ is in ‘0’ time in the simulation timeline. The RAL C API hides the physical addresses of registers and the position and size of fields. The hiding is performed by functions and macros rather than an object-oriented structure like the native RAL model in SystemVerilog. This eliminates the need to compile a complete object-oriented model in an embedded processor object code with a limited amount of memory

Thus by using the RAL C interface one can develop a compact and efficient firmware code, while preserving as much as possible of the abstraction offered by RAL. I hope this was useful and do let me know your thoughts on the same if you plan to use this flow to meet some of your requirements of having your firmware and application-level code to be developed and debugged on a simulation of the design.

Please refer to the RAL userguide for more information. You can also refer to the example present within VCS installation located at $VCS_HOME/doc/examples/vmm/applications/vmm_ral/C_api_ex.

Posted in Register Abstraction Model with RAL, SystemC/C/C++, SystemVerilog, VMM | No Comments »

Reusing Your Block Level Testbench

Posted by JL Gray on 12th November 2010

JL Gray, Vice President, Verilab, Austin, Texas, and Author of Cool Verification

Building reusable testbench components is a desirable, but not always achievable goal on most verification projects. Engineers love the idea of not having to rewrite the same code over and over again, and managers love the idea that they can get more work out of their existing team by simply reusing code within a project and/or between projects. It is interesting, then, that I frequently encounter code written by engineers that cannot work in a reusable way.

Consider the following scenario. You’ve just coded up a new block level testbench to verify an OCP to AHB bridge. As any good engineer knows, your environment must be self-checking, so you create a VMM scenario that does the following:

class my_dma_scenario extends vmm_scenario;

  rand ocp_trans ocp;
  ahb_trans ahb;

  // ... 

  virtual task execute(ref int n);
    vmm_channel ocp_chan = this.get_channel("OCP");
    vmm_channel ahb_chan = this.get_channel("AHB");
    this.ocp.randomize();
    ocp_chan.put(this.ocp.copy());

    // Wait for the transaction to complete...

    ahb_chan.get(ahb) 

    // Compare actual and expected values
    // in the stimulus instead of the scoreboard.
    my_compare_function(ocp, ahb);

  endtask

endclass

Or, perhaps you decide to get fancy and update and create a situation where you add the expected transaction from the scenario directly to a scoreboard, which compares the results with transactions on the AHB interface as observed by a monitor. Either way, you go merrily about your business verifying the bridge. You get excellent coverage numbers, and eventually, with the help of the block’s designer, you feel you’ve fully verified the module at the block level. However, things are not going well in the full chip environment. Full chip tests are failing in unexpected ways, and other engineers debugging the issue feel it must be a bug in the brigde! They start assigning bugs to you and the block designer. If only you had some checkers operating at the full chip level you could prove your module was operating correctly…

Unfortunately, you have a problem. In your block level testbench, your checkers only work in the presence of stimulus, and in the full chip environment the stimulus components cannot be used. And at this stage of the project, there is no time to go back and create the additional infrastructure (monitor(s) and possibly a new scoreboard) required to run your self-checking module-level testbench at the full chip level.

Fortunately, there is another way to build a module level testbench so that it will be guaranteed to work at the full chip. Follow these simple rules:

  1. Always create a monitor, driver, and scenario generator for every testbench component.
  2. Never populate a scoreboard from a driver or scenario. Always pull in this information from a passive monitor.

In general, when writing your testbench components always assume they will need to run in the absence of testbench stimulus. Yes – this means additional up-front work. Writing checkers that work with passive monitors instead of the information you have available to you in the driver can be time consuming. Often, though, the amount of additional work required is much less than the time you will spend debugging issues without checkers at the full chip level. That being said, sometimes it is too difficult to implement purely passive checkers. You will have a good idea of whether or not this is the case based on the upfront work you did writing a comprehensive testplan. You do have a testplan, right?

Posted in Reuse, VMM, Verification Planning & Management | 3 Comments »

Modeling ISRs with VMM RAL

Posted by Amit Sharma on 4th November 2010

In a verification environment, different components may be trying to access the DUT registers and memories. For example, the BFM might be programming some registers while the bus monitor might be sampling the values of these registers. In specific cases, there may be an interrupt monitor which triggers an Interrupt Service Routine (ISR) whenever it sees an Interrupt pin toggling in the interface. The ISR might end up having to read the Interrupt registers and end up clearing the Interrupt bit/s through a front door access.

To ensure that different components in a verification environment can access the DUT registers at any given point in time, the RAL model instantiated in the environment can be passed to different VMM components. These different components whose methods are executing in separate parallel threads can now access the same set of registers in the DUT through the RAL model. A question many folks ask is: when there are multiple parallel register accesses, how do they get scheduled through the RAL layer?  Here is an explanation of how threads are scheduled in RAL:

A Register read/write from different threads is comparable to an ‘atomic’ channel.put() from  different threads. Hence it gets scheduled in the order of pipelining of the threads.

A write/read would basically consist of the following atomic operations:

- A generic vmm_rw_access transaction with its fields (addr, data, kind etc) being populated and posted onto the execute_single()   task of the Translate Transactor

- The transaction being translated in the execute_single() task and pushed into the input channel of the user BFM

- The transaction being retrieved through get/activate in the User BFM main thread and then subsequently driven to the DUT interface

Thus ‘posting’ of RAL accesses whenever a Read/Write/Mirror/Update is invoked is in the same order they are issued. Subsequently, the execute_single() task just translates the generic RW RAL transaction to a User BFM comprehensible transaction, and doesn’t change the order.

Now, how do we handle a scenario when specific register accesses like those coming from an ISR need to be given a higher priority than accesses coming from other threads in a verification environment?  VMM and RAL methods give us specific hooks to achievethis requirement. Here is one of the options on how this can be done.

If we look at the vmm_ral_reg::write method, we have the given signature:

virtual task write( output vmm_rw::status_e status,

    input bit [63:0] value,

    input vmm_ral::path_e path = vmm_ral::DEFAULT,

    input string domain = "",

    input int data_id = -1,

    input int scenario_id = -1,

    input int stream_id = -1)

Now, a generic RAL transaction that gets created through any Register access has the same data_id, scenario_id, stream_id arguments which get passed on from the read/write call. These arguments help us tag and track the transactions if that is so desired. Now, these arguments can be made use of in the execute_single() task to ensure that accesses from ISRs have the highest priority. But, first, if we go back to the earlier post by Varun, Issuing concurrent READ/WRITE accesses to the same register on two physical interfaces using RAL, Issuing concurrent accesses to the same registers on two physical interfaces using RAL , we note that if the RAL model is processing a ‘register access’ , it will not initiate the next one until the earlier one is completed. So, this is what we need to get done.We first use the RAL Proxy transactor to schedule the ISR register access simultaneously. After that, we flush out any existing accesses and prevent any new register access through RAL until access from the ISR is completed

This is how it, will be done:

image

For a normal register access, a read/write method will be invoked as follows:

          ral_model.<reg_name>.write(stat,wdata, “AHB”); //’wdata’ is the value to be driven, “AHB” is the domain /physical interface

For a register access in an ISR modeled in a MS Scenario, we have:

         env.bfm.to_ahb.grab(this); //grabs the channel

         env.ral.read(status, env.ral_model.<reg_name>.get_address_in_system("AHB"), data, 32, "AHB",,,1); // The last argument is again the “stream_id” argument

         env.bfm.to_ahb.ungrab(this); //allows the channel to be accessed from other threads once the ISR is completed

Once, this is done, the execute_single() task inside the translate Transcator will know if an access is through an ISR and can ensure that ISRs are processed on priority through a combination of functionality provided through the VMM Channel methods in combination with a semaphore:

virtual task execute_single(vmm_rw_access tr);

        AHB_tr cyc;

        AHB_tr cyc_active; //to keep any transactions residing on the active slot

        semaphore sem = new(1); //semaphore with a single key to prevent new accesses when a reg access from an ISR is processed

       // Translate the generic RW into a simple RW

       cyc = new;

       {cyc.dev, cyc.addr} = tr.addr;

       if (tr.kind == vmm_rw::WRITE) begin

         cyc.cycle = simple_tr::WRITE;

         cyc.data = tr.data;

         …

       end

      else begin

         cyc.cycle = simple_tr::READ;

         …

      end

      if (tr.stream_id != 1) sem.get(1); // regular register access gets blocked here when a reg access from an ISR is processed

      else begin

         if(this.bfm.to_ahb.is_full()) begin

           this.bfm.to_ahb.activate(cyc_active); //removes any existing transactions in the channel’s active slot so that the current access can be pushed through

           this.in_chan.start();

           this.in_chan.complete();

           this.in_chan.remove();

           this.bfm.to_ahb.put(cyc);

           this.bfm.to_ahb.put(cyc_active); //restores the original transaction back into the active slot

           cyc = cyc_active;

           sem.put(1); //ths ISR access puts back the key for normal accesses to resume

        end

        else this.bfm.to_ahb.put(cyc);

      end

      // Send the result back to the RAL

      if (tr.kind == vmm_rw::READ) begin

        tr.data = cyc.data;

      end

      endtask: execute_single

Posted in Register Abstraction Model with RAL, VMM infrastructure | No Comments »