testvector.txt 17 KB
1. Introduction

This document describes the test vector development environment for the RCP.
It briefly explains the partitioning of tests, the tools and methodologies
used, then describes the purpose of each module's directed tests.


2. Test Partitioning

We have grouped the test vectors into four categories:

    a) i/o	(AI, MI, PI, SI, VI) (these modules will be described below)
    b) misc	(imem_bist, io_pad, sp_mem, tmem_bist, vi_ms_buf)
    c) rsp
    d) rdp

Four directories with these names (io, misc, rsp, rdp) are maintained in the
PR/hw2/chip/vector directory, and contain the source files used to generate 
the test vectors.


3. Tools and Methodologies

All test vectors are captured as ascii signal dumps from a verilog simulation
of the chip.  The simulator takes its input from an external 'C' executable
named "iosim".  This program reads a ".tst" command file, then communicates 
with the verilog simulator to write to memory mapped locations as instructed
by the command file.

In the iosim test environment, all accesses to the RCP memory address space are
converted to the R4000 K1 segment space. This automatically generates an
exception by UNIX since the K1 segment is in kernel space.  Then, in the iosim
exception handler, the RCP physical address is extracted out from the exception
address and depending upon whether we had a store or load instruction, 
the proper write/read request is sent to the verilog process. 

The verilog process (via the r4200b.v module) receives the request, generates 
the proper signals on the MBUS to either read/write from/to a memory address 
(i.e., RDRAM, SP registers), collects the data, and sends it back to iosim. 
During all this time, iosim is blocked in the exception handler, waiting for 
the response from verilog. Upon receiving valid data, the iosim exception 
handler inserts the data into the proper CPU registers, increments the 
exception PC, and returns from exception. The iosim application can then 
continue to process the next command from the command file.

An important feature of the verilog simulator is its ability to initialize the
rdram model using a memory mapped file (a ".rdram" file) whose contents are to
be loaded directly into 1MB, 2MB, 4MB, or 8MB of rdram model memory.  The 
contents of the .rdram files often include RDP display lists and texture data,
which may then be rendered to the frame buffer (contained within the rdram
memory model) with a few simple memory mapped writes to RDP registers modeled
by the verilog simulator.

The test vector environment consists primarily of .tst files which provide
directed I/O tests to memory mapped locations modeled the verilog simulator,
and .rdram files, whose embedded display lists exercise the indirectly 
accessible logic of the RDP.  If a display list written to exercise a 
particular logic block of the RDP uncovers a fault, the pixel data written
to the frame buffer (and thus captured as test vectors) will be in error.  

There is no better, more direct way to test the RDP, unfortunately.

Finally, the raw ascii dumps (typically 200-800MB of information) are then
parsed into test vectors by tools external to the SGI environment (e.g. scripts
and programs on an HP workstation which provide vectors in the HP tester
format).

4. i/o test vectors

AI (Audio Interface)

The AI test vectors are generated from an iosim ".tst" file which writes to
programmed i/o (memory mapped) locations within the RCP such that the RCP is
initialized to perform audio DMA's from RDRAM to the AI.  The memory activity
is captured at the external pins and extracted as test vectors.  Many address
boundary conditions are exercised and tested.  We wait for the interrupt which
signals DMA completion prior to starting the next DMA cycle.

AI counters are also tested; these counters are used to program the DAC output
rates and the sampling rates of the incoming audio data.


MI (Microprocessor Interface)

An iosim ".tst" file tests the following features of the MI block:

    1. Ability to receive and generate coprocessor interrupts.

    2. Ability to set ebus test mode, RDRAM register mode.

    3. Perform a walking one's test through the SysAD bus (the basic i/o bus
       of the R4300 microprocessor).

    4. Perform multiple byte transfers across the SysAD bus (to test all 
       possible address boundary conditions).


PI (Peripheral Interface)

The PI vectors are generated from an iosim .tst file; in addition, a mock PIF
controller is implemented in the top level verilog file "pif_tasks.v", and
satisfies the PI test's reads from the PIF rom address space, as well as the
interrupts that would have been generated by the PIF during such read activity.
We do not test the PIF microprocessor; we just test the RCP's ability to read
and write the PIF address space.

We also explicitly test the following PI registers:

    1. PI RDRAM address register

    2. PI cartridge address register

    3. Rom access register; this register is used to control the speed of the
       AD16 (cartridge) bus.

The PI's ability to DMA to/from RDRAM is tested with differing lengths and
address boundary conditions.  Finally, we test the PI's ability to read/write
from the different cartridge address domains (address spaces).


SI (Serial Interface)

The SI vectors are also generated from an iosim .tst file.  The serial interface
is used primarily to read data from the handheld game controller, which is 
formatted by the PIF microprocessor into the small 2K ram space maintained 
within the PIF address space.  The following features are tested:

    1. Ability to read "preloaded" pif data from the dummy top level pif model.

    2. Test 4 byte, 64 byte DMA's from PIF to RDRAM (and vice versa), on all
       possible address boundaries.

    3. PIO (programmed i/o) reads from the PIF rom area.

    4. Back to back DMA's (should generate the DMA busy signal).


VI (Video Interface)

These iosim .tst files configure the VI for a 320 x 32 framebuffer, then load
RDRAM with test pattern data.  All VI registers are written (and read) as they
are configured for the framebuffer size & location.

The same tests are repeated, applying a 2x scale factor in the x & y directions.


5. Miscellaneous test vectors

The PR/hw2/chip/vector/rcp_misc_test directory contains verilog test programs
used to generated slow and fast speed tests for the RCP.  An initial test
program for IDDQ testing is also implemented in this directory.

The tests provided include fast and slow speed versions for the I/O pads, as
well as tests which invoke the built-in self-test (bist) features of the rsp 
IMEM (instruction memory) and TMEM (texture memory).

In addition to these tests, additional verilog tests were written to further
exercise the SP memories (IMEM, DMEM), the line buffer used by the VI block,
and the span buffer used by the MS (memspan) block.


6. RSP

Vectors are captured by running microcoded tests through a specially built
verilog based rsp simulator, and capturing the resultant signal dumps as raw
vectors.  The signal dumps are then parsed into test vectors by scripts and
programs maintained on our HP workstation.

This simulator was intended for daily regression tests of the RSP logic, and
does not include the RDP logic; it also included special verilog modules which
dump the state of external RCP pins for vector capture purposes.

Blocks tested include the vector and scalar processing units (vu, su), the 
data memory (dmem), the instruction memory (imem), the divisor control logic
(divctl), the divisor rom table (divrom), the load/store unit (ls), and the
io_logic unit (io_logic).  Vector coverage for the io_logic block is also
provided by several of the RDP test vector modules.

The vectors for the io_logic block are created using the iosim 'C' interface
to the verilog simulator, so that we can easily exercise the memory mapped 
hardware of the RSP.


7. RDP

Very little of the RDP logic blocks are accessible to a memory mapped directed
test approach.  Only the memspan's span buffer can be tested in this fashion
(and is done so in the miscellaneous test category previously discussed).

We decided that the best method for achieving optimal coverage of the RDP logic
blocks was to generate and capture special RDP display lists, each of which 
would target a particular logic block of the RDP.  Several frames worth of 
rendering were required to obtain good vector coverage for each module, and
the resultant RDP display lists were captured from the development system and
converted into the ".rdram" file format.

The tool used to generate these display lists was a Nintendo 64 application
program (written by Silicon Graphics) entitled "rdpvector".  It utilizes special
RSP microcode which writes the RDP display list into an rdram buffer (the 
standard RSP microcode writes the RDP display list commands directly into the
XBUS channel which links the RSP and RDP together).  Once the RSP has written
the entire RDP display list into this buffer (maximum size is 64K bytes, or
1024 64 bit RDP commands), the rdpvector program utilizes a debug port to 
dump (in ascii) the contents of this display list buffer, the contents of the
rdram texture segment (which is used by the RDP whenever a texture load command
is processed), and information regarding the framebuffer configuration needed
to execute this display list.

A second tool, rdpascii2rdram, converts the ascii formatted display list data
into a binary .rdram file suitable for use by the RCP verilog simulator.

Once captured as ".rdram" files, the RDP display lists (along with the texture 
memory segments and pre-initialized frame buffer memories) serve as input to an
iosim managed verilog simulation process, which subsequently dumps the signal
data that comprise the chip test vectors.  These raw test vectors are then 
parsed into the HP test vector format by additional scripts and tools that run
on our HP workstation.

A description of the display lists used to test the major RDP logic blocks
follows:


CS (Command Shuffle unit)

The command shuffle unit buffers commands and arguments for the RDP pipeline,
converting them from an RSP format to an RDP format, possibly reordering data
and commands for better pipelining.

Two vector files are captured; the vectors for these groups of tests are 
generated separately to avoid the need to run and test large vector files each
time a new test is added (or an old test is modified).  The .rdram files used
to generate these vector files are run through the simulator separately, and
the resulting vectors are composited together.

The first vector file, "cs_vector.rdram", renders a pair of triangles whose
vertices extend from the minimum screen coordinates to the maximum supported
by the RDP (a range from 0 to 4091).  These triangles walk bits through the
entire address range of the CS unit, the ST (stepper) unit, and the EW (edge
walker) unit.

The second vector file is similar to the first, except that the RSP generated
vertices were edited by hand and dumped into the RDP pipe as raw 64 bit 
commands.  The graphics header file "gbi.h" (graphics binary interface) provides
the definition for this special RSP display list command ("gsDPWord") which 
dumps 64 bits of data directly into the RDP without any processing.  By so 
doing, more precise control over the vertex data was possible, allowing us to
provide better coverage for the design.


TM (Texture Memory unit)

A variety of frames are rendered, and the resulting display lists are captured
as a single .rdram file (thus only one vector file is produced for this module).

Frame 1:

    Four textured rectangles of type RGBA are rendered in 16 bit mode, with data
    patterns of 0xa5, 0x5a, random data, and the complement of this random data.

    The rectangles are rendered once in 1-cycle mode, and once in copy mode; the
    set of rectangles are rendered twice; once with texture data loaded low
    in tmem, and once with the texture data loaded high in tmem.

Frame 2:

    Similar to Frame 1, except that the scene is rendered in 32 bit mode instead
    of 16 bit.

Frame 3:

    Similar to Frame 1, with YUV textures rendered in 16 bit mode.

Frame 4:

    Similar to Frame 1, with Intensity Alpha textures rendered in 16, 8, and
    4 bits.

Frame 5:

    Similar to Frame 1, with Intensity textures rendered in 8 and 4 bits.

Frame 6:

    Similar to Frame 1, with Color Index textures rendered in 8 and 4 bits.

Frame 7:

    All types and sizes, random data, 1 cycle mode, with texture data loaded
    into both low and high tmem (this is a compressed test, allowing a single
    frame to capture the elements tested by the first 6 frames of this module's
    test sequence).

Frame 8:
    
    A copy mode rectangle is rendered using procedurally generated data whose
    values best exercise the TM unit's logic.


TC (Texture Coordinate unit)

The TC is responsible for addressing and retreiving data from the texture
memory on behalf of a rendered primitive.  Conversely, it is also responsible
for loading texture data into texture memory from RDRAM.

Two sets of test vectors were prepared; the first display list renders triangles
which test the TC dividers and {S, T} multipliers.  These are defined in the
gsDPWord() RDP input format for precise control over these carefully chosen 
coefficients.

Four triangles step {S, T} from 1023 to -1/32, and step W from 1.0 to -1/32K.

We also test the tile shift feature, S - SL and T - TL subtractors, and the 
wrap/mirror/clamp features.

The second vector set exercises the TC unit's ability to compute LOD (Level of
Detail), a necessary feature for mipmapped texture calculations.


TF (Texture Filter unit)

The TF unit is responsible for two functions; it performs a bilinear 
interpolation of four texel values received from the TM unit, and is capable of
performing half of the necessary matrix computations for the YUV to RGB color
space conversion.  The CC (Color Combiner) unit performs the final set of
computations for color space conversion.  Several display list tests, each of
which renders a frame of data, were created to test the TF unit.

The first test exercises the YUV conversion multiplicands; a pair of triangles
65 pixels on the side are drawn; this forces the steppers (ST) to iterate
through a range of {S, T} fractions, thereby exercising the TF's linear 
interpolators better than fixed T fractional value of 0 or 0.5 (which is what
you would get if the triangle width matched the texture width (64 x 64)
precisely.

The YUV conversion matrix coefficients are loaded with the nominal values
used for YUV conversion, then the triangles are rendered using a texture loaded
with embedded powers of two.  The act of rendering causes the YUV conversion to
occur (assuming the TF and CC have been properly programmed to perform this
conversion).  The same test is repeated using a random data pattern which is
loaded into texture memory (instead of the powers of two texture data pattern).

A third test renders a perspective corrected texture, using a powers of two
mipmap texture image.

The final test renders a lengthy set of textured rectangles, varying the
x, y, s, t coordinates through a carefully chosen range of values that provides
for maximal coverage of the TF logic.


CC (Color Combiner unit)

The CC has a great deal of programmability and range of operation.  The CC 
display list test renders triangles with carefully chosen coordinates and
color values, then repeats the rendering of these triangles with all of the
different CC rendering options available (this is done by enabling/disabling
the CC modes of operation with RSP display list commands).  For specific 
details on the rendering modes tested, consult PR/apps/rdpvector/cs_static.c,
and look at the contents of the "CC_dl" display list test.

BL (Blender unit)

The Blender unit of the RDP is responsible for Z-buffering, coverage updates to
the frame buffer (the coverage bit planes are used by the VI's anti-aliasing
filter), and color/alpha blending.

Special RSP microcode was written to compute every combination of pixel color
and alpha for every possible blender multiplicand.  Pixel colors and alpha 
values both come from the shade color of the primitive being rendered, except
in the alpha = 1.0 case, where alpha must come from the coverage value.

Separate display list tests using the standard graphics microcode were written 
to issue RDP commands which best exercise the many different rendering modes of
the blender.

MS (MemSpan unit)

The memspan unit is at the end of the RDP rendering pipeline, and is responsible
for all memroy read-modify-write cycles.  A variety of address alignment cases
are handled by the memspan as special cases, and several state machines are
defined to generate a stall condition to the other RDP units whenever the 
memspan's internal span buffer is full and cannot be flushed or filled because
RDRAM access is blocked due to higher priority RDRAM access from the R4300 cpu,
VI, or AI units.  The "stall" condition is simply the absence of the "gclock"
signal.

The MS display list tests render triangles and rectangles in all of the MS
operating modes (load, copy, fill, 1-cycle, 2-cycle), in 16 and 32 bit pixel
modes.  The coordinate values chosen are intended to cover all possible span
address combinations.