testvector.txt
17 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
1. Introduction
This document describes the test vector development environment for the RCP.
It briefly explains the partitioning of tests, the tools and methodologies
used, then describes the purpose of each module's directed tests.
2. Test Partitioning
We have grouped the test vectors into four categories:
a) i/o (AI, MI, PI, SI, VI) (these modules will be described below)
b) misc (imem_bist, io_pad, sp_mem, tmem_bist, vi_ms_buf)
c) rsp
d) rdp
Four directories with these names (io, misc, rsp, rdp) are maintained in the
PR/hw2/chip/vector directory, and contain the source files used to generate
the test vectors.
3. Tools and Methodologies
All test vectors are captured as ascii signal dumps from a verilog simulation
of the chip. The simulator takes its input from an external 'C' executable
named "iosim". This program reads a ".tst" command file, then communicates
with the verilog simulator to write to memory mapped locations as instructed
by the command file.
In the iosim test environment, all accesses to the RCP memory address space are
converted to the R4000 K1 segment space. This automatically generates an
exception by UNIX since the K1 segment is in kernel space. Then, in the iosim
exception handler, the RCP physical address is extracted out from the exception
address and depending upon whether we had a store or load instruction,
the proper write/read request is sent to the verilog process.
The verilog process (via the r4200b.v module) receives the request, generates
the proper signals on the MBUS to either read/write from/to a memory address
(i.e., RDRAM, SP registers), collects the data, and sends it back to iosim.
During all this time, iosim is blocked in the exception handler, waiting for
the response from verilog. Upon receiving valid data, the iosim exception
handler inserts the data into the proper CPU registers, increments the
exception PC, and returns from exception. The iosim application can then
continue to process the next command from the command file.
An important feature of the verilog simulator is its ability to initialize the
rdram model using a memory mapped file (a ".rdram" file) whose contents are to
be loaded directly into 1MB, 2MB, 4MB, or 8MB of rdram model memory. The
contents of the .rdram files often include RDP display lists and texture data,
which may then be rendered to the frame buffer (contained within the rdram
memory model) with a few simple memory mapped writes to RDP registers modeled
by the verilog simulator.
The test vector environment consists primarily of .tst files which provide
directed I/O tests to memory mapped locations modeled the verilog simulator,
and .rdram files, whose embedded display lists exercise the indirectly
accessible logic of the RDP. If a display list written to exercise a
particular logic block of the RDP uncovers a fault, the pixel data written
to the frame buffer (and thus captured as test vectors) will be in error.
There is no better, more direct way to test the RDP, unfortunately.
Finally, the raw ascii dumps (typically 200-800MB of information) are then
parsed into test vectors by tools external to the SGI environment (e.g. scripts
and programs on an HP workstation which provide vectors in the HP tester
format).
4. i/o test vectors
AI (Audio Interface)
The AI test vectors are generated from an iosim ".tst" file which writes to
programmed i/o (memory mapped) locations within the RCP such that the RCP is
initialized to perform audio DMA's from RDRAM to the AI. The memory activity
is captured at the external pins and extracted as test vectors. Many address
boundary conditions are exercised and tested. We wait for the interrupt which
signals DMA completion prior to starting the next DMA cycle.
AI counters are also tested; these counters are used to program the DAC output
rates and the sampling rates of the incoming audio data.
MI (Microprocessor Interface)
An iosim ".tst" file tests the following features of the MI block:
1. Ability to receive and generate coprocessor interrupts.
2. Ability to set ebus test mode, RDRAM register mode.
3. Perform a walking one's test through the SysAD bus (the basic i/o bus
of the R4300 microprocessor).
4. Perform multiple byte transfers across the SysAD bus (to test all
possible address boundary conditions).
PI (Peripheral Interface)
The PI vectors are generated from an iosim .tst file; in addition, a mock PIF
controller is implemented in the top level verilog file "pif_tasks.v", and
satisfies the PI test's reads from the PIF rom address space, as well as the
interrupts that would have been generated by the PIF during such read activity.
We do not test the PIF microprocessor; we just test the RCP's ability to read
and write the PIF address space.
We also explicitly test the following PI registers:
1. PI RDRAM address register
2. PI cartridge address register
3. Rom access register; this register is used to control the speed of the
AD16 (cartridge) bus.
The PI's ability to DMA to/from RDRAM is tested with differing lengths and
address boundary conditions. Finally, we test the PI's ability to read/write
from the different cartridge address domains (address spaces).
SI (Serial Interface)
The SI vectors are also generated from an iosim .tst file. The serial interface
is used primarily to read data from the handheld game controller, which is
formatted by the PIF microprocessor into the small 2K ram space maintained
within the PIF address space. The following features are tested:
1. Ability to read "preloaded" pif data from the dummy top level pif model.
2. Test 4 byte, 64 byte DMA's from PIF to RDRAM (and vice versa), on all
possible address boundaries.
3. PIO (programmed i/o) reads from the PIF rom area.
4. Back to back DMA's (should generate the DMA busy signal).
VI (Video Interface)
These iosim .tst files configure the VI for a 320 x 32 framebuffer, then load
RDRAM with test pattern data. All VI registers are written (and read) as they
are configured for the framebuffer size & location.
The same tests are repeated, applying a 2x scale factor in the x & y directions.
5. Miscellaneous test vectors
The PR/hw2/chip/vector/rcp_misc_test directory contains verilog test programs
used to generated slow and fast speed tests for the RCP. An initial test
program for IDDQ testing is also implemented in this directory.
The tests provided include fast and slow speed versions for the I/O pads, as
well as tests which invoke the built-in self-test (bist) features of the rsp
IMEM (instruction memory) and TMEM (texture memory).
In addition to these tests, additional verilog tests were written to further
exercise the SP memories (IMEM, DMEM), the line buffer used by the VI block,
and the span buffer used by the MS (memspan) block.
6. RSP
Vectors are captured by running microcoded tests through a specially built
verilog based rsp simulator, and capturing the resultant signal dumps as raw
vectors. The signal dumps are then parsed into test vectors by scripts and
programs maintained on our HP workstation.
This simulator was intended for daily regression tests of the RSP logic, and
does not include the RDP logic; it also included special verilog modules which
dump the state of external RCP pins for vector capture purposes.
Blocks tested include the vector and scalar processing units (vu, su), the
data memory (dmem), the instruction memory (imem), the divisor control logic
(divctl), the divisor rom table (divrom), the load/store unit (ls), and the
io_logic unit (io_logic). Vector coverage for the io_logic block is also
provided by several of the RDP test vector modules.
The vectors for the io_logic block are created using the iosim 'C' interface
to the verilog simulator, so that we can easily exercise the memory mapped
hardware of the RSP.
7. RDP
Very little of the RDP logic blocks are accessible to a memory mapped directed
test approach. Only the memspan's span buffer can be tested in this fashion
(and is done so in the miscellaneous test category previously discussed).
We decided that the best method for achieving optimal coverage of the RDP logic
blocks was to generate and capture special RDP display lists, each of which
would target a particular logic block of the RDP. Several frames worth of
rendering were required to obtain good vector coverage for each module, and
the resultant RDP display lists were captured from the development system and
converted into the ".rdram" file format.
The tool used to generate these display lists was a Nintendo 64 application
program (written by Silicon Graphics) entitled "rdpvector". It utilizes special
RSP microcode which writes the RDP display list into an rdram buffer (the
standard RSP microcode writes the RDP display list commands directly into the
XBUS channel which links the RSP and RDP together). Once the RSP has written
the entire RDP display list into this buffer (maximum size is 64K bytes, or
1024 64 bit RDP commands), the rdpvector program utilizes a debug port to
dump (in ascii) the contents of this display list buffer, the contents of the
rdram texture segment (which is used by the RDP whenever a texture load command
is processed), and information regarding the framebuffer configuration needed
to execute this display list.
A second tool, rdpascii2rdram, converts the ascii formatted display list data
into a binary .rdram file suitable for use by the RCP verilog simulator.
Once captured as ".rdram" files, the RDP display lists (along with the texture
memory segments and pre-initialized frame buffer memories) serve as input to an
iosim managed verilog simulation process, which subsequently dumps the signal
data that comprise the chip test vectors. These raw test vectors are then
parsed into the HP test vector format by additional scripts and tools that run
on our HP workstation.
A description of the display lists used to test the major RDP logic blocks
follows:
CS (Command Shuffle unit)
The command shuffle unit buffers commands and arguments for the RDP pipeline,
converting them from an RSP format to an RDP format, possibly reordering data
and commands for better pipelining.
Two vector files are captured; the vectors for these groups of tests are
generated separately to avoid the need to run and test large vector files each
time a new test is added (or an old test is modified). The .rdram files used
to generate these vector files are run through the simulator separately, and
the resulting vectors are composited together.
The first vector file, "cs_vector.rdram", renders a pair of triangles whose
vertices extend from the minimum screen coordinates to the maximum supported
by the RDP (a range from 0 to 4091). These triangles walk bits through the
entire address range of the CS unit, the ST (stepper) unit, and the EW (edge
walker) unit.
The second vector file is similar to the first, except that the RSP generated
vertices were edited by hand and dumped into the RDP pipe as raw 64 bit
commands. The graphics header file "gbi.h" (graphics binary interface) provides
the definition for this special RSP display list command ("gsDPWord") which
dumps 64 bits of data directly into the RDP without any processing. By so
doing, more precise control over the vertex data was possible, allowing us to
provide better coverage for the design.
TM (Texture Memory unit)
A variety of frames are rendered, and the resulting display lists are captured
as a single .rdram file (thus only one vector file is produced for this module).
Frame 1:
Four textured rectangles of type RGBA are rendered in 16 bit mode, with data
patterns of 0xa5, 0x5a, random data, and the complement of this random data.
The rectangles are rendered once in 1-cycle mode, and once in copy mode; the
set of rectangles are rendered twice; once with texture data loaded low
in tmem, and once with the texture data loaded high in tmem.
Frame 2:
Similar to Frame 1, except that the scene is rendered in 32 bit mode instead
of 16 bit.
Frame 3:
Similar to Frame 1, with YUV textures rendered in 16 bit mode.
Frame 4:
Similar to Frame 1, with Intensity Alpha textures rendered in 16, 8, and
4 bits.
Frame 5:
Similar to Frame 1, with Intensity textures rendered in 8 and 4 bits.
Frame 6:
Similar to Frame 1, with Color Index textures rendered in 8 and 4 bits.
Frame 7:
All types and sizes, random data, 1 cycle mode, with texture data loaded
into both low and high tmem (this is a compressed test, allowing a single
frame to capture the elements tested by the first 6 frames of this module's
test sequence).
Frame 8:
A copy mode rectangle is rendered using procedurally generated data whose
values best exercise the TM unit's logic.
TC (Texture Coordinate unit)
The TC is responsible for addressing and retreiving data from the texture
memory on behalf of a rendered primitive. Conversely, it is also responsible
for loading texture data into texture memory from RDRAM.
Two sets of test vectors were prepared; the first display list renders triangles
which test the TC dividers and {S, T} multipliers. These are defined in the
gsDPWord() RDP input format for precise control over these carefully chosen
coefficients.
Four triangles step {S, T} from 1023 to -1/32, and step W from 1.0 to -1/32K.
We also test the tile shift feature, S - SL and T - TL subtractors, and the
wrap/mirror/clamp features.
The second vector set exercises the TC unit's ability to compute LOD (Level of
Detail), a necessary feature for mipmapped texture calculations.
TF (Texture Filter unit)
The TF unit is responsible for two functions; it performs a bilinear
interpolation of four texel values received from the TM unit, and is capable of
performing half of the necessary matrix computations for the YUV to RGB color
space conversion. The CC (Color Combiner) unit performs the final set of
computations for color space conversion. Several display list tests, each of
which renders a frame of data, were created to test the TF unit.
The first test exercises the YUV conversion multiplicands; a pair of triangles
65 pixels on the side are drawn; this forces the steppers (ST) to iterate
through a range of {S, T} fractions, thereby exercising the TF's linear
interpolators better than fixed T fractional value of 0 or 0.5 (which is what
you would get if the triangle width matched the texture width (64 x 64)
precisely.
The YUV conversion matrix coefficients are loaded with the nominal values
used for YUV conversion, then the triangles are rendered using a texture loaded
with embedded powers of two. The act of rendering causes the YUV conversion to
occur (assuming the TF and CC have been properly programmed to perform this
conversion). The same test is repeated using a random data pattern which is
loaded into texture memory (instead of the powers of two texture data pattern).
A third test renders a perspective corrected texture, using a powers of two
mipmap texture image.
The final test renders a lengthy set of textured rectangles, varying the
x, y, s, t coordinates through a carefully chosen range of values that provides
for maximal coverage of the TF logic.
CC (Color Combiner unit)
The CC has a great deal of programmability and range of operation. The CC
display list test renders triangles with carefully chosen coordinates and
color values, then repeats the rendering of these triangles with all of the
different CC rendering options available (this is done by enabling/disabling
the CC modes of operation with RSP display list commands). For specific
details on the rendering modes tested, consult PR/apps/rdpvector/cs_static.c,
and look at the contents of the "CC_dl" display list test.
BL (Blender unit)
The Blender unit of the RDP is responsible for Z-buffering, coverage updates to
the frame buffer (the coverage bit planes are used by the VI's anti-aliasing
filter), and color/alpha blending.
Special RSP microcode was written to compute every combination of pixel color
and alpha for every possible blender multiplicand. Pixel colors and alpha
values both come from the shade color of the primitive being rendered, except
in the alpha = 1.0 case, where alpha must come from the coverage value.
Separate display list tests using the standard graphics microcode were written
to issue RDP commands which best exercise the many different rendering modes of
the blender.
MS (MemSpan unit)
The memspan unit is at the end of the RDP rendering pipeline, and is responsible
for all memroy read-modify-write cycles. A variety of address alignment cases
are handled by the memspan as special cases, and several state machines are
defined to generate a stall condition to the other RDP units whenever the
memspan's internal span buffer is full and cannot be flushed or filled because
RDRAM access is blocked due to higher priority RDRAM access from the R4300 cpu,
VI, or AI units. The "stall" condition is simply the absence of the "gclock"
signal.
The MS display list tests render triangles and rectangles in all of the MS
operating modes (load, copy, fill, 1-cycle, 2-cycle), in 16 and 32 bit pixel
modes. The coordinate values chosen are intended to cover all possible span
address combinations.