README file for ci4fb demo Author: Rob Moore, rmoore@sgi.com NOTE: This demo still needs some work. In particular, the combining of the two texels is being screwed up somehow. Also, should show different combinations of +/- DxS and left/right major tile offsets. This program is an example of how a 4-bit CI framebuffer would work. It renders the image into an 8-bit buffer packing two 4-bit pixels into one 8-bit output pixel, then copies/converts the buffer to a 16-bit RGBA framebuffer. The 8-bit buffer is actually located in the same memory as the final 16-bit framebuffer, so no extra memory is used. A quick description of the mathod used: 2 cycle mode. no perspective. render 4-bit CI textures with Tlut *disabled* into 8-bit framebuffer. No lookup table is needed at this point. step DxS by two texels per pixel (2*Dxs). set up two tiles, first has SL=0, second has SL=Dxs. This is how each intermediate texel is 'stepped'. In the color combiner do the following operation to combine the two 4-bit indexes: (T0 << 4) + T1. Bypass the blender and write the 8-bit output value. After all rendering, copy the 8-bit framebuffer to a 16-bit RGBA framebuffer. This rendering mode has several limitiations: no perspective no anit-aliasing no zbuffering? half as much resolution in horizontal direction vertices located on 8-bit rather than 4-bit pixels requires microcode to generate Dxs/tile for general case. This program prints data concerning the fill rate performance for triangles of different areas and aspect ratios. This data can easily be plotted as a 3-D graph using gnuplot (see plot.gnu). The program works by increasing the number of triangles until a frame time is exceeded and then backing off slightly. At this point the fill rate is recorded and the next type of triangle is drawn. The triangles are drawn at random locations, but within the 320x240 screen so that clipped triangles are not a factor. To define the extents of the graph, define the constants (in fill.h): ASPECT_HI ASPECT_LO ASPECT_STEP AREA_HI AREA_LO AREA_STEP Note that only the RDP performance is measured. There is a certain amount of DRAM bandwidth that is dedicated to loading the display list into the input FIFO of the RDP. To get the most accurate answers, you should run with video blanked (-b option), and don't do frame buffer clears (-c option).