relnotes.090795 13.7 KB
Sept 07 release RELEASE NOTES
=============================

Summary

	The following is a summary of changes since the last release.

	---------------------------------------------------------------------

	IMPORTANT: The memory map has changed. The final machine will have 3 MB
	of memory.  The color and z framebuffer can only reside in the
	first 2 MB of memory.  The boot code must reside in the 3rd MB of
	memory. Therefore, if you DMA data into the 3rd MB, you can potentially
	wipe out the code/data that was loaded duing boot time.

	A new application called midiDmon allows you to download your audio
	banks and playback on the Ultra64 with midi from keyboard or sequencer.
	To use this application, the ultra inst image depends on installation
	of the dmedia_eoe (version 5.5) image. The 5.5 dmedia_eoe is included
	on the inst tape sent to you.

	audio performance has dramatically improved on this release. We have
	rearranged code to minimize the ICache misses. The cost for audio
	is 1% per channel at 30Hz update for 22KHz output.

	There are 3 new demos. detail shows you how to use multitile textures.
	Including detailed textures, mipmaps etc. morphface shows you how to
	use the texturing engine to morph between 2 pictures. bumpmap shows you
	how to use texture lookup tables to create bump map surfaces. Gives it
	that not so "clean" look.

	The simple demos includes a audio/graphics task scheduler. We have fixed
	bugs in this scheduler and there are still more bugs to be isolated.
	If you are use this scheduler as a template for your game, you should
	reexamine this code.

	We are shipping 2 new tools to help make games more efficient. stacktool
	helps identify the maximum depth of stacks in each thread. gperf can
	help you identify where your code is spending most of its time.

	The graphics ucode has a new feature gSPClipRatio that lets you reduce
	clipping by using a large viewing frustum projection with RDP
	scissoring.  You can also turn near clipping on and off with
	gSPNearClip.  See the gSPClipRatio and gSPNearClip manpages.

	we now support user definable reverb effects.

	There has been a bug fix for the multi-thread window in gvd.
	After user has switched to a specific thread, he can then invoke
	the multi-thread window to obtain a complete list of all active threads
	running on the target machine. There is still a bug in showing the
	actual state of the threads in this window; that is, a stopped thread 
	can still be listed as running on the window. Therefore, user should 
	use the threadstatus utility to obtain accurate state information of 
	a specific thread.
	
	Currently, the debugger cannot support background mode debugging.
	That is, if a breakpoint is encountered, all user threads are stopped.
	gvd does not support private breakpoints per thread; all breakpoints
	are shared globally. Furthermore, the Continue button on the main gvd
	window only resumes execution of the thread being debugged. To resume
	or stop execution of all threads, user should use the Continue or Stop
	button in the multi-thread window.

	The fxSize field in the ALSynConfig struct has been replaced by a
	pointer which the developer must assign to an array of parameters
	which specify their custom effect

	The API call alSynSetFXType has been deleted

	The parameters to the API call alSynAllocFx have been changed. See
	the new man page

KNOWN BUGS and CAVEATS ======================================================

Graphics

	---------------------------------------------------------------------

	decal polygon bug

	Z-buffered decal polygons use the delta Z value for a pixel,
	defined as:

		delta_z = log2 (| dz_x | + | dz_y |)

	to determine whether two surface should be considered co-planar.
	A 'fuzzy' compare of the current pixel's Z and the previously
	stored Z is made.  A surface is coplanar if your have set the Z
	compare mode bits to decal and:

		(pix_z + delta_z >= memory_z) && (pix_z - delta_z <= mem_z)

	If this condition is met, the color is written and the mem_z is
	not updated.

	Unfortunately, when the two surfaces are perpendicular or nearly
	perpendicular to the eyepoint, the delta_z goes to zero and the
	Z comparison is not fuzzy anymore, but exact.  Because of differences
	in accumulated error in the rasterization of two planes that don't
	share the same vertices, the comparison will sometimes indicate the
	decal poly is in front and sometimes behind.  This will show up as
	random flashing in the decal as you move the eyepoint when the
	decal is approaching perpendicular.

	One hardware solution is to use the max of delta_z and some 
	hardware register value to ensure that a fuzzy Z compare is
	always possible.  Unfortunately, this bug was not found in time
	to fix in the second silicon.

	Other workarounds are to always make the decal and the underlying
	polygon share the same vertices.  This has the disadvantage of 
	wasting pixel visits.

	It may be possible to use texture decals rather than geometric
	decals.  This technique is discussed in the Texture document
	shipped to  developers.

	Always a possibility is to simply not use geometric decals.  Instead,
	break up the surface to use more polygons around the decal and
	remove the part of the polygon underneath the decal.  This increases
	the number of polygons but is more efficient in terms of pixel visits
	(fill rate).

	Finally, a microcode fix may be possible that clamps the dZy term
	during setup.  This workaround is currently being investigated and
	is _not_ in the released microcode.

	Note that this bug only applies to Z-buffered decals.  Some
	applications may be able to use non zbuffered decals.

	---------------------------------------------------------------------

	gDPLoadTextureBlock has a bug that some of the sizes results in
		incorrect texture loads.

	---------------------------------------------------------------------

	alpha X coverage in parallel with alpha dither bug

	---------------------------------------------------------------------

>	To get better texture accuracy, we scale the texture coordinates using
	perspective normalization gSPPerspNormalize() to retain more useful
	bits in the RSP fixed point transformation. We then rely on perspective
	correction hardware in the RDP to divide and regain the correct (s,t)
	value. Basically, this is a fixed point trick to scale up the number
	prior to divide in hope of retaining maximum significant bits.

	Anyways, this means that you need to turn on perspective correction.
	gDPSetTexturePersp(G_TP_PERSP) whenever you draw textured triangles.
	Even in orthographic projection. However, when doing texture rectangles
	where you specify (s,t) directly without any geometry transformation.
	You need to turn off perspective correction.
	
	---------------------------------------------------------------------

>	The RDP has a bug that causes some small thin triangles to hang
        the RDP pipeline.  The RSP graphics ucode detects MOST of these
        polys and prevents the RDP from rendering them.  As a result,
        some tiny thin triangles may 'drop out' from the rendered image.
        Even if your model doesn't contain any such triangles, they can
        be generated by the clipping step.

        There are a few very rare polygons which will hang the RDP pipeline
        but which are not detected and discarded by the RSP's ucode.  If
        you suspect such a hang, there are a few steps you can take to
	verify and/or avoid the problem:

        a) To determine whether the RDP is hung during rendering, block
           on OS_EVENT_SP and OS_EVENT_DP separately.  If you get an SP event 
	   for a particular frame but never get a DP event, then the DP
	   is hung.  The most likely cause is one of the "bad" triangles.

        b) There is a parameter defined in gbi.h that you can change to
           prevent hangs from the "bad" triangles at the cost of cracks
           in your images.  This parameter is called BOWTIE_VAL.
           The default value is zero.  Changing it to 4, 8, or 12 will give
           increased protection against hangs from bad triangles, with
	   progressively poorer quality images.

	   The BOWTIE_VAL parameter needs to make its way to the RSP where
	   it is actually used.  The gSPTexture() and gsSPTexture()
	   commands are the ones that send the value down to the RSP.  If
	   you wish to make use of BOWTIE_VAL you need to include one of
	   these commands before any triangles which might require it are
	   rendered.  If you're not using textures, it's OK to use something
	   like gSPTexture(glist++, 0, 0, 0, 0, 0), but you MUST use some
	   form of the gSPTexture command.

           You can change BOWTIE_VAL in gbi.h and then rebuild your entire
	   game; OR, you can do a

	   #undefine BOWTIE_VAL
	   #define   BOWTIE_VAL 4	(or 8, or 12)

	   in EVERY source file which uses a gSPTexture() or gsSPTexture()
	   command. The effect will be the same.

	The RDP hardware bug which causes these hangs will be fixed in the
	next release of hardware.  At that point the RSP ucode workaround
	for the bug and support for BOWTIE_VAL will be removed.
	---------------------------------------------------------------------

>	The RDP has a bug that gDPLoadTextureTile* does not work for certain
	texture pointer alignments and texture line sizes.

	This will be fixed on the next release of hardware.

	A workaround for gDPLoadTextureTile* has been implemented.  This uses
	the RDP load_block command to simulate loading each line of the tile.
	By using this workaround, the user *does not* need any special 
	alignment of tile pointers other than the normal 64b alignment
	restriction of the texture image pointer.

	The workaround creates about 33% more DMA traffic than the real 
	load_tile primitive.  It also adds commands to your dynamic display 
	list.  The usage for the gDPLoadTextureTile* commands doens't change, 
	but you must include gu.h in you app (already present if you include 
	ultra64.h). You can determine how many commands will be added to your 
	display list by using the guGetDPLoadTextureTile*Sz function.  Note
	that this workaround only works for dynamic display lists.

	---------------------------------------------------------------------

>	The RDP has a bug that gDPLoadTLUT does not work correctly when
	replicating each index entry 4x for the 4 banks for TMEM.

	In order to work around this bug, TLUTs must be replicated in DRAM
	and loaded using the load_block RDP primitive.  The g*LoadTLUT*
	macros can be used as before (they've been changed to use load_blocks
	internally), but the table that gets loaded should look like:
		0a 0b 0c 0d
		1a 1b 1c 1d
		2a 2b 2c 2d
		.  .  .  .
		.  .  .  .
		na nb nc nd

	where each short a,b,c,d are identical RGBA16 or IA16 values and
	n is the number of entries in the table.  If you use rgb2c to create 
	your textures, use the -Q option to create quadricated TLUTs.
	When this bug is fixed in the latest hardware, users will have to 
	convert their TLUTs back to one entry per line (just the 'a' column 
	above, for example).

	The performance hit is estimated to be about 2x a normal TLUT load.
	The workaround TLUTs will use 3x the DRAM of normal TLUT.

	This is fixed on the next rev of the chip

	---------------------------------------------------------------------

>	The RSP/RDP pipeline has an algorithmic bug in LOD calculation. LOD
	calculation is incorrect when clipping. This error is most noticible
	when the eye point is near a mipmapped textured surface in perspective.
	The surface may look blurry.

	---------------------------------------------------------------------

>	The 'lodscale' parameter of gSPTextureL() is not implemented. 
	This macro behaves just like gSPTexture().

	---------------------------------------------------------------------

>	gSPCullDisplayList() is not implemented.
	
	---------------------------------------------------------------------

>	gSPMatrixState() is not implemented.
	
	---------------------------------------------------------------------

>	When using the LAN2 and LAF2 video modes (especially LAF2), some
        glitches in the video output can occur.  When the glitch occurs,
	a portion of a displayed scanline will be corrupted for
	a single frame.  The bug is related to timing of memory accesses with
	the 50 MHz system frequency, and so it is sensitive to other memory
        activity (including RDP rendering) in the system.  This problem will
        not occur on the final hardware.  The bug will not cause any side
        effects in your game; it only affects the displayed video.  Thus,
        the workaround is to simply ignore the glitches.

	---------------------------------------------------------------------

>	The Video output pipeline which performs scaling on frame buffer data
	has a pipeline delay bug which causes 2 pixel wide artifacts to be seen
	on scaled pixel data whenever a scale factor other than .5 or 1.0 is 
	used.  This bug will be fixed in the next version of the RCP.
	
	The video scaling hardware was not designed to support scale factors 
	less than .25, and doing so results in undesirable image breakup; the
	os calls osViSetXScale(), osViSetXScale() now have assertion checking
	for a minimum scale factor of .25 (formerly they asserted for a 
	minimum scale factor of 0.0).

	---------------------------------------------------------------------

>	The Line microcode has 3 known bugs:

        o The flag parameter is currently required but ignored
        o Nearly horizontal lines are shaded incorrectly
        o Line Zbuffer values are scaled incorrectly

	---------------------------------------------------------------------

Audio
	---------------------------------------------------------------------

Operating System

Tools

	---------------------------------------------------------------------

Demos

	---------------------------------------------------------------------

Debugger

	---------------------------------------------------------------------