Charles R. Johns and Taggart Robertson
Design Goals and Considerations
Graphics Subsystem Organization
Graphics Subsystem Block Diagram
PAX attaches directly to the PowerPC 601 Microprocessor
bus. Attaching directly to the
PAX Block Diagram
The frame buffer is constructed of specialized memory called video RAM
The serial output port is a dedicated high speed interface used by the Palette–DAC for scanning pixels out of the frame buffer VRAM. This interface eliminates the overhead of refreshing the screen from the parallel port which is often required with more conventional DRAM interfaces.
Block Write is a unique feature of VRAM which allows a constant color to be written to multiple locations within a single write cycle. This feature allows PAX to render up to 32 pixels in a single write cycle.
The Write per bit feature provides a write mask which the rendering engine uses to select which bits of the pixel are to be updated. This feature eliminates costly read / modify / write operations.
PAX’s 32–bit frame buffer architecture exploits these VRAM features, as well as employs advanced interleaving techniques referred to as Pixel Interleaving  and Load Clock Interleaving to enhance performance. The architecture supports 1M to 5M bytes of memory which provides the capability to support screen resolutions ranging from 1024 x 768 to 1280 x 1024. With 2M bytes or more of memory, PAX can support a double buffer display.
The IBM RGB 530 Palette–DAC serves as the video controller. It provides four color palettes, video output (red, green, blue), display timings, VRAM serial port control, and two hardware cursors. This device also provides an on–chip programmable Phase Locked Loop (PLL) which allows the graphics subsystem to support a wide range of monitors with varying timing requirements.
As the block diagram and descriptions illustrate, most of the logic required for the graphics subsystem is integrated into the PAX and the IBM RGB 530 custom chips. This level of integration is key to keeping the cost of the subsystem to a minimum while enhancing the performance and function.
Direct Frame Buffer Access allows the Frame Buffer to be
accessed by the PowerPC 601 processor as if it were part of system memory.
This simplified interface is very effective in reducing the X Window software
development time. X Window software development uses DFA to directly map
the X Window System code received from Massachusetts Institure of Technology
(MIT) into the initial device driver for the adapter. Then, X Window software
The ”Poly” command interface provides a rich set of rendering instructions which complement the DFA programming model. The word ”Poly” refers to the ability to render multiple primitives of the same type with only one command. The ability to render multiple primitives with only one command eliminates the overhead of sending a new command with each primitive. The command set is designed to map closely to the X Window protocol. The performance of accelerated functions using the ”Poly” command interface is significantly faster than DFA. Some of these accelerated functions include line draw, area fill, and bit block transfer.
Rendering Engine Architecture
See Figure 2 for a block diagram of the internal architecture. These processing units provide the X Window server (X server) with dedicated hardware to accelerate the rendering of lines, points, and area fills. In addition to rendering, PAX also provides assistance for moving pixels to and from system memory or to and from another screen location. This function is referred to as Bit Block Transfer (Blits).
Vertices define the boundary of the region to be rendered. These vertices are included in ”Poly” commands. All vertices for these drawing commands can be sent as 16–bit, two’s complement, window relative coordinates. This 16–bit vertex provides the X server with a 64 K x 64 K virtual screen. The origin of this virtual screen space is at the center, thus allowing for both positive and negative X and Y addresses.
Every vertex sent with the Poly Line command defines a new line. This is useful for drawing connected lines. The Poly Segment command requires two vertices to define a line. This command is used to render multiple non–connecting lines.
The X Window protocol supports styled lines. These are
referred to as OnOffDashed and
PAX supports a set of dash counters which allow the server to define a line style with up to 8 unique segments or dashes. The X server selects between Dashed and Double Dashed by enabling transparent rendering.
The major performance bottleneck for the line generation logic is the frame buffer. As mentioned earlier, PAX employs Pixel Interleaving and Load Clock Interleaving to reduce this bottleneck.
Spans are a continuous row of pixels. These are the basic area fill primitives. All other types of area fills are broken down into spans by the internal area fill logic.
The triangle and quadrilateral fill logic uses the line
logic and another simpler line generator to find the edges of the area.
Since only two line generators (edge walkers) are available, PAX can only
support quadrilaterals which are convex in the Y direction. See Figure
3 for examples of the
Rectangles are a special case of a quadrilateral and are handled separately. Special casing rectangles provides additional performance since the overhead of walking the edges is eliminated. This extra performance is beneficial to window management and clearing areas on the screen.
Quadrilaterals, rectangles, and triangles can
be drawn in one of two modes: X Window compliant or Full Fill. The X Window
mode draws the fill area such that when an object is connected to other
objects along an edge, no pixel is written twice. This is accomplished
Figure 3 Area Fill Examples
Bit Block Transfer
Screen to Screen Blits are used to copy a block of pixels from one location on the screen to another. PAX automatically handles overlapping source and destination blocks so that the source block appears correctly at the destination.
Screen to Screen Blits continuously switch between frame buffer reads and writes. These transitions drastically reduce the usable frame buffer bandwidth. PAX has an internal buffer to reduce the number of transitions which increases the utilization of the frame buffer’s bandwidth.
System to Screen and Screen to System Blits are used to
copy pixels between system memory and the frame buffer. There are two modes
of operation for these commands: Direct and Indirect. In Direct mode, system
software controls the pixel transfer from system memory and PAX only controls
the frame buffer address. In Indirect mode, the PAX chip becomes a master
of the PowerPC bus and completes the transfer of data with no processor
intervention. The software in
PAX supports a special System to Screen Blit mode which
accelerates character performance. When operating in this mode, each bit
of the data sent is interpreted as a pixel. The hardware renders the foreground
color for all the 1s in the data word. The background color is rendered
for all the 0s in the data word if transparency is disabled and nothing
is rendered if transparency is
These attributes modify and control the rendering functions by modifying vertices and pixel generation. Different attributes apply at different stages in the rendering process.
By implementing Boolean operations in hardware, the X server is relieved of the slow task of reading the frame buffer, modifying the color, and writing the new color to the frame buffer. This function is extremely important to applications which require Boolean operations.
The Rectangular Clipping logic consists of four pairs of
extent registers. Each pair defines a rectangular region with either an
inclusive (all pixels inside the region are rendered) or exclusive (all
pixels inside the region are NOT rendered) attribute. The X server uses
these registers to define the window’s geometry to the PAX chip so that
pixels in the obscured sections of the window
Four regions are not always enough to define a window’s
geometry. Such is the case when the window is obscured in four or more
unique areas or when shaped (i.e. non rectangular) windows are used. For
these cases, the X server may render the window’s geometry to off screen
memory, referred to as clipping planes, with a unique ID. PAX then reads
the pixel’s corresponding clip plane value and compares it with the clipping
ID to determine if the pixel should be written to the
Applications draw using a window relative coordinate system. Vertices in this system must be converted to screen coordinates before they can be used to render an object. This conversion is accomplished, in hardware, by adding the Window Origin Offset to every pixel before it is rendered. Providing this capability in the PAX chip eliminates the task of converting the coordinates in software which increases the rate at which the vertices are sent to PAX. Overall, adding the window origin offset increases the performance of processor bound primitives such as points.
Some applications require a different color palette than
the default. If only one palette is available, the colors of other windows
will change when focused on these types of applications. The PAX chip supports
an additional four planes of memory, referred to as Window ID planes, which
allow the X server to select a unique palette on a per pixel basis. These
planes also identify other attributes of the pixels such as: frame buffer
PAX supports a fixed 16 x 16 Stipple pattern. This pattern is addressed by the four least significant bits of the window coordinate of the pixel to be rendered. The value at that location is the stipple value for that pixel. PAX supports stippling for all rendering operations.
Applying a transparent stipple pattern to an area fill
operation does not effect the ability to use the Block Write function.
However, applying an opaque stipple prevents the use of Block Write since
two colors must be rendered (foreground and background). This normally
reduces the performance of large opaque stippled objects by a factor of
approximately four. To prevent such a drastic drop in performance, the
PAX architecture employs a unique feature called Stipple
3D API Assist Functions
The Anti–Aliased line draw function provides the system software with the ability to render lines without the familiar problem of ”stair steps” or ”jaggies” (i.e. aliasing). This function uses a proprietary two pixel approximation technique to visually remove the aliasing caused by the discrete pixels on a raster display.
Sub pixel positioning of lines allows the system software to more precisely place the anti–aliased lines on the display.
Dithering is a technique which trades spatial resolution for more color resolution. Essentially a 24–bit color value is converted to an 8–bit value and then slightly modified based on the pixel’s position in the window. The overall appearance is that the graphics subsystem has more than 256 colors.