# Enhanced HyperSparc Challenges UltraSparc

Ross's "Colorado 4" Offers Speedy Upgrade for SPARC Systems



#### by Linley Gwennap

Refusing to yield to Sun in the SPARC processor race, Ross Technology continues to improve its HyperSparc line. The company disclosed its latest revision, known as Colorado 4, at the Microprocessor Forum in October. According to

architect Mitch Alsup, the new design will match the integer performance of a 167-MHz UltraSparc but with a smaller die. Because Colorado 4 continues to use the 32bit SPARC v8 architecture and an MBus interface, unlike

UltraSparc, it offers a high-performance yet compatible upgrade to the large installed base of SuperSparc systems.

### Colorado 3 Now Shipping

HyperSparc is an excellent example of how a company that lacks the huge design resources of Intel and Sun can develop competitive products through an incremental design strategy. The original HyperSparc, known as Pinnacle, used a 0.8-micron CMOS process and TAB packages to reach speeds of 66 MHz (*see 060405.PDF*). A shrink to 0.5-micron CMOS, combined with a move to an MCM package, pushed the clock rate to 100 MHz for the first "Colorado" design. As Figure 1 shows, reducing the gate size to 0.4 microns boosted the frequency to 125 MHz for Colorado 2.

While all of these parts use the same basic design, the current offering, Colorado 3, includes some new fea-



Figure 1. A roadmap of HyperSparc processors shows steadily increasing clock speed as the original design has evolved.

Increase in the second s

Ross architect Mitch Alsup describes the new features of the Colorado 4 CPU at the Forum.

tures. The biggest change is the addition of a second integer ALU. All HyperSparc processors use a simple two-way superscalar design, but the initial versions must pair simple integer operations with memory operations, branches, or FP instructions to reach the peak issue rate. The second ALU allows Colorado 3 to issue two simple integer instructions at once, improving throughput by 1-2%.

The design team also reworked critical circuit paths, improving the clock speed to 150 MHz with the same 0.4micron process as Colorado 2. The newer processor comes with larger caches (512K and 1M) than Colorado 2. The

company is still characterizing Colorado 3's performance with SPEC95, but Alsup estimates that the processor will deliver 3.9 SPECint95 and 4.9 SPECfp95 (base) at 150 MHz with a 512K cache.

### Colorado 4 Adds Data Cache

To further improve performance, the Colorado 4 design is the most extensive update since the HyperSparc line debuted. A move to Fujitsu's 0.35-micron CS-60 process (see MPR 7/10/95, p. 16) increases the clock rate to 200 MHz. At this speed, however, it becomes difficult for the external cache to keep up with the core processor.

All previous HyperSparc designs lack an on-chip data cache, instead using the external cache as the first-level data

cache and accepting a two-cycle load latency. Running the external cache slower than the CPU would have further increased the load latency, an unacceptable performance penalty. Instead, Ross added an on-chip data cache, as Figure 2 shows, reducing the load latency to one cycle. The tighter metal pitches of CS-60 allowed the company to add a 16K data cache and double the instruction cache size to a matching 16K, all without significantly increasing the die size.

As a result, Colorado 4 can use the same cachememory chips as its predecessor, running the external cache at half of the CPU frequency. The added on-chip cache greatly reduces the number of accesses to the external cache, so halving the cache bandwidth has a limited impact on performance. Building the cache chips in the 0.4-micron process saves cost and eases the company's dependence on Fujitsu's most advanced process.

Another major change is an improved floating-point

#### MICROPROCESSOR REPORT

unit. The original Pinnacle FPU, carried through successive versions, required three cycles for most FP calculations. Because the chip had only a single-precision multiplier, double-precision multiplication took four cycles. Taking advantage of the process shrink, Colorado 4 includes a larger FPU that executes most FP calculations in two cycles. A full DP multiplier eliminates the extra cycle for double-precision multiplication.

To deliver strong performance even for programs with large data sets, Ross increased the MBus speed to 66 MHz. Unfortunately, most MBus-based workstations run the bus at 40 or 50 MHz. A large portion of Ross's sales are in the upgrade market, and these processors must run at the system's bus frequency. A 200-MHz HyperSparc with a 1M cache loses little SPECint92 performance with a 40-MHz MBus, but some larger programs, including SPEC95, will lose significant performance in this configuration. New system designs can use the faster bus to reduce this performance degradation.

#### UltraSparc Offers Stiff Competition

HyperSparc was originally designed to compete with SuperSparc, but it must now match up against Sun's UltraSparc. Alsup estimates that a 200-MHz HyperSparc with 1M of cache and a 50-MHz MBus will deliver 5.3 SPECint95, roughly matching a 167-MHz UltraSparc. The Ross chip is rated at 5.9 SPECfp95, however, well behind UltraSparc. It does hold a cost advantage: at 144 mm<sup>2</sup>, Colorado 4 will be much easier to build than the 315-mm<sup>2</sup> UltraSparc.

UltraSparc is currently shipping, however, while



Figure 2. Block diagram of Colorado 4 shows the new on-chip data cache and dual integer ALUs. The MCM also contains the CMTU (cache memory and tags unit) and up to 1M of level-two cache.

## Price & Availability

HyperSparc processor modules are shipping today at 150 MHz, with the 200-MHz version expected to ship in volume in 2Q96. In 10K quantities, a 150-MHz MBus module sells for \$1,350 with a 512K cache. No pricing has been announced for the faster version.

For more information, contact Ross Technology (Austin, Texas) at 512.892.7802; fax 512.892.3036 or check the Web at *www.ross.com*.

Ross just taped out Colorado 4 in November. Ross hopes that the high degree of leverage from previous designs will allow it to ship Colorado 4 processors in 2Q96. We would not be surprised if this aggressive schedule slips to 3Q96, the time UltraSparc-2 (see **091505.PDF**) is expected to hit the streets. At just 148 mm<sup>2</sup>, this 0.3micron UltraSparc will cost about the same to build as Colorado 4, but it is expected to deliver 8.5–11 SPECint95, far more than the Ross processor. For these reasons, Sun, which currently supplies 26% of Ross's revenue, is likely to cut back its HyperSparc orders.

Ross, of course, is working on a next-generation processor that it hopes will compete with UltraSparc-2. In the meantime, Colorado 3 and 4 are most appealing to the upgrade market. Sun has no plans to make Ultra-Sparc work with either MBus or the older SunOS, while HyperSparc plugs directly into existing systems, offering far better performance than the MBus- and SunOScompatible SuperSparc. With one million MBus slots in the field, the upgrade opportunity is large. ◆



Figure 3. Die plot of the 4.9-million-transistor Colorado 4 CPU also shows the sizable caches. The die will measure  $12 \times 12$  mm when fabricated in Fujitsu's 0.35-micron four-layer-metal CS-60 process.