# 

# THE INSIDERS' GUIDE TO MICROPROCESSOR HARDWARE

#### VOLUME 7 NUMBER 15

#### NOVEMBER 15, 1993

# Sun Enters Chip Market

# First Product, MicroSPARC-2, Increases Low-End Performance

#### by Linley Gwennap

Sun Microsystems, hoping to spur the dormant SPARC market, will become a vendor of microprocessors and chip sets through its SPARC Technology Business (STB) arm. The company will provide an alternative to SPARC processor sources, such as TI and Fujitsu. Sun also plans to market ASICs from its current and future workstations.

The STB unit now consists of more than 400 people, including all the SPARC processor designers within Sun and its own sales and marketing force. STB continues as part of Sun Microsystems Computer Company (SMCC), Sun's hardware systems subsidiary. The group's goal is to increase the penetration of SPARC in the computer market.

Because Sun is by far the largest consumer of SPARC processors, some CPU vendors have shown little interest in selling to other customers. For example, TI refused to sample SuperSPARC chips outside of Sun until well after Sun had announced systems, delaying the development of any compatible systems. Sun has also caused problems by not making key ASICs available or withholding them for months after system release.

Sun says it has turned over a new leaf. For example, STB's early access program (*see* **0712MSB.PDF**) gives other companies an opportunity to develop SPARC systems in parallel with SMCC's own developments—for a \$750,000 fee. STB announced Sun's next-generation graphics controller (see sidebar below) a full two weeks before SMCC announced systems using this chip. STB hopes to extend this interval in the future.

By selling processors and ASICs, STB plans to offer one-stop shopping to companies interested in building SPARC systems. The new division hopes to be more active than current CPU vendors in promoting the SPARC architecture to new system makers.

It is not clear whether STB will offer lower prices for processors than its current vendors. While Sun has the disadvantage of relying on external companies to build the chips, it has the advantage of not paying itself royalties on the design. Ultimately, prices should be comparable in most cases; for example, STB is quoting \$179 for microSPARC chips in 1K volumes, while TI has priced the chip at \$179 in 10K volumes.

#### MicroSPARC-2 Upgrades Low End

The first product being released under this new strategy is microSPARC-2 (MS-2), a follow-on to Sun's original microSPARC processor (MS-1). Although the Sun-designed chip is being built by Fujitsu, STB is the first to announce availability of samples. Eventually, both companies will market MS-2. Neither company, however, is willing to discuss volume pricing at this time.

The new chip is being introduced just one year after the first microSPARC processor. Like its predecessor, MS-2 is a highly integrated, low-cost system solution. By quadrupling the cache sizes and increasing the clock frequency, Sun has increased its low-end performance from that of a 486DX to near a low-end Pentium.

Due to its high integration, MS-2 maintains a low system cost; it achieves its high performance without external cache and without memory or bus interface chips. MS-1 is used in Sun's SPARC Classic system, which has a fully-configured list price of \$4,295; systems using MS-2 could be priced around \$5,000.

Although MS-1 is being built by TI, Sun decided to change tacks and use Fujitsu's 0.5-micron CMOS process for the new chip. Fujitsu, a long-time SPARC supporter, has been slowly building its own SPARC business. Its state-of-the-art fabrication process allows the larger caches and faster clock rate (70 MHz versus 50 MHz) that provide much of the performance boost from MS-1. Sun's engineers also made many logic-design changes to improve performance.

#### Workstation on a Chip?

The new chip includes all the system features of the original microSPARC, creating the core of a low-cost workstation. A SPARC Classic, for example, consists of

little more than an MS-1 chip, SIMM sockets for DRAM, a graphics chip, and two NCR peripheral chips that provide disk, keyboard, and other interfaces. MS-1 includes the CPU, FPU, cache, MMU, memory controller, and SBus interface, to which the graphics and peripherals are connected.

Figure 1 shows a block diagram of MS-2. It is similar to MS-1's; the only new block is a PLL clock generator, which allows the use of a lower-frequency clock input rather than the  $2\times (100 \text{ MHz})$  clock used by MS-1. For MS-2, the PLL accepts a half-speed clock (35 MHz), which simplifies system design.

Compared with MS-1, other changes are the larger caches, write buffer, and TLB. The 24K of on-chip cache, four times the amount in MS-1, is comparable to other leading microprocessors. The caches are now virtually tagged to improve timing over MS-1's physical caches.

The larger caches are particularly significant to microSPARC due to the lack of a secondary cache; the improved hit rate reduces the number of lengthy mainmemory accesses. (Neither microSPARC chip can support even an externally controlled secondary cache due to the integrated memory controller.)

To improve performance when the cache does miss, the new design features a 64-bit cache-refill bus, twice the width of MS-1's. The write buffer is increased from a single entry to four entries, to help prevent the processor from stalling on a sequence of stores to memory or I/O. The size of the TLB is doubled, compared with MS-1, to 64 entries.

#### Branch Folding is Pseudo-Superscalar

The pipeline and instruction dispatch logic for MS-2 is quite similar to that in MS-1. One significant improvement is support for branch folding. This is similar



Figure 1. MicroSPARC-2 contains a complete SPARC processor, with memory and SBus interfaces, on a single chip.

to the technique used in Hobbit and the PowerPC 603; branches are detected early in the pipeline and the instruction flow follows the predicted outcome of the branch. Since branches are removed from the instruction stream before the execute stage of the pipeline, the integer unit can, in effect, handle two instructions in a single cycle if one of them is a branch.

MS-2 uses a simple prediction scheme: all branches are predicted taken. If a folded branch is correctly predicted, there is no penalty; a mispredicted (not taken) branch costs only a single cycle. The misprediction penalty is kept low by the short (five-stage) pipeline and an instruction prefetch queue. This new four-entry queue normally contains the next four instructions; during branch handling, the processor will begin fetching instructions from the taken path, but instructions from the sequential path will still be in the queue when the branch condition is resolved. Thus, even if the branch is mispredicted, the correct instruction can still be quickly fetched from the queue, saving one cycle.

Most integer conditional branches can be folded, but branches that access the register file (such as a return from subroutine) cannot, since their destination is not immediately known. Such branches typically take two cycles to execute.

To keep the prefetch queue fed, the instruction cache delivers two instructions (64 bits) per cycle. The data-cache bus width has also been doubled to 64 bits. The latter change improves the timing of 64-bit FP loads, which take two cycles on MS-1 but only a single cycle on MS-2. Store timing is also improved from 2–3 cycles in the previous design to one cycle in the new chip, except for 64-bit integer stores, which take two cycles.

#### New Multiplier Speeds Floating Point

The faster doubleword loads and stores are a benefit only to floating-point code, since the integer registers are 32 bits wide. Along with the reduced execution time, the new chip eliminates the one-cycle FP load-use penalty in MS-1.

To further improve FP performance, Sun added a new multiplier to MS-2. The original FPU design was obtained from Meiko, a minisupercomputer vendor, but it does not deliver competitive performance, particularly on FP multiplication. While the old multiplier took nine cycles for a DP multiply, the new one takes only three. Since the rest of the FPU has not been changed, MS-2 has the unusual property of executing a floating-point multiply in one fewer cycle than an FP add.

The new multiplier is added to the Meiko FPU. This allows an FP multiply and FP add to execute in parallel in MS-2; in the old FPU, each FP operation must execute in series. Although FP instructions still must be dispatched one at a time, the new processor includes a three-entry FP instruction queue to reduce stalls in the



Figure 2. MicroSPARC-2, which uses a 0.5-micron CMOS process, requires 2.3 million transistors on a  $15.3\times15.2$  mm die.

integer unit; MS-1 uses a queue with only a single entry.

#### System Interface Improved

In the spirit of making small improvements to nearly every feature of the chip, the MS-2 designers doubled the maximum amount of main memory to 256M by doubling the number of banks to eight. Up to two pagemode accesses can be active at one time, reducing memory latency.

The new chip allows for programmable memory timing and supports a variety of DRAM types. With 60ns DRAM, the new processor can do page-mode accesses every four cycles at 70 MHz (11-4-4-4 access pattern). MS-2 also allows a graphics accelerator to be optionally connected to the memory bus rather than to the much slower SBus.

The SBus controller has been tweaked to allow more clock dividers; this was necessary to handle the 70-MHz clock rate, since SBus is limited to 25 MHz and the original microSPARC design allowed for only a  $2\times$  difference between the CPU clock and the SBus clock. MS-2 can support up to a  $5\times$  ratio, providing for future upgrades as fast as 125 MHz.

The new chip implements longer SBus burst transactions and pipelined DMA, more than doubling SBus throughput. Up to 16 TLB entries (25% of the total) can be allocated to I/O page-table entries to eliminate page lookups during DMA. MS-2 also supports five SBus slots, one more than MS-1.

The processor core operates at 3.3 V, but the chip can interface directly to 5-V SBus devices and either 5-V

## New Chips from STB

Along with microSPARC-2 samples, STB has also announced availability of a number of chips and chip sets used in Sun systems. Many have already been announced by their respective manufacturers; for example, STP2000 and STP2001 are STB's names for the two SBus combination-I/O chips that NCR markets as the 89C100 and 89C105 (*see 061402.PDF*). These chips are used in Sun's SPARC Classic workstation.

One new product is the SX graphics chip set. The SX integrates graphics into the memory subsystem, taking advantage of MBus' 320 Mbytes/s peak bandwidth. The heart of the chip set is the SMC, which controls up to 512M of DRAM and 8M of VRAM. The SMC also contains a 2D-graphics accelerator (the CPU performs 3D-graphics calculations) that delivers high performance due to its close proximity to the memory system. Two other chips convey data from the VRAM serial ports to a standard DAC and perform 24-bit color translation. The SX chip set is currently available for \$933 in 1K quantities from STB.

Other chip sets are also available. For example, a 50-MHz SuperSPARC with cache control, memory control, SBus and EBus interfaces, and ISDN is priced at \$1,819 in quantities of 1K. A 60-MHz SuperSPARC with similar system logic plus the SX graphics chips costs \$3,276 for the same volume.

or 3.3-V DRAMs. Even with its lower operating voltage, the new chip dissipates more power than MS-1, due to its higher clock rate and much larger number of transistors. Maximum power at 70 MHz is 6 W, 33% more than MS-1. Typical power usage at that clock rate is about 5 W (estimated). During normal operation, most of the cache is powered down when it is not being accessed, which helps reduce power consumption.

Although the primary target for MS-2 is desktop systems, Sun has included some power-management features to make the chip more attractive for portable systems. Tadpole, for example, markets notebooks using the Cypress 601 (SPARCstation 2) processor; the notebook vendor may pick up MS-2 for future products.

The new processor uses a fully static design. Since the chip is clocked through a PLL, however, external system logic cannot rapidly vary the clock frequency to save power. Instead, MS-2 takes advantage of its static design by offering a standby mode that stops the internal clocks to all logic blocks, although DRAM refresh and SBus clocks continue. In this mode, which is triggered by asserting the STANDBY pin, power consumption is reduced by 90% to about half a watt.

#### TAB Package Rejected

While MS-1 is available only in a TAB package (see **071304.PDF**), the new chip is not offered in TAB at all and

instead uses 321-pin PGA. The TAB package shaves about \$20 off the manufacturing cost of the chip, but SMCC believes that it spends nearly that much in added board-level manufacturing costs. Given SMCC's ambivalence, the packaging decision rested with STB, which feels that the familiar PGA package will be easier to market to other customers.

The original microSPARC design took a lot of criticism for using 225 mm<sup>2</sup> of die area for just 800,000 transistors, which seems like a lot of silicon for a processor that achieves only 486DX performance. As shown in Figure 2, the new chip is even larger, at 233 mm<sup>2</sup>. To pack the larger caches and new functions onto the die, Sun turned to Fujitsu's 0.5-micron, three-layer-metal CMOS process, gaining a significant advantage over the 0.8micron, two-metal process used by TI for MS-1.

The more advanced process results in nearly twice the wafer manufacturing cost, however. According to the MPR Cost Model (*see* **071004.PDF**), MS-2 will cost about \$195 to manufacture, more than twice the \$85 cost of its predecessor. The cost increase is due to the more expensive process and the PGA package. This compares with \$230 for Digital's 21066 and about \$200 for the forthcoming 0.6-micron Pentium (P54C).

Neither STB nor Fujitsu has announced volume pricing for MS-2, but STB indicates that the new part

| System        | SPARC   | Sun         | SparcStation | Sun        | SparcStation |
|---------------|---------|-------------|--------------|------------|--------------|
|               | Classic | prototype   | 10 Model 40  | prototype  | 10 w/upgrade |
| Processor     | TI      | Fujitsu     | TI           | TI         | Fujitsu      |
|               | MS-1    | MS-2        | SuperSparc   | SuperSparc | HyperSparc   |
| Clock Rate    | 50 MHz  | 70 MHz      | 40 MHz       | 60 MHz     | 66 MHz       |
| Cache         | 6K/none | 24K/none    | 36K/none     | 36K/1M     | 8K/256K      |
| (on/off-chip) |         |             |              |            |              |
| espresso      | 26.2    | 49.1        | 44.9         | 69.5       | 62.1         |
| li            | 20.7    | 54.6        | 48.9         | 77.3       | 71.1         |
| equtott       | 51.9    | 106.8       | 82.7         | 127.0      | 88.1         |
| compress      | 18.5    | 33.6        | 29.0         | 42.1       | 42.0         |
| SC            | 34.2    | 64.6        | 71.0         | 115.9      | 78.0         |
| gcc           | 18.8    | 39.8        | 42.7         | 62.2       | 55.5         |
| SPECint92     | 26.4    | 54.0        | 50.2         | 76.9       | 64.6         |
| spice         | 17.0    | 34.5        | 38.4         | 66.4       | 49.9         |
| doduc         | 17.0    | 36.1        | 56.5         | 96.8       | 77.2         |
| mdljdp2       | 24.6    | 52.9        | 65.0         | 103.1      | 116.6        |
| wave5         | 13.7    | 26.9        | 39.8         | 68.4       | 60.9         |
| tomcatv       | 20.8    | 52.3        | 71.8         | 86.9       | 58.8         |
| ora           | 31.2    | 66.3        | 127.9        | 191.1      | 157.9        |
| alvinn        | 34.0    | 87.3        | 123.2        | 209.3      | 137.1        |
| ear           | 28.9    | 59.5        | 78.8         | 113.3      | 119.3        |
| mdljsp2       | 14.5    | 29.0        | 33.2         | 50.3       | 58.7         |
| swm256        | 13.3    | 31.6        | 46.4         | 49.5       | 49.7         |
| su2cor        | 25.9    | 47.7        | 55.0         | 136.0      | 109.4        |
| hydro2d       | 22.3    | 35.4        | 49.8         | 93.7       | 83.4         |
| nasa7         | 27.3    | 51.7        | 60.8         | 110.6      | 94.6         |
| fpppp         | 16.9    | 46.2        | 63.3         | 121.3      | 86.5         |
| SPECfp92      | 21.0    | 44.4        | 60.2         | 98.1       | 85.5         |
| Chip price    | \$179   | \$350 (est) | \$450        | \$1850*    | \$1595**     |

Table 1. MicroSPARC-2 nearly doubles the performance of MS-1 but lags highend SPARC processors. \*includes cost of cache and cache controller. \*\*includes complete processor module. All prices in 1K quantities.

will supplement, not replace, MS-1. We expect that the 70-MHz MS-2 will be priced around \$350 in thousands, while MS-1 will continue to be available for low-end systems at its current price of \$179. The new price would compare well with the 40-MHz SuperSPARC, which TI offers for \$300 in quantities of 10,000, since MS-2 includes memory and SBus interfaces not found in the TI chip.

As shown in Table 1, the 70-MHz MS-2 offers about twice the performance of a 50-MHz MS-1 and matches the integer performance of a 40-MHz SuperSPARC with no external cache. The new processor significantly lags the performance of high-end SPARC processors, however, although it does offer better price/performance.

#### Approaching Pentium Performance

A system using the 70-MHz MS-2 should offer about the same performance as a low-end Pentium system, or about 20% less than a maxed-out 66-MHz Pentium box. The new SPARC chip should sell for several hundred dollars less than a Pentium processor, however. Since MS-2 requires no external cache or system-logic chips, the system cost difference could be even greater.

The new Sun processor is comparable to Digital's 21066 (*see* **071201.PDF**) in its integration of memory and bus interfaces. The Alpha processor delivers 40% more

performance but requires an external cache to achieve this result. The 166-MHz 21066 costs \$424, a price that STB needs to beat to be competitive.

Although MS-2 systems will not run Windows NT, unlike Pentium and 21066 systems, the three processors are likely to collide in the low-end technical desktop market. Many CAD applications will soon be available on Alpha and Pentium systems running under both NT and UNIX, which will give users a choice of hardware platforms. Digital expects configured 21066 systems to sell for \$3,000-\$4,000, and complete Pentium systems are already at this price point.

Since Sun's MS-1 system is priced above \$4,000, the company must make significant price cuts to match these other systems with MS-2 boxes, a problem given the higher cost of the new processor. While MS-2 will give a performance boost to users already committed to SPARC platforms, those with more freedom of choice may be attracted to other low-cost RISC systems or Pentium PCs.

#### Chip Strategy is Low Risk

Sun's entry into the chip business is the culmination of a series of actions over the past several months. First, the company detailed a

roadmap of future SPARC processors (*see* **070404.PDF**). Soon after, STB was formed to license Sun's processor designs to vendors who want to manufacture their own modified chips (*see* **0705MSB.PDF**). Since then, the company has moved to make additional operating environments available on SPARC, including Novell NetWare and Windows NT (*see* **0710MSB.PDF**), and later announced its early access program.

The company appears concerned about SPARC's stagnant market share. SPARC systems account for more than half of all workstations sold today; the vast majority of these are from Sun. Having established such a dominant position, the company is finding it difficult to pry additional market share from entrenched rivals such as HP and IBM. With powerful systems such as Sun's Dragon (*see* 070301.PDF), Cray's Superserver (*see* 0715MSB.PDF), and Thinking Machines CM-5, SPARC has made some in-

roads into the high-end market, but unit volume for these machines is tiny compared to workstation sales.

Thus, any significant increase in SPARC volume must come from expanding into the low-cost desktop market, i.e., PCs. Sun itself, however, has built its corporate strategy around Solaris; directly marketing boxes running PC operating systems would conflict with this strategy. To solve this dilemma, Sun finds itself in need of SPARC partners.

While Sun has a history of empowering other SPARC system vendors with one hand and squashing them with the other, this dichotomy was due to the other vendors cloning Sun systems and competing directly. The new initiative could be more successful because it encourages companies to compete in markets in which Sun is not yet interested.

This lets Sun use a two-pronged strategy. First, the company is trying to expand its low-end sales with aggressively priced Solaris boxes, using WABI (*see* **0707ED.PDF**) to support PC software. Second, by increas-



"The design took less than one year from when it was fully staffed until tapeout...The first silicon booted UNIX at the target operating frequency on day one." Chris Yau, Sun Microsystems

Chins Yau, Sun Microsystem

### For More Information

STB has not announced price and availability for microSPARC-2. Contact STB at 415.336.2299, fax 415.336.0822.

ing the availability of SPARC processors and chip sets, Sun is enabling its partners to build SPARC-based Net-Ware (and eventually Windows NT) systems that compete head-to-head with x86-based PCs.

The cost of supplying SPARC chips to these vendors is small, since Sun already has in-house expertise. If vendors insist on using these parts to clone Sun systems, the workstation leader can rely on its marketing prowess and channel ownership to prevent them from stealing Sun's market share. If these vendors instead attack the

> PC market, they expand the presence of SPARC and STB gets the resulting licensing revenue, without the hassle of banging its corporate head against the x86 wall.

> Sun is nothing if not opportunistic. The company quickly dropped its x86 and 68000 workstation lines to promote its then-new SPARC architecture, and (all jokes aside) Scott McNealy was able to say "Motif" when it counted. If SPARC boxes with PC operating systems approach the sales of Sun's Solaris systems, watch for the company to change course. Encouraging other vendors to wring out new technology will make it easier for Sun to change if and when it becomes necessary. The company might even switch CPU architectures again if SPARC falters badly.

Unless that happens, however, Sun will try to position SPARC as the architecture of choice no matter which operating system succeeds. By increasing the supply of SPARC processors and chip sets, the new STB will encourage this trend.  $\blacklozenge$