# Most Significant Bits

### Cyrix 6x86 Challenges Pentium

Formally unveiling its M1 processor, Cyrix became the first vendor to deliver a product that matches Intel's Pentium with software compatibility, hardware compatibility, and performance up and down the line. As expected, the company dubbed its chip the 6x86, following the lead of its 5x86 processor introduced earlier this year (*see 090901.PDF*).

The 6x86 is available immediately at a clock speed of 100 MHz. Cyrix set the 1,000-piece list price of its new part at \$450, well below the \$694 list price of Intel's 133-MHz Pentium. Even after Intel cuts the price of its chips on 11/1, the 6x86 should still hold a significant price advantage, putting it about on par with a Pentium-120.

To bolster its claim to Pentium-133 performance, Cyrix released additional performance data on the 6x86. Using Winstone 95 and Windows for Workgroups, the company measured its internal reference design at 217 Winstones using a 1M secondary cache and a Diamond Stealth Video 64 card with 2M of VRAM. In a standard Micronics motherboard, where the Cyrix chip is hampered by nonlinear burst ordering and a more reasonable 256K cache, the 6x86 achieves 196 Winstones using the same video card.

Published Winstone ratings for 133-MHz Pentium PCs range from 154 to 218, putting the 6x86 at the high end of this range. Similarly, on CPUmark16, Cyrix mea-



Cyrix's initial 6x86 contains 3.0 million transistors and measures 20 mm  $\times$  19.4 mm (394 mm<sup>2</sup>) in a 0.65-micron three-layer-metal CMOS process manufactured by IBM.

sured the 6x86 at scores of 271 and 287, while Pentium-133 systems rate from 260 to 289. Comparing comparably configured systems, we believe the 6x86-100 is somewhere between a Pentium-120 and Pentium-133 in PC application performance.

Given Pentium Pro's poor performance on 16-bit code (see **091001.PDF**), Cyrix claims the 6x86 is the fastest processor available for 16-bit Windows applications. In any case, Cyrix has demonstrated that it can compete with Intel's fastest processors using a part that can be dropped into existing motherboard designs.

PC makers AST, Epson, and Germany's Peacock will be the first companies to introduce 6x86 systems, which are expected to begin shipping by the end of this year. Compaq, which is quoted in Cyrix's press release, is also likely to pick up the 6x86. Motherboard vendors such as Acer and Micronics are planning to produce 6x86 boards, supplying numerous smaller PC makers. A wide variety of chip-set and BIOS vendors support the new processor as well.

Cyrix also reported good progress on the shrink version of the 6x86. The current part uses an enormous 394mm<sup>2</sup> die, shown in the photo below, but the company is already testing a 210-mm<sup>2</sup> version built in a five-layermetal 0.65-micron process with better metal pitches. This shrink will not only bring manufacturing cost to a manageable level, it will also boost the clock speed to 120 MHz. A further shrink, to a 0.5-micron process, is due in 1H96 with clock speeds of 133 MHz, putting the pressure on Intel to keep ramping Pentium's frequency.

Even with the reduced die size, the 6x86 still costs about 50% more to build than the 0.35-micron Pentium. But as long as Cyrix can deliver strong performance, it keeps its parts in the high-margin zone where manufacturing cost is relatively unimportant. The company also has the 5x86 to compete with low-end Pentiums. Particularly given AMD's K5 stumbles (*see 0914MSB.PDF*), the 5x86 and new 6x86 position Cyrix for a good year in 1996.

#### Alpha Extends Performance Lead with 21164A

Just days after boosting Alpha performance to an estimated 400 SPECint92 (*see 0913MSB.PDF*), Digital revealed a 500-SPECint92 processor at the recent Microprocessor Forum. The new 21164A is a slightly enhanced version of the 21164 in a 0.35-micron process. The performance increase is due mainly to an increase in clock speed from 333 MHz to 417 MHz (2.4-ns cycle time).

The process shrink provides significant improvement in both cost and power. The 21164A has a die size of 209 mm<sup>2</sup>, a third smaller than the original 21164. This brings the estimated manufacturing cost to about \$240, according to the MDR Cost Model, down from \$310. Power dissipation is improved by a move to a 2.0-V core instead of the more standard 3.3-V supply used by the 21164, although the I/O remains 3.3-V compatible. Thus, despite the faster clock speed, the new device dissipates 20 W, far less than the 50 W burned by the incandescent 300-MHz part.

The processor includes a few enhancements over the 21164 (see **081201.PDF**). The original Alpha architecture was unique in its complete rejection of 8- and 16-bit data types and was often criticized for this shortcoming (see **060302.PDF**). Digital finally succumbed and added load and store instructions for these data types, improving performance when emulating x86 code or executing drivers ported from x86. Digital's increasing emphasis on Windows NT necessitated these changes.

The new L3 cache controller supports Pentiumstyle synchronous SRAMs, which are becoming inexpensive and widely available. Most 21164 systems today use asynchronous SRAMs, slightly lowering performance.

Digital did not formally announce 21164A as a product; no pricing is available. The current high-end Alpha part lists for nearly \$3,000, and the 21164A is likely to inherit this price point. The company is already testing first silicon of the device, which it expects to reach volume production in 3Q96. At the conference, Digital's Pete Bannon estimated the 417-MHz part would deliver about 11 SPECint95 and 17 SPECfp95 (baseline).

By moving its industry-leading processor to 0.35micron before the competition upgrades their parts, Digital should retain its performance leadership with the 21164A. The reduction in manufacturing cost is a nice side effect, but until Digital cuts margins on its high-end products, manufacturing cost is not an issue. The 21164A will be an appealing part for those who need the highest uniprocessor performance, for either Unix or Windows NT, and are willing to pay for it.

### **Enhanced PowerPC 604 Debuts**

The PowerPC team also is moving its entire product family to 0.35-micron CMOS. With the 601 already shipping at that level and the 603e (*see 0911MSB.PDF*) planned for 1Q96, the next step is the PowerPC 604e, disclosed at the recent Microprocessor Forum. The new chip offers larger caches, higher clock speeds, and other improvements over the current 604.

IBM's Kaivalya Dixit said that the 604e will reach 166 MHz, a 20% increase from the 0.45-micron version, and possibly faster. He estimates the performance of the 604e in a high-end configuration will be 6.0 SPECint95 and 5.0 SPECfp95 (baseline). This represents a 33% increase from the 133-MHz 604 on the integer test and 50% on the floating-point tests, significantly better than would be expected from the clock-speed increase.

An important factor in the improved performance is

the doubling of the on-chip caches, to 32K each for instructions and data. The SPEC95 tests generate more cache misses than does the SPEC92 suite, so the larger caches have a bigger impact with the newer tests. The higher scores also include some compiler improvements and a faster system configuration.

Other hardware enhancements include more buffering on bus transactions and slightly improved branch handling. The 604e adds hardware support for littleendian misaligned loads and stores, improving performance when emulating x86 code. New clock multiples  $(2.5 \times$  and  $4 \times)$  allow more flexibility in choosing a bus frequency. The maximum power dissipation of the new part is estimated at 12–13 W, about the same as earlier versions of the 604.

The new process shrinks the die to  $148 \text{ mm}^2$ , 25% smaller than the 604's, despite the increased cache size. The MDR Cost Model estimates the manufacturing cost of the new chip at about \$90, also 25% less than its predecessor's and about the same as a 0.35-micron Pentium. The cost is improved by a move to a 255-pin BGA package, although a 304-pin CQFP is offered for compatibility with existing designs.

The 604e has achieved first silicon and already runs at 166 MHz. Both IBM and Motorola expect volume production in 1H96. The performance of the new part exceeds that of any current processor except Digital's 21164, although forthcoming next-generation RISCs may surpass that performance level before the 604e begins shipping.

Intel's fastest Pentium, however, is rated at just 3.64 SPECint95, well behind the 604e's expected performance, so it is unlikely that even a future Pentium will match the 604e's performance. Instead, Intel will need Pentium Pro (the P6) to compete at that performance level. If the 604e meets its goals, it will offer P6-class performance with the manufacturing cost of a Pentium, a potent combination.

The PowerPC vendors hope that the 604e avoids the 620's fate. That processor, originally due in systems in 2H95, has slipped into 1H96, although some samples are now available. To improve performance, the 620 may debut in a 0.35-micron process rather than the 0.5micron process originally planned. IBM and Motorola need a better showing to deliver the 604e on target.

## AMD SSA5 Really Is K5

Since our last issue (*see* **0913MSB.PDF**), AMD has released additional information on its "new" SS/5 processor, now renamed the SSA5. Contrary to our speculation, the SSA5 is a fully functional K5 processor. It turns out that the K5, as originally designed, does not meet its goal of delivering 30% better performance than Pentium at the same clock speed. Although it does well on Unix applications (e.g., SPEC), on typical PC software, the original design has roughly the same performance as a Pentium of the same clock speed.

Instead of redefining the K5's performance target, AMD simply redefined the K5 name. The SSA5 is merely the K5 as originally designed. This chip is being tested now and, if all goes well, will begin shipments late in 1Q96, with volume production the following quarter. In the meantime, the company will work on bringing the chip's performance to its original 1.3× target.

AMD claims that the improvement will be relatively simple and quick. The performance degradation on PC software is due to a few long-latency instructions that occur far more frequently in real applications than AMD expected. For example, the company expected few occurrences of the far CALL instruction and so didn't bother to reduce its execution time below 15 cycles or so. Upon testing the device, AMD discovered that far CALL, REP MOVS, and a few other instructions consumed nearly half the cycles, bogging down performance. By trimming the execution time of these instructions to a handful of cycles, the company expects a significant performance boost, putting the K5 back to its original target.

AMD plans to make the necessary modifications quickly enough that the new version, which the company now calls the K5, will reach production just one quarter behind the SSA5. If there are any problems, however, the new K5 may slip further, or the company may release another K5 that doesn't meet the  $1.3 \times$  target. AMD is also working to improve some speed paths, boosting the clock frequency to 100 MHz instead of the 75-MHz clock used by the SSA5.

It is disappointing, if perhaps understandable, that the K5 team, in its first x86 design, miscalculated the instruction distribution of common PC applications. Hopefully, the team will be able to fix the problems as quickly as planned, getting the K5 program back on track, albeit several months behind schedule.

## **Mobile Pentium Jumps to 120 MHz**

As previewed in our last issue (see **091301.PDF**), Intel today rolled out a 120-MHz version of its Mobile Pentium processor. The new part is identical to the desktop version except that it operates at 2.9 V. The new chip dissipates 3–4 W under typical conditions, the same as the 90-MHz Mobile Pentium. Power dissipation is held constant, despite the higher clock speed, by a switch to Intel's 0.35-micron BiCMOS process.

Intel's new part, also known as the 120-MHz Pentium VRT, carries a 1,000-piece price of \$681, a 17% premium over the desktop Pentium-120. By comparison, the 90-MHz Pentium VRT costs \$341. The VRT parts are available in the standard 296-pin PGA or a smaller 320-lead TAB package, which Intel calls a TCP; the price is the same for either packaging option. Production volumes are available immediately.

The new entry eases the performance gap between high-end notebook and desktop systems while pushing 75- and 90-MHz Mobile Pentiums into midrange notebooks. The next step, a 133-MHz Mobile Pentium, should appear in 1H96.

## Gassée's Firm Debuts BeBox

After four years of development, Be Inc., a startup led by former Apple executive Jean-Louis Gassée, has revealed the BeBox, a dual-processor PowerPC-based system with an entirely new operating system. Although at first it may be hard to justify a new hardware and software platform in today's standards-dominated market, Be's strategy is not as crazy as it may seem.

The company has no illusions of pushing into the mainstream office computing market with the BeBox. Instead, Be is focusing on audio-visual computing and other real-time, essentially dedicated applications. The BeBox may also appeal to computer hobbyists and experimenters, and eventually to some brave business users. The tiny company's slogan is "Amiga 96," which offers a good idea of its orientation.

By starting fresh, rather than extending existing software, Be has created an exceptionally fast, elegant platform. The initial system is based on two 66-MHz PowerPC 603 processors with no L2 cache—hardware that would seem very modest in the Mac environment. Yet the performance is vastly superior to even the fastest Macs when it comes to responsiveness. The system feels dramatically faster: windows pop up nearly instantaneously, and simultaneous tasks run smoothly.

The microkernel-based OS supports preemptive multitasking for up to eight processors, with a real-time architecture that provides high-performance audio and video. A full set of Internet access software is included. Be plans to port its OS to the CHRP platform (*see* **081602.PDF**) and also expects to support CHRP operating systems, including Mac OS, on its future hardware.

The BeBox uses PC-standard I/O chips and provides three PCI slots and four ISA slots. The system includes extensive built-in I/O, including two MIDI ports, four serial ports, a parallel port, two joystick ports, three IR controller ports, 16-bit 44.1-kHz stereo audio in and out, and a "GeekPort" that provides 16 parallel I/O lines and four A/D and D/A channels.

Developer systems are available now, with volume production planned by year-end. A basic system is priced at only \$1,600, but this includes no DRAM, disk, keyboard, or monitor. Be is offering licenses to its hardware and software for only \$50 per unit. Extensive specifications are available from Be's Web site, *www.be.com*; the company can also be reached at 415.462.4141.