# THE INSIDERS' GUIDE TO MICROPROCESSOR HARDWARE

# **SA-1100 Puts PDA on a Chip** Integrated StrongArm Is Fastest Windows CE Chip to Date

# by Jim Turley

A new generation of PDA processors is growing up, and Digital Semiconductor has produced the pick of the litter. The company's new StrongArm-1100 combines the spectacular performance of last year's SA-110 microprocessor (see MPR 2/12/96, p. 1) with the integrated logic of a PDA processor. At \$30–\$40 in volume, the SA-1100 offers the best combination of price/performance and MIPS/watt of any integrated CPU on the market.

The chip combines all the logic of a Newton or other handheld computer onto a single chip. The SA-1100 includes a memory controller, color LCD driver, PCMCIA interface, IrDA and USB communication channels, and extensive power management. With the addition of a touch-sensitive LCD panel, main memory, and batteries, designers can create a complete portable system with a minimum of logic. At 200 MHz, that system will easily outperform the current crop of handheld PCs running Windows CE.

# Core, Cache Taken Straight From SA-110

The SA-1100 is the first of a planned series of integrated StrongArm chips combining the company's expertise in CPU design with a number of peripherals. Given StrongArm's relative success with Newton, it's reasonable that the PDAoriented SA-1100 would be the first in the series.

The new device incorporates the existing SA-110 processor almost intact but halves the size of the data cache to 8K, making room for peripherals without bloating the size of the die. Digital then added nearly every peripheral device and controller one could want for a self-contained handheld or portable system, as Figure 1 shows. Most of these peripheral blocks, except the DRAM controller, are VHDL or Verilog designs carried out by third parties to Digital's specifications. All but the USB interface are Digital-owned.

Unfortunately, the SA-1100's data cache suffers from the same problem as the SA-110's: the write-back cache must be flushed manually before shutting down the chip. New data must be forced in to force the old data out. The SA-1100 has a special memory space allocated for just this purpose. Reading from this space returns zeros with single-cycle latency. An iterative software loop can therefore flush the data cache in a few microseconds.

Like most high-end microprocessors—especially ones with a large disparity between bus speed and processor speed—the SA-1100 has a write buffer. Each of the buffer's eight entries can hold 1–16 bytes. Entries are always flushed to memory in order; the chip does not reorder its write buffer or merge writes to the same address. Like the SA-110, buffered write cycles cannot be aborted; once data is committed to the write buffer, the bus cycle must complete.

The SA-1100 does impose consistency checking: if software accesses an address that is in the write buffer, the chip stalls until all entries in the buffer have been written to memory. Although this stall delays the processor, it avoids problems with memory consistency. Unbuffered writes (i.e., stores to address ranges that have been declared uncachable) also flush the write buffer, forcing long delays.



Figure 1. Digital's SA-1100 integrates the StrongArm-110 core and all the peripherals for a single-chip handheld computer.

# Mini-Cache and Read Buffer Supplement Caches

In addition to the chip's normal data cache, the SA-1100 has two additional internal data stores that Digital calls the minicache and the read buffer. These were added to prevent applications from thrashing the conventional data cache.

The mini-cache is a 512-byte, two-way set-associative cache, like a much smaller version of the main data cache. Data can be cached in either the regular cache or the mini-cache, but never both. The choice of cache is determined by an attribute bit in the MMU translation tables.

The mini-cache is intended to speed updates to the LCD frame buffer. By allocating the memory space of the buffer to the mini-cache, software can read, modify, and update 512-byte display segments from the LCD without disturbing the main data cache.

The read buffer is less transparent to the programmer. Its four entries are loaded via explicit coprocessor instructions. The SA-1100 then fulfills the read requests during idle bus cycles. After the buffer is loaded, references to the data are treated as if they were cache hits. The read buffer's main advantage is that it is manually controlled, so often-used data, such as lookup tables or DSP coefficients, can be available with fixed timing and without allocating space in the caches.

Managing the read buffer is the programmer's responsibility. In particular, the buffer must be loaded well in advance of the code that references its contents. Since the buffer is loaded during idle cycles, predicting exactly when the data will be available is tough. If data destined for the read buffer is requested before the load is finished, the chip stalls until the read is done.

# Power Modes Include Powering Down Like most integrated chips in this market, the SA-1100



Figure 2. In a 0.35-micron three-layer-metal process, the SA-1100 measures 75 mm<sup>2</sup>, 50% larger than the original SA-110.

indulges in a lot of self-initiated power conservation. Peripherals are clocked only when active, the caches are powered one set at a time, and even portions of the ALU (such as the barrel shifter) are powered only when needed. The chip also has two power-conservation modes.

In idle mode, which the chip enters via explicit software command or automatically any time data is loaded from noncachable memory (a long-latency operation), all clocks are stopped. The chip restarts immediately when an interrupt is received or the read cycle terminates.

Sleep mode is more of a controlled power-down. This mode can be used when, for example, a PDA is in the "off" state for hours or days. The chip requires several microseconds to enter sleep mode and a few hundred milliseconds to recover. Internal state information is lost, but memory contents can be preserved if SRAMs or self-refreshing DRAMs are used. In sleep mode, all power to the SA-1100 can be removed. Oscillator inputs can also be stopped, at the user's option; leaving the core oscillator running during sleep shortens the wakeup time to less than 20 ms.

# Memory Controller Does DRAM or SRAM

The SA-1100's memory controller manages both volatile and nonvolatile memory. Main memory may be either SRAM or DRAM (fast page-mode or extended data-out); nonvolatile memory can be either ROM (burst or nonburst) or flash memory. Unfortunately, the chip cannot handle both DRAM and SRAM in the same system because many of the necessary control signals are multiplexed together. All memory resides on the same 32-bit data bus.

Even the "volatile" memory can be made nonvolatile, with DRAMs refreshing themselves while the SA-1100 is idle or powered down. For DRAMs that support self-refresh mode (when CAS and RAS are driven low), the SA-1100 initiates an orderly shutdown that enables the self-refresh feature and preserves DRAM contents as long as the memory chips have sufficient power.

As an interesting (and useful) power-saving measure, Digital designed the SA-1100's memory controller to minimize signal transitions. Between memory cycles, all address and data lines are held in their previous state while control signals are negated. This technique reduces power consumption and EMI slightly and also eliminates the need for pullup resistors on the buses.

#### LCD Handles High Resolutions, Multiple Colors

The LCD controller on the SA-1100 is among the best available on an integrated processor and is similar to the one on the Philips 31700 (see MPR 5/12/97, p. 13). The chip can drive color (active or passive) or monochrome LCD panels, with resolution to  $1024 \times 1024$  with 256 colors or 15 gray levels.

Using the greater resolutions or color depth will sap increasing amounts of the SA-1100's available bandwidth. Screen data is stored in main memory; because the DMA channel dedicated to the LCD has the highest priority, increasing resolution or color depth can begin to starve other tasks, depending on memory speed.

Lots of Serial Channels; Minimal On-Chip Debug The SA-1100 has no fewer than six serial channels, each with a special purpose. Port 0 is a USB port; port 1 doubles as an SDLC or UART channel; port 2 is IrDA-compatible; port 3 is a basic UART; and port 4, which is actually two channels, connects to telecom or audio codecs and serial peripherals. The USB port is a slave only (no host or hub functions) and handles the faster, 12-Mbps data rate. The SA-1100 is the only integrated CPU so far with USB. It is also among the few to handle the faster, 4-Mbps IrDA data rate.

In a typical Newton or WinCE PDA, the IrDA port would be used for short-range communication with other PDAs or a host system; the USB port for hot-linking with desktop PCs; the codec port for touchscreen, telephone, and audio hookups; the SDLC port for wired networking; and the UART for a general-purpose serial interface, such as for a bar-code wand. The only thing missing is a dedicated keyboard interface, but with support for touch screens and enough CPU muscle for speech recognition, keyboards can be optional (or connected via an unused serial port).

Some simple debug support has been added to the part, which sports a pair of hardware breakpoint registers. Instruction or data breakpoints can be set by loading an address into one register and match data into the other. Data breakpoints can be set on a load or on a store with particular data. Although the SA-1100's debug support is primitive, it's better than nothing, and it is comparable to what Hitachi's SH7708 and NEC's R41xx chips provide.

# Die Size Larger Than First StrongArm

As the die photo in Figure 2 shows, the SA-1100's 2.5 million transistors cover an area of about 75 mm<sup>2</sup>. The new Strong-Arm is significantly larger than its predecessor; the peripheral logic takes up more space than the 8K of data cache that was removed. Like the SA-110, the SA-1100 is built in Hudson (Mass.) using Digital's 0.35-micron process.

The chip requires two supply voltages, also like its predecessor. The peripheral I/O runs on 3.3 V while the processor core requires a separate supply, at 1.35 V to 1.5 V. Varying the core voltage does not affect performance, although the lower voltage naturally reduces power consumption.

The SA-1100 also requires two crystals: a common 32.768-kHz crystal for the real-time clock and a 3.6864-MHz crystal for the PLL that drives the processor. This latter frequency was chosen because it's commonly available, inexpensive, keeps electromagnetic radiation to a minimum, and still provides a workable baseline frequency for the PLL. System software can dial in the speed of the CPU using one of 11 clock multipliers, from about 59 MHz to around 383 MHz, with a 200-MHz limit imposed on the current devices.

# Power Below 250 mW

Power conservation has always been StrongArm's strong point, and the SA-1100 is no exception. At 133 MHz, Digital specs the part at 200 mW (typical) or 330 mW (maximum) combined power consumption from the 1.35-V core supply and the 3.3-V I/O supply. Boosting the clock speed by 50% raises power consumption by about one-third, to 250 mW (typical) or 450 mW (maximum), as Table 1 shows. The bottom line: the SA-1100 draws no more from its batteries than many of its slower competitors, but it delivers better performance and at least as much on-chip I/O.

How does Digital do it? Part of the secret lies in the StrongArm core design (see MPR 11/13/95, p. 16), which merges an inherently simple CPU architecture with leading-edge circuit-design techniques that include conditional clocking, segmented distribution trees, and edge-triggered logic. Part of the credit also goes to Digital's first-rate fab process—the same one that keeps 600-MHz Alpha chips humming along. While Alpha has been described as a large RF circuit with a processor hiding in the stray capacitance, StrongArm parts, as Figure 2 shows, are big caches with a CPU lurking in the unused silicon.

### SA-1100 May Give Competitors Sweaty Palms

Digital's SA-1100 joins a growing family of new palmtop processors that have been announced in the past 12 months. Fueled by the release of Windows CE and an initial flurry of handheld PCs, the category has grown to include AMD's Elan400 and Elan410, NEC's R4101 and R4102, Philips's 31500 and 31700, Hitachi's 7708, and other ARM chips from Cirrus, Sharp, and VLSI Technology.

The SA-1100 has the most on-chip peripheral logic of any of these contenders. In addition to the valuable color LCD and memory controllers, Digital has included more (and more varied) serial channels, larger caches, faster IrDA, and as Table 2 shows, the only USB interface. Like most of these chips (except the x86-based Elan) the SA-1100 instruction set includes pseudo-DSP operations that enable software modems, media processing, and basic signal-processing.

| Resolution                | Condition | 100 MHz | 200 MHz |  |
|---------------------------|-----------|---------|---------|--|
| $480 \times 240 \times 4$ | Typical   | 198 mW  | 253 mW  |  |
|                           | All on    | 48 mW   | 61 mW   |  |
|                           | LCD on    | 37 mW   | 45 mW   |  |
| $480 \times 240 \times 8$ | Typical   | 204 mW  | 259 mW  |  |
|                           | All on    | 54 mW   | 67 mW   |  |
|                           | LCD on    | 42 mW   | 51 mW   |  |
| $640 \times 480 \times 8$ | Typical   | 223 mW  | 279 mW  |  |
|                           | All on    | 74 mW   | 87 mW   |  |
|                           | LCD on    | 62 mW   | 71 mW   |  |

Table 1. Typical power consumption for the SA-1100 varies from about 250 mW to well below 50 mW, depending on clock speed and the number of active peripherals. Typical values are with CPU active, others are for peripherals only; all values assume 30-pF bus loading. (Source: Digital Semiconductor)

# Price & Availability

Digital's StrongArm-1100 processor is sampling now to selected customers; production is scheduled for the end of 1997. The chip is housed in a 208-lead TQFP package; a mini-BGA package is planned for 1998. In 10,000-unit quantities, the 133-MHz part is priced at \$29; the 200-MHz version sells for \$39.

For more information, contact Digital (Maynard, Mass.) at 111.222.3333 or visit Digital's Web site at *www.digital.com/semiconductor/strongarm/strongar.htm.* 

Collectively, the other chips offer nearly all the same features as the SA-1100, but individually, each leaves off something valuable. The R4102, for example, has neither the memory controller nor the LCD controller found on Digital's chip. NEC believes in leaving these features off so OEMs can choose their own. The SA-1100's fifth serial channel is compatible with Philips's UCB1200, a mixed-signal chip used by the 31500 and 31700 for interfacing with the analog world (telephones, speakers, and resistive touchscreens). Feature for feature, AMD's 486-based Elan400 is the nearest match to the SA-1000, but it also has the lowest performance, worst power consumption, and highest price.

The SA-1100 certainly offers the most integer performance for the money. At 133 MHz, the chip easily outruns its 33-75-MHz competitors, even as its \$29 price tag stays in line with theirs. The 200-MHz version sells for just \$39. Based on the notoriously inaccurate Dhrystone, the SA-1100 should churn out  $5-10\times$  the performance of these other parts yet consume about the same amount of battery power.

|                                                               | SA-1100 | R4102  | 31700   | SH7708  | Elan400 | 77790  |
|---------------------------------------------------------------|---------|--------|---------|---------|---------|--------|
| Vendor                                                        | Digital | NEC    | Philips | Hitachi | AMD     | Sharp  |
| Max freq                                                      | 200 MHz | 66 MHz | 75 MHz  | 60 MHz  | 66 MHz  | 25 MHz |
| CPU                                                           | ARM     | MIPS   | MIPS    | SuperH  | 486     | ARM    |
| I/D cache                                                     | 16K/8K  | 4K/1K  | 4K/1K   | 8K      | 8K      | 4K     |
| MMU?                                                          |         |        |         |         |         |        |
| FPU?                                                          |         |        |         |         |         |        |
| LCD?                                                          |         |        |         |         |         |        |
| DRAM ctrl?                                                    |         |        |         |         |         |        |
| PCMCIA?                                                       |         |        |         |         |         |        |
| USB?                                                          |         |        |         |         |         |        |
| IrDA?                                                         |         |        |         |         |         |        |
| Keyboard?                                                     |         |        |         |         |         |        |
| A/D conv                                                      |         |        |         |         |         |        |
| D/A conv                                                      |         |        |         |         |         |        |
| Power (typ)                                                   | 250 mW  | 250 mW | 290 mW  | 570 mW  | 875 mW  | 550 mW |
| Price (10K)                                                   | \$39    | \$25   | \$39*   | \$20    | \$44    | \$18   |
| Availability                                                  | 1Q98    | Now    | Now     | Now     | Now     | Now    |
| Color LCD control Function supplied by UCB1200 companion chip |         |        |         |         |         |        |

Table 2. Digital's SA-1100 includes most of the features found on other recent PDA processors, but its much faster, 200-MHz StrongArm processor core gives it the best performance. \*price includes UCB1200.

At 133 MHz, the SA-1100 is more than fast enough for soft-modem functions, quick screen updates, handwriting recognition, and most other chores a handheld device might be expected to perform. For an extra \$10, the 200-MHz version of the part will be attractive primarily to OEMs that expect to run Java, which will benefit from the increased CPU performance. It also allows OEMs to take advantage of larger LCD screen resolutions or more colors before saturating the part's internal bus bandwidth.

# A Strong Part But Not a Strong Market

There's little question that Digital has done an excellent job with the SA-1100. On the software side, the company has lined up the usual suspects: Windows CE, Newton, JavaOS, VxWorks, Psion's EPOC32, and others. Soft-modem and speech-recognition libraries are also available, as is the obligatory evaluation board.

The trick now is to line up customers. Newton, Strong-Arm's biggest and best-known design win to date, has an uncertain future ahead of it, and sales have never been spectacular. The upcoming Newton 2100 will be the first volume product based on the SA-1100; that, and some software upgrades, should improve Newton's performance even as it reduces its cost.

The market for Windows CE units is also sluggish, with many of last year's HPCs now discounted to half their initial prices. The ARM port of WinCE has been imminent for months now; the SA-1100 will be the first platform for that version of the OS. In the Windows CE world, the MIPS and SuperH vendors have a one-year headstart, however, that may be difficult for Digital to overcome. If Digital bets big on catching the current wave of pocket organizers, it may be badly disappointed. With a very few exceptions, the public

has consistently resisted adopting the industry's vision of portable, ubiquitous computing.

In the meantime, Digital has pursued an alternate course, gaining a foothold with makers of network computers, such as Corel, LG, Wyse, and others, on the strength of StrongArm and its ability to make Java performance tolerable. The original StrongArm chip, the SA-110, appeals to makers of larger, AC-powered NCs, while the newer SA-1100 should attract those planning portable or flat-panel versions. Nortel (formerly Northern Telecom) has plans to build a "smart phone" around the SA-1100, with an LCD screen and a fold-up keyboard.

The SA-1100 will be a fierce competitor for embedded systems, tethered or portable, with a need for an LCD screen and lots of communication channels. Even if some of the part's peripherals go to waste, it's still aggressively priced and easy to integrate, and its performance can't be beat. With Alpha, Digital staked out the high end of the high-end market; with Strong-Arm, the company has the tools to dominate the high end of the low-end market as well.