# Philips Advances TriMedia Architecture New CPU64 Core Aimed at Digital Video Market



## by Peter N. Glaskowsky

Once the most dynamic market for innovative processor microarchitectures, the media-processor busi-

ness has changed greatly in recent years. Philips' TriMedia architecture is the sole survivor of the first crop of media processors. Now, Philips is preparing to move TriMedia forward with a new core that extends the architecture into new markets, including high-end digital video processing.

Despite early support from industry heavyweights such as Microsoft and Apple, TriMedia failed to find a niche in the personal-computer industry. Since then, however, the Tri-Media group has been finding and developing other markets for its chips. The TM-1000 is now used in videoconferencing systems such as Polycom's Viewstation and in digital-TV receivers from Samsung and Philips itself. In addition, several vendors use the TM-1000 in video-editing systems.

Speaking at the Microprocessor Forum earlier this month, Philips senior scientist Frans Sijstermans provided the first look at the new CPU64 core. With twice the datapath width and new features that include a MIPS-compatible MMU, the CPU64 core provides more than twice the performance—with twice the complexity—of the original Tri-Media core design, now dubbed the CPU32.

### TriMedia Roadmap Runs Into Roadblocks

In the four years since the TriMedia architecture was unveiled, Philips has shown roadmaps with at least eight different TriMedia chips (see MPR 1/27/97, p. 10). Some of these chips were intended for applications, such as Microsoft's Talisman 3D initiative, that have either been canceled or moved in other directions. Figure 1 shows the company's current roadmap, which still has seven chips but is now tightly focused on digital-video applications.

Only the original TM-1000 is in production. This chip, built in 0.35-micron technology, is a 32-bit 100-MHz device that is slightly too slow for full-speed DVD decoding, due primarily to limited local-memory bandwidth through its 32-bit 100-MHz SDRAM interface. Philips has marketed the TM-1000 as a DVD-assist engine, but it recognized the need for a standalone DVD-decoding product.

The 133-MHz TM-1100 follow-on is designed to take over the complete DVD decoding task, though it still requires an additional microprocessor for system-control functions such as memory management and the user interface. The TM-1100, now sampling, adds additional logic to support DVD transport-stream descrambling and decoding. These tasks rely heavily on bit-manipulation operations that are relatively inefficient on the TM-1000's programmable core, which is optimized for 8-bit or larger values.

The TM-1300, which should begin sampling by the end of the year, is the first TriMedia part built on 0.25-micron technology. This chip has no significant enhancements, but the better process should boost the TM-1300's clock frequency to 180 MHz. The extra speed will allow the TM-1300 to pursue more advanced video-editing tasks.

The TM-1300 will still be too slow to handle MPEG-2 and Dolby Digital decoding at the higher data rates and resolutions required by Advanced Television Systems Committee (ATSC) digital television. Philips has developed ATSCdecoder reference platforms for the TM-1000 and TM-1100, but these use a separate MPEG-2 decoder chip to do most of the work. To reduce subsystem cost, Philips began developing a new TriMedia processor that could handle ATSC decoding unassisted.

The result, the TM-2000 family, should begin sampling by the end of the year. These new chips incorporate the first major improvements to the TriMedia design. Philips anticipates 150-MHz operation on the first TM-2000 silicon, with a 200-MHz version to follow shortly.

In addition to minor (and unspecified) improvements to the core, the TM-2000 includes a fixed-function MPEG-2 video-decoding unit and special image-processing hardware devoted to high-definition television. Two versions of the TM-2000 are planned: one for high-definition television, the other for standard-definition applications. The TM-2HD adds on-chip memory and logic for better performance on high-definition video processing.



**Figure 1.** The TriMedia roadmap currently defines seven parts, ranging from the current TM-1000 to the TM-3HD, expected in late 2000. TM-1x00 parts are general-purpose devices; the TM-2000 and 3000 series are targeted at digital-video applications.

#### For More Information

Additional information on Philips' current TriMedia processors, the TriMedia software development environment, and TriMedia-based reference designs for digital television and videoconferencing is available online at *www.trimedia.philips.com*.

The first CPU64 device will be the general-purpose TM-1400, scheduled to sample in mid-2000. The TM-1400 is being designed for 0.18-micron process technology and a 1.8-V supply. The CPU64 core alone will have about seven million transistors and should achieve a 300-MHz clock frequency. This speed is said to be fast enough to handle real-time MPEG-2 encoding, even without the dedicated motion-estimation logic found in MPEG-2 encoder chips from C-Cube (see MPR 12/8/97, p. 1) and others.

Later in 2000, Philips plans a part currently known as the TM-3HD, which will add hardware acceleration for MPEG-2 processing. This chip, the first in the TM-3000 family, will use motion-compensated frame-rate up-conversion to boost video quality for both analog and digital television.

## Twice the Core It Was Before

The new TriMedia CPU64 core incorporates two improvements over the original CPU32 core. The most obvious is the doubled datapath width. Many operations that formerly operated only on 32 bits of data have been extended with 64-bit forms, doubling the work done in each clock period. The other major improvement is the addition of new SIMD instructions, including several multimedia-calculation and data-movement operations that facilitate digital video editing and other video applications.



Figure 2. The TriMedia CPU64 core is augmented by three memorymanagement units, one for instructions and two for data. These MMUs allow future TriMedia chips to run complex embedded operating systems.

By leveraging both instruction-level and data-level parallelism, a 300-MHz CPU64 core reaches a peak execution rate of 24 billion 8-bit operations per second (five issue slots, eight byte-sized data elements per instruction, two operations per byte), more than twice as fast as that of the CPU32 core on a clock-for-clock basis. Including the effect of the higher clock speed, Philips' simulations show the TM-1400 is effectively 4.5 to 12 times faster than the original TM-1000 on real world tasks, averaging about 6× faster. Peak floatingpoint throughput increases to 2.4 GFLOPS at 300 MHz.

Philips' simulations show the CPU64 core can perform an IEEE 1180-compliant inverse discrete-cosine transform (iDCT) operation in just 56 cycles, compared with 160 for the CPU32 core. The CPU64 core is therefore almost twice as efficient as a PowerPC G4 with AltiVec (see MPR 5/11/98, p. 1), which can perform the same function in 102 cycles, according to Apple. However, the G4 is expected to run at speeds of 400 MHz or more, giving it roughly 1.5× the effective performance of a 150-MHz TM-1400 on this algorithm.

The same comparison made against Intel's measurements of a 400-MHz Pentium II (which uses MMX) gives the TM-1400 a 61% advantage. By the time the TM-1400 ships, however, Intel's slowest CPU will use the Katmai core at speeds of at least 500 MHz. Katmai's performance will likely keep the TM-1400 out of PC designs.

#### New MMU Enables Standalone Operation

Perhaps the most useful feature added to the TriMedia CPU64 architecture is a memory-management unit (MMU). Current TriMedia processors lack an MMU, limiting them to real-time operating systems that use physical addressing only. Philips provides the pSOS operating system from Integrated Solutions (*www.isi.com*) as part of the software development environment for TriMedia, but only the pSOS kernel is provided. Higher-level services such as networking protocol stacks and user interfaces must be developed separately, and, currently, no TriMedia customer uses the TriMedia chip for these functions.

More advanced embedded operating systems, such as Windows CE, provide such services. These OSs use a virtualmemory addressing model to allocate physical memory among multiple tasks. Designers who wish to use current TriMedia processors in systems that use virtual memory must include a separate microprocessor capable of handling virtual-memory management and other tasks for which today's TriMedia chips are not well suited.

Philips has not described any plans to support Windows CE, but our analysis suggests that the CPU64 core with its MMU will be able to handle the demands of that operating system. TriMedia would be a natural choice for digital televisions, and both Philips and Microsoft have expressed a strong interest in this market. Only a few embedded processors, including the StrongARM-1500 (see MPR 12/8/97, p. 12), combine the features needed by Windows CE with the ability to handle digital-TV decoding. The CPU64 core comes with an instruction MMU and a pair of data MMUs, as Figure 2 shows. Each MMU performs 32-bit address translation with page sizes from 4K to 16M. Each MMU has a 64-entry fully associative TLB. Philips has made these MMUs compatible with the MIPS memory-management model to facilitate memory sharing in designs that include both a MIPS processor to manage the user interface and a TriMedia chip to handle multimedia operations.

The caches planned for the TM-1400 are no larger than those found on earlier TriMedia processors, however. Philips says that the current 32K instruction cache and 16K data cache will continue to work well.



Frans Sijstermans, senior scientist at Philips Research Labs, describes the new CPU64 core.

even in the more complex software environments enabled by the CPU64 MMUs.

The company reasons that most multimedia algorithms have small inner loops that fit into 8K of instruction cache. At the other extreme, the large data sets used by these algorithms cannot be cached without more on-die cache than Philips can afford to put on these new chips.

We believe Philips is underestimating the cache demands that will be imposed by more complex operating systems, especially if user-interface features are added to create TV-based Web browsers or other applications running on TriMedia. If the TM-1400 caches turn out to be too small, it is possible that Philips may boost the cache size on later TM-3000 parts.

### TriMedia Finds Defensible Niche

Despite the collapse of competing efforts from Chromatics and Samsung, Philips can't afford to let its TriMedia effort stagnate. The CPU64 core, though a substantial improvement over current TriMedia offerings, will face stiff competition for digital-television applications from a number of directions. The Project X media processor from VM Labs (see MPR 6/22/98, p. 4) will be used in Motorola's Blackbird reference design for

digital TV. At the high end of this space, the forthcoming media-processor design from Equator technologies (see MPR 6/22/98, p. 4) is said to provide impressive performance.

Even if media processors are never widely used in personal computers, Philips has found a way to sustain future growth for TriMedia. The company is one of the world's largest makers of consumer electronics, especially televisions. As long as the TriMedia group heeds the advice it is surely getting from Philips' other divisions, it has an almost guaranteed market for its products.