Multicore Gives More Bang for the Buck

  By Peter Claydon
EE Times

October 15, 2007
 
     
 

It has been clear for some time that a law of diminishing returns applies to the advancement of conventional processor architectures. Each new process geometry and microarchitecture delivers successively less in terms of performance gains: It is simply no longer possible to deliver Moore's Law by going faster.

At Intel, they have encompassed this truth by complementing Moore's Law with Pollack's Rule, named after Fred Pollack, director of Intel's microprocessor research labs. Pollack has observed that each new Intel architecture, starting with the i386, has required two to three times the silicon area (in a comparable process), while delivering a 1.4 to 1.7 times improvement in performance. In short, performance increases in proportion to the square root of complexity. In two generations, performance doubles for a fourfold increase in complexity; a 4X increase in speed (six years of Moore's Law) requires 16 times more transistor.

Meanwhile, many believe they have also determined what will prove to be the limiting factor in terms of process shrinks: not lithography or quantum physics, but the enormous power densities inherent in doing a great deal of processing work in a very small physical space (notoriously, the power density in a core has passed that of an iron, and is nearing that of a rocket nozzle).

With power consumption and the limitations of microarchitecture enhancements putting a brake on uniprocessor progress, attention has turned to the potential of multiprocessor, or multicore, chip architectures. The concept is becoming familiar in desktop processors, although multicore products have, in fact, been around for some time. PicoChip's picoArray products, for instance, have 300 core on one die and have been on the market since 2002.

Because of its use to overcome the limitations of uniprocessor systems, many people assume that multicore is all about sheer performance. Our experience at picoChip is that the technology's impact is more subtle, and at the same time more far-reaching, than simple processing power. For us and our customers, the most important point is that multicore allows the picoArray to deliver 10 times more MIPS per dollar than standard processors.

What that means in practice is that, for the same price and with the same power consumption, one can build a device with 10 times more processing power than a standard DSP. And one can implement applications that would traditionally have required a large FPGA, at lower cost and with lower power consumption.

On an even more practical level, these attributes of cost-effectiveness enable the development of mass-market products such as 3G femtocells (also known as home basestations), which require immense processing capability at aggressive price points and with fast time-to-market. They also underlay the picoArray's success in implementing promising technologies such as WiMax, which, again, have exacting requirements of processing power and bill of materials cost.

There are other misconceptions about multicore processors as well. The second main one is that the term "multicore" refers to a single technology. In fact, it encompasses a whole range of approaches, differentiated by two factors: size of the core and number of cores used.

Clearly, there is a huge difference between a dual-core processor that powers a desktop computer, and a product like the picoArray, with hundreds of cores working in parallel. And yet both systems truly are "multicore."

There is a major divergence between the needs of an embedded system and a general-purpose laptop. In the latter, an open environment with constantly changing circumstances (as the consumer may run any program at any time) places a huge emphasis on consistent performance even as hardware generations change. In contrast, in the latter, it is more common for a system to run one single (albeit complex) task, and to recompile code as hardware changes.

In fact, size of core and size of array are fundamental decisions in delivering high MIPS-per-dollar figures from a multicore architecture, and they must be made with full knowledge of the target applications for the device. PicoChip did a careful analysis of the optimum core size--or MIPS per million transistors--for our target applications in wireless signal processing. The resultant cores are relatively simple--nearer to the complexity of the Intel 8086 than that of the Pentium 4.

Our analysis suggests that a 16-bit three-way long instruction word architecture, with three-deep pipeline and 100- to 200-MHz clock, is about the optimum in terms of MIPS/mm2 of silicon or MIPS/mW. There is a Darwinian logic to this conclusion: This is the architecture inside the billion-plus cell phones shipped each year; if there were a more efficient architecture, the incredible price pressures and fast-moving cycles would have moved to that optima instead very swiftly. Using such structures, we can fit several hundred cores on each device. This immediately surmounts one of the traditional problems of multiprocessor systems: that making a process "a little bit parallel" provides limited performance gains for a significant increase in complexity and decrease in usability. The proof of this is that independent benchmarks do indeed show (as predicted) a 10 to 20X gain in performance per dollar and performance per watt: the same number of transistors (die area, cost, power) can deliver dramatically more results. In essence, several years of Pollack's Rule have been "unwound."

This brings us to usability--arguably, the most important issue. A hardware structure that supports interprocessor communication is vital in getting the most out of a multicore architecture. Above that, it is the software development tool chain that will make or break a design project. More MIPS per dollar is all very well, but if the design team takes twice as long to harness those MIPS, the economic argument (not to mention the time-to-market argument) disappears. That's why picoChip has invested in a tool chain that supports structured programming via familiar hardware- and software-description languages.

So we can conclude that, though multicore is definitely here to stay, the key challenges are not solved simply by throwing extra cores at the problem. It is necessary to scale the number and size of the processor units to the application; to provide adequate interprocessor communication; and, most important, to ensure that software designers can be as productive as possible, by supplying a standards-based, intuitive tool chain.

The bad news is that only with all these things in place does multicore deliver the extra bang for the buck. The good news is that not only are these things realistically possible, they have been proven in products that are already on the market.

Peter Claydon (peter.claydon@picochip.com) is COO and co-founder of picoChip. Claydon was general manager of Oak Technology Ltd. and has 20 years of management and engineering experience.
 
     
More Case Studies
Mannington Mills Case Study

Mannington Mills scales with Dell and SAP

Tellabs Case Study

Tellabs pushes virtualization with Dell PowerEdge servers

Acuity Case Study

Acuity streamlines with Dell virtualized servers and storage

Edmunds Case Study

Edmunds.com chooses Dell to support exponential growth

Related News

Multicore Gives More Bang for the Buck
EE Times
October 15, 2007

Next Up in the Data Center
InformationWeek
September 29, 2007

More News
Resources
 

New WHITE PAPER: Unified Messaging with Microsoft Exchange Server 2007
Read more »

New WHITE PAPER: High Availability in Exchange Server 2007
Read more »

New WHITE PAPER: Competitive Power Savings with VMware Consolidation on the Dell PowerEdge 2950
Read more »

New WHITE PAPER: Overview of Dell Virtualization Reference Architecture
Read more »

New WHITE PAPER: Best Practices for Migrating to VMware Infrastructure 3 on Dell PowerEdge Servers
Read more »

POWER SOLUTIONS FEATURE: Using VMware Cluster Features on Dell PowerEdge Servers
Read more »

POWER SOLUTIONS ARTICLE: Upgrading to VMware Infrastructure 3 on Dell PowerEdge Servers
Read more »

WHITE PAPER: How to Improve the Energy Efficiency of New and Existing Datacenters Read more »

DELL ENTERPRISE PODCAST SERIES: Infrastructure Strategies to Grow Your Business Read more »

PODCAST: Improving Energy Efficiency in the datacenter Read more »

CAPACITY PLANNER: Estimate Your Power, Cooling and Airflow Requirements with Dell's Datacenter Capacity Planner Read more »

POWER MANAGEMENT ASSESSMENT: Determine Your Power and Cooling Needs with a Datacenter Environment Assessment Read more »

 
More Resources