Saturday, July 16

Geek

The Magic May Return

A guest post, by, well, me, from seven years ago, with added commentary by me from today.

I've written recently on the untimely death of Moore's Law and on one of the first side-effects of the faltering and failure of that law. But, being somewhat dead myself, I didn't have the time or energy to go into any detail, and probably left my less-geeky readers saying something along the lines of Huh?

But this is important, so I'm going to give it another try.

Way back in 1965, just four years after the first integrated circuit was built, Gordon Moore, then working at Fairchild, made an observation and a prediction.

His observation was that the number of components in an integrated circuit was increasing, while the cost of each component was decreasing; his prediction was that this trend would continue. Intel has made his original paper available for you to read. It's a little bit complicated; Moore is talking about trends in the number of elements in a integrated circuit required to achieve the minimum cost per component - efficiencies of scale, in other words.

Reduced cost is one of the big attractions of integrated
electronics, and the cost advantage continues to increase as the technology evolves toward the production of larger and larger circuit functions on a single semiconductor substrate.

For simple circuits, the cost per component is nearly inversely proportional to the number of components, the result of the equivalent piece of semiconductor in the equivalent package containing more components. But as components are added, decreased yields more than compensate for the increased complexity, tending to raise the cost per component.

Thus there is a minimum cost at any given time in the evolution of the technology. At present, it is reached when 50 components are used per circuit. But the minimum is rising rapidly while the entire cost curve is falling (see graph below). If we look ahead five years, a plot of costs suggests that the minimum cost per component might be expected in circuits with about 1,000 components per circuit (providing such circuit functions can be produced in moderate quantities.) In 1970, the manufacturing cost per component can be expected to be only a tenth of the present cost.

The complexity for minimum component costs has increased at a rate of roughly a factor of two per year (see graph on next page). Certainly over the short term this rate can be expected to continue, if not to increase. Over the longer term, the rate of increase is a bit more uncertain, although there is no reason to believe it will not remain nearly constant for at least 10 years. That means by 1975, the number of components per integrated circuit for minimum cost will be 65,000.

I believe that such a large circuit can be built on a single wafer.

What he's saying is that by 1975, it would be cheaper to build a single integrated circuit with 65,000 components than to build two 32,500-component circuits - and, by comparison, a 130,000-component circuit (if such a thing could be built) would cost more than twice as much.

Events since then have proved him right (and happily he is still around to enjoy it). [Me'11: And still is today.] And more right than he imagined because not only have the components been getting smaller and cheaper, but at the same time they have been getting faster and using less power. And this has been going on, following a curve where (to take the most famouse example) processing power has been doubling every 18 months. For my entire life processing power has been doubling roughly every 18 months.

My first computer, which I bought as a teenager, saving pocket money every week until the day of the Big! Christmas! Sale! was a Tandy (Radio Shack to many) Colour Computer. It had 16k of ROM (which contained the BASIC interpreter; there was no operating system as such) and 16k of RAM. It was powered by a Motorola 6809 processor and a 6847 video chip. It had a maximum resolution of 256 by 192 - in black and white - or 16 lines of 32 columns in text mode.

It ran at 895kHz.

Yes, boys and girls, kiloherz. It was an 8 bit chip (with a few 16-bit tricks up its sleeve, admittedly); it could execute, at most, one instruction each cycle, and it ran at less than a megahertz. (Also, it had no disk drives at all; everything was stored on cassette tape, which fact is directly responsible for the irretrievable loss of my version of Star Trek and the completely original game Cheese Mites.)

Not quite twenty years on, I'm typing this on a system with a 2.6 gigahertz 32-bit processor than can execute as many as three instructions per cycle, some of which can perform multiple operations like doing 4 16-bit multiply-accumulates all at once. It has more level-one cache than my Colour Computer had total memory. Its front-side bus is eight times as wide and nearly a thousand times as fast. My display is running at 1792 by 1344 in glorious 24-bit colour. [That display sadly died not long after; it just got brighter and brighter until it burned out. Yes, a CRT.] And it has six hundred and fifty gigabytes of disk.* [Today you could buy that for less than $30.]

It cost a bit more, it's true. My 1984 Colour Computer cost me $149.95, and Kei, my 2003 Windows XP box, cost me around $2000. The best I can do today for $149.95 (ignoring for the moment two decades of inflation and the fact that this now represents a morning's earnings rather than a year's) is a Nintendo Gamecube. [Remember those?] The Gamecube only runs at 485MHz (achieving a measly 1125 MIPS); it only has 40MB of memory; it only has 1.5GB of storage. Its peak floating-point performance is a mere 10.5 GFLOPS, compared to the Colour Computer's... I don't know, exactly, since the CoCo had no floating-point hardware at all, and I doubt that the software emulation achieved so much as 10.5 kiloFLOPS.

So, depending on exactly what you wish to measure, 20 years of innovation has given us somewhere between a thousand and a million times better value for money.

And here it is again: This has been going on for my entire life. Every year, tick tick tick, new and better and faster and cheaper. You buy the latest and greatest and it's obsolete before you get home from the mall. It's so much a part of our lives that it's a joke, a cliche.

And it just died. [That last link goes to an IBM presentation, the first 13 pages of which are just general marketing material, but pages 14 to 24 go right to the heart of the problem.] [In fact, all of those links are now dead, so I've excised them.] This PDF of a PPT from Intel covers the same ground though - on page 14:
Classical MOSFET scaling ENDED at the 130nm node (and nobody noticed)
- almost the exact same sentiment expressed by IBM.]

The death of Moore's Law has been predicted many times, not least by Moore himself, but when you get IBM's Chief Technology Officer saying
Scaling is already dead but nobody noticed it had stopped breathing and its lips had turned blue.
you know something's up. Particularly when he's not making a prediction, but talking about what's happening right now. [Scaling may be dead, but Wired have kept that link alive for 7 years.]

And everything was planned so neatly too. 90 nanometres was to come on line late '03, ramping up this year; 65 nanometres was to be the big thing of '05, followed by 45 nanometres in '07. Now, beyond that, at 30 nanometres and 20 nanometres, things were less clear, and beyond 20 nanometres not clear at all, but at least the path was marked out from the old 130 nanometre stuff down to 45, giving us 9 times the transistors and 3 times the speed. Only someone forgot to check with the laws of physics.

Wired: How long will Moore's Law hold?

Moore:

It'll go for at least a few more generations of technology. Then, in about a decade, we're going to see a distinct slowing in the rate at which the doubling occurs. I haven't tried to estimate what the rate will be, but it might be half as fast - three years instead of eighteen months.

What will cause the slowdown?

We're running into a barrier that we've run up against several times before: the limits of optical lithography. We use light to print the patterns of circuits, and we're reaching a point where the wavelengths are getting into a range where you can't build lenses anymore. You have to switch to something like X rays.

So, what exactly is the problem? It's not, as Moore and others predicted, a question of actually building the circuits - that's still working fine. IBM, Intel, AMD and others have all produced working chips at 90 nanometres. The problem is leakage. Each of the millions of transistors in a chip is a tiny switch, turning on and off and incredible speeds. Each time you turn the transistor on, or off, you need to use a little bit of electricity to do so. That's okay, and it's expected, because you don't get anything for free. The problem is that the transistors are now so small, and the layers of insulation - the dielectric - so thin, that they leak. There's a partial short-circuit, and so instead of only using power when the switch switches, it's using power all the time.

So what? Electricity is cheap. Well, the so what is heat. Modern microprocessors use as much electricity as a light bulb, and that means they produce just as much heat. If they didn't have huge heat sinks and fans bolted onto them, they'd very quickly overheat and fail - a fact that some people have independently discovered.

Until now, each new generation of scaling, each new node, has brought smaller, faster, cheaper and cooler transistors. At 90 nanometres, transistors are smaller, cheaper, probably faster again - but they run hotter. And the competition in the processor market has already driven power consumption (and heat generation) about as high as it can go. So when the new generation was discovered to increase the heat rather than decrease it, the whole forty-year process of accelerating change ran head-first into a wall.

Back at the end of 2002, I made the following set of predictions for the coming year. I felt pretty comfortable in all of them, the first no less than any of the others:

My predictions for 2003:

1. Microprocessors will hit 4GHz by the end of the year. Marketers will try and largely fail to convince the public to buy them.
2. A major scientific breakthrough will lead to a new and deeper understanding of something.
3. A major political scandal will result in a huge media kerfuffle and only die down when someone resigns.
4. There will be a war.
5. Bad weather will affect the lives of millions of people.
6. There will not be any major, civilisation-destroying meteor impacts.
7. Astronomers will find new and interesting things in the sky.
8. Spam, pop-ups and viruses will continue to plague us. The Internet will fail to collapse under the strain. Pundits will predict that this will now happen in 2004.
9. A rocket will explode either on the launch pad or early in its flight, destroying its expensive payload - which will turn out to be uninsured.
10. Cod populations in European waters will continue to fall, and the European parliament will fail to act to prevent this.
11. A new species of mammal will be discovered.
12. A species of reptile or amphibian will be reported as extinct.

But not only did we not see 4GHz processors in 2003, it's doubtful that we'll see them in 2004 either. [Nope.] (I was wrong about number 3, too. No-one resigned, and the media moved onto the next scandal. Rinse, repeat.)

Now, assuming you're not a hard-core computer gamer, hanging out for the release of Doom 3 Mass Effect 3 and Half-Life 2 Half-Life 2 Episode 3, why should you care?

Well if you have broadband internet, or a mobile phone, or a DVD player, or a PDA [A what?], or a notebook computer, or a digital camera (or a digital video camera), or you use GPS on your camping trips, or you enjoy the low cost of long-distance phone calls these days, if you download anime or the latest episode of Angel Doctor Who off the net, if you take your iPod iPad with you everywhere you go, if your job or your hobby involves using e-mail or looking things up on the Web, you can thank Moore's Law for it.

Modern communications depend critically on advanced signal processing techniques, performed by specialised chips called Digital Signal Processors, or DSPs. These things are everywhere - every modem, every mobile or cordless phone, every digital camera, every TV or VCR or DVD player, every stereo, every disk drive. It's the relentless advance of Moore's Law that has made DSPs fast enough and cheap enough to do all this, and made them efficient enough to run on batteries so well that your mobile phone might last a week between charging. (My first mobile was lucky to make it through the day.) Disk drives demand high-speed DSPs to sort out the signals coming from the magnetic patterns on the disk and turn them back into the original data. DVD players need them to turn the tiny pits pressed into the aluminium surface into a picture. The entire global telephone network, mobile and fixed, depends on DSPs. And any advances in any of these areas will require more and faster and cheaper DSPs and - uh-oh.

And there's more: The advances in computers and communications over the past four decades have been the primary driver of the global economy. The economy has been growing all that time, even though we have made no fundamental breakthroughs in finding new resources or new materials. If you're better off than your parents, you can thank Moore's Law for a big chunk of that - if not the effort you put in, then the new opportunities it opened up.

And it just died.

I don't think the financial markets have a clue yet what's going on, but in any case it's going to be a soft landing. All of the processor manufacturers have been in a mad rush over the last decade to produce faster chips at the expense of pretty much anything else. The funny thing is that they've been pushing so hard, they've left a lot of things behind. Take a look at this chart:

int fp
base base
1076 763 Pentium M 1.6GHz
805 635 Pentium M 1.1GHz
237 148 C3 1.0GHz (C5XL)
398 239 Celeron 1.2GHz (FSB100)
543 481 Athlon XP Barton 1.1GHz (FSB100 DDR)
581 513 Athlon XP Thoroughbred-B 1.35GHz (FSB100 DDR)
1040 909 Athlon XP 3200+ (Barton 2.2GHz, FSB200 DDR)
1276 1382 Pentium 4 3.0E GHz Prescott (FSB800), numbers from spec.org
1329 1349 Pentium 4 3.2E GHz Prescott (FSB800)
560 585 Athlon 64 3200+ 0.8GHz 1MB L2
1257 1146 Athlon 64 3200+ 2GHz 1MB L2
You don't have to understand exactly what this means, but the first number relates to "integer" performance, which is important for things like word processing and web browsing and databases, and the second number relates to "floating-point" performance, which is important for games. (Well, and other things too.)

The Pentium M is a modified version of the Pentium III, customised for notebook computers. Since notebooks computers run off batteries, and batteries don't hold much power at all, the Pentium M has been tweaked to provide as much speed as possible while using as little power as possible. The Pentium 4, on the other hand, is designed for speed at the expense of everything else. And what we find is that the 3.2GHz Pentium 4, despite having twice the clock speed of the 1.6GHz Pentium M, is just 25% faster on integer (useful work) and 75% faster on floating point (games).

And - here's the tricky bit, and the cause of Intel's recent and dramatic change in direction - the Pentium 4 uses four times as much power as the Pentium M. So if, instead of putting one Pentium 4 onto a chip, you put four Pentium Ms, it would use the same amount of power and produce the same amount of heat, but it would run up to three times as fast... Overall.

Which is great and wonderful if you can use four processors at once. I can, quite happily, and more than that. A word processor can't, not easily, but then word processors already run pretty well. Games, and other graphics-intensive stuff like Photoshop or 3D animation software certainly can, though most games haven't been written to do so. Not yet.




Or so the situation was seven years ago. What's changed? Well, now I can have a game busy-wait on two cores at once.

The situation turned out not to be quite so dire as it appeared at the time, though a huge amount of engineering effort has gone into the advances we've seen in recent years. And yet, the server we recently deployed at my day job, while it has forty processors (yes, four-zero), is still based on the Pentium Pro (through at least six generations of intermediary designs) and only runs at 2GHz.

The bright spots have been not so much in the CPU cores themselves, as in the vector (a.k.a. SIMD) units, which have grown from 64 to 128 to 256 bits, and in video cards, which are just masses of vector processors all working together. Video cards have hit a limit too, though; they're choking on their own heat. A single high-end card can use more than 300W, about as much as a well-configured PC at the time of the original post.

And we still haven't broken 4GHz in a mainstream processor.

There are two particular beacons on the horizon at the moment. One comes from AMD, the Zambezi-Orochi-Bulldozer chip I mentioned in a recent post. If pre-launch data is correct, they expect to provide 8 cores running at 4.2GHz (and up to 4.7GHz when conditions are right) within a 125W power budget. That's a lot of processing power for a fairly low-end chip. It has some limitations; in particular, it only has one full vector unit per pair of cores, so for floating-point heavy applications like games and video editing, it will be no faster than Intels four-core chips. For the stuff I do, though - web sites and databases - it will (again assuming the details are correct) slam Intel's chips into the ground.

The other ray of light comes from Intel, because, while we might loathe the behaviour of their marketing department, they are no slouches when it comes to engineering. Their new FinFET transistors, debuting on the upcoming 22nm node, allow their chips to cut overall power consumption in half. Which means, since everything computational nowadays is limited either directly by available power or indirectly by heat dissipation, that everything can get twice as fast. Not vertically, but at least horizontally.

So we're talking about mainstream desktop processors with sixteen cores, running at well over 4GHz, coming your way in the next year or two. It's not the 10GHz Pentium 4 that Intel promised us all those years ago, but it will serve. Before much time has passed we'll see games busy-waiting on eight cores, you mark my words.

* That's dedicated disk; we'll set aside the terabyte or so living in the file server. **

** Which died in the great server crash of...  Around 2007, I think.  Took me ages to recover all that anim...  Data.

Posted by: Pixy Misa at 11:16 PM | No Comments | Add Comment | Trackbacks (Suck)
Post contains 3247 words, total size 21 kb.

Comments are disabled. Post is locked.
65kb generated in CPU 0.0256, elapsed 0.4941 seconds.
54 queries taking 0.4832 seconds, 345 records returned.
Powered by Minx 1.1.6c-pink.