Oh, lovely, you're a cheery one aren't you?
Sunday, July 31
A database is a record storage system that provides a generalised mechanism for locating any given item among N items in less than O(N) time.
First CorollaryAny record storage system that does not provide such a mechanism, regardless of what other capabilities it might exhibit, is not a database.
Second CorollaryAny record storage system that does not provide such a mechanism will eventually fail due to unforseen queries that require O(N) time.
* From Hal Draper's seminal but little-known 1961 paper on information science.
Monday, July 25
(Cute Overload < Zooborns < Koala Land)
Thursday, July 21
Also, the new Mac Mini Lion Server. A teeny-tiny quad-core i7 running MacOS X Server? Sold! It has everything except for tons of storage, and Synology will deal with that. I've been looking at the Mac Mini for years, and this is easily the best config they've ever offered.
Sunday, July 17
It's been a while since I bought myself a new toy.0 My current pair of laptops from last year (Mio and Sae) are doing fine and don't need replacing. My current Windows desktop (which dates back to 2008) needs to be rebuilt, but I already have the parts for that and just haven't found the time to do the necessary work. My Linux box is pretty much okay, though it's full of backup files because I have nowhere else to put them. More memory would be nice when I'm playing around with large databases or working with lots of virtual machines at once, but to get there I'd have to replace the motherboard as well, because while the current board supports 16GB of DDR2, up from the 8GB actually installed, it would be cheaper to swap the board for a DDR3 model than to track down the rare and expensive unbuffered 4GB DDR2 DIMMs I need.
Which ain't no fun.
A new video card is always nice, but my 4850 is doing pretty well, and I've finished Mass Effect 2 and Dragon Age: Origins, so there's nothing taxing I really need to play until Mass Effect 3 comes out next year. And with AMD's 7000 series based on a 28nm process expected around the end of the year, now is not the time to buy.
I even have an SSD - an 80GB Intel X25-M - sitting here that I haven't had time to install.
What I really need is products that are reasonably priced and simply do what it says on the box. Buy them, plug them in, leave them to work.
I've had my eye on some cheap Buffalo Linkstation Pro Quads - a cute little 4-bay NAS, about 6x6x9 inches, which goes for less than $250 locally without disks. I can pick up 2TB Seagate or Western Digital drives1 for less than $90 each, so it's about $600 for a 6TB RAID-5 NAS. Pretty good; my LaCie 8TB RAID-5 unit ran something like $2000 back when. But I'd need more than one of the Buffaloes, and that's kind of a bore, even if they're small and cute and I can name them after the fairies from Sugar.2
I regretted not getting more of the old Acer Easystores when those were going cheap (the discontinued Linux model, not the later Windows Home Server model), so I've been thinking of maybe getting three or even four of the Buffaloes.
Then I saw this:
And I though to myself, wait a minute. Wait a minute. 12 bays? Nearly 200MB per second?4 InfiniBand5 expansion for a second cabinet and another 12 bays? That's got to cost a fortune.
Well, in fact it's not cheap, but it's not nearly as expensive as I'd expected. $1550 for a 12-bay high-performance (for the market segment) SMB6 NAS is a steal. It's not rack mount, but I don't have a rack. The expansion cabinet isn't much cheaper than the main unit itself, making that a dubious proposition, but it's there if you need it.
As well as the basic SMB,7 NFS, and FTP, it's an iSCSI target (that is, it can serve as raw disk as well as shared filesystems), a web server with MySQL and PHP, a recording station for TCP/IP-based video cameras, a streaming audio server, an automated BitTorrent/eMule/Usenet downloader, an iTunes server, a print server (it has four USB ports for attaching widgets), a mail server, a firewall, and a VPN server.
Or to put it another way, it runs Linux. But it has a pretty front-end.
It does all the usual things you'd expect storage-wise: RAID-0, 1, 5, 6, and 10, online capacity expansion and RAID level migration, and it has a hybrid RAID mode (like the Drobo) that lets you mix and match drive sizes and automatically balances them with single or double failure protection and gives you the optimum disk space.
It's a fair bit bigger than the Buffaloes - about a twelve-inch cube, so about four times the size and six times the price for the empty box. But I can pop a dozen cheap 2TB drives in there, RAID-6 them, and then forget about it. That's precisely what I need.
There's an even more powerful model, the DX3611xs, with four network ports, two InfiniBand ports for expansion, and two expansion slots besides, but that costs more than twice as much and still only has 12 bays, so it's not what I need. Might be just the ticket for the office, though. And this is where the expansion units make sense - with a much more capable but concomittently exorbitant main unit, you'd want the ability to add as much storage as possible, and the DX1211 expansion runs a little over a third the price of the DS3611xs.
Update: With the latest version of the management software (currently in beta) it's also an ISO server (serving CD/DVD/Blu-Ray images as network drives), a syslog server (for collecting logging data from multiple Linux or Unix systems), an Apple TimeMachine backup server, a scan and fax server, an LDAP server, and a Youtube video snaffler, among other new features. It's really quite shiny.
0 The Steam and GOG sales don't count.
1 I already have a small herd of external 2TB Western Digital drives here that I've been using for backups for my dying Windows box. Wonder if they come out of their casings easily?
2 The LaCie is in fact named Sugar. It's white3, and it's a cube, so that was obvious.
3 Well, it looked white in the photos. It's not white. It's an unpainted cast aluminium block. I named it Sugar anyway.
4 Link aggregation.
6 Small-Medium Business.
7 Server Message Block, a.k.a. CIFS.
Saturday, July 16
A guest post, by, well, me, from seven years ago, with added commentary by me from today.
I've written recently on the untimely death of Moore's Law and on one of the first side-effects of the faltering and failure of that law. But, being somewhat dead myself, I didn't have the time or energy to go into any detail, and probably left my less-geeky readers saying something along the lines of Huh?
But this is important, so I'm going to give it another try.
Way back in 1965, just four years after the first integrated circuit was built, Gordon Moore, then working at Fairchild, made an observation and a prediction.
His observation was that the number of components in an integrated circuit was increasing, while the cost of each component was decreasing; his prediction was that this trend would continue. Intel has made his original paper available for you to read. It's a little bit complicated; Moore is talking about trends in the number of elements in a integrated circuit required to achieve the minimum cost per component - efficiencies of scale, in other words.
Reduced cost is one of the big attractions of integratedWhat he's saying is that by 1975, it would be cheaper to build a single integrated circuit with 65,000 components than to build two 32,500-component circuits - and, by comparison, a 130,000-component circuit (if such a thing could be built) would cost more than twice as much.
electronics, and the cost advantage continues to increase as the technology evolves toward the production of larger and larger circuit functions on a single semiconductor substrate.
For simple circuits, the cost per component is nearly inversely proportional to the number of components, the result of the equivalent piece of semiconductor in the equivalent package containing more components. But as components are added, decreased yields more than compensate for the increased complexity, tending to raise the cost per component.
Thus there is a minimum cost at any given time in the evolution of the technology. At present, it is reached when 50 components are used per circuit. But the minimum is rising rapidly while the entire cost curve is falling (see graph below). If we look ahead five years, a plot of costs suggests that the minimum cost per component might be expected in circuits with about 1,000 components per circuit (providing such circuit functions can be produced in moderate quantities.) In 1970, the manufacturing cost per component can be expected to be only a tenth of the present cost.
The complexity for minimum component costs has increased at a rate of roughly a factor of two per year (see graph on next page). Certainly over the short term this rate can be expected to continue, if not to increase. Over the longer term, the rate of increase is a bit more uncertain, although there is no reason to believe it will not remain nearly constant for at least 10 years. That means by 1975, the number of components per integrated circuit for minimum cost will be 65,000.
I believe that such a large circuit can be built on a single wafer.
Events since then have proved him right (and happily he is still around to enjoy it). [Me'11: And still is today.] And more right than he imagined because not only have the components been getting smaller and cheaper, but at the same time they have been getting faster and using less power. And this has been going on, following a curve where (to take the most famouse example) processing power has been doubling every 18 months. For my entire life processing power has been doubling roughly every 18 months.
My first computer, which I bought as a teenager, saving pocket money every week until the day of the Big! Christmas! Sale! was a Tandy (Radio Shack to many) Colour Computer. It had 16k of ROM (which contained the BASIC interpreter; there was no operating system as such) and 16k of RAM. It was powered by a Motorola 6809 processor and a 6847 video chip. It had a maximum resolution of 256 by 192 - in black and white - or 16 lines of 32 columns in text mode.
It ran at 895kHz.
Yes, boys and girls, kiloherz. It was an 8 bit chip (with a few 16-bit tricks up its sleeve, admittedly); it could execute, at most, one instruction each cycle, and it ran at less than a megahertz. (Also, it had no disk drives at all; everything was stored on cassette tape, which fact is directly responsible for the irretrievable loss of my version of Star Trek and the completely original game Cheese Mites.)
Not quite twenty years on, I'm typing this on a system with a 2.6 gigahertz 32-bit processor than can execute as many as three instructions per cycle, some of which can perform multiple operations like doing 4 16-bit multiply-accumulates all at once. It has more level-one cache than my Colour Computer had total memory. Its front-side bus is eight times as wide and nearly a thousand times as fast. My display is running at 1792 by 1344 in glorious 24-bit colour. [That display sadly died not long after; it just got brighter and brighter until it burned out. Yes, a CRT.] And it has six hundred and fifty gigabytes of disk.* [Today you could buy that for less than $30.]
It cost a bit more, it's true. My 1984 Colour Computer cost me $149.95, and Kei, my 2003 Windows XP box, cost me around $2000. The best I can do today for $149.95 (ignoring for the moment two decades of inflation and the fact that this now represents a morning's earnings rather than a year's) is a Nintendo Gamecube. [Remember those?] The Gamecube only runs at 485MHz (achieving a measly 1125 MIPS); it only has 40MB of memory; it only has 1.5GB of storage. Its peak floating-point performance is a mere 10.5 GFLOPS, compared to the Colour Computer's... I don't know, exactly, since the CoCo had no floating-point hardware at all, and I doubt that the software emulation achieved so much as 10.5 kiloFLOPS.
So, depending on exactly what you wish to measure, 20 years of innovation has given us somewhere between a thousand and a million times better value for money.
And here it is again: This has been going on for my entire life. Every year, tick tick tick, new and better and faster and cheaper. You buy the latest and greatest and it's obsolete before you get home from the mall. It's so much a part of our lives that it's a joke, a cliche.
Classical MOSFET scaling ENDED at the 130nm node (and nobody noticed)- almost the exact same sentiment expressed by IBM.]
The death of Moore's Law has been predicted many times, not least by Moore himself, but when you get IBM's Chief Technology Officer saying
Scaling is already dead but nobody noticed it had stopped breathing and its lips had turned blue.you know something's up. Particularly when he's not making a prediction, but talking about what's happening right now. [Scaling may be dead, but Wired have kept that link alive for 7 years.]
And everything was planned so neatly too. 90 nanometres was to come on line late '03, ramping up this year; 65 nanometres was to be the big thing of '05, followed by 45 nanometres in '07. Now, beyond that, at 30 nanometres and 20 nanometres, things were less clear, and beyond 20 nanometres not clear at all, but at least the path was marked out from the old 130 nanometre stuff down to 45, giving us 9 times the transistors and 3 times the speed. Only someone forgot to check with the laws of physics.
Wired: How long will Moore's Law hold?So, what exactly is the problem? It's not, as Moore and others predicted, a question of actually building the circuits - that's still working fine. IBM, Intel, AMD and others have all produced working chips at 90 nanometres. The problem is leakage. Each of the millions of transistors in a chip is a tiny switch, turning on and off and incredible speeds. Each time you turn the transistor on, or off, you need to use a little bit of electricity to do so. That's okay, and it's expected, because you don't get anything for free. The problem is that the transistors are now so small, and the layers of insulation - the dielectric - so thin, that they leak. There's a partial short-circuit, and so instead of only using power when the switch switches, it's using power all the time.
It'll go for at least a few more generations of technology. Then, in about a decade, we're going to see a distinct slowing in the rate at which the doubling occurs. I haven't tried to estimate what the rate will be, but it might be half as fast - three years instead of eighteen months.
What will cause the slowdown?
We're running into a barrier that we've run up against several times before: the limits of optical lithography. We use light to print the patterns of circuits, and we're reaching a point where the wavelengths are getting into a range where you can't build lenses anymore. You have to switch to something like X rays.
So what? Electricity is cheap. Well, the so what is heat. Modern microprocessors use as much electricity as a light bulb, and that means they produce just as much heat. If they didn't have huge heat sinks and fans bolted onto them, they'd very quickly overheat and fail - a fact that some people have independently discovered.
Until now, each new generation of scaling, each new node, has brought smaller, faster, cheaper and cooler transistors. At 90 nanometres, transistors are smaller, cheaper, probably faster again - but they run hotter. And the competition in the processor market has already driven power consumption (and heat generation) about as high as it can go. So when the new generation was discovered to increase the heat rather than decrease it, the whole forty-year process of accelerating change ran head-first into a wall.
Back at the end of 2002, I made the following set of predictions for the coming year. I felt pretty comfortable in all of them, the first no less than any of the others:
My predictions for 2003:But not only did we not see 4GHz processors in 2003, it's doubtful that we'll see them in 2004 either. [Nope.] (I was wrong about number 3, too. No-one resigned, and the media moved onto the next scandal. Rinse, repeat.)
1. Microprocessors will hit 4GHz by the end of the year. Marketers will try and largely fail to convince the public to buy them.
2. A major scientific breakthrough will lead to a new and deeper understanding of something.
3. A major political scandal will result in a huge media kerfuffle and only die down when someone resigns.
4. There will be a war.
5. Bad weather will affect the lives of millions of people.
6. There will not be any major, civilisation-destroying meteor impacts.
7. Astronomers will find new and interesting things in the sky.
8. Spam, pop-ups and viruses will continue to plague us. The Internet will fail to collapse under the strain. Pundits will predict that this will now happen in 2004.
9. A rocket will explode either on the launch pad or early in its flight, destroying its expensive payload - which will turn out to be uninsured.
10. Cod populations in European waters will continue to fall, and the European parliament will fail to act to prevent this.
11. A new species of mammal will be discovered.
12. A species of reptile or amphibian will be reported as extinct.
Now, assuming you're not a hard-core computer gamer, hanging out for the release of Doom 3 Mass Effect 3 and Half-Life 2 Half-Life 2 Episode 3, why should you care?
Well if you have broadband internet, or a mobile phone, or a DVD player, or a PDA [A what?], or a notebook computer, or a digital camera (or a digital video camera), or you use GPS on your camping trips, or you enjoy the low cost of long-distance phone calls these days, if you download anime or the latest episode of Angel Doctor Who off the net, if you take your iPod iPad with you everywhere you go, if your job or your hobby involves using e-mail or looking things up on the Web, you can thank Moore's Law for it.
Modern communications depend critically on advanced signal processing techniques, performed by specialised chips called Digital Signal Processors, or DSPs. These things are everywhere - every modem, every mobile or cordless phone, every digital camera, every TV or VCR or DVD player, every stereo, every disk drive. It's the relentless advance of Moore's Law that has made DSPs fast enough and cheap enough to do all this, and made them efficient enough to run on batteries so well that your mobile phone might last a week between charging. (My first mobile was lucky to make it through the day.) Disk drives demand high-speed DSPs to sort out the signals coming from the magnetic patterns on the disk and turn them back into the original data. DVD players need them to turn the tiny pits pressed into the aluminium surface into a picture. The entire global telephone network, mobile and fixed, depends on DSPs. And any advances in any of these areas will require more and faster and cheaper DSPs and - uh-oh.
And there's more: The advances in computers and communications over the past four decades have been the primary driver of the global economy. The economy has been growing all that time, even though we have made no fundamental breakthroughs in finding new resources or new materials. If you're better off than your parents, you can thank Moore's Law for a big chunk of that - if not the effort you put in, then the new opportunities it opened up.
And it just died.
I don't think the financial markets have a clue yet what's going on, but in any case it's going to be a soft landing. All of the processor manufacturers have been in a mad rush over the last decade to produce faster chips at the expense of pretty much anything else. The funny thing is that they've been pushing so hard, they've left a lot of things behind. Take a look at this chart:
You don't have to understand exactly what this means, but the first number relates to "integer" performance, which is important for things like word processing and web browsing and databases, and the second number relates to "floating-point" performance, which is important for games. (Well, and other things too.)
1076 763 Pentium M 1.6GHz
805 635 Pentium M 1.1GHz
237 148 C3 1.0GHz (C5XL)
398 239 Celeron 1.2GHz (FSB100)
543 481 Athlon XP Barton 1.1GHz (FSB100 DDR)
581 513 Athlon XP Thoroughbred-B 1.35GHz (FSB100 DDR)
1040 909 Athlon XP 3200+ (Barton 2.2GHz, FSB200 DDR)
1276 1382 Pentium 4 3.0E GHz Prescott (FSB800), numbers from spec.org
1329 1349 Pentium 4 3.2E GHz Prescott (FSB800)
560 585 Athlon 64 3200+ 0.8GHz 1MB L2
1257 1146 Athlon 64 3200+ 2GHz 1MB L2
The Pentium M is a modified version of the Pentium III, customised for notebook computers. Since notebooks computers run off batteries, and batteries don't hold much power at all, the Pentium M has been tweaked to provide as much speed as possible while using as little power as possible. The Pentium 4, on the other hand, is designed for speed at the expense of everything else. And what we find is that the 3.2GHz Pentium 4, despite having twice the clock speed of the 1.6GHz Pentium M, is just 25% faster on integer (useful work) and 75% faster on floating point (games).
And - here's the tricky bit, and the cause of Intel's recent and dramatic change in direction - the Pentium 4 uses four times as much power as the Pentium M. So if, instead of putting one Pentium 4 onto a chip, you put four Pentium Ms, it would use the same amount of power and produce the same amount of heat, but it would run up to three times as fast... Overall.
Which is great and wonderful if you can use four processors at once. I can, quite happily, and more than that. A word processor can't, not easily, but then word processors already run pretty well. Games, and other graphics-intensive stuff like Photoshop or 3D animation software certainly can, though most games haven't been written to do so. Not yet.
Or so the situation was seven years ago. What's changed? Well, now I can have a game busy-wait on two cores at once.
The situation turned out not to be quite so dire as it appeared at the time, though a huge amount of engineering effort has gone into the advances we've seen in recent years. And yet, the server we recently deployed at my day job, while it has forty processors (yes, four-zero), is still based on the Pentium Pro (through at least six generations of intermediary designs) and only runs at 2GHz.
The bright spots have been not so much in the CPU cores themselves, as in the vector (a.k.a. SIMD) units, which have grown from 64 to 128 to 256 bits, and in video cards, which are just masses of vector processors all working together. Video cards have hit a limit too, though; they're choking on their own heat. A single high-end card can use more than 300W, about as much as a well-configured PC at the time of the original post.
And we still haven't broken 4GHz in a mainstream processor.
There are two particular beacons on the horizon at the moment. One comes from AMD, the Zambezi-Orochi-Bulldozer chip I mentioned in a recent post. If pre-launch data is correct, they expect to provide 8 cores running at 4.2GHz (and up to 4.7GHz when conditions are right) within a 125W power budget. That's a lot of processing power for a fairly low-end chip. It has some limitations; in particular, it only has one full vector unit per pair of cores, so for floating-point heavy applications like games and video editing, it will be no faster than Intels four-core chips. For the stuff I do, though - web sites and databases - it will (again assuming the details are correct) slam Intel's chips into the ground.
The other ray of light comes from Intel, because, while we might loathe the behaviour of their marketing department, they are no slouches when it comes to engineering. Their new FinFET transistors, debuting on the upcoming 22nm node, allow their chips to cut overall power consumption in half. Which means, since everything computational nowadays is limited either directly by available power or indirectly by heat dissipation, that everything can get twice as fast. Not vertically, but at least horizontally.So we're talking about mainstream desktop processors with sixteen cores, running at well over 4GHz, coming your way in the next year or two. It's not the 10GHz Pentium 4 that Intel promised us all those years ago, but it will serve. Before much time has passed we'll see games busy-waiting on eight cores, you mark my words.
* That's dedicated disk; we'll set aside the terabyte or so living in the file server. **
** Which died in the great server crash of... Around 2007, I think. Took me ages to recover all that anim... Data.
Thursday, July 14
According to this handy chart, AMD's new FX-8170P CPU (Order Orochi, Family Zambezi) will have 8 cores running at 4.2GHz base speed, 4.7GHz in turbo mode.
That looks like a worthwhile upgrade for my current 2.4GHz quad core. Well over three times the compute power. And because AMD has maintained a sensible continuity in their platform, I can build a system now with the latest AM3+ socket, drop my current AM3 CPU into it, swap in the octocore goodness when it lands, and use the spare CPU to upgrade my AM2 Linux box. With Intel you'd be faced with three different pin counts.
I really want to see the server versions of these chips now. We're building a cluster of AMD-based servers at my day job, and we're using the cheapest current CPUs with the plan to swap them out for the newer models when they arrive. I was expecting more cores but a slower clock speed, but based on what they've achieved on the desktop I could get more cores and a higher clock speed. That would be very nice.
Monday, July 11
Not official yet, but clearly on its way. Thanks for all your hard work, CentOS peeps.
Bimped: It's here!
That's one of the blockers for the new Minx platform rollout fixed. The others include a stable release of OpenVZ for RedHat/CentOS 6, and Intel's 710 series SSDs. The latter are expected this month.
Oh, and me getting time to do some work on it. That's much more likely to happen now than it was six weeks ago, since we have now filled all our situations vacant at my day job, and I'm hoping to see my hours drop from ~60 to ~35 a week.
Saturday, July 09
A while back, in between houses falling on me, I was working in a database written in Python, which I called Pita. I actually got it working, enough to start doing some performance tests...
At which point I shelved the project, because (a) I was absurdly busy what with the houses and all and (b) even though it had pluggable low-level storage engines, the overhead of the Python layer made it significantly slower than just using MySQL.
What Pita could do, which was nice, was (a) offer a choice of in-memory or on-disk tables using identical syntax and selectable semantics and (b) provide a log-structured database that did sequential writes for random updates. Cassandra also has this trick. The advantage here is that it (a) can cope with a huge volume of incoming data, and (b) doesn't fry consumer-grade SSDs the way MySQL would.
Unfortunately, Cassandra is a bit of a cow. Undeniably useful, but indubitably bovine.
Redis with AOF can offer similar performance, but only so long as your data fits in memory, because it's simply snapshot+log persistence (like Pita) and single threaded (unlike Pita) so it can't cope with I/O delays. This makes Redis and its support for data structures beyond simple records (hashes, lists, sets, sorted sets) great for your hot data but no use for your long tail - if, say, you've been running a blogging service for 8 years.
What you could do in that situation is use Redis for your hot data (great performance, easy backups, easy replication) and stick your cold data in a key-value store.
Like Keyspace, except that's dead.
Or Cassandra, except that's a cow.
Or MySQL, except that defeats the purpose.
Or MongoDB, except that you'd like to keep your data.
Or Kyoto Tycoon, which has pluggable APIs (don't like REST - use RPC or memcached protocol) and pluggable storage engines... Like Google's LevelDB. Kyoto Tycoon running Kyoto Cabinet uses snapshot+log for backups, but the database itself is a conventional B+ tree, so it needs to do random writes. LevelDB, on the other hand, uses log-structured merge trees - sequential writes, even for the indexes.
So Redis and Kyoto Tycoon with LevelDB both provide:
- Key-value store
- Range lookups
- Sequential writes (SSD friendly)
- Snapshot+log backups (bulletproof)
- Instant replication (just turn it on, unlike MySQL replication, which is a pain)
- Lua scripting (not yet in mainstream Redis, but coming)
- Key expiry (for caching)
- Data structures
- Lists (which can be used to provide stacks, queues, and deques)
- Sorted sets
- Bytestrings (update-in-place binary data)
- Pub/Sub messaging
- Support for databases larger than memory
- Very fast data loads
Saturday, July 02
I'll say this up front: I think AMD's new Fusion range of processors are some of the most important integrated circuits since Signetics' 555.
Why? Let's start at the low end and work our way up.
The C-50 model provides two dual-issue, out-of-order x64 cores (codenamed Bobcat) at 1GHz, an 80-shader GPU at 280MHz (44 gigaflops), 1MB of cache, and a 1066MHz 64-bit memory bus. That's enough hardware to make my SGI O2 look sad, and it has a total power consumption of 9 watts in a 40nm process. The C-60 refresh due this quarter enables a turbo mode that can increase CPU speed by 33% and GPU speed by 44% when that fits within the power and thermal envelope, still with the same 9 watts draw.
The E-350 has the same architecture, but bumps the CPU clock to 1.6GHz and the GPU to 500MHz (80 gigaflops). The power consumption goes up to 18 watts, but that's still pretty modest, less than a single-core 500MHz AMD K6-2, which lacked most of the features of these new chips and was obviously much, much slower. (But a solid little workhorse in its day.) An E-450 version is due out this quarter with a modest CPU speed bump and a 20% GPU and 25% memory speed increase.
They're small and cheap to produce, too - 75mm2 on a 40nm process, which is in itself not leading-edge.
The second half of AMD's Fusion range for 2011 is the Llano family, the A-series. Where the C and E-series chips target netbooks, ultralight notebooks and embedded designs, the A-series are aimed at full-feature laptops and low-to-mid-range desktops.
These don't have a new CPU core; they're based on the K10.5 core, a derivative of the long-lived K7 Athlon. But they deliver the goods nonetheless.
The A8-3500M is a notebook chip: 4 cores running at 1.5GHz standard, and up to 2.5GHz in turbo mode: If you are only using one of the cores right now, it will instantly shut off the other three to save power and speed up the one that is actually in use. 4MB of cache, a GPU with 400 shaders at 444MHz (355 gigaflops) and a 128-bit 1333MHz memory bus. Maximum power consumption is 35 watts.
The A8-3800 is its desktop counterpart. The 4 cores run at 2.4GHz and up to 2.7GHz in turbo mode; the 400 shaders at a zippy 600MHz (480 gigaflops), the memory bus at up to 1866MHz. Total power draw is 65 watts.
That is, it's as fast as my curent desktop CPU, uses 30% less power, and throws in half the performance of my 110 watt graphics card for free.*
Or to look at it another way, AMD's new budget desktop solution offers twice the graphics performance of an Xbox 360 or Playstation 3, while costing no more and using less power than their existing CPUs alone.
Okay, so technically all very nice. Now, why do I think they're so important?
Well, consider the Amiga. Brilliant piece of work, but the fastest production model ever made was a 25MHz 68040. The slowest of the Fusion chips can emulate an entire Amiga without breaking a sweat. Want an Amiga? C-50, Linux, emulator. Job done.
Or the Be Box. Neat concept, neat OS, ran out of money and died, but not before BeOS was ported to x86. Want a Be Box? C-50.
Want a game machine that can knock over any of the current-generation consoles? A8-3500M or A8-3800. No chip design, no integration hassles, your job is done.
Want a solid little desktop for Windows or Linux? A8-3800, 16GB of cheap RAM, and you're set. Okay, you won't want to play Civ 5 on a 30 inch monitor with that, but at 1920x1080 it should actually work pretty well.
Intel's Sandy Bridge chips (their current low-end desktop CPUs) have better single-threaded CPU performance, but suffer from truly second-rate GPUs. With AMD's Fusion chips you don't have to compromise on graphics: Their new embedded GPUs are genuinely good.
The performance that any of these chips can deliver would make high-end workstation designers of a decade ago turn green, and they're just dirt cheap. We live in a world of riches unimagined.
* Radeon 4850. Still a solid card.
Friday, July 01
Or, Much Ado About Random Write Endurance
Intel's 320-series 300GB SSD has a quoted 4KB random write endurance - that is, the minimum total volume of data you can write to it in individual 4KB randomly located blocks before it begins to fail - of 30TB.
30TB may sound a lot to you. The primary MySQL server at my day job does 2.5TB of writes per day (and it's only one of several database servers). MySQL writes tend to be random-ish, so you might at first glance expect the abovementioned drive under those conditions to burn out in 12 days. For that reason (and the fact that the database is rather larger than 300GB), we don't use a 320-series SSD; we use a RAID-50 array of 20 enterprise drives each with about 60x the quoted write endurance. Based on the quoted numbers and measured load, we should be good for at least 10 years.
The question is, though, what is the real-world longevity of SSDs under heavy random write conditions? I've been very conservative about SSD deployment - for mee.nu I've used the more expensive enterprise SLC drives as well (and RAID-5 at that) even though our write activity is a couple of orders of magnitude lower. The only MLC drives I've deployed in a production environment have been in applications where reads are random and writes are sequential - some of the Cassandra and Xapian databases at my day job fit this description.
However, this paper, presented at last year's Hot Storage conference, suggests that things might not be nearly so bad. The authors examine a model of flash cell burnout, and note that if cells are given time to rest between write/erase cycles, their endurance can be expected to increase significantly.
How significantly? Let's take our 300GB SSD and hit it with 2.5TB of data a day. Let's assume a worst-case scenario on two aspects - all of that is individual 4KB random writes, and there's no write-combining done by the OS or RAID controller. Let's assume a best-case scenario on the other aspects - write multiplication is 1.0 (that is, no blocks need to be moved to allow for the updates) and wear-levelling is perfect across the drive (all blocks are updated evenly). (All of these assumptions are completely implausible, but the idea is that they'll kind of balance out until I can get more precise data.)
That means that every block on the drive is updated every three hours. A litle less than three hours, but near enough. That paper suggests that with a 10,000 second - a little less than three hours - recovery period between write/erase cycles, write endurance of MLC cells can be expected to be 90 times the worst-case situation the manufacturers cite.
That is, rather than two weeks, the drive would last for three years. And then drop dead all at once given our rather unreasonable scenario.
Which is a completely different picture from what the manufacturer's worst-case numbers might suggest. And with a RAID controller with battery-backed write-back cache, the number of writes that actually hit the SSD can be significantly less.
The problem is, this is a simulation. It's a very careful simultion based on the known physical properties of the semiconductor materials used in flash fabrication, but it's still a simulation. I'm hoping I can get a couple of SSDs solely for the purpose of killing them, because I haven't seen anyone else publish good data on that.
The reason all this matters is that where a 300GB Intel MLC drive costst $600, 300GB of Intel SLC enterprise SSD storage comes to five drives totalling $4000. The point may become somewhat moot when Intel's 710 MLC-HET drives launch. The HET, I would guess, stands for something like high-endurance technology; these drives are based on cheaper MLC flash but optimised for reliability rather than capacity. They will likely (based on reports in the trade press) cost twice as much as the regular MLC drives, but offer 20 to 40 times the endurance - nearly as good as SLC. If the price and endurance turn out that way, then there will be 3x less reason to risk your data on a statistical model and a consumer drive.
Another thing: Intel's 320 series (unlike the earlier M-series) implement internal full-chip parity in the spare area, so even if one of the flash chips dies completely, the drive will continue operation unaffected.
56 queries taking 0.2525 seconds, 343 records returned.
Powered by Minx 1.1.6c-pink.