Sunday, February 07


Seeking A Database That Doesn't Suck


Quick recap of databases that suck - or at least, suck for my purposes - and some that I'm still investigating.

  • MySQL Lacks intersection and except, lacks array support, has only so-so full-text indexing, offers either concurrency or full-text indexes and GIS, but not both.
  • PostgreSQL Provides arrays, concurrency and full-text indexes and GIS, but a lot of this is lost in a twisty maze of plugins and thoroughly non-standard operators. And the full-text indexing sucks.
  • Ingres Ingres is free these days (if you don't need a support contract). It's a good, solid database, but doesn't actually offer anything I can't get from MySQL with InnoDB.
  • Firebird Doesn't seem to offer anything more than MySQL or PostgreSQL or Ingres. Which doesn't mean that it's bad, but doesn't really help me either.
  • SQL Server Needs Windows, which is worth an automatic 6 demerits, even though I can get enterprise-level Windows and SQL Server products for free. (My company is a Microsoft Bizspark member.) Full-text and GIS, intersect and except are all there, but still no arrays.
  • IBM DB2 Costs too much.
  • Oracle Costs way too much.
  • Progress / OpenEdge Solid database, lovely 4GL, but still, the last time I looked at it (2006) mired in 16-bitness (!) and the 4GL is too slow for anything complicated. Also expensive and has a screwed-up pricing model. Would use it if I could.

  • Redis Nice feature set, and looks very useful for small systems, but the current version is strictly memory-based. (It's persistent through snapshots and logging, but the whole database must fit in memory.) The developers are working on this, though. The API could do with a tidy-up too; it has different calls for the same operation on different data structures.
  • MongoDB Very nice feature set. It's a document database, but it stores the documents in a JSON-like structure (called BSON) and can nest documents arbitrarily and inspect the fields within a document and build indexes on them. But its memory handling is lousy; while it's not explicitly memory-based, I wouldn't want to run it on anything but a dedicated physical server with more memory than my total database size. I could just throw money at it and put another 24GB of RAM in the server (far cheaper than a commercial RDBMS license) which would last us for a while, but I have serious doubts about its robustness as well.
  • CouchDB Written in Erlang, which is always a warning sign. Erlang programmers seem to care about performance and reliability far more than they care about making a product that anyone would want to use. In this case, instead of MongoDB's elegant query-by-example (with extensions) I write map/reduce functions in JavaScript and send them to the server. In what universe is that an improvement on SQL? On the plus side, it apparently has replication. On the minus side, it's an Apache project, and I have yet to meet an Apache project that didn't suck in some major way.
  • HBase Looks good if you have billions of very regular rows (which I do at my day job, but not here). Nothing wrong with it, but not a good fit.
  • Project Voldemort Pure evil. No, wait. This one came out of LinkedIn. It's one of the recent flock of inherently scalable (automatic sharding and multi-master replication) key/value databases. In their own words, [it] is basically just a big, distributed, persistent, fault-tolerant hash table. That's a very useful thing, but I need defined ordering (multiple defined orderings for the same dataset, in fact) which a hash table can't give me.
  • Cassandra This is Facebook's distributed hash table thingy (it's like the old days when every server company designed their own CPU). May have some vague concept of ordering, so I'll take a closer look.
  • Jackrabbit It's a Java/XML datastore from the Apache Foundation. Uh-uh. I've used ActiveMQ guys. You can't fool me twice. I'd sooner chew rusty nails.
  • Riak Bleah. Another key/value/map/reduce thing. In their own words:
    A "map phase" is essentially just a function ("F") and an argument ("A") that is defined as part of a series of phases making up a given map/reduce query. The phase will receive a stream of inputs ("I"), each of which consists of a key identifying a Riak object and an optional additional data element to accompany that object. As each input is received by the phase, a node that already contains the document ("D") corresponding to "I" will run F(D,A) and stream along the results to the next phase. The point here is that your function can be executed over many data items, but instead of collecting all of the data items in one place it will execute wherever the data is already placed.
    Clear? Right. Not interested at all.
  • LightCloud A distributed key-value store from Plurk. On the plus side, it's written in Python, supports Tokyo Tyrant and/or Redis for a back end, and "plurk" is fun to say. On the downside, it seems to be just a key/value database and not all that fast; it doesn't seem to expose the more interesting features of Tokyo Cabinet or Redis. It does at least have some update-in-place operations.
  • GT.M  GT.M is a crocodile.  That's not an aspersion, exactly.  Crocodiles were contemporaries of the dinosaurs, but when the dinosaurs went extinct, the crocodiles survived, and they are still around and occasionally snacking on jumped-up bipeds today.  It's a hierachical key-value store with a variety of access mechanisms.  It's unquestionably powerful, but it looks clunky; the MUMPS code reminds me of the systems I was employed to replace as little boy programmer in the 80's, and the Python interface doesn't actually look like Python, but more like some odd offspring of Cobol and Pascal.
  • Neo4j  Neo4j is a graph database, which is not something I've worked with before.  Graphs are a very general data structure for mapping relationships; while relational databases model parent-child relationships well, graphs are the natural model for networks of friends (for example) where you can end up back where you started by any number of differing paths.  The shortcoming of graphs is that they do not have a defined order, which is something I need a lot of around here.
Libraries and Other
  • Berkeley DB An oldie but a goodie. An embedded, transactional database. You can shove pretty much anything into it; it doesn't care. No query language, but does have indexes. One problem is this clause from the license:
    Redistributions in any form must be accompanied by information on how to obtain complete source code for the DB software and any accompanying software that uses the DB software. The source code must either be included in the distribution or be available for no more than the cost of distribution plus a nominal fee, and must be freely redistributable under reasonable conditions.
    Any code that uses the DB software? I assume they mean direct code embedding/linking, but that's pretty broad. And it's really just a library, albeit a good one; it could serve as the basis for a database server, but it isn't that by itself.
  • Metakit Metakit is a column-oriented database library, with a very nice, clean Python interface. For example, to display all posts by user 'Pixy Misa', you could simply write:
    for post in = 'Pixy Misa'):
      print post.title,
    The problem is, it doesn't scale. I tried using it for the first pass at Minx, about four years ago, and it broke long before it reached our current database size. Like MongoDB, nice semantics, not so great on the implementation.
  • Tokyo Cabinet / Tokyo Tyrant / Tokyo Dystopia, VertexDB Tokyo Cabinet is a database library similar to Berkeley DB, but licensed under the no-worries LGPL. Tyrant is a "lightweight" database server built on Cabinet, Dystopia a full-text search engine built on Cabinet, and VertexDB a graph database built on Cabinet. I haven't explored these in depth yet because the standard Tokyo Cabinet distribution doesn't include Python libraries (Perl, Ruby, Java and Lua, but no Python?), but there are third-party libraries available.
  • Xapian and Omega  Xapian is a full-text search library, and Omega a search engine built on Xapian.  In fact, Xapian is more than that; it can do range searches on strings, numbers, and dates as well, and can store arbitrary documents.  It's quite good for searches, but not really suited to general database work.

Posted by: Pixy Misa at 01:17 AM | Comments (67) | Add Comment | Trackbacks (Suck)
Post contains 1381 words, total size 10 kb.

1   In my project at work, I've basically resigned myself to using one from column A (SQL) and one from column B (NoSQL), using the strengths of each. I'm bootstrapping with just SQL, since emulating a NoSQL in SQL is pretty trivial in most ways (and is definitely trivial for the purpose I forsee needing it), but I expect to have to switch out that part with a NoSQL solution sooner or later and am making preparations for that.

One additional comment, per your parenthetical on Project Voldemort; the generic NoSQL answer to "order" is that you have a key that you store the order in: "blog_posts_(blog_id)_order_chrono_asc" => [1, 2, 4, 8, 3, ...]. Every time you add a new post, you have to update all the relevant index keys. Yes, this sucks and is prone to bugs... there's a reason I'm going with the hybrid approach. smile

Nice summary, though; this is the most complete listing of these technologies I've seen.

Posted by: Jeremy Bowers at Sunday, February 07 2010 04:31 AM (yu3og)

2 Progress has been 32-bit for years, altho there are still some screwy things about it. I use it for my day job. It works well enough, screwy pricing model aside, that a LOT of Fortune 500 companies have applications written in it. Casinos tend to use it, as do companies that write anything from ERP/MRP software to HR/Payroll software. If you've ever heard of DTS (the Defense Travel System) that's written in Progress too. Further you can always write compute-intensive stuff in other languages; the interop features are pretty good.

Posted by: RickC at Sunday, February 07 2010 04:56 AM (8GbPX)

3 Even though Progress has been theoretically 32-bit (or 64-bit) since forever, there's still a bunch of stuff that's 16-bit limited that's never been fixed (or hadn't as of version 9) - record sizes, string sizes, code segments and so on.

Glad to see another Progress programmer - I worked with it from 1988 to 2006, and I still haven't found anything better for ad-hoc data manipulation.

Posted by: Pixy Misa at Sunday, February 07 2010 05:34 AM (PiXy!)

4 Yeah, I followed your link to MongoDB and my brain aborted when I hit the remarks about endian issues. I've dipped my toe just a bit into Tokyo Cabinet, but just haven't had the time to apply it to some of our work projects; I managed to avoid some development work by molesting a Perl/MySQL script into running 10 times faster, which let me add the new features without a rewrite.


Posted by: J Greely at Sunday, February 07 2010 11:52 AM (Pue/B)

5 Why do you need 1 database that does everything? You seemed to like Postgres except that it doesn't do full text. Solr is super awesome at full text, why not use Postgres + Solr?

Posted by: BJ Clark at Wednesday, February 10 2010 05:57 AM (1SMNd)

6 You mean, SQL, baked beans, Java, and SQL?


Posted by: Pixy Misa at Wednesday, February 10 2010 06:41 AM (PiXy!)

7 Aha, you're the one that wrote the if-only-it-was-that-easy article.  I read that while I was searching for a database that doesn't suck.

Posted by: Pixy Misa at Wednesday, February 10 2010 07:06 AM (PiXy!)

8 Just found a bug in Minx too - comment previews don't get properly HTML/BBCode sanitised.

Posted by: Pixy Misa at Wednesday, February 10 2010 07:07 AM (PiXy!)

9 Great roundup but it left me wondering: What *are* your purposes?

Posted by: Guillaume Theoret at Wednesday, February 10 2010 09:47 AM (F3Kuu)

10 (Waves hands.)  This.

Minx is a blog/forum/wiki/social networking system and an evolving application platform.  I need a database suited to that.

So primary use cases are -

- Document store, where either the documents don't change much or versioning is actively desirable.
- Fixed-length randomly accessed data, like statistics.
- Variable-length randomly accessed data that fits nicely in memory (user tables, session data, etc).

What I do need -

- Fast operation on complex sorts and selects.

What I don't need -

- Transactions.

I guess Pita (see more recent posts) is a database for mid-scale social networks.  My goal is for it to be suitable for systems from hundreds of thousands of documents to hundreds of millions (I currently have around three million posts and comments).  I don't plan to even consider the problem of billions of documents at this point (that's what I do at my day job).  I do plan to keep it small and simple enough that anyone starting a social web project would think of using it.

Posted by: Pixy Misa at Wednesday, February 10 2010 09:55 AM (PiXy!)

11 <quote>
And the full-text indexing sucks.

What don't you like about PostgreSQL's full text search? It works for me; curious why you don't like it..

Posted by: Eugene Koontz at Wednesday, February 10 2010 02:02 PM (0ivZI)

12 In MySQL all you do is say "put a full text index on these fields", and lo, it is done.  In PostgreSQL you have to prepare a separate field specifically for the data to be indexed.  So while full-text-indexing a blog in MySQL roughly doubles your database size, it's more like tripled in PostgreSQL.  It's also a pain.

Posted by: Pixy Misa at Wednesday, February 10 2010 02:21 PM (PiXy!)

13 Have you checked Scalaris[1]? Not that I think it will get any good credits from you ;-) It's an Erlang+RAM+NoOdering ShowStopper for you, but I would rather like to know how you would argue; and also to make your list look more complete.

Posted by: Djui at Wednesday, February 10 2010 05:46 PM (NyKge)

14 I forgot about Scalaris.  I'd seen it mentioned, but didn't really know anything about it.

I took a quick look - given all its constraints, I'm not sure who it is good for!

Posted by: Pixy Misa at Wednesday, February 10 2010 05:53 PM (PiXy!)

15 Hi there, have you taken a look at Depending on what you are trying to do, a graph is VERY flexible as a data structure, and you get fast traversals, a REST/JSON API and a very capable query language that will soon even work for Document DBs like CouchDB and MongoDB,

Posted by: Peter Neubauer at Wednesday, February 10 2010 06:24 PM (QZXb3)

16 I did take a quick look at Neo4j.  I'll take a close look and do a little writeup of it as well.  (From my perspective, of course.)

So, who's linked to me?

Posted by: Pixy Misa at Wednesday, February 10 2010 06:27 PM (PiXy!)

17 Everyone who writes these kind of NoSQL articles always forgets GT.M - I've no idea why. See the FOSDEM talks at and

Posted by: Rob at Wednesday, February 10 2010 09:34 PM (JU/Kj)

18 I looked at GT.M, but the site didn't do a good job of explaining what it was or why I should use it.  I'll take a look at those talks, thanks.

Posted by: Pixy Misa at Wednesday, February 10 2010 09:46 PM (PiXy!)

19 Oh, yes, that's what I didn't like about GT.M - MUMPS.


I try to avoid programming languages that are as old as me.

Except Algol.  Algol rules.

Posted by: Pixy Misa at Thursday, February 11 2010 12:41 AM (PiXy!)

20 Just remove fulltext-search from your criteria list and use Lucene for that?! Fulltext means more than just being able to index ASCII text, but PDF, DOC,...
Are you now able to choose a database?

Posted by: Mike at Thursday, February 11 2010 12:41 AM (SK6Rk)

21 If, for whatever your reasons, you don't want to use the Mumps language, you can use still use GT.M as a back-end NoSQL database to pretty much anything. See ie the database and language are separate issues, so don't throw the baby out with your bathwater.

Posted by: Rob at Thursday, February 11 2010 12:54 AM (JU/Kj)

22 Mike - I'm not sure what Lucene does with PDF, DOC and such beyond indexing the text, but even so you're right that there's more to it - stemming and weighting and so on.  And you really need a query language for full text, though it doesn't need to be Turing-complete.  My choice there would be Xapian - I've indexed a few TB of documents with it at my day job, and it has a neat Python interface.

And no, that doesn't make my problem easier.  It makes it worse.

Posted by: Pixy Misa at Thursday, February 11 2010 01:02 AM (PiXy!)

23 Rob - GT.M does look like a powerful back-end, that's for sure.  But I do want a "Pythonic" interface (though I don't want an object database).  And that ain't it. sad

Posted by: Pixy Misa at Thursday, February 11 2010 01:09 AM (PiXy!)

24 If you'd care to contact me by email I'd be interested to discuss if/how a more agreeable interface could be developed for GT.M. It really is a powerful and immensely flexible and adaptable back-end db so it would be good to figure this out.

Posted by: Rob at Thursday, February 11 2010 01:19 AM (JU/Kj)

25 Over the next couple of days I'm going to be writing some detailed posts on how I'd do a Python interface to a document-oriented database.  The purpose is to get my thoughts in order, whether I write my own database or a wrapper for an existing one, but if it offers value to anyone else then that's just another bonus.

(Mind you, there are those who would also accuse me of writing un-Pythonic code, but I think a two-line for loop is more readable than something like
print sum(map(float, open('test.numbers').readlines()))
Human beings read sequentially, not from the middle of the line outwards in both directions.  Well, except for German, which is a stack-based language.)

There are two particular interfaces I plan to steal from - MongoDB's PyMongo and MongoEngine and Metakit's Mk4py - plus the TreeDict data structure.

Posted by: Pixy Misa at Thursday, February 11 2010 01:44 AM (PiXy!)

26 Sounds like a great plan - I shall be very interested in what you come up with and see how I can implement it.

Posted by: Rob at Thursday, February 11 2010 02:14 AM (JU/Kj)

27 Thanks for the note about the BerkelyDB license. I followed up with Oracle just to verify you are correct and you are.

Posted by: Dave at Thursday, February 11 2010 06:17 AM (SCYxm)

28 In MySQL all you do is say "put a full text index on these fields", and lo, it is done.  In PostgreSQL you have to prepare a separate field specifically for the data to be indexed.  So while full-text-indexing a blog in MySQL roughly doubles your database size, it's more like tripled in PostgreSQL.  It's also a pain.

I agree it is hard to set up the first time, but once you understand it you can script it - you could even use Python if you like:

As far as the size increase, I guess my databases are too small for that to be a factor for me, and the growth is linear with the number of rows..

Posted by: Eugene Koontz at Thursday, February 11 2010 09:12 AM (0ivZI)

29 Pixy Misa, why is age alone a negative?  MUMPS, SQL, UNIX and C all had origins circa 1970.  All have evolved since.  (I too like Algol - I wrote many programs in Algol W, but never wrapped my head around Algol 68.)

One important thing to realize about GT.M is that while the key-value database and the language are well integrated, either is usable without the other.  So, you can easily define and create your own preferred Python or other API to connect to the database.  The database engine itself is extremely scalable.  There are production sites with individual database files in the 100s of GB and aggregate (logical) databases in the TB.  The largest production site I know of serves approximately 10,000 concurrent users during peak usage.  Yet, the footprint of the database engine itself on disk is on the order of 20MB, so it also fits nicely into virtual machines and software appliances.  GT.M also has excellent functionality for building applications that must remain available 24x365 in the face of not only unplanned events (like crashes) but also planned events (such as system and application upgrades).  And yes, you can create an API that accesses all this from Python or anything else that uses a C compatible calling interface.

By the way, for an example of MUMPS code see (the header has a pointer to the problem that the program solves).  Notice the support for multiprocess synchronization, international characters, and of course key-value access to the database that is as simple as accessing an array.

-- Bhaskar
(full disclosure - I manage GT.M)

Posted by: K.S. Bhaskar at Thursday, February 11 2010 09:59 AM (jcSjr)

30 It's not age per se.  People in 2010 are perfectly capable of designing horrible programming languages, and indeed, new and horrible languages are implemented every day.

But back then, horrible was the rule, because (a) language designers didn't have a lot of experience and (b) writing a compiler at all was on the edge of what was even possible.

Hence Cobol, early Fortran, Lisp, Snobol, and so on.  All of them are thoroughly horrible languages.  But all of them are a hell of a lot better than assembler.  Likewise C, which came somewhat later and is concomittantly somewhat cleaner; the description of it as "a portable PDP-11 assembler" is still apt.

None of which excuses modern efforts like Go, which has an inferior syntax and largely inferior semantics to decades-old efforts like Algol.  (I think that the entire family of curly-bracket languages should cease to exist, with the one exception of C, which at least had an excuse....  But that's a whole separate rant.)

Now, to the specific issue of MUMPS: When I look at that code, the first thing that strikes me is that I have no visual indication of program flow.  I can glance at a short C, Pascal, Java, Python or Ruby program and get an immediate idea of where it comes in, where it goes, and what comes out of this.  Not so with MUMPS.

The second thing I notice is that there's punctuation where I'm not expecting punctuation.  Most programming languages fail at this; certainly C does, Perl and PHP with their $, Python and Ruby with their respective @.  MUMPS isn't as bad as line-noise Perl, but it is opaque to even an expert programmer who is new to the language.  That's typical of older languages too.

Posted by: Pixy Misa at Thursday, February 11 2010 01:17 PM (PiXy!)

31 ..which simply goes to prove that when it comes to computer languages, one person's meat is another person's poison. To get back to the important point: the great thing about GT.M (unlike most other earlier Mumps database implementations) is that you don't have to use the Mumps language to access the database, and it's the database that is the cool and powerful part (and it's arguments about the language that have traditionally prevented more people discovering the powerfully simple and unique features of this database). I'd love more people to discover the kinds of things I can do with ease and amazing speed with a GT.M database, and if we can apply Pixy's work on a more "Pythonic" syntax, then I think we may have a chance of getting many more people to become familiar with this important member of the NoSQL technologies.

Posted by: Rob at Thursday, February 11 2010 06:42 PM (JU/Kj)

32 The human mind has a need to manage information.  Just give a child a box of trinkets and watch them.  They will organize, and reorganize the objects based on colour, size or any other characteristic that they can identify.   Each time there will be resultant groupings that accomplish the goal of classification to a degree, yet none exist that work universally for all purposes.  Database structures are attempts to impose structure for a purpose.  Spreadsheets have existed since Phonecian traders needed to keep records of their shipping.  When the first spreadsheet was computerized the very idea was rejected outright.  Because it didn't make sense to the world at large and it took years to catch on, and more years to settle into a useful tool.  Database science has been around since the 1930's, a much shorter time frame, and a much more difficult problem scope.  Computer science embraced relational databases in the 1970's because it was the most applicable tool for the business related problems that computer science was trying to solve.  MUMPS was written specifically to solve medical database problems.  The language is different because the database is different.  The language operators represent specific actions that provide a mechanism for the database to be useful.  It is one of the very few software tools that has an integrated language and database.  Rather than being a crocodile it was a glimpse of a future where the language and the database can be integrated into a cohesive whole.

Posted by: stuffduff at Friday, February 12 2010 05:15 AM (+7HGt)

33 Well, no.

As I noted in my recent post regarding Progress, there's a right way to do this, and it's been done, and it's not the MUMPS way.  MUMPS is still useful, but there are still crocodiles.

The fact that most languages are actually worse at this sort of data handline than MUMPS doesn't mean that there aren't languages that are better.  And that's purely on semantics; syntax-wise, MUMPS is just plain horrible.

Posted by: Pixy Misa at Friday, February 12 2010 05:35 AM (PiXy!)

34 That's this post, Syntax Matters.  The point I'm making regarding MUMPS is similar to the problem of Lisp and Smalltalk.  Semantically powerful.  Syntactically horrible.  Essentially gone from the marketplace.

Syntax matters.  If programmers back away when they see the sample code, your language is not going to be a success, no matter what semantic advantages it brings to the table.  They might be wrong, in a sense, to reject your language for its surface syntax, but it's far easier to evaluate syntax than semantics.  And I'd argue that bad syntax has a permanent cost, particularly in maintenance.

Posted by: Pixy Misa at Friday, February 12 2010 05:40 AM (PiXy!)

35 I think I am the guilty party for morphing this discussion from one about databases to one about languages.  While a language discussion can be objective in parts, it is also inevitably subjective as well.

A major point in my message is that the language and API are very separable from the database / datastore with GT.M.  One of the lessons from Rob is that with m_python, M/DB and M/DB:X (discussed at his site), he is able to rapidly create a variety of APIs/intefaces into the same key-value database.

The content of databases lives even as languages evolve and the ability to access the same long-lived data easily from multiple APIs and interfaces is essential.

Posted by: K.S. Bhaskar at Friday, February 12 2010 06:30 AM (jcSjr)

36 Quite so Bhaskar. Take M/DB for example. Internally its logic is written using the Mumps language and it uses the GT.M database to store and index the data. However, no user of M/DB will ever need to be aware of that fact because, to them, it looks and behaves like SimpleDB and can be accessed by your pet client using whatever your favoured language is. Rather than crocodile, chameleon seems more apt to me. It would be quite feasible and pretty straightforward to emulate the behaviour of pretty much any other database, both NoSQL and SQL, if you so desired using GT.M and make it available and accessible via whatever people's current pet (and in vogue) languages and syntax. The language used internally for development then doesn't matter to anyone else but the guy(s) who developed it.

Posted by: Rob at Friday, February 12 2010 08:06 PM (JU/Kj)

37 True to a degree.  But then you get into the area of leaky abstractions - and all abstractions leak.

Sorry, I've been busy hating XML parsers today and haven't had time to hate databases. wink

Posted by: Pixy Misa at Friday, February 12 2010 09:01 PM (PiXy!)

38 Actually you miss my real point. Take CouchDB and SimpleDB - written in Erlang. Does anyone but the developers of those databases really care? You don't need to understand Erlang to use them. Now let's say I want to develop a brand new NoSQL database, newDB, that has all the cool features I believe people will need and is accessible in any language that enthusiasts want to use for it. Why would I go to the bother of building an entire database from scratch to implement it? All that architectural and system management stuff that needs to be developed under the water line (guess what the CouchDB folks are spending most of their time working on these days?) - sheesh! Why not develop newDB with something tried and tested as the underlying engine that has all that stuff already in place? Call me old-fashioned, but that seems a lot more sensible use of people's time than building a complete new mousetrap from the ground up. The emulations I've done, leaky or not) serve to provide proof that GT.M is a great "raw pallette" on which to build other database functionality and projections. So if anyone out there is considering building newDB, check GT.M out and save yourself a ton of time to market.

Posted by: Rob at Friday, February 12 2010 09:49 PM (JU/Kj)

Does anyone but the developers of those databases really care?
Yes!  Because the abstractions of the language and the mindset of the developers leak out into the public API and from there into the applications built on it.

This is one of the great invariants of computer science - all abstractions leak.

Of course this does not make CouchDB or SimpleDB or other Erlang applications useless, or GT.M either.  But it actually matters, and it really does affect the adoption of these products.

You can build a wonderful system in Smalltalk, and its users don't need to know that it's written in Smalltalk (except of course that it doesn't look quite like any of the other applications they run), and everyone will be happy (mostly) - until something goes wrong (and something always goes wrong) and you have to find a Smalltalk developer to fix it.  There's a lot of different things going on here - simple practicality, network effects and so on - but it's very real.

Posted by: Pixy Misa at Friday, February 12 2010 10:42 PM (PiXy!)

40 Can GT.M be deployed on more than one, replicated node?

And, don't forget Keyspace.

Posted by: W-Mark Kubacki at Saturday, February 13 2010 05:31 AM (mjrUB)

41 W-Mark Kubacki asks whether GT.M can be deployed on more than one, replicated node.

With GT.M replication, all business logic (updates) are processed on one node.  From that one node, updates are streamed in real time to as many as sixteen replicating nodes, each of which can in turn stream to 16 more (for a total of 256 at the third level) and so on without limit.  The replicating nodes can be used for queries but not updates.  In the event the original node goes down for an unplanned event, or a planned event, any downstream node can become the new originating node (although network considerations probably suggest making one of the 16 second level nodes the new originating node).  When the original node is returned to service, it becomes a replicating node.

That's a start but there's actually a lot more to it (for example, the nodes have eventual Consistency not just eventual consistency, the difference being that the design provides for all nodes ultimately having the same path through the database state space, not just having the same state).

Posted by: K.S. Bhaskar at Saturday, February 13 2010 07:53 AM (bLn04)

42 Yes, that's a valuable advantage of master-slave replication systems - you never see local inconsistency.  You can be out-of-date, but the data you view from a given replication node will be consistent in itself.

I'm aiming (if I go ahead with this project) to offer a choice, either eventual consistency (fine for most web app requirements) or state space consistency.  So the comments on a blog can be eventually consistent, but the payment history for your accounts database needs to do better than that!

Posted by: Pixy Misa at Saturday, February 13 2010 12:40 PM (PiXy!)

43 Hi Pixy,

you may want to check out Terrastore as well:
It's a (brand new) distributed document store with per-record strong consistency.

I'd be curious to know where it sucks: I'll take your feedback as on opportunity to improve it wink


Sergio B.

Posted by: Sergio Bossa at Saturday, February 13 2010 06:58 PM (ZVAu2)

44 MonetDB.  As always, suitability does depend on use case.

Posted by: hedgehog at Saturday, February 13 2010 07:37 PM (3JNr9)

45 Sergio - I was looking at Terrastore today.  I don't know that there's anything wrong with it, but I don't think it fits my requirements (strong ordering in particular).

hedgehog - I completly forgot MonetDB, and I have looked at it in the past.  Thanks for reminding me!  I'll add it to my list.

Posted by: Pixy Misa at Saturday, February 13 2010 10:04 PM (PiXy!)

46 Ah, now I remember the problem with MonetDB: The documentation.  Makes me want to shoot myself.  It manages to tell you everything except what you actually need to know.

Posted by: Pixy Misa at Saturday, February 13 2010 10:21 PM (PiXy!)

47 Hi Pixy,

glad to know you're looking into Terrastore.

What do you mean, more specifically, with "strong ordering"?
You can perform dynamically ordered range queries (on keys) with optional predicates (on values): does it fit your needs?

Posted by: Sergio Bossa at Saturday, February 13 2010 11:32 PM (ZVAu2)

48 What I need to do is this sort of thing:

- Find the last 20 comments from a given user on a given blog
- Find the last 20 comments on posts in a given category on a given blog
- Find the last 20 posts with a given tag across all blogs
- Find the 20 most commented posts in the last 48 hours by users in Australia

And I need to do it across millions of documents in under a millisecond.  Just about every page produced by the system is performing queries like this, and with the new features coming up it's only going to increase.

So one of my main concerns is having a flexible selection of indexing methods (B-tree, R-tree, Quadtree, Trie etc).  And I'm a lot less worried about insert time than the ability to zoom in on specific subsets of my data.  With the MySQL, this sort of query is the second most time-consuming operation in the entire system (number one is HTML sanitisation on comments; on-demand image processing is even more CPU intensive, but happens far less often).

There is another approach, and it's something I've implemented at my day job, where the documents number in the billions and most other approaches simply don't work.  And that's to pre-define the complex subset operations and prebuild them.  That's certainly something that can be done with a key/value store, and it's the only way to go for really high volumes.  But I have a dynamic template language (and in the upcoming version, a matching API and scripting language), and I don't want to put arbitrary restrictions on users if I can avoid it.  Optimising the common case, though - that I can certainly spend some time on.

Posted by: Pixy Misa at Sunday, February 14 2010 01:44 AM (PiXy!)

49 Pixy said:

"... that's a valuable advantage of master-slave replication systems - you never see local inconsistency.  You can be out-of-date, but the data you view from a given replication node will be consistent in itself."

Actually, it's even a little stronger than that.  If you use transaction processing on the originating instance, and you transfer a micropayment from an account to view an article under an ACID transaction, and this is replicated, then no process on any instance will ever see an intermediate database state with the account debited but the article not paid for.  So, the replication preserves the ACIDity.

Posted by: K.S. Bhaskar at Sunday, February 14 2010 02:27 AM (eyb9K)

50 Yep, good point.

Posted by: Pixy Misa at Sunday, February 14 2010 04:02 AM (PiXy!)

51 Pixy said:

"- Find the last 20 comments from a given user on a given blog
- Find the last 20 comments on posts in a given category on a given blog
- Find the last 20 posts with a given tag across all blogs
- Find the 20 most commented posts in the last 48 hours by users in Australia"

A quick comment - when storing, use the negative of time as the time index (array subscript, key - whatever you want to call it).  So, when inserting, you're always inserting at the front, since a later post has a smaller index than an earlier post.  It may take a few millibleems longer to insert, but if most of what you are doing is reading, the extra effort taken to insert is more than offset by faster reads.  Many (maybe even most?) database engines move faster front to back than back to front.

Posted by: K.S. Bhaskar at Sunday, February 14 2010 09:28 AM (eyb9K)

52 Another one worth mentioning is SQLite. Embedded and pretty basic, but damn useful.

Posted by: Stefano Rivera at Monday, February 15 2010 04:34 PM (rwGpJ)

53 Definitely worth adding to the list.  If you need an embedded SQL database with a minimum of fuss, it's the way to go.

Of course, what I need is neither SQL nor embedded, so... wink

Posted by: Pixy Misa at Monday, February 15 2010 04:45 PM (PiXy!)

54 I'm a big fan of SQLite, but I'd never shove a terabyte of data into it. A gig or two, maybe, for something that's mostly read-only. I think the coolest thing you can do with it is create in-memory databases, which lets you replace hard-coded text-processing logic with a few simple queries.


Posted by: J Greely at Tuesday, February 16 2010 03:36 AM (2XtN5)

55 Ask a proper question and answer will come to you. I agree with some of the posters, fulltext search better be done externally, suddenly your options open up. I heard good things on mongodb. Only caveat that it eats all ram you got, not a problem if you have a separate database server.
You are running a database on separate server, right?

Best of luck to you with finding the right question(s) to ask.

Posted by: Pavel Zaitsev at Wednesday, February 17 2010 03:57 AM (MciT2)

56 "- Find the last 20 comments from a given user on a given blog
- Find the last 20 comments on posts in a given category on a given blog
- Find the last 20 posts with a given tag across all blogs
- Find the 20 most commented posts in the last 48 hours by users in Australia"

I think there is a perfect fit for such kind of queries:
Full text search engine with QL support and MySQL integration.
and it's fact, really fast.

Posted by: Vlad at Wednesday, February 17 2010 07:52 AM (czPrG)

57 Vlad - I'm not sure how Sphinx help me there.  It's a full text search engine.  It's not ordered at all.  Sure, you can sort the results, but then I'm selecting irrelevant results, sorting them, taking the last 20 or whatever, and throwing away all the rest of the work.

What I need is an index that spans levels of the document hierachy.  MongoDB can do this, but MongoDB has other problems.  As Pavel said, it eats all the ram you've got.

Posted by: Pixy Misa at Wednesday, February 17 2010 10:11 AM (PiXy!)

58 Pavel - well, I could ask questions like "How do I run MySQL with instant schema updates, online index rebuilds, and updateable views with real cross-table indexes?"  But I already know that I can't do any of that, and I need all of that.  So I need a different database.

I hate SQL anyway.

Posted by: Pixy Misa at Wednesday, February 17 2010 10:15 AM (PiXy!)

59 Oh, and my database is on a VM.  (OpenVZ.)  That simply doesn't work with MongoDB because of OpenVZ's out-of-memory semantics.  MongoDB on OpenVZ will reliably run out of memory and crash.

Posted by: Pixy Misa at Wednesday, February 17 2010 10:17 AM (PiXy!)

60 Did you look at the Solid DB project?
I also suspect you would have the same complaints about Informix and Informix XPS that you have about Oracle and DB2?

Posted by: Noir at Sunday, February 21 2010 05:06 AM (BPBpw)

61 I'm sure Solid DB is a fine product, but a two-core license costs more than all my other operational expenses for two years.  Informix pricing is nearly as bad.

Posted by: Pixy Misa at Sunday, February 21 2010 12:46 PM (PiXy!)

62 Just a follow-up on GT.M. There's now a low-level TCP-based wire protocol for it at on top of which it should be possible to build any kind of abstraction of its scheme-free hierarchical database in whatever your favourite language.

Posted by: Rob at Monday, February 22 2010 04:13 AM (JU/Kj)


There are certain topics which make zealots come out of the woodwork to defend their particular religious point of view. Alternate Energy is one of those, as I've found out by personal experience.

It's surprising to learn that databases are like that.

Posted by: Steven Den Beste at Monday, February 22 2010 08:40 AM (+rSRq)

64 . All databases suck because you cant find 1 that does "everything" for you. If I were to do a writeup on why Windows 7 doesn't cook me fried beans, it'd be as sensible, but more worthwhile.

Posted by: sofar at Tuesday, February 23 2010 10:40 PM (EIONZ)

65 All databases suck because none of them deliver an acceptable combination of price, performance, scalability and convenience for my application.

I don't want them to "do everything".  I want them not to require me to jump through flaming hoops in my programming, or spend triple my current operating budget, or both.

Go away.

Posted by: Pixy Misa at Wednesday, February 24 2010 01:11 AM (PiXy!)

Hide Comments | Add Comment

Comments are disabled. Post is locked.
100kb generated in CPU 0.0211, elapsed 0.1024 seconds.
56 queries taking 0.0864 seconds, 391 records returned.
Powered by Minx 1.1.6c-pink.