Sunday, February 07
Anyone?
Quick recap of databases that suck - or at least, suck for my purposes - and some that I'm still investigating.
SQL
- MySQL Lacks intersection and except, lacks array support, has only so-so full-text indexing, offers either concurrency or full-text indexes and GIS, but not both.
- PostgreSQL Provides arrays, concurrency and full-text indexes and GIS, but a lot of this is lost in a twisty maze of plugins and thoroughly non-standard operators. And the full-text indexing sucks.
- Ingres Ingres is free these days (if you don't need a support contract). It's a good, solid database, but doesn't actually offer anything I can't get from MySQL with InnoDB.
- Firebird Doesn't seem to offer anything more than MySQL or PostgreSQL or Ingres. Which doesn't mean that it's bad, but doesn't really help me either.
- SQL Server Needs Windows, which is worth an automatic 6 demerits, even though I can get enterprise-level Windows and SQL Server products for free. (My company is a Microsoft Bizspark member.) Full-text and GIS, intersect and except are all there, but still no arrays.
- IBM DB2 Costs too much.
- Oracle Costs way too much.
- Progress / OpenEdge Solid database, lovely 4GL, but still, the last time I looked at it (2006) mired in 16-bitness (!) and the 4GL is too slow for anything complicated. Also expensive and has a screwed-up pricing model. Would use it if I could.
NoSQL
- Redis Nice feature set, and looks very useful for small systems, but the current version is strictly memory-based. (It's persistent through snapshots and logging, but the whole database must fit in memory.) The developers are working on this, though. The API could do with a tidy-up too; it has different calls for the same operation on different data structures.
- MongoDB Very nice feature set. It's a document database, but it stores the documents in a JSON-like structure (called BSON) and can nest documents arbitrarily and inspect the fields within a document and build indexes on them. But its memory handling is lousy; while it's not explicitly memory-based, I wouldn't want to run it on anything but a dedicated physical server with more memory than my total database size. I could just throw money at it and put another 24GB of RAM in the server (far cheaper than a commercial RDBMS license) which would last us for a while, but I have serious doubts about its robustness as well.
- CouchDB Written in Erlang, which is always a warning sign. Erlang programmers seem to care about performance and reliability far more than they care about making a product that anyone would want to use. In this case, instead of MongoDB's elegant query-by-example (with extensions) I write map/reduce functions in JavaScript and send them to the server. In what universe is that an improvement on SQL? On the plus side, it apparently has replication. On the minus side, it's an Apache project, and I have yet to meet an Apache project that didn't suck in some major way.
- HBase Looks good if you have billions of very regular rows (which I do at my day job, but not here). Nothing wrong with it, but not a good fit.
- Project Voldemort Pure evil. No, wait. This one came out of LinkedIn. It's one of the recent flock of inherently scalable (automatic sharding and multi-master replication) key/value databases. In their own words, [it] is basically just a big, distributed, persistent, fault-tolerant hash table. That's a very useful thing, but I need defined ordering (multiple defined orderings for the same dataset, in fact) which a hash table can't give me.
- Cassandra This is Facebook's distributed hash table thingy (it's like the old days when every server company designed their own CPU). May have some vague concept of ordering, so I'll take a closer look.
- Jackrabbit It's a Java/XML datastore from the Apache Foundation. Uh-uh. I've used ActiveMQ guys. You can't fool me twice. I'd sooner chew rusty nails.
- Riak Bleah. Another key/value/map/reduce thing. In their own words:
A "map phase" is essentially just a function ("F") and an argument ("A") that is defined as part of a series of phases making up a given map/reduce query. The phase will receive a stream of inputs ("I"), each of which consists of a key identifying a Riak object and an optional additional data element to accompany that object. As each input is received by the phase, a node that already contains the document ("D") corresponding to "I" will run
Clear? Right. Not interested at all.F(D,A)
and stream along the results to the next phase. The point here is that your function can be executed over many data items, but instead of collecting all of the data items in one place it will execute wherever the data is already placed. - LightCloud A distributed key-value store from Plurk. On the plus side, it's written in Python, supports Tokyo Tyrant and/or Redis for a back end, and "plurk" is fun to say. On the downside, it seems to be just a key/value database and not all that fast; it doesn't seem to expose the more interesting features of Tokyo Cabinet or Redis. It does at least have some update-in-place operations.
- GT.M GT.M is a crocodile. That's not an aspersion, exactly. Crocodiles were contemporaries of the dinosaurs, but when the dinosaurs went extinct, the crocodiles survived, and they are still around and occasionally snacking on jumped-up bipeds today. It's a hierachical key-value store with a variety of access mechanisms. It's unquestionably powerful, but it looks clunky; the MUMPS code reminds me of the systems I was employed to replace as little boy programmer in the 80's, and the Python interface doesn't actually look like Python, but more like some odd offspring of Cobol and Pascal.
- Neo4j Neo4j is a graph database, which is not something I've worked with before. Graphs are a very general data structure for mapping relationships; while relational databases model parent-child relationships well, graphs are the natural model for networks of friends (for example) where you can end up back where you started by any number of differing paths. The shortcoming of graphs is that they do not have a defined order, which is something I need a lot of around here.
- Berkeley DB An oldie but a goodie. An embedded, transactional database. You can shove pretty much anything into it; it doesn't care. No query language, but does have indexes. One problem is this clause from the license:
Redistributions in any form must be accompanied by information on how to obtain complete source code for the DB software and any accompanying software that uses the DB software. The source code must either be included in the distribution or be available for no more than the cost of distribution plus a nominal fee, and must be freely redistributable under reasonable conditions.
Any code that uses the DB software? I assume they mean direct code embedding/linking, but that's pretty broad. And it's really just a library, albeit a good one; it could serve as the basis for a database server, but it isn't that by itself. - Metakit Metakit is a column-oriented database library, with a very nice, clean Python interface. For example, to display all posts by user 'Pixy Misa', you could simply write:
The problem is, it doesn't scale. I tried using it for the first pass at Minx, about four years ago, and it broke long before it reached our current database size. Like MongoDB, nice semantics, not so great on the implementation.for post in posts.select(user = 'Pixy Misa'):
print post.title, post.date - Tokyo Cabinet / Tokyo Tyrant / Tokyo Dystopia, VertexDB Tokyo Cabinet is a database library similar to Berkeley DB, but licensed under the no-worries LGPL. Tyrant is a "lightweight" database server built on Cabinet, Dystopia a full-text search engine built on Cabinet, and VertexDB a graph database built on Cabinet. I haven't explored these in depth yet because the standard Tokyo Cabinet distribution doesn't include Python libraries (Perl, Ruby, Java and Lua, but no Python?), but there are third-party libraries available.
- Xapian and Omega Xapian is a full-text search library, and Omega a search engine built on Xapian. In fact, Xapian is more than that; it can do range searches on strings, numbers, and dates as well, and can store arbitrary documents. It's quite good for searches, but not really suited to general database work.
Posted by: Pixy Misa at
01:17 AM
| Comments (67)
| Add Comment
| Trackbacks (Suck)
Post contains 1381 words, total size 10 kb.
One additional comment, per your parenthetical on Project Voldemort; the generic NoSQL answer to "order" is that you have a key that you store the order in: "blog_posts_(blog_id)_order_chrono_asc" => [1, 2, 4, 8, 3, ...]. Every time you add a new post, you have to update all the relevant index keys. Yes, this sucks and is prone to bugs... there's a reason I'm going with the hybrid approach.
Nice summary, though; this is the most complete listing of these technologies I've seen.
Posted by: Jeremy Bowers at Sunday, February 07 2010 04:31 AM (yu3og)
Posted by: RickC at Sunday, February 07 2010 04:56 AM (8GbPX)
Glad to see another Progress programmer - I worked with it from 1988 to 2006, and I still haven't found anything better for ad-hoc data manipulation.
Posted by: Pixy Misa at Sunday, February 07 2010 05:34 AM (PiXy!)
-j
Posted by: J Greely at Sunday, February 07 2010 11:52 AM (Pue/B)
Posted by: BJ Clark at Wednesday, February 10 2010 05:57 AM (1SMNd)
Posted by: Pixy Misa at Wednesday, February 10 2010 06:41 AM (PiXy!)
Posted by: Pixy Misa at Wednesday, February 10 2010 07:06 AM (PiXy!)
Posted by: Pixy Misa at Wednesday, February 10 2010 07:07 AM (PiXy!)
Posted by: Guillaume Theoret at Wednesday, February 10 2010 09:47 AM (F3Kuu)
Minx is a blog/forum/wiki/social networking system and an evolving application platform. I need a database suited to that.
So primary use cases are -
- Document store, where either the documents don't change much or versioning is actively desirable.
- Fixed-length randomly accessed data, like statistics.
- Variable-length randomly accessed data that fits nicely in memory (user tables, session data, etc).
What I do need -
- Fast operation on complex sorts and selects.
What I don't need -
- Transactions.
I guess Pita (see more recent posts) is a database for mid-scale social networks. My goal is for it to be suitable for systems from hundreds of thousands of documents to hundreds of millions (I currently have around three million posts and comments). I don't plan to even consider the problem of billions of documents at this point (that's what I do at my day job). I do plan to keep it small and simple enough that anyone starting a social web project would think of using it.
Posted by: Pixy Misa at Wednesday, February 10 2010 09:55 AM (PiXy!)
And the full-text indexing sucks.
</quote>
What don't you like about PostgreSQL's full text search? It works for me; curious why you don't like it..
Posted by: Eugene Koontz at Wednesday, February 10 2010 02:02 PM (0ivZI)
Posted by: Pixy Misa at Wednesday, February 10 2010 02:21 PM (PiXy!)
Posted by: Djui at Wednesday, February 10 2010 05:46 PM (NyKge)
I took a quick look - given all its constraints, I'm not sure who it is good for!
Posted by: Pixy Misa at Wednesday, February 10 2010 05:53 PM (PiXy!)
Posted by: Peter Neubauer at Wednesday, February 10 2010 06:24 PM (QZXb3)
So, who's linked to me?
Posted by: Pixy Misa at Wednesday, February 10 2010 06:27 PM (PiXy!)
Posted by: Rob at Wednesday, February 10 2010 09:34 PM (JU/Kj)
Posted by: Pixy Misa at Wednesday, February 10 2010 09:46 PM (PiXy!)
MUMPS!
I try to avoid programming languages that are as old as me.
Except Algol. Algol rules.
Posted by: Pixy Misa at Thursday, February 11 2010 12:41 AM (PiXy!)
Are you now able to choose a database?
Posted by: Mike at Thursday, February 11 2010 12:41 AM (SK6Rk)
Posted by: Rob at Thursday, February 11 2010 12:54 AM (JU/Kj)
And no, that doesn't make my problem easier. It makes it worse.
Posted by: Pixy Misa at Thursday, February 11 2010 01:02 AM (PiXy!)
Posted by: Pixy Misa at Thursday, February 11 2010 01:09 AM (PiXy!)
Posted by: Rob at Thursday, February 11 2010 01:19 AM (JU/Kj)
(Mind you, there are those who would also accuse me of writing un-Pythonic code, but I think a two-line for loop is more readable than something like
print sum(map(float, open('test.numbers').readlines()))
Human beings read sequentially, not from the middle of the line outwards in both directions. Well, except for German, which is a stack-based language.)
There are two particular interfaces I plan to steal from - MongoDB's PyMongo and MongoEngine and Metakit's Mk4py - plus the TreeDict data structure.
Posted by: Pixy Misa at Thursday, February 11 2010 01:44 AM (PiXy!)
Posted by: Rob at Thursday, February 11 2010 02:14 AM (JU/Kj)
Posted by: Dave at Thursday, February 11 2010 06:17 AM (SCYxm)
I agree it is hard to set up the first time, but once you understand it you can script it - you could even use Python if you like:
http://www.postgresql.org/docs/8.4/interactive/plpython-database.html
As far as the size increase, I guess my databases are too small for that to be a factor for me, and the growth is linear with the number of rows..
Posted by: Eugene Koontz at Thursday, February 11 2010 09:12 AM (0ivZI)
One important thing to realize about GT.M is that while the key-value database and the language are well integrated, either is usable without the other. So, you can easily define and create your own preferred Python or other API to connect to the database. The database engine itself is extremely scalable. There are production sites with individual database files in the 100s of GB and aggregate (logical) databases in the TB. The largest production site I know of serves approximately 10,000 concurrent users during peak usage. Yet, the footprint of the database engine itself on disk is on the order of 20MB, so it also fits nicely into virtual machines and software appliances. GT.M also has excellent functionality for building applications that must remain available 24x365 in the face of not only unplanned events (like crashes) but also planned events (such as system and application upgrades). And yes, you can create an API that accesses all this from Python or anything else that uses a C compatible calling interface.
By the way, for an example of MUMPS code see http://docs.google.com/View?id=dd5f3337_15hssw6bdh (the header has a pointer to the problem that the program solves). Notice the support for multiprocess synchronization, international characters, and of course key-value access to the database that is as simple as accessing an array.
-- Bhaskar
(full disclosure - I manage GT.M)
Posted by: K.S. Bhaskar at Thursday, February 11 2010 09:59 AM (jcSjr)
But back then, horrible was the rule, because (a) language designers didn't have a lot of experience and (b) writing a compiler at all was on the edge of what was even possible.
Hence Cobol, early Fortran, Lisp, Snobol, and so on. All of them are thoroughly horrible languages. But all of them are a hell of a lot better than assembler. Likewise C, which came somewhat later and is concomittantly somewhat cleaner; the description of it as "a portable PDP-11 assembler" is still apt.
None of which excuses modern efforts like Go, which has an inferior syntax and largely inferior semantics to decades-old efforts like Algol. (I think that the entire family of curly-bracket languages should cease to exist, with the one exception of C, which at least had an excuse.... But that's a whole separate rant.)
Now, to the specific issue of MUMPS: When I look at that code, the first thing that strikes me is that I have no visual indication of program flow. I can glance at a short C, Pascal, Java, Python or Ruby program and get an immediate idea of where it comes in, where it goes, and what comes out of this. Not so with MUMPS.
The second thing I notice is that there's punctuation where I'm not expecting punctuation. Most programming languages fail at this; certainly C does, Perl and PHP with their $, Python and Ruby with their respective @. MUMPS isn't as bad as line-noise Perl, but it is opaque to even an expert programmer who is new to the language. That's typical of older languages too.
Posted by: Pixy Misa at Thursday, February 11 2010 01:17 PM (PiXy!)
Posted by: Rob at Thursday, February 11 2010 06:42 PM (JU/Kj)
Posted by: stuffduff at Friday, February 12 2010 05:15 AM (+7HGt)
As I noted in my recent post regarding Progress, there's a right way to do this, and it's been done, and it's not the MUMPS way. MUMPS is still useful, but there are still crocodiles.
The fact that most languages are actually worse at this sort of data handline than MUMPS doesn't mean that there aren't languages that are better. And that's purely on semantics; syntax-wise, MUMPS is just plain horrible.
Posted by: Pixy Misa at Friday, February 12 2010 05:35 AM (PiXy!)
Syntax matters. If programmers back away when they see the sample code, your language is not going to be a success, no matter what semantic advantages it brings to the table. They might be wrong, in a sense, to reject your language for its surface syntax, but it's far easier to evaluate syntax than semantics. And I'd argue that bad syntax has a permanent cost, particularly in maintenance.
Posted by: Pixy Misa at Friday, February 12 2010 05:40 AM (PiXy!)
A major point in my message is that the language and API are very separable from the database / datastore with GT.M. One of the lessons from Rob is that with m_python, M/DB and M/DB:X (discussed at his http://mgatgeway.com site), he is able to rapidly create a variety of APIs/intefaces into the same key-value database.
The content of databases lives even as languages evolve and the ability to access the same long-lived data easily from multiple APIs and interfaces is essential.
Posted by: K.S. Bhaskar at Friday, February 12 2010 06:30 AM (jcSjr)
Posted by: Rob at Friday, February 12 2010 08:06 PM (JU/Kj)
Sorry, I've been busy hating XML parsers today and haven't had time to hate databases.
Posted by: Pixy Misa at Friday, February 12 2010 09:01 PM (PiXy!)
Posted by: Rob at Friday, February 12 2010 09:49 PM (JU/Kj)
Does anyone but the developers of those databases really care?Yes! Because the abstractions of the language and the mindset of the developers leak out into the public API and from there into the applications built on it.
This is one of the great invariants of computer science - all abstractions leak.
Of course this does not make CouchDB or SimpleDB or other Erlang applications useless, or GT.M either. But it actually matters, and it really does affect the adoption of these products.
You can build a wonderful system in Smalltalk, and its users don't need to know that it's written in Smalltalk (except of course that it doesn't look quite like any of the other applications they run), and everyone will be happy (mostly) - until something goes wrong (and something always goes wrong) and you have to find a Smalltalk developer to fix it. There's a lot of different things going on here - simple practicality, network effects and so on - but it's very real.
Posted by: Pixy Misa at Friday, February 12 2010 10:42 PM (PiXy!)
Posted by: W-Mark Kubacki at Saturday, February 13 2010 05:31 AM (mjrUB)
With GT.M replication, all business logic (updates) are processed on one node. From that one node, updates are streamed in real time to as many as sixteen replicating nodes, each of which can in turn stream to 16 more (for a total of 256 at the third level) and so on without limit. The replicating nodes can be used for queries but not updates. In the event the original node goes down for an unplanned event, or a planned event, any downstream node can become the new originating node (although network considerations probably suggest making one of the 16 second level nodes the new originating node). When the original node is returned to service, it becomes a replicating node.
That's a start but there's actually a lot more to it (for example, the nodes have eventual Consistency not just eventual consistency, the difference being that the design provides for all nodes ultimately having the same path through the database state space, not just having the same state).
Posted by: K.S. Bhaskar at Saturday, February 13 2010 07:53 AM (bLn04)
I'm aiming (if I go ahead with this project) to offer a choice, either eventual consistency (fine for most web app requirements) or state space consistency. So the comments on a blog can be eventually consistent, but the payment history for your accounts database needs to do better than that!
Posted by: Pixy Misa at Saturday, February 13 2010 12:40 PM (PiXy!)
you may want to check out Terrastore as well: http://code.google.com/p/terrastore
It's a (brand new) distributed document store with per-record strong consistency.
I'd be curious to know where it sucks: I'll take your feedback as on opportunity to improve it
Cheers,
Sergio B.
Posted by: Sergio Bossa at Saturday, February 13 2010 06:58 PM (ZVAu2)
Posted by: hedgehog at Saturday, February 13 2010 07:37 PM (3JNr9)
hedgehog - I completly forgot MonetDB, and I have looked at it in the past. Thanks for reminding me! I'll add it to my list.
Posted by: Pixy Misa at Saturday, February 13 2010 10:04 PM (PiXy!)
Posted by: Pixy Misa at Saturday, February 13 2010 10:21 PM (PiXy!)
glad to know you're looking into Terrastore.
What do you mean, more specifically, with "strong ordering"?
You can perform dynamically ordered range queries (on keys) with optional predicates (on values): does it fit your needs?
Posted by: Sergio Bossa at Saturday, February 13 2010 11:32 PM (ZVAu2)
- Find the last 20 comments from a given user on a given blog
- Find the last 20 comments on posts in a given category on a given blog
- Find the last 20 posts with a given tag across all blogs
- Find the 20 most commented posts in the last 48 hours by users in Australia
And I need to do it across millions of documents in under a millisecond. Just about every page produced by the system is performing queries like this, and with the new features coming up it's only going to increase.
So one of my main concerns is having a flexible selection of indexing methods (B-tree, R-tree, Quadtree, Trie etc). And I'm a lot less worried about insert time than the ability to zoom in on specific subsets of my data. With the MySQL, this sort of query is the second most time-consuming operation in the entire system (number one is HTML sanitisation on comments; on-demand image processing is even more CPU intensive, but happens far less often).
There is another approach, and it's something I've implemented at my day job, where the documents number in the billions and most other approaches simply don't work. And that's to pre-define the complex subset operations and prebuild them. That's certainly something that can be done with a key/value store, and it's the only way to go for really high volumes. But I have a dynamic template language (and in the upcoming version, a matching API and scripting language), and I don't want to put arbitrary restrictions on users if I can avoid it. Optimising the common case, though - that I can certainly spend some time on.
Posted by: Pixy Misa at Sunday, February 14 2010 01:44 AM (PiXy!)
"... that's a valuable advantage of master-slave replication systems - you never see local inconsistency. You can be out-of-date, but the data you view from a given replication node will be consistent in itself."
Actually, it's even a little stronger than that. If you use transaction processing on the originating instance, and you transfer a micropayment from an account to view an article under an ACID transaction, and this is replicated, then no process on any instance will ever see an intermediate database state with the account debited but the article not paid for. So, the replication preserves the ACIDity.
Posted by: K.S. Bhaskar at Sunday, February 14 2010 02:27 AM (eyb9K)
Posted by: Pixy Misa at Sunday, February 14 2010 04:02 AM (PiXy!)
"- Find the last 20 comments from a given user on a given blog
- Find the last 20 comments on posts in a given category on a given blog
- Find the last 20 posts with a given tag across all blogs
- Find the 20 most commented posts in the last 48 hours by users in Australia"
A quick comment - when storing, use the negative of time as the time index (array subscript, key - whatever you want to call it). So, when inserting, you're always inserting at the front, since a later post has a smaller index than an earlier post. It may take a few millibleems longer to insert, but if most of what you are doing is reading, the extra effort taken to insert is more than offset by faster reads. Many (maybe even most?) database engines move faster front to back than back to front.
Posted by: K.S. Bhaskar at Sunday, February 14 2010 09:28 AM (eyb9K)
Posted by: Stefano Rivera at Monday, February 15 2010 04:34 PM (rwGpJ)
Of course, what I need is neither SQL nor embedded, so...
Posted by: Pixy Misa at Monday, February 15 2010 04:45 PM (PiXy!)
-j
Posted by: J Greely at Tuesday, February 16 2010 03:36 AM (2XtN5)
You are running a database on separate server, right?
Best of luck to you with finding the right question(s) to ask.
Posted by: Pavel Zaitsev at Wednesday, February 17 2010 03:57 AM (MciT2)
- Find the last 20 comments on posts in a given category on a given blog
- Find the last 20 posts with a given tag across all blogs
- Find the 20 most commented posts in the last 48 hours by users in Australia"
I think there is a perfect fit for such kind of queries:
http://www.sphinxsearch.com/index.html
Full text search engine with QL support and MySQL integration.
and it's fact, really fast.
Posted by: Vlad at Wednesday, February 17 2010 07:52 AM (czPrG)
What I need is an index that spans levels of the document hierachy. MongoDB can do this, but MongoDB has other problems. As Pavel said, it eats all the ram you've got.
Posted by: Pixy Misa at Wednesday, February 17 2010 10:11 AM (PiXy!)
I hate SQL anyway.
Posted by: Pixy Misa at Wednesday, February 17 2010 10:15 AM (PiXy!)
Posted by: Pixy Misa at Wednesday, February 17 2010 10:17 AM (PiXy!)
I also suspect you would have the same complaints about Informix and Informix XPS that you have about Oracle and DB2?
Posted by: Noir at Sunday, February 21 2010 05:06 AM (BPBpw)
Posted by: Pixy Misa at Sunday, February 21 2010 12:46 PM (PiXy!)
Posted by: Rob at Monday, February 22 2010 04:13 AM (JU/Kj)
There are certain topics which make zealots come out of the woodwork to defend their particular religious point of view. Alternate Energy is one of those, as I've found out by personal experience.
It's surprising to learn that databases are like that.
Posted by: Steven Den Beste at Monday, February 22 2010 08:40 AM (+rSRq)
Posted by: sofar at Tuesday, February 23 2010 10:40 PM (EIONZ)
I don't want them to "do everything". I want them not to require me to jump through flaming hoops in my programming, or spend triple my current operating budget, or both.
Go away.
Posted by: Pixy Misa at Wednesday, February 24 2010 01:11 AM (PiXy!)
56 queries taking 0.4133 seconds, 408 records returned.
Powered by Minx 1.1.6c-pink.