Is NoSQL Finally Going Mainstream?

February 12th, 2010

Its been a while since I enjoyed my adventures with CouchDB. I sure wish I could have some extra time to pick this up again, but getting some sleep at night is nice too once in a while. I noticed that OO databases and document/key-value stores are getting more and more traction lately and I must say that its about time.

Rob Conery hits the nail right on the head in his post on Reporting in NoSQL.

Put as gently as I can – relational systems are an answer to a problem that we faced 30 years ago. What you’re doing now is nothing other than compensating for a lack of imagination from the platform developers. Think about it – we code using Object Oriented approaches, we store those objects in a relational system.

What we should all learn in this industry is to stop assuming that a relational database is the default option for storing the data of every solution we build. This is what we have been doing for a long time and its pure madness, plain and simple.

RDBMS don’t fit for holding your application’s data, and they don’t fit for reporting. They’re a solution for a problem that doesn’t exist anymore. Time to kick them to the curb.

The most typical setup you see is a single relational database that is used for both storing the data of an application as well as reporting from this data. The relational schema usually sits between normalized and denormalized tables, which means having a compromise for both needs. You can get away with this for small to medium-sized applications, but when you start working on mission-critical solutions with higher volumes, this compromise isn’t going to cut it anymore. This is why Greg Young, Udi Dahan and Mark Nijhof amongst others are advocating command query separation. For these kind of solutions, you want to have the best option for handling commands, which could be an OO database or a document/key-value store (with or without event-sourcing) and for reporting you’d want the best option available as well like an OLAP system. What I’m describing here is just the elevator pitch, so if you want to learn more about this then do checkout the resources that these gentlemen mentioned above have already put on their blogs.

I hope that one day we realize that a relational database was just a means for optimizing file storage, which is hardly a need anymore these days. We shouldn’t be struggling with how to solve the impedance mismatch between relational databases and OO programming in any kind of application. The one thing we should care about is how to provide solid and clean solutions to our businesses without having to worry about tables and those zealots with their holy database schemas. Just store the objects you want and worry about other things like the so-called ‘ilities’ and being able to respond to business needs in a timely manner. 

  • Silvano

    @Jan, sorry but in the real world data usually survives the apps. You’re not paid to just persist an object somewere… you need to create data models that will be used in the future from different applications, using different programming platforms… maybe will be used in completely different ways with respect to how they’ve been collected… today, and for a lot of time forward, relational data models are the best way to achieve this goal. I know, it’s hard to understand for a “pure” developer, but life it’s a little bit more complicated than “persist an object…” :) )

  • http://www.drawntoscalehq.com Bradford

    “Mainstream” is a stretch, but when I hear people on the street complaining about how their SQL Database is a nightmare to deal with (and they’re not doing complex things with it), it’s certainly progress.

    Our startup, Drawn To Scale, is building a platform around the points you’ve made: we’re solving different problems today than we did 30 years ago when the RDBMS was designed. It turns out that when you build something from the ground up to handle what we *really* do with data, it becomes scalable, fast, and easy to use :)

  • http://elegantcode.com Jan Van Ryswyck

    @Silvano If it’s just about the data, then why don’t we just hand-out Excel? The data is associated with the app, not the other way around. Having multiple apps on the the same database smells like shared database:

    http://elegantcode.com/2009/03/28/about-a-shared-database/

    I never ever saw this work properly. And to quote Jeremy D. Miller on this

    “With very few exceptions, I’d say
    that at this point that if you’re writing ADO.Net code or SQL by hand, you’re stealing money from your employer.”

  • PT

    *sigh*

    Yet another developer who doesn’t understand the difference between the Relational Model and almost-Relational RDBMSes.

    Coupled with the common trait in developers that they always get “props” for working out new tech, and never for fully understanding old tech, it’s no wonder that this newest of database strawmen will die slowly, like XML and OO databases before it.

    Let’s keep it simple: the Relational Model is nothing more than a logical way to understand and therefore access/validate a data model. It has nothing to do with files, disks, tables, key-value stores, etc. That’s the physical implementation thereof.

    There is no such thing as semi- or unstructured data. There is always a structure. The question is how detailed it needs to be. I could create a “relational” database with one “table” and two attributes: Key and Value. I shove everything in it and call it ArmchairDB. I think the NoSQL people, like almost all developers, conflate the physical layer with the logical one all too often. I also find it funny when some NoSQL databases offer some sort of “logical querying language” that eventually grows up into some knock-off of SQL.

    Lastly, am I the only guy who’s never struggled with the “impedance mismatch”? Or, said another way, doesn’t see it as such, but rather as the semantic mismatch that inevitably exists between any two systems built with a different vision? In fact, the Relational Model, if well implemented, does have an easy solution for this so-called mismatch. If one offered proper domain support, RDBMSes would be even easier to use.

    • http://elegantcode.com David Starr

      I don’t suppose you could man up enough to make your point without being personally insulting to a man who has done so much for the craft of software development our community in general?

      Oh, we won’t remove your comment, because it should stand in posterity to your rudeness. You should also note at this point that whatever technical message you were trying to make is now lost in this little drama.

      Good job, Mr. Professional.

  • Nick

    OODB – You order a car and it arrives outside your house.

    RDB – You order a car, and its posted piece by piece (field) through your letter box where it has to be reassembled.

  • http://elegantcode.com Jan Van Ryswyck

    @Nick
    @Nick I like that analogy. Thx for sharing.

  • ozzebolleO

    @David Starr

    PT’s rudeness was unnecessary but then again, no excuse to dismiss the technical message. I’m not disagreeing with that.

  • PT

    @David:

    Insulting? My apologies. It was not intended to be personally insulting.

    But, if you want to take the aggressive approach that “relational database was just a means for optimizing file storage”, then you best be prepared for like responses *on that topic*, because I will stand by the assertion that that is completely and utterly wrong.

    RM, and its derivative language SQL, is very powerful. Good for all scenarios? Absolutely not, but still one of the most powerful tools in any dev’s arsenal. I guess if that is perceived as insulting to Jan to find some*thing* they said or believe wrong, then I am at a loss.

    I could just as easily say “NoSQL databases answer the difficulty developers have with designing a suitable structure to data by letting anarchy reign and allowing them to store anything in any unstructured way”, but that wouldn’t be true, would it?

    I meant no disrespect to Jan as a person. I’m sure that if we met for a beer, we’d all have a good time, although you probably now would refuse. :) I shouldn’t have let my frustrations with the rampant NoSQL hype color my post.

    @Nick:

    Follow up to your analogy:

    RDB – After you assemble your car, which the dealer could have assembled for you before shipping, you then order a missing piston. You get the missing piston.

    OODB – You want to store just a piston in the warehouse, you can’t. You have to create a fake car around it. Once you store the piston, you order a piston, and you get a car, and instructions on how to trace down the part chart to get to the piston.

    I’m sure we can go on and on…

  • http://elegantcode.com Jan Van Ryswyck

    @PT
    Did you ever take a serious look into a specific NoSQL data store (e.g. CouchDB)? There is no ‘data model’ that constrains the documents one stores. Map-reduce isn’t ‘knock-off of SQL’ either.

    I agree that you can use an RDBMS and use a key/value table and it has been done, but that somewhat mitigates the point of buying Oracle and SQL isn’t going to be much of help either.

    When I develop an app, I start with the domain model because that’s where the interesting functionality of the business domain lives. This is the place where I provide a model that corresponds with the feedback of the domain experts. I don’t care about a data model or database because it doesn’t interest the business (nor it shouldn’t), so it doesn’t interest me either. When I’m done with a fully fletched model, the shape of the domain is almost always different from the tables layed out by the DB folks. It just need to be able to store these things and move on. NHibernate reduces some of the pain, but the mismatch is still there and a NoSQL data store reduces it even further.

    I want to conclude by saying that the largest apps/websites in the world all run from a NoSQL DB so why wouldn’t we learn, pick this up and move on?

    PS: I’m sure you meant no disrespect. Discussions like these can get passionate and intense. Been there :-)

  • PT

    Hmm. So many interesting topics flying at once in the conversation that it’s hard to keep it brief and focused! :)

    The root of the problem I have is getting devs to differentiate the logical from the physical in the mixed-up world of SQL, RDBMSes, NoSQL, CAP, BASE, and other such complex beasts. There’s so much conflating going on. And, the movement’s extremely poor choice of a name only compounds the issue.

    First point: You will not get an argument from me that NoSQL (ugh!) databases currently address the physical, “CAP Theorem” problem better than ol’skool RDBMSes. Many RDBMSes (esp. startup-friendly, popular DBs like MySQL) weren’t built to address distributed petabytes of simple structure data with limited consistency needs but overwhelming partitioning ones.

    That doesn’t mean that:
    a) all systems have those kinds of FBish needs. In fact, FB and its ilk lie at the far end of the bell curve, and no, the intertubes won’t change that. :)

    b) that RDBMSes can’t be made adaptable to these scenarios. (ex: Drizzle)

    c) that we need to push down those CAP needs on to simpler web applications (which I am not sure if that is what you mean by “the largest apps/websites in the world all run from a NoSQL DB so why wouldn’t we learn, pick this up and move on”).

    Second point: SQL, and the Relational Model that fathers it, is a different topic. I am not sure why one would think that the Relational Model is not layerable[sic] over distributed key-value stores, but I am open to being beat on the head with a good reason! :) I am far from a guru in the world of developing NoSQL implementations.

    PS: Your comment on your practices of the domain as a starting point is interesting. I naturally gravitate to the exact opposite. I start from both the absolute “bottom” and absolute “top” of a typical web app, namely I use ORM (Object-Role Modeling) to capture the conceptual structure of the data (and many of its constraints), and use cases to capture the behaviors in the system, and from there I use the domain model as the Play-Doh layer to merge those two worlds.