To NoSQL or not to NoSQL?

As part of Total ReCal we’ve been taking a look at the so-called NoSQL approach to databases. I gave a quick overview of NoSQL and why we were looking at it in a previous blog post, so I’m going to skip all the gory details of what NoSQL actually is (and why we’re using it), and leap straight into the discussion on if it’s any good, if it’s ready for prime-time, and if it’s ready for the HE sector to actually use in production.

Is it any good?

In a word, yes. In slightly more words, yes, but only if you use it in the right place. NoSQL is excellent at providing fast, direct access to massive sets of unstructured data. By ‘fast’ I mean ‘thousandths of a second’, and by ‘massive’ I mean ‘billions of items’. On the other hand, if you’re after rock-solid data integrity and the ability to perform functions like JOIN queries then you’re out of luck and you should stick to an RDBMS. The two approaches aren’t competing, but offer complementary functionality. A corkscrew and a bottle opener both let you into your drink, but it’ll be amazingly awkward to open your beer with a corkscrew.

Continue reading