reliability

I’m guessing your grandmother probably didn’t have an API permissions control layer, but if she did this wouldn’t be it.

This post is mostly about Nucleus, our name for the storage layer which drives the Total ReCal components. The only way to communicate with Nucleus is over our RESTful API. This comes as somewhat of a shock to some people who believe that the way to move data around is a batch script with direct database access, but I digress…

What I’m going to try to do here is summarise just how epically confusing our permissions handling system for Nucleus is, mostly for the benefit of Alex and myself who (over the next week or so) will be trying to implement this layer without breaking anything important. It’s really, really essential that we get this done before we start promoting the service because of a few simple reasons:

Data security is important, and we don’t want anybody being able to read everything without permission.
Data security is important, and we don’t want anybody being able to write all over the place without permission.
Changing this kind of thing on a live service is like trying to change the engine block on a Formula 1 car whilst it’s racing.
We need to be able to guarantee the system can hold up to DoS attacks or runaway processes hammering the APIs.
People are already asking for access to this data for important things, like their final year projects.

So, where to go from here? Let’s take a look at everything which will be going on in the finished version.

Server Rate Limiting

Even before the Nucleus code kicks in, the server is fine-tuned to avoid overloading from any IP address or hostname. Using a combination of the OS firewall and the web server configuration overall request rates and bandwidth usage is kept below thresholds to ensure that the server is never overloaded. Due to the RESTful nature of the API (in which each request must represent a complete transaction) we have no requirement to ensure server affinity, so if the load gets too heavy we can easily scale horizontally using pretty much any load balancer.

To keep the pipes clear for our ‘essential’ services we do maintain a whitelist of IPs which have higher (but still not uncapped) limits.

Key Based Access

The only way to access any data in Nucleus is with an access token, issued by our OAuth system. These come in two flavours, either a user token (which grants permission for a specific user), or an autonomous token (which is issued at an application level, and is ‘anonymous’). The very first thing that happens with any request is that the token it gives is validated. No token, no access. Invalid token, no access. Revoked token, no access. To keep things nice and fast we store the token lookup table in memory with a cache of a few minutes, since most requests occur in ‘bursts’.

Continue reading →

After looking at the initial brief for Total ReCal, we realised that it would be necessary to build a new data storage layer to handle the time/space information which drives the project. There are many reasons for this both technical and political, but the key reason is that since we are running what is effectively an abstraction and amalgamation service we want to be able to interface directly with our own copy of the data; here’s why.

Speed is often considered a luxury when dealing with large data sets, and especially in larger institutions it’s common to think nothing of waiting a few minutes for a report to finish building or for your operation to finish processing, but we wanted to offer something where you could happily hit it with 20-30 queries a second over an API. This is particularly relevant given our larger Nucleus un-project to expose public (and some private) data over APIs to allow mashups. In short, we don’t want to have to wait for even half a second whilst another service gets the data we’re after, and we especially don’t want to have to waste more time parsing the data into a useful format.

We looked at several possibilities for how to store the data. An obvious one to take a look at is a traditional RDBMS ((Relational Database Management System)) such as PostgreSQL or MS-SQL. In this instance we would most likely have been using MySQL, since it fits smoothly into the almost universally supported LAMP ((Linux, Apache, MySQL, PHP)) stack which is available on our key development server. Alex and myself are both well-versed in using MySQL as a database and interfacing with it using PHP, so should we have opted for an RDBMS it would be the obvious choice despite the rest of the University standardising on MS-SQL.

Continue reading →

Total ReCal

This isn’t your grandmother’s API permissions control layer…

Server Rate Limiting

Key Based Access

Why NoSQL?