Friday 29 July 2011

Why you should care about: idempotence and nullipotence

(The information in the post is somewhat basic and will probably be known to a lot of people, but it's meant more to get my own thoughts straight on the matter than anything else.)

In mathematics, an operation is called idempotent if it can be applied over and over without changing the answer. We've all seen such operations: a familiar example is the absolute value function. (Apply it to a number like -5 and you get back 5. Apply it again and you still get back 5, and so on.) An even simpler example is the identity function: applying it twice is the same as applying it once, which interestingly is the same as not applying it at all.

In computer science, the word can either retain this meaning, and then be used to describe functions like Math.abs(), or take on a slightly different meaning: a function call is idempotent if any side effects it has remain the same when the function's called again. So, for example, a database query saying "insert or update row 42 with the given details" is idempotent -- whether it's executed once or a hundred times, the end result is going to be that the table's going to have a row 42 with the given details.

Idempotence is also important when you're dealing with transactions and trying to redo one on failure: if you know a particular operation in a transaction is idempotent, you can redo it without worrying about whether it already happened the first time.

Related is the idea of nullipotence: a function is nullipotent if not calling it at all has the same side effects as calling it once or more. In practice, this simply means that the function doesn't have any side effects at all. A database query saying "get row 42" is a good example. Nullipotence is clearly a stronger condition than idempotence.

So why am I posting this on a Web-centric aggregator like Planet Mozilla? Well, it turns out that the ideas are fundamental to HTTP. Of the four basic HTTP verbs, GET is nullipotent, PUT and DELETE are idempotent but not nullipotent, and POST isn't idempotent. (It's straightforward to see why.) This means that:
  • GET requests can be cached, since it doesn't matter whether the server sees the request at all.
  • It's safe for web crawlers to make GET requests.
  • You get to see an annoying but unavoidable dialog box if you refresh a POST request, such as this:
Of course, none of these requirements are actually enforced by anyone -- it's really easy to write a server-side script that modifies database rows on a GET request, for instance. They're definitely assumed by others that interact with your service, though, which means you really should adhere to them. On the flip side, you shouldn't use POST for an operation that would work with GET, since that'd lead to your users having an unnecessarily bad UX.

4 comments:

Xav said...

> On the flip side, you shouldn't use POST for
> an operation that would work with GET

Sometimes a POST ends up being used simply so that the parameters aren't exposed in the query string as they are with a GET. We recently had some code audited by a third party security firm who insisted that all calls to the back-end should be POSTs for precisely that reason.

Perhaps there's a need for a form of GET which puts the parameters in the request body, as with a POST, yet retains the nullipotency of GET.

ignorante said...

The "canonical" solution is for operations that "change the database" to use a POST request that return a redirect response.

In that way, you get the benefits of POST and pressing F5 on the returned page does not give you that horrible message and you don't repeat the post request, but the GET request that came after the browser was redirected.

That means handling GET and POST diferently on the server side which many web frameworks make really easy.

Plain old PHP is not one of them.

Sid said...

Yeah, that's the standard solution. It's not a hack either, because you're dividing the operation into a non-idempotent and an idempotent (well, nullipotent) part.

Ted Mielczarek said...

If you only care about modern browsers, you can also use the history.replaceState method to turn your POSTed page into a GETtable page client-side, without a redirect. This got fixed in Bugzilla recently, so after you make changes to a bug you wind up with a usable URL in the location bar:
https://bugzilla.mozilla.org/show_bug.cgi?id=577720