Summary

In math…

…And everywhere else

The state of HTTP

Safeguarding your APIs

Summary

Summary

  • Idempotence ensures that repeated operations produce the same result, making systems more reliable and resilient.
  • It’s present in everyday life (e.g., elevator buttons, crosswalk signals) and is crucial for APIs to handle duplicate requests safely.
  • In HTTP, methods like GET, PUT, and DELETE are idempotent, while POST and PATCH are not.
  • Implementing idempotency improves resilience, scalability, performance, and simplicity in distributed systems.
  • Solutions include server-side hashing (e.g., AWS Powertools) or client-generated idempotency keys.
  • Companies like Airbnb use structured idempotency to ensure safe financial transactions.

This strange word you’ve just read is present in everyday life and makes your financial transactions safe. How so? Let’s start from the top.

In math…

The term idempotence, introduced in 1870 by the American mathematician Benjamin Peirce, means “the quality of having the same power.” It is related to elements of algebras that remain invariant when raised to a positive integer power. In the context of unary operations, an idempotent function is a function such that f(f(x)) = f(x) for each xX. In other words, it is an operation that, when applied twice, gives the same result. Some examples of such functions are absolute value, floor, ceiling, constant function, and identity function.

It also applies to binary operations, such that xx=x for each xX. Examples? Union and intersection of sets or logical conjunction and disjunction.

And now the hard part’s behind us.

…And everywhere else

Imagine an elevator. You enter and press a button with the floor number. This time you want to go to the fifth floor. You press the button again, just to be sure. And a few more times so that the doors close faster. You have pressed the button with number 5 seven times. Do you get off on the 35th floor? No, right?

Or, when you press the button at a crosswalk. Once you do that, any following pushes will not make anything happen. They will not speed up the process, cancel the request, or change anything.

Modern CPUs like ARM are built on top of a load-store instruction set architecture. In this architecture​, instructions that might cause a page fault are idempotent. So, if a page fault occurs, the operating system can load the page from the disk and then simply re-execute the faulted instruction. Dealing with page faults is much more complex in a processor where such instructions are not idempotent.

Amazon Web Services Simple Queue Service offers a secure, durable, and available hosted queue for integrating and decoupling distributed software systems and components. One of the features of SQS is the so-called “at-least-once delivery.” Amazon SQS stores copies of messages on multiple servers to ensure redundancy and high availability. In rare instances, one of the servers storing a copy of the message may be unavailable when receiving or deleting the message. If this happens, the copy of the message will not be deleted on the unavailable server – you may receive this copy again. This can lead to a situation where one message is handled several times. That’s why the AWS SQS Developer guide encourages you to “design your applications to be idempotent.”

Looking at all of those examples might raise a question: if these systems work this way, why do your APIs, when requested with precisely identical payloads, modify the server state more than once?

The state of HTTP

To safeguard your APIs, first, we need to discuss the protocol your clients will most likely use. As you know, HTTP defines several methods for handling requests. Although each method has its own semantics, features that describe all HTTP methods are: security, idempotency, and caching.

What do these features mean?

  • Security determines whether a method changes the server state. In other words, secure HTTP methods are read-only methods: GET, HEAD, OPTIONS.
  • Idempotence determines if the intended effect of a single request on the server is the same as that of several identical requests. Let me emphasise again: EFFECT ON THE SERVER. All secure methods, as well as PUT and DELETE methods, are idempotent.
  • Cacheability speaks for itself and applies to GET and HEAD methods and, under certain conditions, POST and PATCH.
A table comparing HTTP methods (GET, HEAD, OPTIONS, TRACE, PUT, DELETE, POST, PATCH, CONNECT) and their properties: Secure, Idempotent, and Cacheable. Green checkmarks indicate 'yes', red X's indicate 'no', and yellow question marks indicate uncertainty.

Regarding the idempotence of methods, only the state of the server is considered idempotent, and the response returned by each request may differ. For example, the first DELETE call will return code 200 SUCCESS, while subsequent ones will return code 404 NOT FOUND.

A sequence diagram illustrating an idempotent DELETE request. The diagram shows a User, Client, and Server, with the Client making two DELETE requests for the same resource. The first request returns a 200 OK, and the second returns a 404 Not Found, demonstrating that subsequent identical requests don't change the server state.

Safeguarding your APIs

Why should you consider idempotence?

As you can see, HTTP methods give you just a bit of safety. To fully protect your APIs, additional steps need to be taken. Let’s focus on the whys before we jump into the implementation details.

Idempotency gives our systems the following benefits:

Resilience

Network failures are inevitable. Idempotency is crucial in handling these failures by allowing clients to resend requests without fear of unintended consequences. This approach ensures data consistency and avoids duplicate operations, making systems more robust.

Scalability

Distributed systems often spread tasks across multiple servers. Idempotency is vital in such environments as it enables different servers to process the same request without altering the outcome. This not only enhances scalability but also improves fault tolerance.

Performance

Idempotency can significantly optimise performance by improving caching mechanisms. When operations are guaranteed to lead to the same final state, servers can cache results effectively, resulting in quicker response times.

Simplicity

Designing operations to be idempotent simplifies code and reduces the complexity of error handling, allowing developers to concentrate on core functionalities without being concerned about the side effects of repeated executions.

How to implement idempotence?

Today, I want to show you two different methods of protecting your APIs. The first method is implemented in Powertools for AWS, a set of tools in different programming languages that implement Serverless best practices and increase developer velocity.

This sequence diagram illustrates the idempotency mechanism of a Lambda function. It shows how a DynamoDB table, keyed by a hash of the request payload, tracks the status of requests. This prevents duplicate executions by checking for existing 'COMPLETE' records before processing a new request, ensuring that retried requests with the same payload return the same result without re-execution.

It uses Amazon DynamoDB to store the idempotency keys, which are calculated as a hash of the whole payload (or part of it; it’s configurable!) passed to the Lambda function.
If such a key has not expired, the response is returned from the cache.
If the key is invalid, the response is calculated and stored before it returns to the client.
This simple mechanism serves well and protects the system well.
And with the use of AWS Powertools, it is a matter of a couple of lines of code.

However, I am not the biggest fan of this approach. As you know, hashing algorithms are pretty resource-heavy, and the whole payload of the lambda function is rather a big object prone to change.

A rather simplified approach can be implemented if you own both the server and the client.

Sequence diagram showing an idempotent request using a custom header (X-Idempotency-Key) for cache control. The first request results in a cache miss, while subsequent requests with the same key result in a cache hit.

It requires the client to calculate the idempotency key and send it to the server, most likely in the form of a custom HTTP header. The caching mechanism works in the same way as in the first implementation. Let’s take a look at the simplest implementation in TypeScript, using Hono framework:

const app = new Hono().post(
"idempotence",
validator("header", (value, ctx) => {
const idempotencyKey = value["x-idempotency-key"];
if (idempotencyKey == undefined || idempotencyKey === "") {
throw new HTTPException(400, {
message: "X-Idempotency-Key is required",
});
}
return { idempotencyKey };
}),
async (ctx) => {
const { idempotencyKey } = ctx.req.valid("header");
const body = await ctx.req.json();
if (IDEMPOTENCY_CACHE.has(idempotencyKey)) {
return ctx.json(IDEMPOTENCY_CACHE.get(idempotencyKey));
} else {
const response = {/* … */};
IDEMPOTENCY_CACHE.set(idempotencyKey, response);
return ctx.json(response);
}
},
);

In this example, we’re using the Hono validator for the header value. If the idempotency key is valid and present in the cache, we return the response from the cache. Of course, the implementation here is simplified and has no time-to-live mechanism for the cache values.

Introducing some small overhead on the client’s side reduces the complexity of hashing the whole payload.

In some cases, you will want to implement a hybrid approach – if the client does not provide an idempotency key, you can calculate it on the server.

How do others do it?

The example below shows how Airbnb separates network communication from database transactions, allowing only two possible outcomes: success or failure, with consistency.

A flowchart depicting the idempotency process within the Airbnb Payments Service. It shows the flow of a payment request, including idempotency key generation, database updates, and interaction with external services, ensuring that duplicate requests are handled correctly.

Airbnb has developed a robust approach to idempotency within its payment processing system, which can be divided into three distinct phases: pre-RPC, RPC, and post-RPC.
Before initiating any remote procedure call (RPC), Airbnb records the details of the payment request in its database. This step guarantees the creation of a persistent record of the transaction request before any network communication. This allows Airbnb to maintain a consistent state and refer back to the original request details if necessary.

During the RPC phase, the request is sent to an external service via the network, and the response is received. This phase is vital for performing idempotent computations or RPCs. For example, if a transaction is being retried, the system may initially query the external service to ascertain its status. This process guarantees that duplicate transactions are not processed, maintaining the integrity of the operation.

Once a response has been received from the external service, Airbnb records the details of this response in the database. This encompasses data regarding the success or failure of the transaction and the possibility of a retry in the event of a failed request. By logging these details, Airbnb can make informed decisions about subsequent actions, such as whether to retry a transaction.

To maintain data integrity throughout these phases, Airbnb adheres to two fundamental rules:

  1. No service interactions over networks during pre- and post-RPC phases. This rule prevents any network communication from interfering with the database operations, thereby reducing the risk of data inconsistencies.
  2. No database interactions during the RPC phase. By separating database operations from network communications, Airbnb avoids potential conflicts and ensures that each phase operates independently and efficiently.

Summary

As you can see, idempotency – much like all other mathematic principles – is all around us, regardless of whether we notice it or even know what it means. And it’s as important in our APIs, as it is in getting elevators where they need to get. Implementing idempotency is not just a best practice, it’s a necessity for building resilient, scalable, and efficient systems. By safeguarding against duplicate operations, handling network failures, and simplifying error management, idempotence saves you as a developer a lot of headaches. And if that’s not a reason enough to implement it, I don’t know what is.

color-orb
color-orb

Have a project in mind?

Let’s meet - book a free consultation and we’ll get back to you within 24 hrs.

Dominik is the Chief Innovation Officer at Gorrion and a full-stack software developer by both heart and trade. He is passionate about new technologies, teaching, and open-source. Sharing knowledge is what truly drives him, so you’ll often find him speaking at conferences and meet-ups. After work, he tends to work even more, but he also likes boxing, cycling, and bartending.

Other worthy reads