This strange word you’ve just read is present in everyday life and makes your financial transactions safe. How so? Let’s start from the top.
The term idempotence, introduced in 1870 by the American mathematician Benjamin Peirce, means “the quality of having the same power.” It is related to elements of algebras that remain invariant when raised to a positive integer power. In the context of unary operations, an idempotent function is a function such that f(f(x)) = f(x) for each xX. In other words, it is an operation that, when applied twice, gives the same result. Some examples of such functions are absolute value, floor, ceiling, constant function, and identity function.
It also applies to binary operations, such that xx=x for each xX. Examples? Union and intersection of sets or logical conjunction and disjunction.
And now the hard part’s behind us.
Imagine an elevator. You enter and press a button with the floor number. This time you want to go to the fifth floor. You press the button again, just to be sure. And a few more times so that the doors close faster. You have pressed the button with number 5 seven times. Do you get off on the 35th floor? No, right?
Or, when you press the button at a crosswalk. Once you do that, any following pushes will not make anything happen. They will not speed up the process, cancel the request, or change anything.
Modern CPUs like ARM are built on top of a load-store instruction set architecture. In this architecture, instructions that might cause a page fault are idempotent. So, if a page fault occurs, the operating system can load the page from the disk and then simply re-execute the faulted instruction. Dealing with page faults is much more complex in a processor where such instructions are not idempotent.
Amazon Web Services Simple Queue Service offers a secure, durable, and available hosted queue for integrating and decoupling distributed software systems and components. One of the features of SQS is the so-called “at-least-once delivery.” Amazon SQS stores copies of messages on multiple servers to ensure redundancy and high availability. In rare instances, one of the servers storing a copy of the message may be unavailable when receiving or deleting the message. If this happens, the copy of the message will not be deleted on the unavailable server – you may receive this copy again. This can lead to a situation where one message is handled several times. That’s why the AWS SQS Developer guide encourages you to “design your applications to be idempotent.”
Looking at all of those examples might raise a question: if these systems work this way, why do your APIs, when requested with precisely identical payloads, modify the server state more than once?
To safeguard your APIs, first, we need to discuss the protocol your clients will most likely use. As you know, HTTP defines several methods for handling requests. Although each method has its own semantics, features that describe all HTTP methods are: security, idempotency, and caching.
What do these features mean?
Regarding the idempotence of methods, only the state of the server is considered idempotent, and the response returned by each request may differ. For example, the first DELETE call will return code 200 SUCCESS, while subsequent ones will return code 404 NOT FOUND.
As you can see, HTTP methods give you just a bit of safety. To fully protect your APIs, additional steps need to be taken. Let’s focus on the whys before we jump into the implementation details.
Idempotency gives our systems the following benefits:
Network failures are inevitable. Idempotency is crucial in handling these failures by allowing clients to resend requests without fear of unintended consequences. This approach ensures data consistency and avoids duplicate operations, making systems more robust.
Distributed systems often spread tasks across multiple servers. Idempotency is vital in such environments as it enables different servers to process the same request without altering the outcome. This not only enhances scalability but also improves fault tolerance.
Idempotency can significantly optimise performance by improving caching mechanisms. When operations are guaranteed to lead to the same final state, servers can cache results effectively, resulting in quicker response times.
Designing operations to be idempotent simplifies code and reduces the complexity of error handling, allowing developers to concentrate on core functionalities without being concerned about the side effects of repeated executions.
Today, I want to show you two different methods of protecting your APIs. The first method is implemented in Powertools for AWS, a set of tools in different programming languages that implement Serverless best practices and increase developer velocity.
It uses Amazon DynamoDB to store the idempotency keys, which are calculated as a hash of the whole payload (or part of it; it’s configurable!) passed to the Lambda function.
If such a key has not expired, the response is returned from the cache.
If the key is invalid, the response is calculated and stored before it returns to the client.
This simple mechanism serves well and protects the system well.
And with the use of AWS Powertools, it is a matter of a couple of lines of code.
However, I am not the biggest fan of this approach. As you know, hashing algorithms are pretty resource-heavy, and the whole payload of the lambda function is rather a big object prone to change.
A rather simplified approach can be implemented if you own both the server and the client.
It requires the client to calculate the idempotency key and send it to the server, most likely in the form of a custom HTTP header. The caching mechanism works in the same way as in the first implementation. Let’s take a look at the simplest implementation in TypeScript, using Hono framework:
const app = new Hono().post(
"idempotence",
validator("header", (value, ctx) => {
const idempotencyKey = value["x-idempotency-key"];
if (idempotencyKey == undefined || idempotencyKey === "") {
throw new HTTPException(400, {
message: "X-Idempotency-Key is required",
});
}
return { idempotencyKey };
}),
async (ctx) => {
const { idempotencyKey } = ctx.req.valid("header");
const body = await ctx.req.json();
if (IDEMPOTENCY_CACHE.has(idempotencyKey)) {
return ctx.json(IDEMPOTENCY_CACHE.get(idempotencyKey));
} else {
const response = {/* … */};
IDEMPOTENCY_CACHE.set(idempotencyKey, response);
return ctx.json(response);
}
},
);
In this example, we’re using the Hono validator for the header value. If the idempotency key is valid and present in the cache, we return the response from the cache. Of course, the implementation here is simplified and has no time-to-live mechanism for the cache values.
Introducing some small overhead on the client’s side reduces the complexity of hashing the whole payload.
In some cases, you will want to implement a hybrid approach – if the client does not provide an idempotency key, you can calculate it on the server.
The example below shows how Airbnb separates network communication from database transactions, allowing only two possible outcomes: success or failure, with consistency.
Airbnb has developed a robust approach to idempotency within its payment processing system, which can be divided into three distinct phases: pre-RPC, RPC, and post-RPC.
Before initiating any remote procedure call (RPC), Airbnb records the details of the payment request in its database. This step guarantees the creation of a persistent record of the transaction request before any network communication. This allows Airbnb to maintain a consistent state and refer back to the original request details if necessary.
During the RPC phase, the request is sent to an external service via the network, and the response is received. This phase is vital for performing idempotent computations or RPCs. For example, if a transaction is being retried, the system may initially query the external service to ascertain its status. This process guarantees that duplicate transactions are not processed, maintaining the integrity of the operation.
Once a response has been received from the external service, Airbnb records the details of this response in the database. This encompasses data regarding the success or failure of the transaction and the possibility of a retry in the event of a failed request. By logging these details, Airbnb can make informed decisions about subsequent actions, such as whether to retry a transaction.
To maintain data integrity throughout these phases, Airbnb adheres to two fundamental rules:
As you can see, idempotency – much like all other mathematic principles – is all around us, regardless of whether we notice it or even know what it means. And it’s as important in our APIs, as it is in getting elevators where they need to get. Implementing idempotency is not just a best practice, it’s a necessity for building resilient, scalable, and efficient systems. By safeguarding against duplicate operations, handling network failures, and simplifying error management, idempotence saves you as a developer a lot of headaches. And if that’s not a reason enough to implement it, I don’t know what is.
Have a project in mind?
Let’s meet - book a free consultation and we’ll get back to you within 24 hrs.
Dominik is the Chief Innovation Officer at Gorrion and a full-stack software developer by both heart and trade. He is passionate about new technologies, teaching, and open-source. Sharing knowledge is what truly drives him, so you’ll often find him speaking at conferences and meet-ups. After work, he tends to work even more, but he also likes boxing, cycling, and bartending.