Authentication in Web Apps, Part 2: Tokens
As you can read in the title, this is the second part of a multi-part series on famous authentication methods present on the web. In the last article, we discussed sessions, and today we’re talking about the new kid in town: tokens.
I’ll probably make other articles about different methods of authentication as I learn them myself, but this article— alongside the previous about sessions— presents the last one of what I consider to be the most basic and used methods.
Without further adue, let’s get started.
What are tokens?
I started out the last article by emphasizing the stateless nature of the HTTP protocol and how these more basic authentication methods differentiate themselves by how they store their data. To be more precise:
“This is where sessions and tokens come in. They provide a way to maintain a stateful communication while using a stateless protocol. The differences, advantages and disadvantages of each one come from how they store this state…”
This paves the way nicely to get a good understanding on the definition of tokens: they are an authentication method that stores the state of the interaction within itself (note that I said authentication method, as, unlike sessions, tokens are strictly used for authenticating parties). Instead of it being stored on the server — like it is with sessions — a token is said to be self-contained and carries with it all the information necessary to make a particular interaction stateful.
Let’s make this a bit easier to grasp by comparing it with sessions. Remember how we would map a particular session id to an object on the server? The object could contain pretty much all the information we needed it to, like an user id, an expiry date, etc. When using tokens, all of the relevant data is just stored inside the it— the server doesn’t hold on to anything.
Some people might be confused right now. If the server doesn’t store anything, how does it know what is a valid request? With sessions it was simple, we just had a big list that told us what session ids were valid. How is this done with tokens?
The answer is cryptography. Without getting into too much detail — as this can become quite extensive and an article of its own — , when the server is creating the token, it produces a signature by applying a cryptographic function on both the data (also called the payload) and a secret (a string provided by you). The result of this mathematical wizardry is a very long and seemingly random string that is sent as part of the token; and which also gives us two guarantees:
- No one can tamper with with the content of the token without invalidating the signature (remember, the signature was built using the payload itself). Therefore, fields like expiration date, user id and so forth cannot be changed.
- When a token comes in, we can use our secret in order to validate it. That’s the secret’s job — it signs and validates tokens. This technique is called symmetric cryptography.
With these fundamentals out of the way, let’s take a brief look at how a token-based authentication flow works:
- The user enters their credentials and sends them to the server.
- If the data is valid, then the server creates a token and sends it back to the user.
- The server validates the token (expiry date, signature, etc.) and responds accordingly.
When discussing tokens on the modern day web, people are probably talking about JWTs; but they’re not the only — nor the first — type of token used to authenticate.
They are, however, one of the most famous alternatives. Everybody’s using them on their brand new modern front end app. The good thing is: if you’ve followed along nicely so far, it’s pretty much all the same — after all, it’s still a token.
The jwt.io website defines a JWT as:
JSON Web Token (JWT) is an open standard (RFC 7519) that defines a compact and self-contained way for securely transmitting information between parties as a JSON object. This information can be verified and trusted because it is digitally signed. JWTs can be signed using a secret (with the HMAC algorithm) or a public/private key pair using RSA or ECDSA.
Let’s break this definition apart.
- Open standard — Means that this is just the byproduct of a very rigorous specification.
- Compact — JWTs are considerably easier to transmit over networks when compared to their token peers (we’ll see why this happens when discussing the anatomy of a JWT).
- Self-contained — We’ve already defined this, but it means that the data for the interaction is stored within the token.
- Digitally signed — JWTs use this already explained cryptography strategy in order to validate tokens and their payloads.
- Secret or public/private key pair — The cryptography strategy. Using a secret (the most common when using JWTs) refers to symmetric cryptography, while key pairs refer to asymmetric cryptography.
Nothing new so far, it’s still just a token. Now let’s talk about some details.
This is what a JWT looks like:
This seemingly random string is the form in which JWTs are transmitted across applications. Note that we have three distinct parts, each highlighted by a different color and separated by a dot: the header (red), the payload(blue) and the signature (green). We’ll talk in detail about each in a moment.
Also notice how this just looks like a bunch of gibberish. This is because the token is Base64URL encoded so that it can be more efficiently and securely transmitted across networks. When decoded, it’s just a JSON object.
Let’s talk about each part individually.
The header of the JWT is its most boring part — in most applications, you won’t even deal with it. But, just like any header, it contains metadata about the main object itself (in this case, the token). Normally, it has a field
typ that identifies this object is a JWT and also other fields regarding the cryptographic algorithm used.
This section is the where you store the data you want to send back to the server. We need to be careful about two things when choosing what will go in the payload:
- Is it sensitive data?
- Is the data compact?
Regarding sensitivity, this should be obvious. In most cases, the token isn’t encrypted, it’s only encoded. Anyone that catches it will be able to read its content. So, no storing addresses, phone numbers and such.
Also, a token needs to be compact. We’re not trying to send a huge document on each request to the server, as this would make our application much slower. Even the names of each field are also recommended to be as small as possible. Keep it lean!
There’s not much new to talk about JWT’s signature. It still follows the same principles we’ve discussed earlier: it works as a way of checking the integrity of the data sent with it; if the payload was tampered with in any way, the signature will tell us.
To produce the JWT signature, an algorithm called SHA256 is used on both the encoded header and payload (with your secret), which is then also encoded.
JWTs vs Traditional Sessons
The sessions/cookies vs JWT debate is an ongoing one in the developer community. Some people (like myself) prefer to go with the more traditional approach of sessions, while some like using JWTs for more modern web applications.
The good thing about these discussions is that we’re talking about tools, which have a particular use case, just like a hammer or a corkscrew. Therefore, for those cases, we can objectively say what’s better and what’s not. So let’s do just that — let’s see the good and bad sides of JWTs and compare them with the previous alternative.
First, we need to realize that there are two main scenarios in which we can use JWTs. Those are:
- Server-to-server communication
- Sessions in web applications
I’ll be talking about these two.
JWT and sessions aren’t a good match. To put it in a few words, you’re essentially going to end up either with something unnecessarily more complicated than or very similar to sessions. Let me be more specific.
joepie91 said pretty much everything there is to say about this subject, but I’ll give my main reason why I think this isn’t a good idea.
Imagine a common authentication workflow with tokens. An user authenticates and then receives a token that goes along with each request to the server. Now let’s also imagine that the user has logged out/changed their password. The token, which is still acceptable by the server, should no longer be valid. To solve this, you’re gonna need to store things on the server — which goes against the whole self-contained thing, but let’s dive a bit more deeper.
There are two solutions to this:
- Store valid tokens
- Store invalid tokens
Storing Valid Tokens
Congratulations, you’ve just invented traditional sessions!
This is essentially the same thing, but worst, since JWTs take up a lot more space than cookies.
Storing Invalid Tokens
Suppose that in our fictional web application, a token has a lifespan of one week. Let’s also suppose that our user logged out in the middle point of their token lifespan. Therefore, there are three days throughout which the token could be (but shouldn’t) be accepted by the server.
The basic strategy here is to, in every logout or similar event, store that invalid token for a particular interval of time, to then periodically erase it from the server. Personally, I think this is adding unnecessary complexity to the process and is much more error-prone.
To me, JWTs shine brightest in authentication schemes with lots of different servers and in specific one-time actions. Allow me to illustrate.
Consider a web system that has a lot of different APIs, each with a specific task. You may have an authentication service, a content service, an authorization service and so forth. Using JWTs in these situations is very beneficial, as we may need to make multiple requests that require authentication. By sharing our secret or our public/private key pair, we can decrease network requests and make the whole thing much more efficient.
Another interest use case in one-time actions. As an example, you might have a server that sends bills of sales to customers’ emails after each purchase. You could issue very short-lived tokens to perform that one-time task.
You probably have a really solid understanding on how tokens, and more specifically JWTs, work. At the end of the day, I don’t think they’re a good alternative for sessions in web applications, but they do have their use cases.