Authentication in Web Apps, Part 1: Sessions

Mateus Melo
9 min readAug 26, 2021

The current authentication landscape is vast. We have options ranging from a simple login form with a password and an email all the way to multi-factor authentication. It’s something that is still evolving and changing — with new protocols being created and older ones being improved/abandoned — , and we, as developers, need to keep up.

It can be overwhelming—it was for me, at least — , so I’m here today to start talking about the basic strategies of authentication that most web apps use: sessions and tokens. In this first part, I’ll explain how sessions they work, what are their pros/cons, and I’ll also implement a basic authentication flow using NodeJS. Part 2 about tokens should be coming really soon!

The need for stateful communication

HTTP is a stateless protocol. What this basically means is that the server can’t remember who it talks to — every request is treated as a brand new one, even if it comes from someone who has already spoken to the server before. It’s like if you had a friend who forgets about you after every time you encounter each other — you’d have to introduce yourself again every single time.

What this means for authentication is if we were to maintain this characteristic of the HTTP protocol, any user that wants to access any type of secret information would have to authenticate every single time. Their credentials would be flying across the network on each request, which is really bad from a security standpoint and from an user experience one.

This is where sessions and tokens come in. They provide a way to maintain a stateful communication while using a stateless protocol. The differences, advantages and disadvantages of each one come from how they store this state, which is what we’ll discuss now.

Sessions

Sessions are the old school way of keeping track of an user using a website. With sessions, the interaction becomes stateful with data that is stored on the server (database, filesystem, cache, etc.). Each session is identified by an unique ID, which is sent to the user (browser) via a cookie, who then by sending the cookie back enables the server to “remember” the previous interactions.

Note that this scheme is not only limited to authentication. In fact, sessions are used as a general way of tracking user activity on a website. You could have a session id “123” mapped to a JSON object on the server with the following data:

An example of data that is not related to authentication that may be stored with sessions. In the image, we have a JSON object with fields “cart”, “searchedItems” and “layoutPreference”, that represent the activity of an user on a website.
Session data unrelated to authentication

Nevertheless, we’re here to talk about authentication with sessions. The good things is, if you understood the basic premise of sessions so far, then the authentication part becomes super simple. The only difference is that, when dealing authentication, the state stored on the server would have information regarding the logged in user (an identifier, name, email, etc.) and the duration of the session(when will the user have to login again).

Here’s a brief rundown of what an authentication flow with sessions looks like:

  • The user enters their credentials (maybe username and password) and sends them to your back end.
  • Your server validates the data and, if successful, sends a session ID (a random string) as a response within a cookie.
  • This cookie is then sent along each subsequent request to your back end, which will validate and identify the user based on the session ID and grant access to the protected data.
  • When the user logs out (or the session expires), the session is destroyed on the server, rendering the previous id useless.
Authentication flow using sessions. An image that illustrates the topics previously discussed.
Authentication workflow with sessions

Cookies

Since cookies are the main way to store session ids, it’s important to have a solid understanding on their nature, due to most of the particularities of session-based authentication coming from them.

We all have an intuitive notion of what cookies are, since they have become so prevalent in today’s era of targeted advertising and privacy scandals. But cookies are just little blocks of data that store key-value pairs and some other attributes regarding that specific cookie. They are set and transported via HTTP headers. When a server wants you to set a cookie, it does so via a Set-Cookie header, and when you want to send it back, your browser does it so via the Cookie header.

Here are some examples of HTTP responses and requests using these headers (both taken from Wikipedia):

HTTP/1.0 200 OK
Content-type: text/html
Set-Cookie: theme=light
Set-Cookie: sessionToken=abc123; Expires=Wed, 09 Jun 2021 10:18:14 GMT
...
GET /spec.html HTTP/1.1
Host: www.example.org
Cookie: theme=light; sessionToken=abc123

Cookies can also have attributes that change its behavior. Some notable ones are:

  • Domain — Determine to what domain the cookie can be sent back to.
  • Path — Determine to what path the cookie can be sent back to.
  • Expires — Determines when should the browser delete that cookie
  • Secure — Determines if the cookie should only be transmitted via encrypted transmissions (HTTPS)
  • HttpOnly — Determines if the cookie can be accessed on the browser via JavaScript code.

There’s still one topic we need to discuss with sessions, and that’s security. When discussing with authentication, this must always be on your mind. After all, we’re dealing with very sensitive information that, if stolen, could cause some really serious damage to our app/to the user.

As anything in software development, sessions are a strategy that have a good and a bad side. But, in my opinion, the good outweighs the bad here. Let’s start by listing the good.

The Good

Session ids are an opaque reference to the user’s data. Outside of the server, the session id is basically a meaningless random string. Since all of the state is stored in the server, attackers can’t extract any valuable data from the session id alone.

Cookie flags increase the level of security of sessions. Attributes like HttpOnly protect the cookie from malicious client-side Javascript code, also known as XSS attacks (we’ll discuss them in when talking about tokens). Other flags such as Secure add another great layer of security to our data.

It’s a more flexible approach. With all of your state already stored on your server, you have a lot more control over the entire app. As we’ll see when discussing tokens, stale data/expired credentials can become a problem. This doesn’t happen with sessions, since we already keep a list of valid non-expired identifiers.

Sessions have been around for a while. As I said, sessions are an old-school way of implementing authentication; meaning they are a battle-tested approach that has been used for 20+ years in multiple frameworks and languages. Even though us developers might have a soft spot for what’s new and shiny, it’s important to recognize approaches that just work.

The Bad

Sessions require more resources on the server. Due to sessions being stored on the server, this creates a larger need for resources like memory and speed. Alongside this, people also say this necessity will make scaling your app harder (synchronizing your state across many servers), while others say that this claim doesn’t even make sense. Personally, I’ve never needed to create such a large system where this becomes a problem.

Sessions are vulnerable to CSRF (cross-site request forgery) attacks. This is probably the biggest security concern when it comes to sessions/cookies. If you want to implement session-based authentication, you must understand what a CSRF attack is and how to protect your app from it.

CSRF Attacks

CSRF attacks essentially boil down to attackers inducing you to perform a certain action (change email, change password, transfer money) that you didn’t want to in the first place. But how does this happen?

Imagine that the server you’re talking to has a particular POST route that changes your email. For some unknown reason, a savvy attacker finds out about this route and also figures out the parameters it receives to effectively change the email. Let’s call this parameter newEmail.

Now, the only way for this request to go through is with your session id. So, for an attacker to be able to actually change your email (or do something else) they’d have to, somehow, steal that cookie. But things aren’t so simple. A hacker can’t just go into your browser and steal your cookies so easily (especially with already discussed cookie flags such as HttpOnly and Secure).

The things is: they don’t have to.

Remember how the app server answers your initial authentication request with the Set-Cookie header? This header is basically the server saying: “Hey, when talking to me, always send this a cookie, alright?”, which makes your browser send this cookie automatically in every request to that particular server. So, if the attacker can trick your browser into making that request, it will send the session id, effectively performing the action!

But how would he trick my browser into doing this? Well, this is the easy part. Your browser is already making tons of requests in every single web page you enter. All the attacker has to do is induce you to enter a specific URL that makes that request.

Let me illustrate this with an example. Our savvy attacker found out about the POST route which changes your email and also about is parameters (newEmail). He then creates a simple webpage containing a form that makes a POST request to that route, with the new email in an input tag. This request can be made as soon as the page loads, but you have to access it first. If he can make you click this link (in an email, text message, anything), the form on this other website (this is the reason it’s called cross-site request forgery) would change your email. It’s a subtle attack, but it has happened to major companies like Netflix and Youtube. To the main server, everything’s ok, since it was your valid session id that was sent inside the request.

Image explaining CSRF attack. Taken from this website.

Protection from CSRF attacks

The reason this attack works is because the hacker has all the necessary pieces to complete this puzzle. He knows about the route, he knows the parameter of the route, and he also “has” your session id. What if we just added a random new parameter to the route that the hacker can’t guess? This way, the hacker would never be able to complete the puzzle.

This is the strategy behind CSRF tokens. They are essentially random strings that become mandatory on certain requests. The server creates and keeps track of these tokens, and sends them to the user when they authenticate. Your Javascript client-side code (not the browser) stores this token as a custom HTTP header, which then gets sent to the server on subsequent requests. With the Javascript code responsible, other sites can’t force your browser to send this. Also, due to the token being a random string, it becomes nearly impossible for the attacker to forge a request with the correct token.

I’ve put together a small NodeJS server that implements authentication using Redis as a session store. You can find the link to the github repository right here. This repository will eventually contain code for the future articles I write on authentication as well.

Conclusion

Sessions are a solid method for authentication. They have been around for a long time and are battle-tested. Personally, they’re my preferred method of authentication when it comes to web applications, but they’re also not the only one.

Another very famous alternative called tokens have been recently gaining a lot of popularity. This has also led to a heated discussion about which method is better.

In the next part, I’ll explain what tokens are and how they work. I’ll also compare both strategies and see the pros and cons of each one. See you there!

--

--

Mateus Melo

I’m a computer science student that loves to learn and writes to share knowledge.