How authentication on the web has evolved in the last 20 years - Part 1

In this post, I have written about how I have seen authentication changing on the web in the last 20 years. I have divided the post into two parts.

The first time I created authentication for a website, I checked for the presence of isLoggedIn flag and username in the cookies to detect if the request was coming from a logged-in user. Thankfully, this program was short-lived and never left my local computer.

I quickly realized this was disastrous, but it just so happened that I was onto something good even in this mistake.

I have seen numerous bad implementations from friends and colleagues- user name and logged-in status in URL, user name, and password in URL come to mind.

Surprisingly, developers still consider user name and password in HTTP headers acceptable if we base64 encode the packet (That's basic HTTP authentication for you).

While I have learned better ways of transporting credentials and sessions in HTTP request after login, storage of sessions, passwords, etc. has also improved a lot.

Storing passwords

Photo by Paulius Dragunas / Unsplash

If you have spent any significant time online, you probably already know that storing passwords in clear text is not great. Thankfully, none of the companies I worked with committed this sin to my knowledge.

However, developers based the gold standard for storing passwords at that time on the /etc/passwd file on Unix and Linux systems. Everyone had read permissions; passwords were hashed using md5.

Some of my friends knew about rainbow tables, and hence they didn't assume our accounts on department servers were safe but still chose to store user passwords with md5 hash for their websites and customers. It was as secure as the industry standard. After all, unlike some people who didn't chroot their servers and exposed their /etc/passwd files to the world, their database was not public.

Things soon improved, and md5 was replaced with sha256, and web developers started salting the passwords. Not everyone learned, though; I have seen companies using sha256 without salting for production. I hope they never get hacked.

Authenticating sessions in 1999

Photo by Sara Sperry / Unsplash

Most early web developers quickly realized that they needed to send an opaque session id in cookies, and when the cookie is received back, they need to look up the session id in a database to find the user's details and login status.

In 1999 PHP 4 introduced PHPSESSIONID, which did exactly this. The sessions could be stored on text files for single-server applications; they could also be stored in a database for multi-server applications.

Java world wasn't far behind, and in 1999 they also introduced JSESSION ID with their servlet specification as part of J2EE. Developers now had a standard and secure way to determine if someone was logged in.

Developers even started overloading the session database with other information like subscription status, user's display name, and some of them went as far as storing cart information in the session.

The database was soon a bottleneck

Photo by Campaign Creators / Unsplash

As you would have expected, databases were soon a bottleneck. Applications were hitting databases multiple times for every request. Once for session information and then again for actual application-related queries.

I was developing a large-scale application with millions of customers at this point. My team knew that we couldn't run sessions on a database. At this point, I was a big fan of W. Richard Stevens' Unix Network Programming book, and my team was already using shared memory and other IPC methods in applications.

I had also explored RPC in C language, and I was convinced that my application needed a distributed memory much like the shared memory system my team already used. Unfortunately, there were no popular open-source solutions; Memcached and Redis came out in 2003 and 2009, respectively.

My team decided to store session information in an in-memory database. Any webservers could contact the session server with a session token and find out if someone was logged in. After a couple of years, this became the popular way of doing things.

A detour on password storage

Photo by Kind and Curious / Unsplash

When we discussed password storage last, they were in bad shape. But things were improving. New algorithms like bcrypt and argon2 were available, and open source solutions like Apache Shiro had made it easy for application developers like me to get genuinely world-class security with very little code.

With new password hashing algorithms like scrypt, we can be sure that someone with access to the database won't be able to brute-force guess passwords easily.

In the second part of this post, I will talk about JWT, OAuth, TOTP, and many other exciting technologies I have used since the session storage era.

What does your authentication stack look like today?