Web SSO across multiple subdomains using the Magic Cookie Recipe

One of the greatest things about the web is that it’s stateless. One of the worst things about the web is that it’s stateless. To deal with this, Lou Montulli built in “cookies” as a feature of Netscape in 1994. This was a great way to allow browsers to store a bit of information that could be sent in the HTTP headers with every request to it’s corresponding domain. This became the backbone of authentication on the web. On nearly every site on which you authenticate, the site is creating a cookie—often just a session ID that’s an incomprehensible hash—that corresponds to more information on the server.

Cookies have some limitations for security purposes. Your browser won’t send cookies to a domain that didn’t create the cookie.

There’s a problem with this that the web hasn’t solved, though. What happens if you want to have Single Sign-On across two different platforms? Single Sign-On is the idea that once you sign in once, you’re signed in everywhere (at least all the places you should be). Since the web is stateless and cookies only work across a single domain, how do websites accomplish this?

The most common way to get around this is as follows: there’s one identity provider where users sign in, let’s call it id.example.com. A user goes to a.example.com where the server says “you’re not signed in, so I’m redirecting you to id.example.com, and on success come back to a.example.com/auth-handler and give me a token”. Once the user signs in successfully to id.example.com, the identity provider will redirect the user back to a.example.com/auth-handler where that endpoint will do a server-side lookup back to id.example.com to see if token is valid. If it is, then a.example.com will generate a user session and redirect the user back to a.example.com homepage.

That’s a lot to get signed in, but what happens when the user goes to b.example.com which also requires authentication? It has no idea that the user has previously signed in. Only id.example.com and a.example.com know the user has signed in, so when the user goes to b.example.com, they are redirected to id.example.com in a similar way they were when they went to site “a”. That is a poor user experience. After all, the user doesn’t need to know that these are two separate systems, yet they are passed through to id.example.com just so b.example.com can get a token just like site “a” did. Typically, the user doesn’t have to provide their password again since id.example.com already knows they’re signed in, but it’s still a jarring and often long experience.

It turns out, there is a better way to sign in that can cover an subdomain of an existing domain, regardless of the technology behind the sites. It’s called the Magic Cookie Recipe (MCR).

The Magic Cookie Recipe is an architectural pattern that I developed in 2014 to create a seamless experience between a user portal site which was built on Drupal and an eCommerce site which was built on Magento. Here are the basic constructs you’ll need to know to understand this:

  1. The MCR is a cookie created by the identity provider (or login handler) with a “.example.com” domain which means the browser sends the cookie to any subdomain
  2. There is a server-side variable that matches the exact value of the MCR
  3. There is a server-side lookup to the identity provider to validate the token within MCR
  4. We trust the MCR cookie before the server-side session, since the user controls the cookie. The cookie controls the server-side variable
  5. If the cookie and server-side session are in agreement, we trust the cookie. If they disagree, we must validate the MCR cookie

I’ll walk through the document, but here’s the flow diagram. Click to enlarge.

SSO Magic Cookie Recipe Flow Diagram

This is identical logic that exists on all sites. Let’s walk through some examples.

Example: Guest User (guest happy path)
A user arrives at the site without the MCR.

SSO Magic Cookie Guest example

This is the easiest scenario. If the cookie doesn’t exist, the user is not signed in. We always trust the cookie first because the user controls the cookie, so if there is a server-side session, we destroy it no questions asked. Why? Because if the cookie doesn’t exist, that means either the session expired or the user signed out on another site. Always trust the cookie first. This implicitly accomplishes “single sign-out”.

The red line indicates “happy path” for guest users, meaning that until the user signs in, all subsequent page requests should go through this simple logic with low overhead.

It should be pointed out that the first check is labeled “Does a valid Magic Cookie exist”. This is where we can check that the cookie is valid, not just that it exists. For example, if you use OIDC as your identity architecture and store the ID token as the value of the MCR, you can open up the JWT and make sure it’s from a trusted source, it’s not expired etc. This location in the logic is not intended for web service calls since every page load would validate and affect performance.

Example: Signed-in user (happy path)
A user arrives at the site with the MCR and there is a corresponding server-side session.

SSO Magic Cookie Sign-in happy path example

In this scenario, the user has signed in and a server-side session has already been created. We just validated that they both exist and match each other. We call this the “happy path” for logged-in users because if both MCR and server session match, we know the user is A) signed in, and B) signed in as the same user so we don’t have to do any special validations.

Example: Visiting server B for the first time.
A user signs in on site A, but visits site B for the first time while still signed in.

SSO Magic Cookie first time visiting site b example

In this example, the MCR exists but there is no server-side session. We must validate our MCR cookie. Here, the cookie value matches your OIDC ID token and we just pass that to our ID Provider. Our ID provider tells us if the token is still valid. If it’s not, we destroy the cookie and send the user on as a guest. If the token is valid, we create a local session for the user and now our MCR cookie and server-side session match. Now, subsequent page requests should follow either the “happy path for guests” or “happy path for authentication” flows.

Example: Signing in as a different user
A user signs in as user Foo and visits site A and B. The user navigates away from site B, and navigates to Site A where they logout and sign in as user Bar. They then visit site B again.

The crux here is that Site B has a server-side session that recognizes user Foo, but the MCR says we’re user Bar.

SSO Magic Cookie logged in as different user example

In this example, the MCR and server-side session exist but don’t match. Here, we trust the MCR over the server-side session, but since they don’t match we have to validate the MCR. First we destroy the local session since we know it’s not correct. After validation of the MCR, we follow the same logic as the previous example.

What does this solve?

The Magic Cookie Recipe allows us to have multiple different technologies under different subdomains and allow the user to sign in once without jarring redirects. The login experience is seamless and the user won’t even know they’ve jumped from one application to another.

This also allows us to have very little overhead. We only call to validate the token when the server-side session and MCR are out of sync, which should only be once per property per session.

What does this not solve?

This does not solve for different technologies under multiple different domains, only subdomains (meaning it works for a.example.com and b.example.com, but not example.com and example.net). This is due to the limitations with how cookies work. Later, I will write up a specification that browser makers could use to allow SSO across multiple domains, but that’s for a different day.

What other things should I consider before implementing this?

First and foremost, do you have the capacity to accomplish this? Often with SaaS products you don’t get a lot of control, so make sure you can do this.

Second, understand that each implementation is slightly different. If you’re using a system that combines customer and administrator (employee) logins in one table, like WordPress or Drupal, you may need to add in logic to see if the user is an employee and ignore them. Platforms like Magento separate out admins and customers to their own authentication processes.

Third, you should make this code run as early as possible. If other modules and plugins rely on you being authenticated, you’ll want to make sure the user is who they say they are early.

Fourth, this is abstract. It does not cover the login process itself, logout process, or what you do when a user is determined to be a guest or authenticated user. This assumes those processes exist and you’ll have to consider this implementation during each of those scenarios, like creating the MCR on login, removing the MCR on logout etc.

Is this secure?

The MCR does not directly address security. If you’re going to include potentially sensitive data in a cookie, like an ID token, consider making it httponly and secure. It’s already insecure to add sensitive info into a JWT, and almost all authentication on the web relies on cookies & sessions anyway. This is no different.

Is this only for OpenID Connect?

No. Since the MCR is an architectural pattern, it should work with any authentication mechanism including OpenID Connect (OIDC) and SAML as long as there’s a verification endpoint and your app servers can handle sessions. In fact, I’ve gotten it to work consistently with OIDC, SAML, custom token-based sign-in solutions using Drupal, WordPress, and Magento.

Leave a Reply

Your email address will not be published. Required fields are marked *