Performance is critical to today's successful applications and web sites. If you design with an awareness of the session state management challenges you can always change your strategies to match your performance needs.
In the first part of this seriesa discussion was presented on what performance is, and some of the techniques that can be used to improve or monitor performance in your application. In this article the focus is specifically on managing session state and the things that you can do to maintain performance in your application.
There are two key areas to understand in session state management. First, you need to understand the options you have for maintaining session state. Second, you have to consider the different kinds of information that need to be managed in session state and how the different needs for maintaining session state impact how you might choose to manage it.
Just as we had to review some background concepts in order to understand the broad performance discussion, there are a few key concepts related to the communication between the client and the server. This includes the weight of the request in terms of the bytes transferred to the server and transferred from the server to the client. We'll talk about the request/response weight as well as the benefits and weakness of various encryption techniques that may allow you to leverage the users' machines for some session state management.
Some of the techniques for managing session state rely upon an increase in page weight. It's possible to transmit some data in the form of hidden fields on a form or as a cookie to the browser. The browser will dutifully transmit this data back to the server when it's asked to. This increases the communications in both directions, however, if the amount of data being transmitted back and forth isn't that large it can be a good technique for managing the amount of back end IO on a database server.
An argument that was made a while ago, which isn't really an issue today is that users may turn off cookies. It's certainly true that you need to check for this because if they do they'll break any mechanism you use to send data to them. However, because most sites require cookies, most users leave them turned on meaning that you don't have to be concerned about sending them session data via cookies. This is much more practical than the URL encoding strategies and form posting strategies that can be used to make sure you can get posted data back. I won't go into these two techniques but they essentially encode the information into the URL or force each page to be a post so that the client browser posts back hidden form fields.
With any data transmitted to the client there's the concern for its sensitivity. You wouldn't want a cookie with the users password to be transmitted because even if you're running the site over SSL the cookies can get persisted to disk and thereby snooped. It's possible to specify that cookies are session cookies and shouldn't be persisted by not every browser honors these requests. As a result you have to assume that whatever you send as a cookie can end up on a disk — and therefore snooped or tampered with.
Encrypting Transmitted Data
Frequently when speaking with web developers, I hear that they don't need to worry about security because they're using SSL. While SSL is designed to prevent snooping and tampering during transmission, it doesn't do anything about securing the information that is on the user's computer. Securing the information that's on the user's computer takes two unique dimensions.
The first dimension is the protection of the user's data from snooping eyes. In other words, a cookie that has the user's social security number that they just entered on a form should be encrypted so that it's protected on the user's computer.
The other dimension is the malicious user who's trying to tamper with the information they have so that they can gain access to things that they shouldn't have. For instance, storing an unencrypted user id value in a cookie is a bad idea because a malicious person could just change the value and become someone else — if you were using that for their authentication.
Microsoft ASP.NET handles the session key encryption process for you if you use the ASP.NET Membership provider — and has built in tamper resistance for the session ID that is created, however, if you step outside of this to manage your own client storage you'll definitely need to be concerned with how you're going to secure the information being stored on the local system. That means understanding a few encryption basics.
In computer encryption there are two basic types: symmetric and asymmetric. Symmetric encryption is fast but requires that both the sender (encrypter) and the receiver (decrypter) have the same key. This key management problem — making sure that only the right people have the keys necessary to encrypt and decrypt data is where most encryption mechanisms throughout history have fallen apart. One an enemy (or bad guy) has the encryption key then your messages are no longer safe.
Technologies like Secure Sockets Layer (SSL) do the bulk of their encryption with symmetric encryption because it is so fast. The keys for the symmetric encryption are exchanged through asymmetric encryption which isn't as fast but doesn't have the same problem that symmetric encryption has with key distribution.
Asymmetric encryption relies on what's called a public-private key pair. One key is kept secret and the other key is made public. For instance, let's take a simplified exchange where we want to exchange some symmetric keys.
The client (requestor) requests a copy of the public key from the server (responder). The client then creates a random symmetric key and encrypts the information with the public key of the server. The server decrypts the package with its private key and gets the symmetric key the client randomly created. Now that both sides have the symmetric key they can exchange information using a symmetric encryption mechanism which is smaller and faster to decode than an asymmetric approach.
This understanding of encryption is necessary because it's impractical to encrypt all of the information going to the client to the client with asymmetric approaches. Since this isn't practical it's necessary to use a symmetric encryption mechanism for data sent to the client. The processing required to do asymmetric encryption and decryption just isn’t practical with current hardware. This introduces a key management problem.
If you're going to send data to the client using a symmetric key that key will need to change periodically in order to keep the data secure from prying eyes and from tampering. Changing the keys is difficult because it's mostly done on the fly and any data existing on the client when the change is made will end up being invalidated — because the old key it was encrypted with won't match the new key.
It is issues like these which push folks away from storing information on the client workstation and keeping all session state on the server.
Solutions for Session State Data
If you've got to maintain data it has to be stored somewhere. The question is where should it be stored? Each location has its own advantages and disadvantages. In the following sections we'll look at what the advantages and disadvantages of the different storage options are.
The easiest answer for storing session state in ASP.NET is in process. This is, in fact, the default configuration for ASP.NET. If you don't do anything else you'll get a session object that will allow you to store any kind of session information you want. The benefits of this are that it's easy and it's fast. You can store practically any object you want in the cache and retrieval doesn't require anything special. However, the disadvantages are that it's not shared between servers and it's vulnerable to the application pool recycling — this will throw away all session state data. It's also in-process so it counts against the maximum address space that a process can have. If you've got a lot of session state data this can be a problem.
Out of Process (Memory)
If you can't live with the memory constraints of being in the same process you can use the in-memory state provider that ASP.NET ships with. It works like a persisted (i.e. database) provider except that it's not actually persisted, it's still in memory. This can be important if you're running out of memory, or you want to have your session data survive the application pool recycling. The benefits are that it's still pretty fast because the information never has to be written to disk. However, the disadvantages are that all of the objects that are placed into the session must be serializable. That and because the sessions are not saved anywhere if the session state service is stopped (by say a server crash) the session data is lost.
The next step in the progression to protect the session state is to persist it to a database. While this addresses the problems with a failure, it does so at a relatively high cost. All of the operations to get to the session are now database operations so they're a few orders of magnitude slower than accessing an in memory object. So a persisted session state does protect against a server failure and reboot — it does so at a very significant performance cost. Like the out of process session state all of the objects must be serializeable.
The final option for managing session state data is to transmit it to the client as described earlier in this document. This is really only suitable for a small amount of data as cookies have size limits and they end up being transmitted on each request whether or not they're needed. This can have a big impact on the amount of time it takes the site to respond to a user's click. It's also sensitive to the tampering and snooping we discussed earlier unless encryption (or signing) is used. While this may be required for some types of information like a user's identity or the session identifier, it's not a silver bullet to the problems that persisted session management creates. However, when used in moderation it can be helpful in limiting the amount of data that must be managed via a persisted storage.
Types of Session State Data
Session state is information about the user and their experience with the system. There are a handful of types of information that fall into session state. The first two are authentication information (who the user is) and session identification (unique identifier). These form The basis for identifying who the user is both when they're logged in and when they're anonymous. With that foundation, you can maintain profile information about the user which is typically persisted somewhere and user entered data which may or may not be persisted.
The final kind of information that is typically kept in session state is user cache information. That is information about the user which is just cached and can be regenerated if necessary. It's only being held as an optimization to prevent load on the system. Let's look at each of these kinds of information in turn.
Depending upon whether you're using HTTP authentication or a forms based authentication the authentication problem may be easy — or difficult. When you use HTTP authentication for your site every request to the server includes the information necessary to authenticate the user. That is to say that their username and password are transmitted with each request. Using Kerberos for HTTP authentication is actually slightly different from this as we'll get to in a moment. Using HTTP authentication for a web site means that either the authentication information to the server must be inherently encrypted and unbreakable or means that the transport should always be encrypted.
This is why when using basic authentication (which is unencrypted) should only occur over HTTPS. Similarly, the NTLM authentication that most systems should only be used over HTTPS. Although NTLM authentication is encrypted, the encryption mechanisms were designed decades ago and isn't really strong enough to stand against the brute force attacks that can be leveled against it by today's computing power. Kerberos uses a slightly different approach.
Kerberos based authentication doesn't actually contain the user's credentials. Instead it's a cryptographic package that includes a server's certification that the user is who they say they are. In other words, the username appears but instead of the user's password, there's a note from another server that they are who they say they are and for how long the web server servicing the request should trust that they are who the package says they are. This completely eliminates the need to retransmit the user's password over and over again which limits the extent of the exposure should the package be intercepted.
This is essentially the technique that is used in forms authentication. When the user authenticates via a form, ASP.NET will create an encrypted token that when read back in will confirm that the user is a certain identity. Rather than containing the user's password the package will have been encrypted by the server so if the package decrypts the package successfully it knows that it was the one who created it and thus it's not been tampered with.
The differences between Kerberos and forms based authentication is that Kerberos is transmitted as a part of the authentication header where as the forms based authentication is transmitted down to the client as a cookie which is returned to the server by the browser as a part of the cookies. Kerberos also relies on a third party server for the certification of who the user is where forms based authentication relies on itself having previously validated the users credentials.
In some cases, anonymous access is needed for the site and yet it's useful to keep track of some of the user's settings. In this case it's not authentication that's being transmitted but the tamper resistant session key. This key is designed to allow the system to determine which session is in use and is designed to be tamper resistant so a user couldn't just change the cookie and snoop in on someone else's session.
ASP.NET does an effective job at managing the session ids internally and isn't generally something that an ASP.NET developer needs to worry about.
User Profile Data
One could argue whether user profile data is a part of session data or not. Most systems that you'll integrate to will have a way of storing the user profile data for a user. There's no question that this kind of data needs to be persisted somehow because user profile information is expected to remain around for a long time. Things like a name, email address, and other information is just a part of knowing who the user is.
User Entered Data
User entered data — like data on page one of a multi-page form is data that's necessary for a small amount of time and then is no longer needed. The biggest issue here is remembering to clear the values when they are no longer needed — such as the form is complete. Failure to clear these values out can mean that the amount of data being carried around in the user's session gets larger and larger until it becomes unwieldy.
Depending upon the sensitivity and volume of the information being entered this kind of information is ideal to be pushed to the client. If pushed via additional hidden fields on the form the process is relatively straightforward when filling out a multi-page form because the transitions will be HTTP posts — however, this can get complex if the user is allowed to go to pages in different orders. Writing these values to cookies is an acceptable workaround provided that the information isn't too sensitive and could be stored locally on the user's computer. Sensitive data can still be pushed to the user's machine but only if encryption is managed as well.
The largest area of data that ends up in a session state tends to be user cache information. This information can be regenerated if necessary. For instance, if you could cache user specific promotions, pricing, or menu options. These items could be regenerated if necessary but it's more efficient to hold on to the assembled form for use on the next request. This kind of data doesn't need to be protected at all from a server failure. This sort of data can be regenerated in the event of a server failure.
Making Decisions about Session Data
With all of the different options for doing session state and the different kinds of data that might be stored in session state how do you make an informed decision on where to store session information? The short answer is that you'll probably be leveraging several different kinds of solutions. For authentication and session information you'll likely transmit the information to the client in the form of a cookie and expect it back each time. For data like user cache you'll want to use an out of process memory cache for volume -- but smaller volumes of user cache data might have you leaving this in the same process memory. User entered data may end up encrypted and transmitted to the user and back or it might end up in a persisted store depending on the volume and sensitivity.
The key to session management is realizing that there isn't a one size fits all particularly when it comes to making the tradeoff between performance and resiliency. Some organizations may take a look at the impact of a server failure and the loss of session data and decide that it's acceptable to lose session data in the event of a server failure because they expect that a server won't fail frequently and therefore the overhead of saving session into a central place isn't warranted. Other organizations may believe that no session data can ever be lost because it's too precious.
Knowing that session data isn’t one size fits all and that the decisions you make in the design of your session state management can have an impact on the flexibility that you have on the infrastructure side — and a substantial impact on the performance of your application. If you design with an awareness of the session state management challenges you can always change your strategies to match your performance needs. The next topic in our review of performance improvement techniques is caching since next to session state management it can have the largest impact on overall system performance.
About the Author
Robert Bogue, MS MVP Microsoft Office SharePoint Server, MCSE, MCSA:Security, etc., has contributed to more than 100 book projects and numerous other publishing projects. Robert’s latest book is The SharePoint Shepherd’s Guide for End Users. You can find out more about the book at http://www.SharePointShepherd.com. Robert blogs at http://www.thorprojects.com/blog You can reach Robert at Rob.Bogue@thorprojects.com.