Sunday, December 11, 2016

Understanding Distributable Java EE Applications

Strictly speaking, a Java EE web application is considered distributable as long as the <distributable/> tag is present in its deployment descriptor and in all /META-INF/web-fragment.xml descriptors in any JAR files packaged in its /WEB-INF/lib directory. If a JAR file does not contain a web-fragment.xml file, it is not considered a web fragment, so it doesn’t count. If a single JAR with a web-fragment.xml lacking <distributable/> is present in an application, or if the deployment descriptor is lacking <distributable/>, the application is considered non-distributable.
This doesn’t mean anything without understanding what, precisely, a distributable Java EE application is. The <distributable/> element indicates that the web application was written to deploy to multiple JVMs running on the same host or on different hosts. In almost all cases, this means that all attributes written to HttpSessions are Serializable. Containers are allowed to support non-Serializable session attributes in distributable applications, but they are not required to, and most simple Servlet containers do not. Tomcat, for example, throws an IllegalArgumentException if a <distributable/> application adds a non-Serializable attribute to a session. The point of using Serializable session attributes is that it permits HttpSessions to be shared among servers in a cluster, which is perhaps the most important reason for using <distributable/>. If two or more containers are configured to work in a cluster, they may share HttpSession data across the cluster only for distributable applications.


Understanding HTTP Sessions, Stickiness, and Serialization


Why would you want to have HttpSession data shared across servers in a cluster? The answer to that is quite simple and boils down to the reasons you use a cluster to begin with: scalability and availability. A single server cannot service an infinitely increasing number of users. When a server fails, you want your users transparently shuffled to other servers. Because the majority of applications interact with users using sessions, it’s important for that session data to be shareable across a cluster. This can serve two purposes:

- If a server fails, a user can be sent to a different server and that server will have all the same session data for the user that the failed server had.

- In an ideal world, consecutive user requests can be handled independently of which server receives the request, so a user’s requests can conceivably be handled on different servers every time without losing session information.

For either of these scenarios to work properly, you must make your session attributes Serializable. Because Java objects cannot live beyond the confines of a single Java Virtual Machine, HttpSessions must be serialized before being sent to other servers in the cluster whether over shared memory, a file system, or a network connection. This pattern presents two interesting challenges with solutions for both.
First, sometimes session attributes simply cannot be 100 percent Serializable. This is especially true for legacy applications that are upgraded and refactored. For example, a session attribute may (for whatever reason, however bad) hold on to a database connection or open file handle. These attributes obviously cannot be serialized and shared across the cluster or can they? The javax.servlet.http.HttpSessionActivationListener interface specifies a special type of attribute that knows when it is about to be serialized and sent to other servers in a cluster or has just been deserialized on a server. Any session attribute that implements this interface is notified when it is about to be sent to other servers (via the sessionWillPassivate method) and when it has just been received from another server (via the sessionDidActivate method). In the aforementioned examples, a session attribute could re-open a database connection marked transient in sessionDidActivate.
A more common problem with clustered sessions is performance. For complete server independence to be possible, the server must serialize and share an HttpSession every time you update a session attribute as well as on every request (so that the lastAccessTime property stays up-to-date). This can present a real performance issue in some cases. (Although it still has a net benefit over using only one server.)
For this reason and others, the concept of session stickiness is an important consideration whenever you deploy a distributable application. How session stickiness is configured varies from one container and load balancer to the next, but the concept is the same: Within a single session, all that session’s requests are handled in the same JVM. Sessions are serialized and shared across the cluster only periodically, as often as is practical by the performance standards set by the deployment team.
Sharing sessions less often can cause problems: Imagine having added several items to your shopping cart, and on the next request, the items are no longer in your shopping cart. With sticky sessions, you are always sent to the same server unless your session ends or the server fails, and the problem of session state inconsistency is mitigated. This is not without its downside. Server failures are not predictable, and sessions are rarely left in a consistent state when a failure occurs. When possible, it is always best to maintain session state across the cluster every time a session is updated.

No comments:

Post a Comment