This week we're getting back to an architecture topic.
Caching is one of the most neglected architecture concerns in enterprise applications. Not susprisingly, most people will think of it simply as a cross-cutting concern, one which you're not to consider until late in the cycle as it's not really driven by business needs, drivers and strategy, but is rather a purely architectural consideration. Caching can however be a huge contributor to success on a project.
The main architectural drivers behind caching are quality attributes performance and scalability. How much attention you pay to these quality attributes will in large depend on the architecture method you're applying as well as the maturity of your architecture organization. In organizations that have a more mature architecture method, quality attributes are a core part of both architecture development as well as architecture evaluation processes (see ATAM). Needless to say, those organizations follow a process that guarantees that risks of failing to satisfy these quality attributes are caught early and mitigated accordingly. This is where caching comes into play.
Caching is one of the general approaches (or solutions) to mitigate the risks of poor performance or scalability. But caching is a very broad term. Following is an overview of main architecture considerations around creating a technical solution for a cache in an enterprise application.
First of all, we can't always address both performance and scalability effectively. So, it's important to separate these two quality attributes and clearly identify them, or rather clearly identify the risks associated to them. The meaning of these two quality attributes is well known, but it doesn't hurt to remind yourself. Performance is about response time, or how fast your application can process requests. Scalability is about the ability of the application to grow in terms of key resources (such as users, concurrent requests, data), while maintaining satisfactory performance.
Some caching solutions will address performance, some scalability, and some both. Just keep that in mind, concentrate on the quality attributes which present higher probability/impact risk in your application.
There are different types of caching. Most types are available on all technology stacks. Most types are equally well aligned with all architectural styles. So you have a few to choose from, no matter what technology you're working on.
A by far predominant type of caching in enterprise applications today is distributed caching. Distributed caching involves storing the cached data on a separate tier, from your application tier. This offers several advantages: you're offloading the storage of the cached data to a separate tier, therefore leaving more resources in the application tier for your application; you're allowing for scale-out approaches in the cache tier independent of the application tier, therefore maximimzing scalability while minimizing the cost etc.
When talking about cache organizations, two main types are in use: replicated and partitioned. Both have a very specific purpose. Replicated cache clearly causes the copies of data to exist on multiple servers, while partitioned spreads the data across the cluster. The cases where replicated makes more sense is when you need to cache data that is mostly-static, used frequently and you want it cached very near the place where it's used, for instance in a process on the same server. On the other hand, most business data falls into the category of semi-volatile data that changes on occasion and is accessed on occasion. This type of data fits best in the partitioned distributed cache, where data is stored on a separate tier, equally accessible to all servers in the application tier. And then there are hybrid approaches or multi-tier caching, where data can move from one cache tier to the next depending on it's use.
What are some of the solutions you should consider? The answer to that depends on the technology stack. Some stacks have solutions that fit naturally, for instance Oracle Coherence on the Java stack. If you're looking for custom off the shelf solutions, Microsoft is coming out with it's own distributed cache server called Velocity. By far my favorite distributed cache solution is based on the open-source tool called memcached.
Memcached is an ultra-fast distributed hash implementation. It works with streams of bytes and is accessed over a TCP-based protocol. It's implemented on both Unix-like OSes and Windows and is very commonly used to implement a general purpose distributed cache. Here's how a memcached-based distributed cache works: it's based on a two-level hash. All servers in the cache tier are organized into the first level hash, commonly a consistent hash. All data within one memcached server is organized into second level hash. This way the entire cache cluster behaves like one big distributed hash. Communication is TCP-based so it's ultra-fast and it scales almost linearly (up to a certain point of course when network resources become the bottleneck). It's not perfect, though. It's missing some very important features other COTS distributed cache solutions provide. For instance, there's no inherent locking. While this promotes good performance, it also presents a challenge when working with scaled-out application clusters. Some type of locking logic typically needs to be implemented on top (at the application layer). Another disadvantage is that there's is no built-in failover. When a memcached server goes down, that entire chunk of cache is invalidated. There are known techniques to introduce backup capabilities to memcached by doubling the cache space and this can solve the failover issue but it does require more resources.
So, how do you use something like memcached? Again, depends on the technology stack. Typically, some support for caching would be implemented as a crosscutting layer in your application. A facility that allows your application to utilize the cache both for business objects as well as supporting data. If we're talking about application architecture, more specifically layered architecture, one would usually provide some kind of caching at the entity or data object level. Retrieving objects from the distributed cache rather than from the database can reduce the load on your database by an order of magnitude. It also may or may not improve performance, depending how well your application performed to begin with.
This was just a brief overview of some of the architecural considerations involved with creating a cache solution. Which product/solution you apply is going to depend on many factors: organizational policies, cost, architectural alignment etc.