Sunday, October 24, 2010

Enterprise-Scale Application Architecture Course

The Enterprise-Scale Application Architecture Course from Pragmatic Skills is available for purchase. You can find the detailed syllabus for the course and a link to purchase at http://pragmaticskills.com/Training/Courses/CSA201EnterpriseScaleApplicationArchitecture.aspx .

The course code is CSA201. Course is divided into four parts. Part one is an overview of the application  architecture practice. Part two is on architecture patterns. Part three is about designing for scale. Part four is a demonstration on a teaching software system.
New dates for 2010 for the live/interactive version over GoToMeeting online service were posted. Here's the breakdown:

Date 1
November 6 1:30PM to 5:30PM Part 1 and Part 2
November 7 1:30PM to 5:30PM Part 3 and Part 4

Date 2

November 16 6:00PM to 8:00PM Part 1
November 17 6:00PM to 8:00PM Part 2
November 18 6:00PM to 8:00PM Part 3
November 19 6:00PM to 8:00PM Part 4


Date 3

December 4 1:30PM to 5:30PM Part 1 and Part 2
December 5 1:30PM to 5:30PM Part 3 and Part 4

You can also purchase a pre-recorded version for a lower price.

Make sure to register on the website to be able to purchase.

Saturday, September 11, 2010

Pragmatic Skills Software Architecture Training Center

Pragmatic Skills is an online training center for software architecture and software development.
Visit the The Pragmatic Skills website, at http://pragmaticskills.com to find out more information about the webinars and courses they offer.
Their main subject areas are software development, software architecture, agile&lean and technology. Training is delivered online, so there's no need to travel and prices are very affordable. All webinars and courses can be purchased pre-recorded or live through GoToMeeting.

Thursday, June 24, 2010

Simple Economics of Agile Planning

Professor Philippe Kruchten, in his teaching, describes what is to me the most brilliant explanation of simple economics of agile planning I've ever run onto. If you haven't had a chance to listen to him speak I strongly suggest doing so.

I'll try to briefly go over the principles, trying not to do him injustice too much.

The Four Elements

Professor Kruchten starts from a simple postulate that everything in agile can be explained with four categories of work (hence my analogy to four elements of nature):

  1. Feature
  2. Issue
  3. Architecture/Infrastructure
  4. Technical Dept
A feature is the type of work that provides customer value in the sense of delivering new functionality. A feature is something that the technical product manager would define, based on interviews with clients and other stakeholders. It is arguably the most important type of task.

An issue is something that is introduced during development and represents a failure in the delivered functionality that needs to be corrected.

Architecture/infrastructure type of work typically represents core framework or plumbing improvements, not directly visible to the customer, but still in support of the features driving the agile project.

Technical debt, probably the most neglected type of work in agile, represents the work that needs to be done in order to fix all the problems caused by taking shortcuts in technical design that prevent us from implementing the features.

Value vs. Cost

So far we have established a base framework for observing the metrics of the agile project without really looking at any measures.

The key measures that come into play are value and cost.

Cost is a little easier to define. Cost represents the measure of how much effort is required to complete a particular task.

Value, on the other hand, is significantly trickier to define. How we determine the value depends on the type of task we're considering. A feature-type task has positive value and that value is typically determined by how much value a certain feature delivers to the customer. It can often be measured very precisely, considering the indirect gains a feature may produce for the customer, such as monetary gain, strategic gain etc. An issue is said to have negative value. Delivering an issue lowers the value of the delivered features. Again, it can be measured similarly to feature value, but in terms of how much it indirectly prevents some gain. Architecture/infrastructure tasks also have value, but that value is typically driven by the value of the features they enable. Often these types of tasks are required by technical and non-technical requirements and therefore cannot be avoided.

The value of technical debt I find most interesting of all. We already said that technical debt is caused by taking shortcuts in technical design, delivering features without building out the right frameworks in support of extensibility and flexibility and thus making it harder to add new value down the road. This is why it is referred to as debt. By taking shortcuts we prevent some future work and we need to pay it back before we can deliver that future value. But it has another interesting characteristic: it grows with interest, another thing that makes it similar to financial debt. The reason it grows is because the more features we implement on the legacy architecture the harder it is to refactor, therefore the more cost it will take to pay off. Therefore it's cost grows over time.

Features, on the other hand, can depreciate in value over time. Postponing a feature for a later release cycle may lower it's value to the customer. This is where we see the dynamic nature of simple agile economics. Value in the backlog can change, existing tasks can depreciate, new value can be added, while at the same time cost can change as well, cost of technical debt will grow.

The Art of Planning

Given what we said so far about the model of economics of agile projects, how does one approach agile planning? Typically from the value/cost perspective.

In each sprint you have a certain amount of time and resources (people, really). The goal is to fit as much value as possible in that box, while making sure the cost can fit. This is where you need to find the right balance between all four types of tasks, making sure to balance out the needs of stakeholders such as technical product managers, support, developers and architects. Each group will try to inflate the value of the tasks they see as having high importance, so being realistic and determining the true value of each task is key.

Another very important thing is keep an eye on the future, don't just focus on the current sprint. Delivering less value in the current sprint while trading it off for paying back some technical debt, may enable the delivery or far more value in the next couple of sprints therefore maximizing the total value delivered over several sprints.

That's it. It's as simple as that!

Sunday, May 9, 2010

Construction Interfaces - Why We Need Them

Most object-oriented languages have a concept of an interface - the contract for a type, usually a class, that defines it's public operations i.e. behavior.

But not all of them have the ability to define the construction interface. How do I define the construction interface? Construction interface is a contract that specifies how an object can be constructed.

Object construction is as essential to the object-oriented paradigm as are other tenets such as polymorphism or inheritance. Object construction defines constraints on an object-oriented model that better describe the objects involved. It adds semantics to the object life-cycle. For instance, who can create some object and how?

Take C#, for example. How would you express the following in C#: object Car can only be constructed with a collection of CarParts. It sounds reasonable, doesn't it. Why wouldn't I be able to say that some object can only be constructed out of some other object? The closest I can get to it in C# is to define something like this:

public interface ICarFactory {
    ICar Create( IEnumerable< ICarPart > parts );
}
So, I would define a factory object for a Car and that factory object is responsible for constructing my Car objects, and it has a strict interface that defines that a Car object can only be constructed out of CarParts. OK, so far so good. It seems like there's a solution. It's not great, though. What if I don't know about the CarFactory, what if someone passes me the Car object and I have no idea how it was constructed? How can I be sure it was constructed out of CarParts? The factory interface doesn't really put any constraints on the implementation of the Car, only the implementation of the Car factory. The only thing restricting the construction of the implementation of Car is its class constructor, but there's no interface for it, is there?

The construction interface is essential to being able to define object dependencies in a type strict manner. Say object A depends on object B, object B on object C. There's no way to define the constraint that the dependency chain needs to be satisfied on object construction.

Syntax for defining a construction interface could be something like this:

public interface ICar {
    Constructor( IEnumerable< ICarPart > parts );
}

This would require that every type that implements ICar has to have a public constructor that takes a collection of car parts. No need for a factory anymore. Furthermore, I could use the ICar interface as a generic type parameter constraint for a generic method or class and call that constructor from it because it's guaranteed to exist.

Now, if only the guys in Redmond could read this.

Saturday, April 17, 2010

Overloads On Interfaces = Code Vomit

Whenever I see overloaded methods on an interface I puke in my mouth a little bit. Imagine seeing something like this:

public interface IPlayer {
  void Play(IMusic music);
  void Play(IMusic music, int volume);
}

This code declares an interface for a player that supports playing music with given volume, or playing music with some default volume level. So, why is that on the interface? Why is the fact that there is such a thing as a "default" value for level something that should pollute an otherwise simple interface for a player?

When overloads are used to hide default values for parameters they have no business being on an interface! Choose the most elaborate way to perform an action and make that the designated method. Only leave that designated method on the interface and take out all others. Implement default values outside of it. Think about this: why would every implementer of that interface have to supply exactly the same default value? And wouldn't it be confusing if they all supplied a different one? And how is that the responsibility of a player to know what the default value for volume is? Doesn't make any sense.

Now, overloads are also used for actually overloading functionality, meaning a method that can be performed on completely different inputs with different logic. Take this for example:

public interface IPlayer {
  void Play(IMusic music);
  void Play(IVideo video);
}

Make sense? NO! Interface segregation is one of the principles of SOLID design. Split that interface into two, one for each type of input. In fact, better make it a generic one since there's nothing else on it:

public interface IPlayer {
  void Play(T t);
}

You can still implement IPlayer and IPlayer in one class if you wish. But keep single responsibility in mind. One object should be responsible for one thing only and playing music and playing videos are two different things.

Sunday, March 21, 2010

Technology Review: Architecture Analysis in Visual Studio 2010

I recently had a chance to review Visual Studio 2010 RC. I was mostly interested in new architecture analysis capabilities that come in the form of Architecture Explorer and dependency graph support.

I started from downloading the Visual Studio 2010 RC virtual machine, available from Microsoft. Note that the size of the unrared archive is around 30GB and that when running it needs at least 1.5GB of memory to run smoothly, so keep that in mind before giving it a try.

The download also comes with a couple of hands-on labs. I was particularly interested in the ones that cover dependency graph generation and architecture explorer.

The starting point for using dependency graphs is the generator, available from the Architecture menu. You can generate graphs by assembly, namespace, type, method as well as specify visibility options. The graph that is generated is in DGML form (directed graph markup language). While I wasn’t too crazy about the graphics of the generated diagrams (they could use some polish), I was quite pleasantly surprised with the features of the DGML viewer. I found different diagram layouts (top-down, bottom-up, butterfly, clustering) particularly useful. Clustering will organize the nodes clustered around nodes with highest number of adjacent edges, which gives you a quick and easy way to identify “significant” or “critical” nodes. Butterfly will organize the nodes around a selected node into ones with incoming and ones with outgoing edges. Particularly useful to identify wrong direction dependencies on the architecture. I also liked the neighborhood mode where you can start from a node and only traverse up to a selected number of levels deep. On the other hand, I found the hovering over an edge until the two arrows and a plus show up a little clunky. I would have liked to have seen perhaps a mode where you can select two adjacent nodes and it would automatically select the edge that connects them and show some kind of icon to access the context menu. Still, it proved useful to create new diagrams by concentrating on the details of the connected nodes. I found the grouping feature in the graph viewer working well, although like I said graphics could use some polish. It allows you to show or hide the details by expanding and collapsing different groups such as assemblies, classes etc.

The architecture explorer feature complements the graph viewer well. It’s a tool that allows you to explore the solution and even external assemblies, traversing various relationships between objects, such as fetching classes, references, files, references to types, building call graphs etc. The end result of the traversal is a path that you can simply plot as a graph in the graph viewer. Again, my biggest complaint is the user interface. It seems they tried to stuff a gigantic piece of UI into a panel, completely unnecessary. Furthermore, it features auto-expanding and collapsing vertical areas that are only used for filtering the next node in the traversal. It resembles an accordion. Traversal goes from left to right, with each step pushing the UI further to the left making room for the next node and with that awkward vertical expand/collapse filter area between each step. In one word a disaster. Whatever happened to not cluttering the user interface? I would like to have seen a top-down, entire step collapsible like in MS Outlook, filter stays in one spot kind of UI that I could dock to the left side of the window and show/hide when needed. Left to right just doesn’t make any sense.

So, how useful are these features? If you’re doing a lot of static code analysis, somewhat. There have been other tools on the market for a long time, but still it’s nice to see a tool that comes with Visual Studio. It’s by no means any match for something like Resharper, but it can come in handy for quick architecture review. That said, that’s about the extent of its usefulness. I would love to have seen refactoring support built right into the graph viewer. Reversing the direction of dependencies is perhaps one of the most common and most important refactoring operations. It seems like the logical next step for Microsoft.

Tuesday, March 16, 2010

Value Proposition in Distributed Version Control Systems

While they've been around for a while, they haven't really come into focus until recently. It seems they're becoming more and more mainstream, pretending to completely replace centralized version control systems. Are we then witnessing a major paradigm shift in source control?

So, what brought this on? What triggered a wide-spread adoption of distributed version control systems? What caused companies to abandon their mature and well-established processes and seek new solutions for managing their codebases?

The answer could be as simple as critical mass. We may very well be at the tipping point. The way it stands today, it's a matter of time before all centralized version control systems become obsolete. Linus Torvalds called the Subversion project "the most pointless project in history". He then goes on to say that if you're using SVN you're "ugly and stupid". I wouldn't go that far, myself.

In order to understand where the need for a paradigm shift is coming from, we need to look at the value proposition from it and the key business drivers that led to the adoption.

The main difference between a distributed version control system and a centralized one is the lack of the central repository used by everyone. In fact, in distributed version control systems, each node can have a number of repositories. What that means is that you can put the repository on your laptop, take it on the road, have full source control capabilities locally and very fast, and then synchronize with other repositories at will. You can do check-ins, logs, diffs, all locally, without the need to be connected to a central repo. Another key differentiator is, due to it's distributed nature, you now have the ability to perform non-linear version control right in the local repo. This is a game-changer.
You really need to adapt your mindset to the new way of doing source control. The way of maintaining feature branches in a central repo is replaced by cloning, pulling/pushing and merging in a distributed version control system.

If we think of key business drivers behind this shift, we're mostly talking about the fact that larger and larger enterprises are becoming increasingly more agile. While agile has been around for a long time, it hasn't really become mainstream in larger enterprises or highly regulated verticals, both legislatively and industry-wise. And for a good reason. High level of regulation means rigid organizational process designed to ensure compliance. But, even highly regulated industries are ever so rapidly changing, and with that came the need for businesses to respond with adapting their rigid organizational processes. It makes all the sense in the world, then, that centrally managed, strictly governed version control processes need to be abandoned in favor of more flexible, agile-promoting, distributed version control systems, that offer far less friction and encourage feature-driven development.

So, how does Joe, chief decision-maker (CDM for short) of ACME Corp., decide to launch an initiative to investigate, evaluate, implement or adopt a distributed version control system?

He thinks in terms of short-term vs. long-term strategies. Short-term Joe would like to somehow marry the two in order to: a. make the transition from centralized to distributed frictionless; and b. get the benefits of both worlds as he sees that centralized worked well for so long and would hate to give up some of the great things it provides, but he would really like to get all the good stuff distributed has to offer. And Joe would be right. Keeping transition in mind is key for short-term success. Luckily, most distributed version control systems in use offer integration with popular centralized version control systems. Long-term, however, Joe sees the need to completely replace the existing centralized version control system in order to reduce the cost of managing two completely separate systems.

Best practice around piloting a project to adopt a distributed VCS starts from the analysis of your source control workflow. Every company has their own process or workflow around source control. They typically have commonalities, like branching strategies, release management etc., but on the other hand each is tailored to the needs of the organization. So, look at how you do version control at your organization and try to envision how it could be implemented using a distributed VCS.
Now that you figured that part out, start by defining the transitional process that includes both centralized and distributed version control. There you would typically define how code from the centralized VCS is imported into the distributed VCS, when, where from, which nodes, how often is it pushed back, by whom etc. An example would be, continue to create feature branches in SVN and import them on each node that needs to do check-ins, do local distributed check-ins, and push them to centralized repo once a day.
Once you have that in place, start to define the lateral processes i.e. start connecting the nodes more directly in islands first, without having to push to the central repo. A natural way to group nodes would be by teams/products. Try to assign root nodes to each group. While root nodes are against the distributed paradigm, they help to manage complexity of the network of repos while transitioning. What you want to do long-term is let the nodes assume leadership naturally.
Eventually, your network will consist of isolated islands of fully distributed version control, with central repo used for release management and such. Some companies will stop here and continue to use this process. That's perfectly reasonable. Others will see the need for further distribution of central repositories, offloading them to other "significant nodes".

Another thing you need to consider is which distributed VCS to use. You pretty much have two options: Git or Mercurial. These two are most popular and for a good reason. I'm not going to go into details about pros and cons of each, there are plently of online resources on the topic. I will say that I worked with both and while Git is more powerfull it is also harder to adopt and transition to from SVN, especially on Windows. Mercurial on the other hand is a little more out-of-the-boxy but slightly less flexible, and also easier to transition to. By all means, evaluate both.

Monday, February 22, 2010

Technology Review: NServiceBus

This week I'll do the first in the Technology Review series. In this article I'll give you an overview of an open source service bus framework for .Net called NServiceBus. The man behind this framework is Udi Dahan, a well known name in the industry. You can find it at http://www.nservicebus.com/. So let's start with the intro.

1. Introduction to NServiceBus

Like I said, NServiceBus is an open source enterprise service bus implementation in .NET. If you're not familiar with the concepts around enterprise service bus, service bus architecture or service oriented architecture, I'll try to give you a brief introduction here. But I do encourage you to read up on these topics as this will be a very high-level description.

Enterprise service bus is an architecture that allows distributed message exchange between applications or services while at the same time offering the loose coupling of these services by providing facilities such as message routing, reliability and failover and supporting different message exchange patterns such as publish/subscribe, request/response etc.

The backbone of every service bus implementation is the communication/reliability infrastructure i.e the queuing infrastructure. NServiceBus is written for MSMQ, Microsoft queuing service on the Windows Server platform.

So, what do you do with this framework? In essence, you build communicating services on top of it. An example would be services in support of an online store such as ordering, billing, manufacturing, shipping etc.

2. Features

The core features of NServiceBus are the messaging capabilities. These include the publisher/subscriber and request/response message exchange patterns. Pub/sub support also allows you to manage the subscriptions and persist them in a built-in subscription store such as database or MSMQ, or implement your own. Request/response support allows you to use addressing to route messages to specific recipients and to send responses to the originators of the request.

It also supports a long-running message exchange pattern called a saga. A saga is essentially a long-running, persisted, message exchange protocol between services. You can think of it as service orchestration.

The messages that are sent over the bus can be implemented as .NET classes or interfaces.

Besides the core features, NServiceBus also allows for high configurability, building and wiring up objects using a dependency injection framework from an XML config file.

Along with the framework come a few utilities to help you get started. These include the generic host process for message handlers and the distributor utility used for load balancing.

3. API

The central interface in the NServiceBus API is IBus. This is your entry point to start messaging over the bus. IBus interface has methods that support all message exchange capabilities:
  • Publish - publishes a message on the bus
  • Send - sends a message to the destination
  • Reply - replies with a message to the sender
  • Subscribe - subscribes to a message
All of the methods on IBus are templated by the message type.

The main interface for message handlers is IHandleMessage, templated by the message type T. IHandleMessage derives from IMessageHandler which has one method, Handle. Implementers perform message handling in this method.

To start sending messages you need to get an instance of IBus. This is done with dependency injection. Based on the endpoint configuration in the config file, NServiceBus will build up an IBus object and inject it into your endpoint configuration object.

When it comes to configuring enpoints, you have a choice of a few built-in classes to derive from, AsA_Publisher for instance provides the endpoint configuration for a message publisher. The dependency injection is typically done based on the configuration in the XML config file. There you would configure the MSMQ transport properties such as the name of the queue, number of worker threads etc., as well as for clients the mapping of message types to queues.

When it comes to saga support, NServiceBus comes with a Saga base class, templated by the saga data type. Saga data type contains the data about the state of the long-running message exchange orchestration process.

4. Usability

This is where I have a slight issue with this framework. While the design is very clean and abstracted, the API takes getting used to. You definitely don't want to give this to a junior developer.

It relies heavily on dependency injection to bootstrap the core objects and configuration and it is not super-clear what's happening behind the scenes. For instance, you always get the IBus object from DI, and what it does internally is it scans the assembly for types that implement certain marker interfaces to perform the wire-up.

There's quite a bit of friction, too. For instance, the pub/sub example implements the endpoint configuration for a publisher by inheriting from AsA_Publisher, and implementing IConfigureThisEndpoint and ISpecifyMessageHandlerOrdering (last one being optional and used to specify the ordering). IConfigureThisEndpoint is a marker and it isn't very clear what its purpose is, until you read the documentation.

Similar concerns go for the IWantToRunAtStartup interface. Implementers of this interface are invoked on startup, if hosted in the utility host process and they are created using DI.

Another interesting thing in the design of the API in NServiceBus is the message handler interface IHandleMessage. It itself defines no operations, simply derives from IMessageHandler which defines the Handle operation.

Messages as interfaces feature is interesting and I would argue perhaps unnecessary. Allows you to define the message type as a .NET interface and use the CreateInstance method of the IBus interface to create it. CreateInstance doesn't allow for immutable interfaces for messages as message properties are assigned either from an action passed in or after the fact.

The saga support, again, a little weird. Using the base Saga class creates tight coupling with the framework. I would rather have seen a marker and dependency injection to provide the base implementation as an object.

5. Overall Rating

I'm not s huge fan of MSMQ. So, just based on that, I can't in clear conscience give a super-high rating to this framework. Not that it has anything to do with the quality of NServiceBus. But in general, MSMQ as an enterprise queuing infrastructure is at least questionable.

On one hand, the wide scope of various message exchange patterns supported make it a really good candidate for implementing a generic ESB solution. If you need a framework that can withstand change in your environment, be it introduction of new services or new messaging patterns, it is well suited. All the little utilities that are supposed to make hosting solutions based on it easier I feel actually make manageability worse, so this should be a consideration too. Not to mention issues around OS-level security and MSMQ configuration.

There's somewhat of a learning curve with NServiceBus, especially if you're new to concepts such as dependency injection, reflection, extension methods, generics. From that standpoint it isn't too well suited for a small-scale Q&D type project.

Here are my scores for NServiceBus:

Features: 7/10
Quality: 8/10
Usability: 5/10

6. Conclusion

I hope this review helps in choosing the right solution for your application. Keep in mind the other considerations you need to make, like cost, risks, technology and architectural alignment etc.

Sunday, February 14, 2010

Using Transactions with .Net/SQL Server

I was recently debugging a piece of .Net code that called into a legacy stored procedure that managed its own transaction. Turns out the .Net code was recently changed to utilize the .Net transaction manager, more specifically TransactionScope. And this is where things started falling apart...

If you're working on the .Net/SQL Server technology stack, you basically have two choices when it comes to transactions: use the .Net transaction manager or use native SQL Server transactions (i.e. T-SQL). Both are valid options, each more suitable in certain scenarios.

Here's how the .Net transaction manager works. In the transaction manager framework in .Net, SQL Server data provider acts as a durable transacted resource manager. What this means is that the transaction manager can do a two-phase commit with it if it needs to. Now, the transaction manager generally has two modes of operation: a so called LTM or lightweight transaction manager and a full-blown MSDTC-based distributed transaction. LTM-based transaction is essentially a single-resource manager transaction which can benefit from the optimization of a single-phase commit. The transaction manager also has a special mode in which it allows resource managers to upgrade its transaction to a distributed transaction, should more durable resource manager get enlisted, and only when needed.

The .Net transaction manager can be used in two ways: explicitly and implicitly. Excplicit transaction management is done using the SqlTransaction class. When doing explicit transactions, the caller has to create and commit or roolback the transaction. Implicit transaction management is done using the TransactionScope class. TransactionScope marks a block of code as part of a transaction. Depending on whether there is nesting (i.e. the method was itself called inside a transaction scope) or the ambient transaction was created in some other way, as well as the options passed to the constructor, the transaction scope may create a new transaction in the transaction manager or attach to the existing one. The transaction manager then in turn works with the resource managers for all connections established from the transactions scopes to manage the native resource manager transactions. When more than one resource manager is used in a single scope, the transaction is promoted to a distributed transaction and a full two-phase commit protocol is used. When using a single connection to a database no distributed transaction is created.

On the other hand, T-SQL based transaction management is done using BEGIN TRAN and COMMIT and ROLLBACK statements. Very important note here: T-SQL based transaction management applies only to local SQL Server transactions. No real nesting exists between .Net transaction manager and T-SQL based transactions!

So when do you use one and when do you use the other?

Generally, I recommend using the implicit transactions with .Net transaction manager. They provide a lot of flexibility in where the transaction is defined, hide all the complexity of managing a native resource manager transaction. However, there are cases when T-SQL transactions are required. Imagine if same T-SQL code needs to be shared by two different applications written on different technology stacks. Bringing transaction management down to the database is the only way to ensure consistency, especially if one or more technology stacks don't support transactions.

A typical scenario in using the TransactionScope would be when implementing a business layer that consumes multiple DAOs from the data access layer to perform an atomic operation. A good example would be a component for processing banking withdrawals and deposits. Placing the transaction scope where the data access operations are invoked makes it clear what the purpose of that scope is, while at the same time allowing for the possibility that the whole business layer operation is composed into a larger transaction scope in another business component.

Distributed transactions should be avoided. The performance and reliability implications simply negate the benefits. Not to mention there's a dependency on MSDTC middleware.

Microsoft does not recommend combining T-SQL transaction management with .Net transaction management as it may lead to inconsistent results. One of the common problems is that a sproc written with a T-SQL transaction performs a ROLLBACK, effectively setting @@TRANCOUNT to 0, followed by a transaction scope attempting to perform another rollback and failing. Another typical problem would be a transaction in a sproc doing a commit while the transaction scope performs a rollback due to an application level error.

If you're going to write a sproc in T-SQL that manages its own transaction, here's a good way to do it:

DECLARE @trancount INT

BEGIN TRY

  SET @trancount = @@TRANCOUNT

  IF @trancount = 0
    BEGIN TRANSACTION

  -- Perform some work

  IF @trancount = 0
    COMMIT TRANSACTION

END TRY

BEGIN CATCH

  IF XACT_STATE() <> 0 AND @trancount = 0
    ROLLBACK TRANSACTION

  RAISERROR

END CATCH
 
The previous T-SQL code will only start and commit/rollback its transaction if upon entering it the @@TRANCOUNT was equal to 0. If you call this sproc from .Net inside a transaction scope @@TRANCOUNT will be 1 and the sproc will let the .Net transaction manager handle the transaction. If you call it directly without an outer transaction it will manage its own transaction. Use this method only if you absolutely have to write T-SQL transactions. Otherwise, stick to .Net.

Saturday, February 6, 2010

Distributed Caching in Enterprise Applications

This week we're getting back to an architecture topic.

Caching is one of the most neglected architecture concerns in enterprise applications. Not susprisingly, most people will think of it simply as a cross-cutting concern, one which you're not to consider until late in the cycle as it's not really driven by business needs, drivers and strategy, but is rather a purely architectural consideration. Caching can however be a huge contributor to success on a project.

The main architectural drivers behind caching are quality attributes performance and scalability. How much attention you pay to these quality attributes will in large depend on the architecture method you're applying as well as the maturity of your architecture organization. In organizations that have a more mature architecture method, quality attributes are a core part of both architecture development as well as architecture evaluation processes (see ATAM). Needless to say, those organizations follow a process that guarantees that risks of failing to satisfy these quality attributes are caught early and mitigated accordingly. This is where caching comes into play.

Caching is one of the general approaches (or solutions) to mitigate the risks of poor performance or scalability. But caching is a very broad term. Following is an overview of main architecture considerations around creating a technical solution for a cache in an enterprise application.

First of all, we can't always address both performance and scalability effectively. So, it's important to separate these two quality attributes and clearly identify them, or rather clearly identify the risks associated to them. The meaning of these two quality attributes is well known, but it doesn't hurt to remind yourself. Performance is about response time, or how fast your application can process requests. Scalability is about the ability of the application to grow in terms of key resources (such as users, concurrent requests, data), while maintaining satisfactory performance.
Some caching solutions will address performance, some scalability, and some both. Just keep that in mind, concentrate on the quality attributes which present higher probability/impact risk in your application.

There are different types of caching. Most types are available on all technology stacks. Most types are equally well aligned with all architectural styles. So you have a few to choose from, no matter what technology you're working on.

A by far predominant type of caching in enterprise applications today is distributed caching. Distributed caching involves storing the cached data on a separate tier, from your application tier. This offers several advantages: you're offloading the storage of the cached data to a separate tier, therefore leaving more resources in the application tier for your application; you're allowing for scale-out approaches in the cache tier independent of the application tier, therefore maximimzing scalability while minimizing the cost etc.

When talking about cache organizations, two main types are in use: replicated and partitioned. Both have a very specific purpose. Replicated cache clearly causes the copies of data to exist on multiple servers, while partitioned spreads the data across the cluster. The cases where replicated makes more sense is when you need to cache data that is mostly-static, used frequently and you want it cached very near the place where it's used, for instance in a process on the same server. On the other hand, most business data falls into the category of semi-volatile data that changes on occasion and is accessed on occasion. This type of data fits best in the partitioned distributed cache, where data is stored on a separate tier, equally accessible to all servers in the application tier. And then there are hybrid approaches or multi-tier caching, where data can move from one cache tier to the next depending on it's use.

What are some of the solutions you should consider? The answer to that depends on the technology stack. Some stacks have solutions that fit naturally, for instance Oracle Coherence on the Java stack. If you're looking for custom off the shelf solutions, Microsoft is coming out with it's own distributed cache server called Velocity. By far my favorite distributed cache solution is based on the open-source tool called memcached.

Memcached is an ultra-fast distributed hash implementation. It works with streams of bytes and is accessed over a TCP-based protocol. It's implemented on both Unix-like OSes and Windows and is very commonly used to implement a general purpose distributed cache. Here's how a memcached-based distributed cache works: it's based on a two-level hash. All servers in the cache tier are organized into the first level hash, commonly a consistent hash. All data within one memcached server is organized into second level hash. This way the entire cache cluster behaves like one big distributed hash. Communication is TCP-based so it's ultra-fast and it scales almost linearly (up to a certain point of course when network resources become the bottleneck). It's not perfect, though. It's missing some very important features other COTS distributed cache solutions provide. For instance, there's no inherent locking. While this promotes good performance, it also presents a challenge when working with scaled-out application clusters. Some type of locking logic typically needs to be implemented on top (at the application layer). Another disadvantage is that there's is no built-in failover. When a memcached server goes down, that entire chunk of cache is invalidated. There are known techniques to introduce backup capabilities to memcached by doubling the cache space and this can solve the failover issue but it does require more resources.

So, how do you use something like memcached? Again, depends on the technology stack. Typically, some support for caching would be implemented as a crosscutting layer in your application. A facility that allows your application to utilize the cache both for business objects as well as supporting data. If we're talking about application architecture, more specifically layered architecture, one would usually provide some kind of caching at the entity or data object level. Retrieving objects from the distributed cache rather than from the database can reduce the load on your database by an order of magnitude. It also may or may not improve performance, depending how well your application performed to begin with.

This was just a brief overview of some of the architecural considerations involved with creating a cache solution. Which product/solution you apply is going to depend on many factors: organizational policies, cost, architectural alignment etc.

Saturday, January 30, 2010

Common Mistakes in Agile Development

You'll hear a lot of project managers nowadays swearing they are applying an agile methodology. But are they really?

I came across a number of mistakes people commonly make when applying agile, or when they think they're applying agile.

One of the most common ones is that people think of agile as iterative development. It is not! While some agile variants do consist of iterations, it is not iterative in a classical sense. Iterative development is about continuous refinement towards the end goal. On the other hand, agile si about achieving short feature-complete and scope-limited milestones in the project. There's a big difference.

So what's the most common consequence of this mistake? People will, under the excuse of being agile, simply adopt an ad-hoc approach in the early stages of the SDLC, expecting that things will change frequently and not bother to spend enough time on architecture and design. What they should be doing is setting up milestones based on the customer value (a concept from the agile project management triangle, I'll touch on later). And then working towards those milestones, following the usual SDLC processes.

Another very frequent mistake is that some people tend to stick to a known agile process to the letter, not even bothering to validate that it works well in their organization. Agile methodologies were meant to be tailored. What you typically want do is tailor it to the organizational policies, competencies and communication style in your organization. Being too rigid about applying a known agile process can lead to people rejecting it. Being flexible allows you to appeal to the needs and constraints of your organization, making the adoption effort more likely to work.

Developers as well as other participants in software development projects, like most people, resist change unless they have a clear incentive. Adopting agile often doesn't present that clear incentive, especially if the management doesn't really believe in it. And how could they when they're not the subject matter experts. Instead of forcing change on their team members, project managers are far more likely to succeed by bringing in external agile SMEs to facilitate the process.

I mentioned the agile triangle previously. Some people think of it as a new wholy grail of agile project management. The classical project managent triangle scope-time-cost with quality in the center is in agile replaced by different facets: customer value, quality and the third one is the classical triangle as a single facet.
What that means is that in agile you need to continuously deliver new value to the customer, while maintaining high quality and staying within the constraints of the classical project management approach.

Not thinking of this new metric is another common mistake people trying to adopt agile make. Naturally, not keeping it in mind can lead to either poor quality or producing a piece of software that noone is going to use.

Some agile SMEs think that one of the ways to improve an agile development process is to keep learning the right lessons. I definitely agree here. Some organizations will even go as far as making the learning aspect a core part of their culture, which results in a complete change in the mind-set of the employees, making them more oriented towards improving their competencies rather than succeeding on projects. Both are equally important in my view, but finding a way to balance the two is key.

Last but not least, an agile process is not a purpose in itself but rather a delivery method that ensures the organization can quickly adapt to change. It's important not to become too attached too any particular process, especially agile. There are downsides to having such a flexible development process. If your organization rutinely deals with change, change will become expected, eventualy even desired. The more your organization becomes able to rapidy respond to change, the more it will feel relaxed about accepting it and the better the chance of a high-risk change having a negative impact. This is to say, don't ever get too comfortable with change.

SCRUM: What Works and What Doesn't

Let's try to illustrate this on one of the most commonly used agile development processes, SCRUM.

The roles in the SCRUM process include: the product owner, the team, the SCRUM master (who often is a member of the team). Some of the key features in the SCRUM process (and this is an over-simplification) are short sprints (usually couple of weeks in duration), daily 15 minute stand-up meetings to briefly check-in with the team, sprint planning and sprint review meetings at the beginning/end of each sprint, product owner assigns the tasks in order of priority to the product backlog, team does the pick'n'choose in each sprint, SCRUM master reports the daily burndown chart.

Now, one of the things you may want to tailor in your organization are the actors in this process. I've seen SCRUM with a team member being the product owner and SCRUM master work, I've seen SCRUM with no SCRUM master work, too.

The key to having a successful SCRUM process is a well balanced team. You don't want team members who can't take on a task from the backlog, or who will fight over taking a task. Good communication is key to ensuring all risks are caught early and mitigated accordingly.

Sprints may or may not be fixed in duration. If you see you can't achieve significant enough value in a sprint, extend it. Don't artificially create more sprints to make up the time. Don't leave features unfinished or partially done. It's not iterative, it's agile.

Don't make architecture decisions likely, expecting they will change. They don't have to change, unless they're wrong. Agile architecture development is tricky since it's hard to figure out the value that it provides. One way to go about it is to vertically partition the solution and split architecture development into phases.

Again, learning is an integral part of SCRUM. Common mistake organizations make is to not spend enough time in sprint review meetings and concentrate mostly on sprint planning. Sprint review is as important and should not be neglected. Successful techniques for sprint review that people use include things like success and failure stories, communication matrices, various types of root-cause analysis etc.

I'm going to leave you with an analogy to a car accident: they say that when in a car accident you're far better off keeping your eyes on where you want to go instead of on what you might run into.

Saturday, January 23, 2010

Primer - How To Write Testable Code

This is the first article in the Primer series. The Primer series will mostly be devoted to practical advise on various topics in software architecture and development.

This week I'll show you how to write code that is SOLID and testable. The reason for writing testable code is, of course, maintainability of your application. Maintainability results in fewer issues therefore lower true cost of ownership. Needless to say, it is one of the key considerations for any architect.

Before you start writing unit tests and integration tests and automating it all in the continuous integration environment, you have to make sure that the technical design principles you apply make your code easily testable. If you can't cleanly test your code your tests will not have any value as bugs will be easy to miss or tests would end up being to fragile. Whether you apply test driven development (i.e. write tests first) or not, the complexity of your technical design will ultimately affect the complexity of your tests.

Here are a few suggestions to follow, before you even start:
  • Follow the SOLID principles. They will ensure that you reduce complexity by creating single-responsibility classes and therefore reduce complexity of your tests as you'll be testing one responsibility at a time.
  • Apply the inversion of control pattern. Inversion of dependencies will allow you to build up the mocks more easily when writing unit tests.
I'll demonstrate some of these principles on an example. The example is an airline ticket ordering framework.

I'll start by defining the public API of the framework, that consists of a few interfaces and classes.


public interface IOrder
{
  OrderConfirmation Submit(string flightNumber, AccountInformation account);
}


public interface IFlightSchedule
{
  FlightInformation GetFlightInformation(string flightNumber);
}


public interface IReservations
{
  Reservation Reserve(FlightInformation flightInformation);
  void ConfirmReservation(Reservation reservation);
  void CancelReservation(Reservation reservation);
}


public interface IBilling
{
  Invoice Charge(AccountInformation account);
}


public class AccountInformation
{
  public AccountInformation(
    string creditCardNumber,
    string expirationDate,
    string fullName,
    string address)
  {
    CreditCardNumber = creditCardNumber;
    ExpirationDate = expirationDate;
    FullName = fullName;
    Address = address;
  }


  public string CreditCardNumber { get; private set; }
  public string ExpirationDate { get; private set; }
  public string FullName { get; private set; }
  public string Address { get; private set; }
}


public class FlightInformation
{
  public FlightInformation(
    string flightNumber,
    DateTime departureDateTime,
    DateTime landingDateTime,
    string origin,
    string destination)
  {
    FlightNumber = flightNumber;
    DepartureDateTime = departureDateTime;
    LandingDateTime = landingDateTime;
    Origin = origin;
    Destination = destination;
  }


  public string FlightNumber { get; private set; }
  public DateTime DepartureDateTime { get; private set; }
  public DateTime LandingDateTime { get; private set; }
  public string Origin { get; private set; }
  public string Destination { get; private set; }
}


public class Reservation
{
  public Reservation(string reservationId, string flightNumber, string seat)
  {
    ReservationId = reservationId;
    FlightNumber = flightNumber;
    Seat = seat;
  }


  public string ReservationId { get; private set; }
  public string FlightNumber { get; private set; }
  public string Seat { get; private set; }
}


public class OrderConfirmation
{
  public OrderConfirmation(FlightInformation flight, Invoice orderInvoice, Reservation flightReservation)
  {
    Flight = flight;
    OrderInvoice = orderInvoice;
    FlightReservation = flightReservation;
  }


  public FlightInformation Flight { get; private set; }
  public Invoice OrderInvoice { get; private set; }
  public Reservation FlightReservation { get; private set; }
}

First notice a few things. The business components that make up my public API are abstracted as interfaces. I always make all my public API interface-based. Secondly, all the classes that are part of the public API are entity-like classes and are intentionally written to be immutable. That way if you get an object from the business component interface you know it hasn't been modified. Notice also how I factored the business logic of a ticket order into separate resposibilities: schedule, reservation, billing and order itself. This is done to reduce complexity and make each of these individual components more testable.

Now let's look at the implementation of the Order component:

internal class Order : IOrder
{
  private readonly IFlightSchedule m_flightSchedule;
  private readonly IReservations m_reservations;
  private readonly IBilling m_billing;


  public Order(IFlightSchedule flightSchedule, IReservations reservations, IBilling billing)
  {
    m_flightSchedule = flightSchedule;
    m_reservations = reservations;
    m_billing = billing;
  }


  #region IOrder Members


  OrderConfirmation IOrder.Submit(string flightNumber, AccountInformation account)
  {
    FlightInformation flightInfo = m_flightSchedule.GetFlightInformation(flightNumber);


    Reservation reservation = null;


    try
    {
      reservation = m_reservations.Reserve(flightInfo);


      Invoice invoice = m_billing.Charge(account);


      m_reservations.ConfirmReservation(reservation);


      return new OrderConfirmation(
flightInfo,
invoice,
reservation);
    }
    catch (BillingException)
    {
      m_reservations.CancelReservation(reservation);


      throw;
    }
  }


  #endregion
}
 
Now, clearly, in order to complete an order, the Order component must communicate with other related components: flight schedule, reservations and billing. This creates dependencies between Order and the other components. These dependencies are, however, inverted in the design of the Order class. Notice that Order has references to IFlightSchedule, IReservations and IBilling and those are passed to Order in the constructor. This way I'm ensuring that Order doesn't know anything about the implementation of these components and that it is given the implementation through the constructor. This will also allow for dependency injection, constructor based injection is preferred over property injection as it keeps the API of the Order class immutable as well. Order class, as well as all other implementation of the public API, is made internal so it is hidden.
 
In the implementation of the IOrder interface, the Order class simply calls the dependent components, handles the business logic of it, and builds up the result OrderConfirmation object.
 
The way you would work with this framework is by getting an IOrder from it somehow. Now, there are a number of ways you can implement this, dependency injection framework being one, handcoded factory being another. I prefer to handcode the factory unless I need the flexibility of re-wiring the components at deployment/configuration time.
 
Let's say you want to test the Order class. There's only one method you need to test here, Submit method. It's business logic can be desribed as: get flight information, try to create reservation, if it fails bail out, else continue on to charge the credit card, if it fails cancel reservation, else confirm it, and return order confirmation. Pretty simple. When writing the unit test for this method, what you want to do is mock all dependent components: IFlightSchedule, IReservation and IBilling. You can use any of the mocking frameworks out there. Make sure to set up the mocks to validate the calls and input parameters to ensure that Order calls the right methods and passes the right things. In this case the business logic is such that it makes sense to validate the sequence of those calls to mocks, too.
 
This was a very simple example, but it demonstrates some of the key principles to follow to make your code more testable. In reality, these concepts are mostly applied when implementing business layers in applications as those are often most complex due to the complex nature of the business logic they encapsulate. Same principles apply, though, when communicating across layer boundaries.

Saturday, January 16, 2010

The Future of eLearning

I'm going to try to put on my visionary hat for a second and talk about the future of the eLearning industry.

In order to understand where we're heading, we need to understand where we are today. What we have today is a rather boring landscape of eLearning applications, with a few attempts to leap forward to the next generation architecture, mostly unsuccessful. The major divide is between enterprise eLearning solutions (i.e. learning management systems) and the open-source offering. Orthogonal to the two main directions are a lot of weak attempts to integrate various tools, web applications and services into the eLearning suite, with the goal to enhance the capabilities of the baseline solution.

The Promise of the Virtual Learning Environment

At one point, there was a lot of hype about the VLE as the vision for the future of eLearning, adopted by a large number of professionals in the software and academic community. So what was VLE about? Virtual learning environment goes beyong a simple learning management system. It is a learning environment, a system that encompases an entire range of capabilities, from infrastructure, through learning content, management and administration, delivery and collaboration etc. VLE would be the mother of all eLearning architectures. Except it wasn't.

The Future is Cloudy

The vision of any kind of centralized learning hub is fundamentaly flawed. The future, quite literaly, is cloudy. Only when eLearning solutions start utilizing decentralized architectures such as cloud computing, mashups, rich multimedia content delivery and interactivity (web 3.0) will they achieve it's full potential. This will truly be a leap forward.

Most people think that the eLearning solution of tomorrow will be offered as an SaaS (software as a service). While SaaS does lend itself as the right architectural choice for an increasing number of applications across a wide range of industries, it is only a part of the vision. eLearning solutions of tomorrow undoubtedly have to preserve some of their core capabilities such as student enrollment, courses, grades etc. Those capabilities will clearly remain under thight control of the centralized eLearning system, whether offered as SaaS or not. But what about the learning experience? We are already seeing an increasing number of attempts to integrate content delivery tools into central LMSes. This trend will continue to a point where 90% of the capabilities of an LMS will be under the control of the end-user, mashed up into, what will inevitably become a personalized learning experience. What's going to be crucial to the success of this vision is the open-ness oriented towards widening the integration landscape i.e. open standards.

Let's try to illustrate the vision on a hypothetical eLearning solution of the future:

Imagine logging on to you university website with your Yahoo account. After you log in you create your personal profile by entering personal metadata, preferences, educational background, interests etc. You then proceed to the administrative part where you enroll into a couple of courses. When you visit the website of one of your courses you realize you're taken to a Google Wave where the instructor has invited you to an open discussion about the course material. He plays a video for participants, then invites everyone to a chat session. While you're chatting away with your peers, you realize that some posts are coming straight from Twitter along with photos from the "scene". You immediately change the direction of your discussion to analyze the new information and decide to switch over to a Wikipedia page to research a little bit more on the background of the events that are unfolding, while at the same time continuing to exchange information with your fellow students. You then decide to blog about what you've discovered and you let everyone in your course know to follow up on it. Later that day you receive an email from the system notifying you that your instructor has posted a comment on your blog post as well as entered a grade for you in the gradebook. You did pretty well.

A few key points in the illustration:
  • The learning experience was delivered through a mashup of online services, some commercial and some free, ranging from collaborative environments such as Google Wave, a more traditional delivery method such as HTML content, to direct integration with online micro-blogging and macro-blogging services.
  • There is no centralized LMS. While some administrative capabilities were managed by a core service, the end-user was never bound to a particular application or web site.
  • The quality of the learning experience very much relies on the capabilities of the learning content delivery tools and services that are utilized, whether they're collaborative or micro-blogging. The better these services are, the better the learning experience. This really shows how inter-connected the future of the eLearning industry is with the future direction of the web in general.
Last, but not least, a key quality improvement in the learning experience of tomorrow over what we have today, evident in the illustration: the learner is in the driver's seat. The learner is driving his own learning experience, affecting his learning outcomes.

Saturday, January 9, 2010

The Broken Promise of ORM Frameworks

How many times have you heard the words: "NHibernate writes better SQL than I do"?

I've heard those words a number of times, mostly from people who proud themselves for being masters of domain driven development. So what's wrong with that statement?

It's actually pretty accurate. People who would say something like that probably do write bad SQL, so it is very likely that a framework would write better SQL than them. But there's an often overlooked fact hiding in that statement, one that demands architectural consideration. And that is, not only is NHibernate writing SQL for the developer, it's doing it at runtime. For an architect, that's a game-changer.

There really is no such thing as "better SQL". Seemingly well written statements can take down the database server if the schema is not optimized for them. Including those written by NHibernate. So saying that a framework would write better SQL than a developer is more often used as an excuse to not look at query plans. Query plan analysis is one of the core competencies of developers, some would say database developers, I'd actually say all developers. Every developer needs to be able to recognize poorly optimized schema when they see a key lookup operator or an index scan operator, let alone a table scan or unnecessary joining.

But ask yourself, how can a developer look at query plans if they don't know which SQL queries execute? Why wouldn't they know? Well, if they let NHibernate generate them at runtime, they have no way of telling which SQL queries will execute at build time. They would have to trace the execution at runtime, capture the SQL, then figure out if it's optimal. While that can result in catching all errors, there's a fundamental problem with this process: typically, SQL code review is done by people with specific competencies, being that those people cost more and there is generally fewer of them. So, having to trace the execution of applications to figure out what statements execute isn't very efficient, especially in large-scale development projects.

It all comes down to quality and risk. In order to ensure high quality, a project should follow a certain well established process, and whether industry standard or company-specific, that process will surely involve a review of the performance of data access components. I haven't yet seen an efficient and precise process in which the target keeps moving. And runtime SQL generation is a moving target from the code review perspective.

But there's another point here. Just because a framework like NHibernate generates SQL at runtime doesn't mean that it prevents people from creating ad-hoc queries. In the case of NH, those queries are expressed using its object query syntax. Ad-hoc SQL is bad in general, mostly because it violates the principle of predictability: a database is designed as well as the SQL that runs is optimized for it. Running any kind of SQL against a statically built database can lead to undesirable consequences. And that represents high impact risk. That risk becomes high probability risk when the architecture doesn't allow for an efficient quality review process to mitigate it.

So, is all SQL generation bad? Absolutely not. But there needs to be a way to guarantee the optimal performance of the database. Certain types of SQL statements are "safe", mostly the ones involved with object graph traversal. Everything else needs to be statically compiled. Allowing SQL statements outside of the optimized and approved set should simply not be allowed.

However, most ORM frameworks violate this rule. They do, in fact, support programmable query construction, entity retrieval based on ad-hoc criteria etc. They are simply overly generic and flexible. There's nothing wrong with the ORM pattern. It is the most common data access pattern in use today. It's just used poorly. ORM is not, as some people think, a way to work with the database without writing a line of SQL.

So, what can we do to address these concerns?

When used a certain way, ORM frameworks can be very powerful. They can cut the development time, improve consistency, therefore quality, and they promote clean design by being themselves designed based on SOLID principles.

Here are a few tips on how to avoid the pitfalls:
  1. Keep it constrained to the data access layer. Use it as a data access layer, not a domain.
  2. Limit ad-hoc queries. Concentrate on where the payoff is highest, boilerplate object traversal. Code more complicated SQL by hand.
  3. Review all query plans for all possible SQL statements that can execute by writing integration tests, targeted at the data access methods. Try to achieve near 100% coverage.
  4. Don't forget to build domains on top of the data access. Apply SOLID principles and prevent people from calling into the data access layer directly by hiding it internally.
One last thing is, if you're considering dynamic runtime schema optimization, don't! It's a migration nightmare. No project manager in their right mind would allow it.