Saturday, January 30, 2010

Common Mistakes in Agile Development

You'll hear a lot of project managers nowadays swearing they are applying an agile methodology. But are they really?

I came across a number of mistakes people commonly make when applying agile, or when they think they're applying agile.

One of the most common ones is that people think of agile as iterative development. It is not! While some agile variants do consist of iterations, it is not iterative in a classical sense. Iterative development is about continuous refinement towards the end goal. On the other hand, agile si about achieving short feature-complete and scope-limited milestones in the project. There's a big difference.

So what's the most common consequence of this mistake? People will, under the excuse of being agile, simply adopt an ad-hoc approach in the early stages of the SDLC, expecting that things will change frequently and not bother to spend enough time on architecture and design. What they should be doing is setting up milestones based on the customer value (a concept from the agile project management triangle, I'll touch on later). And then working towards those milestones, following the usual SDLC processes.

Another very frequent mistake is that some people tend to stick to a known agile process to the letter, not even bothering to validate that it works well in their organization. Agile methodologies were meant to be tailored. What you typically want do is tailor it to the organizational policies, competencies and communication style in your organization. Being too rigid about applying a known agile process can lead to people rejecting it. Being flexible allows you to appeal to the needs and constraints of your organization, making the adoption effort more likely to work.

Developers as well as other participants in software development projects, like most people, resist change unless they have a clear incentive. Adopting agile often doesn't present that clear incentive, especially if the management doesn't really believe in it. And how could they when they're not the subject matter experts. Instead of forcing change on their team members, project managers are far more likely to succeed by bringing in external agile SMEs to facilitate the process.

I mentioned the agile triangle previously. Some people think of it as a new wholy grail of agile project management. The classical project managent triangle scope-time-cost with quality in the center is in agile replaced by different facets: customer value, quality and the third one is the classical triangle as a single facet.
What that means is that in agile you need to continuously deliver new value to the customer, while maintaining high quality and staying within the constraints of the classical project management approach.

Not thinking of this new metric is another common mistake people trying to adopt agile make. Naturally, not keeping it in mind can lead to either poor quality or producing a piece of software that noone is going to use.

Some agile SMEs think that one of the ways to improve an agile development process is to keep learning the right lessons. I definitely agree here. Some organizations will even go as far as making the learning aspect a core part of their culture, which results in a complete change in the mind-set of the employees, making them more oriented towards improving their competencies rather than succeeding on projects. Both are equally important in my view, but finding a way to balance the two is key.

Last but not least, an agile process is not a purpose in itself but rather a delivery method that ensures the organization can quickly adapt to change. It's important not to become too attached too any particular process, especially agile. There are downsides to having such a flexible development process. If your organization rutinely deals with change, change will become expected, eventualy even desired. The more your organization becomes able to rapidy respond to change, the more it will feel relaxed about accepting it and the better the chance of a high-risk change having a negative impact. This is to say, don't ever get too comfortable with change.

SCRUM: What Works and What Doesn't

Let's try to illustrate this on one of the most commonly used agile development processes, SCRUM.

The roles in the SCRUM process include: the product owner, the team, the SCRUM master (who often is a member of the team). Some of the key features in the SCRUM process (and this is an over-simplification) are short sprints (usually couple of weeks in duration), daily 15 minute stand-up meetings to briefly check-in with the team, sprint planning and sprint review meetings at the beginning/end of each sprint, product owner assigns the tasks in order of priority to the product backlog, team does the pick'n'choose in each sprint, SCRUM master reports the daily burndown chart.

Now, one of the things you may want to tailor in your organization are the actors in this process. I've seen SCRUM with a team member being the product owner and SCRUM master work, I've seen SCRUM with no SCRUM master work, too.

The key to having a successful SCRUM process is a well balanced team. You don't want team members who can't take on a task from the backlog, or who will fight over taking a task. Good communication is key to ensuring all risks are caught early and mitigated accordingly.

Sprints may or may not be fixed in duration. If you see you can't achieve significant enough value in a sprint, extend it. Don't artificially create more sprints to make up the time. Don't leave features unfinished or partially done. It's not iterative, it's agile.

Don't make architecture decisions likely, expecting they will change. They don't have to change, unless they're wrong. Agile architecture development is tricky since it's hard to figure out the value that it provides. One way to go about it is to vertically partition the solution and split architecture development into phases.

Again, learning is an integral part of SCRUM. Common mistake organizations make is to not spend enough time in sprint review meetings and concentrate mostly on sprint planning. Sprint review is as important and should not be neglected. Successful techniques for sprint review that people use include things like success and failure stories, communication matrices, various types of root-cause analysis etc.

I'm going to leave you with an analogy to a car accident: they say that when in a car accident you're far better off keeping your eyes on where you want to go instead of on what you might run into.

Saturday, January 23, 2010

Primer - How To Write Testable Code

This is the first article in the Primer series. The Primer series will mostly be devoted to practical advise on various topics in software architecture and development.

This week I'll show you how to write code that is SOLID and testable. The reason for writing testable code is, of course, maintainability of your application. Maintainability results in fewer issues therefore lower true cost of ownership. Needless to say, it is one of the key considerations for any architect.

Before you start writing unit tests and integration tests and automating it all in the continuous integration environment, you have to make sure that the technical design principles you apply make your code easily testable. If you can't cleanly test your code your tests will not have any value as bugs will be easy to miss or tests would end up being to fragile. Whether you apply test driven development (i.e. write tests first) or not, the complexity of your technical design will ultimately affect the complexity of your tests.

Here are a few suggestions to follow, before you even start:
  • Follow the SOLID principles. They will ensure that you reduce complexity by creating single-responsibility classes and therefore reduce complexity of your tests as you'll be testing one responsibility at a time.
  • Apply the inversion of control pattern. Inversion of dependencies will allow you to build up the mocks more easily when writing unit tests.
I'll demonstrate some of these principles on an example. The example is an airline ticket ordering framework.

I'll start by defining the public API of the framework, that consists of a few interfaces and classes.


public interface IOrder
{
  OrderConfirmation Submit(string flightNumber, AccountInformation account);
}


public interface IFlightSchedule
{
  FlightInformation GetFlightInformation(string flightNumber);
}


public interface IReservations
{
  Reservation Reserve(FlightInformation flightInformation);
  void ConfirmReservation(Reservation reservation);
  void CancelReservation(Reservation reservation);
}


public interface IBilling
{
  Invoice Charge(AccountInformation account);
}


public class AccountInformation
{
  public AccountInformation(
    string creditCardNumber,
    string expirationDate,
    string fullName,
    string address)
  {
    CreditCardNumber = creditCardNumber;
    ExpirationDate = expirationDate;
    FullName = fullName;
    Address = address;
  }


  public string CreditCardNumber { get; private set; }
  public string ExpirationDate { get; private set; }
  public string FullName { get; private set; }
  public string Address { get; private set; }
}


public class FlightInformation
{
  public FlightInformation(
    string flightNumber,
    DateTime departureDateTime,
    DateTime landingDateTime,
    string origin,
    string destination)
  {
    FlightNumber = flightNumber;
    DepartureDateTime = departureDateTime;
    LandingDateTime = landingDateTime;
    Origin = origin;
    Destination = destination;
  }


  public string FlightNumber { get; private set; }
  public DateTime DepartureDateTime { get; private set; }
  public DateTime LandingDateTime { get; private set; }
  public string Origin { get; private set; }
  public string Destination { get; private set; }
}


public class Reservation
{
  public Reservation(string reservationId, string flightNumber, string seat)
  {
    ReservationId = reservationId;
    FlightNumber = flightNumber;
    Seat = seat;
  }


  public string ReservationId { get; private set; }
  public string FlightNumber { get; private set; }
  public string Seat { get; private set; }
}


public class OrderConfirmation
{
  public OrderConfirmation(FlightInformation flight, Invoice orderInvoice, Reservation flightReservation)
  {
    Flight = flight;
    OrderInvoice = orderInvoice;
    FlightReservation = flightReservation;
  }


  public FlightInformation Flight { get; private set; }
  public Invoice OrderInvoice { get; private set; }
  public Reservation FlightReservation { get; private set; }
}

First notice a few things. The business components that make up my public API are abstracted as interfaces. I always make all my public API interface-based. Secondly, all the classes that are part of the public API are entity-like classes and are intentionally written to be immutable. That way if you get an object from the business component interface you know it hasn't been modified. Notice also how I factored the business logic of a ticket order into separate resposibilities: schedule, reservation, billing and order itself. This is done to reduce complexity and make each of these individual components more testable.

Now let's look at the implementation of the Order component:

internal class Order : IOrder
{
  private readonly IFlightSchedule m_flightSchedule;
  private readonly IReservations m_reservations;
  private readonly IBilling m_billing;


  public Order(IFlightSchedule flightSchedule, IReservations reservations, IBilling billing)
  {
    m_flightSchedule = flightSchedule;
    m_reservations = reservations;
    m_billing = billing;
  }


  #region IOrder Members


  OrderConfirmation IOrder.Submit(string flightNumber, AccountInformation account)
  {
    FlightInformation flightInfo = m_flightSchedule.GetFlightInformation(flightNumber);


    Reservation reservation = null;


    try
    {
      reservation = m_reservations.Reserve(flightInfo);


      Invoice invoice = m_billing.Charge(account);


      m_reservations.ConfirmReservation(reservation);


      return new OrderConfirmation(
flightInfo,
invoice,
reservation);
    }
    catch (BillingException)
    {
      m_reservations.CancelReservation(reservation);


      throw;
    }
  }


  #endregion
}
 
Now, clearly, in order to complete an order, the Order component must communicate with other related components: flight schedule, reservations and billing. This creates dependencies between Order and the other components. These dependencies are, however, inverted in the design of the Order class. Notice that Order has references to IFlightSchedule, IReservations and IBilling and those are passed to Order in the constructor. This way I'm ensuring that Order doesn't know anything about the implementation of these components and that it is given the implementation through the constructor. This will also allow for dependency injection, constructor based injection is preferred over property injection as it keeps the API of the Order class immutable as well. Order class, as well as all other implementation of the public API, is made internal so it is hidden.
 
In the implementation of the IOrder interface, the Order class simply calls the dependent components, handles the business logic of it, and builds up the result OrderConfirmation object.
 
The way you would work with this framework is by getting an IOrder from it somehow. Now, there are a number of ways you can implement this, dependency injection framework being one, handcoded factory being another. I prefer to handcode the factory unless I need the flexibility of re-wiring the components at deployment/configuration time.
 
Let's say you want to test the Order class. There's only one method you need to test here, Submit method. It's business logic can be desribed as: get flight information, try to create reservation, if it fails bail out, else continue on to charge the credit card, if it fails cancel reservation, else confirm it, and return order confirmation. Pretty simple. When writing the unit test for this method, what you want to do is mock all dependent components: IFlightSchedule, IReservation and IBilling. You can use any of the mocking frameworks out there. Make sure to set up the mocks to validate the calls and input parameters to ensure that Order calls the right methods and passes the right things. In this case the business logic is such that it makes sense to validate the sequence of those calls to mocks, too.
 
This was a very simple example, but it demonstrates some of the key principles to follow to make your code more testable. In reality, these concepts are mostly applied when implementing business layers in applications as those are often most complex due to the complex nature of the business logic they encapsulate. Same principles apply, though, when communicating across layer boundaries.

Saturday, January 16, 2010

The Future of eLearning

I'm going to try to put on my visionary hat for a second and talk about the future of the eLearning industry.

In order to understand where we're heading, we need to understand where we are today. What we have today is a rather boring landscape of eLearning applications, with a few attempts to leap forward to the next generation architecture, mostly unsuccessful. The major divide is between enterprise eLearning solutions (i.e. learning management systems) and the open-source offering. Orthogonal to the two main directions are a lot of weak attempts to integrate various tools, web applications and services into the eLearning suite, with the goal to enhance the capabilities of the baseline solution.

The Promise of the Virtual Learning Environment

At one point, there was a lot of hype about the VLE as the vision for the future of eLearning, adopted by a large number of professionals in the software and academic community. So what was VLE about? Virtual learning environment goes beyong a simple learning management system. It is a learning environment, a system that encompases an entire range of capabilities, from infrastructure, through learning content, management and administration, delivery and collaboration etc. VLE would be the mother of all eLearning architectures. Except it wasn't.

The Future is Cloudy

The vision of any kind of centralized learning hub is fundamentaly flawed. The future, quite literaly, is cloudy. Only when eLearning solutions start utilizing decentralized architectures such as cloud computing, mashups, rich multimedia content delivery and interactivity (web 3.0) will they achieve it's full potential. This will truly be a leap forward.

Most people think that the eLearning solution of tomorrow will be offered as an SaaS (software as a service). While SaaS does lend itself as the right architectural choice for an increasing number of applications across a wide range of industries, it is only a part of the vision. eLearning solutions of tomorrow undoubtedly have to preserve some of their core capabilities such as student enrollment, courses, grades etc. Those capabilities will clearly remain under thight control of the centralized eLearning system, whether offered as SaaS or not. But what about the learning experience? We are already seeing an increasing number of attempts to integrate content delivery tools into central LMSes. This trend will continue to a point where 90% of the capabilities of an LMS will be under the control of the end-user, mashed up into, what will inevitably become a personalized learning experience. What's going to be crucial to the success of this vision is the open-ness oriented towards widening the integration landscape i.e. open standards.

Let's try to illustrate the vision on a hypothetical eLearning solution of the future:

Imagine logging on to you university website with your Yahoo account. After you log in you create your personal profile by entering personal metadata, preferences, educational background, interests etc. You then proceed to the administrative part where you enroll into a couple of courses. When you visit the website of one of your courses you realize you're taken to a Google Wave where the instructor has invited you to an open discussion about the course material. He plays a video for participants, then invites everyone to a chat session. While you're chatting away with your peers, you realize that some posts are coming straight from Twitter along with photos from the "scene". You immediately change the direction of your discussion to analyze the new information and decide to switch over to a Wikipedia page to research a little bit more on the background of the events that are unfolding, while at the same time continuing to exchange information with your fellow students. You then decide to blog about what you've discovered and you let everyone in your course know to follow up on it. Later that day you receive an email from the system notifying you that your instructor has posted a comment on your blog post as well as entered a grade for you in the gradebook. You did pretty well.

A few key points in the illustration:
  • The learning experience was delivered through a mashup of online services, some commercial and some free, ranging from collaborative environments such as Google Wave, a more traditional delivery method such as HTML content, to direct integration with online micro-blogging and macro-blogging services.
  • There is no centralized LMS. While some administrative capabilities were managed by a core service, the end-user was never bound to a particular application or web site.
  • The quality of the learning experience very much relies on the capabilities of the learning content delivery tools and services that are utilized, whether they're collaborative or micro-blogging. The better these services are, the better the learning experience. This really shows how inter-connected the future of the eLearning industry is with the future direction of the web in general.
Last, but not least, a key quality improvement in the learning experience of tomorrow over what we have today, evident in the illustration: the learner is in the driver's seat. The learner is driving his own learning experience, affecting his learning outcomes.

Saturday, January 9, 2010

The Broken Promise of ORM Frameworks

How many times have you heard the words: "NHibernate writes better SQL than I do"?

I've heard those words a number of times, mostly from people who proud themselves for being masters of domain driven development. So what's wrong with that statement?

It's actually pretty accurate. People who would say something like that probably do write bad SQL, so it is very likely that a framework would write better SQL than them. But there's an often overlooked fact hiding in that statement, one that demands architectural consideration. And that is, not only is NHibernate writing SQL for the developer, it's doing it at runtime. For an architect, that's a game-changer.

There really is no such thing as "better SQL". Seemingly well written statements can take down the database server if the schema is not optimized for them. Including those written by NHibernate. So saying that a framework would write better SQL than a developer is more often used as an excuse to not look at query plans. Query plan analysis is one of the core competencies of developers, some would say database developers, I'd actually say all developers. Every developer needs to be able to recognize poorly optimized schema when they see a key lookup operator or an index scan operator, let alone a table scan or unnecessary joining.

But ask yourself, how can a developer look at query plans if they don't know which SQL queries execute? Why wouldn't they know? Well, if they let NHibernate generate them at runtime, they have no way of telling which SQL queries will execute at build time. They would have to trace the execution at runtime, capture the SQL, then figure out if it's optimal. While that can result in catching all errors, there's a fundamental problem with this process: typically, SQL code review is done by people with specific competencies, being that those people cost more and there is generally fewer of them. So, having to trace the execution of applications to figure out what statements execute isn't very efficient, especially in large-scale development projects.

It all comes down to quality and risk. In order to ensure high quality, a project should follow a certain well established process, and whether industry standard or company-specific, that process will surely involve a review of the performance of data access components. I haven't yet seen an efficient and precise process in which the target keeps moving. And runtime SQL generation is a moving target from the code review perspective.

But there's another point here. Just because a framework like NHibernate generates SQL at runtime doesn't mean that it prevents people from creating ad-hoc queries. In the case of NH, those queries are expressed using its object query syntax. Ad-hoc SQL is bad in general, mostly because it violates the principle of predictability: a database is designed as well as the SQL that runs is optimized for it. Running any kind of SQL against a statically built database can lead to undesirable consequences. And that represents high impact risk. That risk becomes high probability risk when the architecture doesn't allow for an efficient quality review process to mitigate it.

So, is all SQL generation bad? Absolutely not. But there needs to be a way to guarantee the optimal performance of the database. Certain types of SQL statements are "safe", mostly the ones involved with object graph traversal. Everything else needs to be statically compiled. Allowing SQL statements outside of the optimized and approved set should simply not be allowed.

However, most ORM frameworks violate this rule. They do, in fact, support programmable query construction, entity retrieval based on ad-hoc criteria etc. They are simply overly generic and flexible. There's nothing wrong with the ORM pattern. It is the most common data access pattern in use today. It's just used poorly. ORM is not, as some people think, a way to work with the database without writing a line of SQL.

So, what can we do to address these concerns?

When used a certain way, ORM frameworks can be very powerful. They can cut the development time, improve consistency, therefore quality, and they promote clean design by being themselves designed based on SOLID principles.

Here are a few tips on how to avoid the pitfalls:
  1. Keep it constrained to the data access layer. Use it as a data access layer, not a domain.
  2. Limit ad-hoc queries. Concentrate on where the payoff is highest, boilerplate object traversal. Code more complicated SQL by hand.
  3. Review all query plans for all possible SQL statements that can execute by writing integration tests, targeted at the data access methods. Try to achieve near 100% coverage.
  4. Don't forget to build domains on top of the data access. Apply SOLID principles and prevent people from calling into the data access layer directly by hiding it internally.
One last thing is, if you're considering dynamic runtime schema optimization, don't! It's a migration nightmare. No project manager in their right mind would allow it.