Coding, is? Fun!

Wednesday, July 04, 2007

Reengineering a website

I have been working on rearchitecting a public website for the last few months. The experience has been instructive.
The website has a large class library in VJ++ components; at least a dozen databases; and hundreds of ASP pages. Our goals were:
1. To shift the code base to .NET and use an application framework designed by us.
2. To exploit this opportunity to redesign the business logic; split it into a set of services; and thus simplify maintenance and enhance performance.

There were several major modules in the website. We decided to approach each of these separately and organize the project into several phases for each module. By doing so, we consciously tried to decouple these modules - if module A was dependent on module B, we would try to isolate these dependencies into a service layer that implemented a facade. For example, the member information was used by all modules - and so we built a read-only service that provided the other modules with the member information.

The approach we took for each major module was this:
Step 1 - Discovery
1. Collect a high number of internal documents - both business and technical - that were available and keep them in a repository.
2. Navigate through the website ourselves and try collecting Database profiler data (such as stored procedures executed) and fiddler data (fiddler is a http debugger from Microsoft). This gave us the client-to-middle tier interface and the middle tier-to-database interface. We put these together in a sequence diagram.
3. List the database tables and try to infer their role in the application flow. Tie these to the higher level workflow in the application.
4. If the existing middle-tier was well done, we will be lucky to find a set of manager or service interfaces that are used to encapsulate business logic. List such manager classes and their interfaces.
5. Talk to the business users. Walk through the backend administration systems, if any. usually big websites also have administrative systems (such as accounting billing systems) that write some data to the database tables. Taking a look at them helps us to understand the tables better and also keep those backoffice systems in mind when redesigning the main website.
At this stage the goal is to collect as much data as possible. Vague ideas about the future architecture of the system and the services boundary float around and are captured. But the goal is to get our heads around the system.
In itself, the Discovery phase is useful, because it lets us collect and organize informationin a single place. In our case we dumped this information into a huge document with many links to sub documents. It was raw data.

Step II - Analysis
In the second phase we block each different subsystems in the overall website and discuss their inner workings. The data we collected in Phase I is useful for this (particularly the Fiddler and Profiler information). We also organize the discovery documentation and enhance it with sequences and flow charts. This may need copying information from some existing documents.
The Analysis is best done as a team with at least 2 members. It is a good idea to include an existing resource working on the maintenance of the legacy website.
The output of analysis is a better organized document.
At this time, we can also document the rearchitectural ideas that we have; any model that we would like the new design to be built on. For example, we may think that a subsystem can be split into a read only service and a read-write service. We may also have concrete ideas on the database model. It is better to document these and start discussing merits and de-merits. Such discussions tend to vanish if not documented.

Step III - Building Specifications
For such reengineering projects, it is unusual to get concrete functional specifications (other than the single line "it should look and feel the same as the existing site").
Therefore, in this phase we can move to building an architecture document. This document will cover each one of the subsystems that we are redesigning. If there are new subsystems being introduced, it will cover that. It will cover the Rollout plan - this is critical in a website that cannot afford downtime. It will aslo cover any integration plans.
Some of the subsystems could be straight ports from J++ to .NET. These have to be mentioned in the architecture document.
Some of these redesigns may require:
Digging deeper into the existing code to uncover business logic or any violations of our model.
Building Proof-Of-Concept (POC) - in our case we had to do this for the Flash-to-service layer integration.

Step IV - Technical Specifications and Estimation
From this point on the project feels like a development project and goes through similar iterations.

The collection of as much data as possible about the existing (legacy) system is crucial for the rearchitecture to be a success.

Thanks to Mr.Jorge Gonzalez for sharing some of these approaches and ideas with me.

Labels: , ,

0 Comments:

Post a Comment

<< Home