Varun asked me some details on a database access layer. This post is an attempt to introduce developers to the complex field of Object Relational(OR) Mapping. I conducted a presentation on this subject with reference to EF internal frameworks. I have also given a set of links at the end.
For the purpose of this article let us consider that we are asked to design the middle tier and database structure of an online book store. A customer can order different books in multiple quantities. There is a Customer table; an Order table linked to the Customer; a Book_lkp table that contains the list of books displayed in the website. A Customer places multiple Orders of different books.
The Middle Tier has to provide the ability to:
1. Load a set of books for display
2. Load a Customer data
3. Save an Order for a Customer
One of the ways to design a data access layer for the above problem (in .NET) is:
Load a Dataset with the Customer and Orders and then save after making changes.
This approach is fine for simple websites - but you have to handle business logic in a set of external classes. Strongly Typed Datasets help you add type safety.
The problem with the simple approach is that for complex websites with intricate relations, your business logic will be very distributed and incoherent.
Typically your middle tier should model your business - you will need an object model. Unfortunately, databases are relational. They are not object based. For example, databases do not capture inheritance. In the above example, let us say a Customer can be a corporation also. In the middle tier, you can describe a Customer base class and derive an Individual class and a Corporate class from it. It is difficult to do this in a databse structure. Databases also do not have object identity; they do not handle relations intrinsically.
So, fundamentally, a relational database represents data and its relations completely differently from a object model. The Middle tier needs a good object model. Therefore, the problem is to map the object data to relational data. This is the problem that Object Relational mapping patterns try to solve.
One of the patterns for solving this problem is called Active Record.
Active RecordWhen you use this pattern, you decide that your object model in the middle tier is going to mirror the database model - that is for every table you create a class and then handle foreign key relations through child collections.
So, in the Book Store example above, you will have a Customer class, an Order class and Product class. The Customer class will have a child collection of Orders.
When you use Active Record, each class is responsible for saving and loading itself. The Customer communicates with Customer table, Order with Order table and so forth.
Commonly, in frameworks, the code for these classes are generated from the database itself. For example, a Customer class will be generated with a property for each column in the Customer table. Also, since you do not want to embed the actual mechanism of communicating with the database in each class, all these classes will derive from a base class - which handles the connections and executes the queries. (Ruby on Rails uses this pattern).
Using this pattern, you will first start with a database design and you will not spend much time on an object model design.
Since the database structure is replicated in the middle tier, the design of your object model cannot include the subtleties of a pure object model. For example, Active Record makes subclassing difficult to handle.
If you are a purist and do want to design a object model that mirrors your business instead of the database, you can use another set of patterns called Mapper.
Data MapperWhen you design an object model independent of your database model, you will still want some mechanism of saving the data to the database. Using Mapper, it is the responsibility of external classes to "map" your object structure to the database. The Mappers will handle database access and executing queries.
In the above example with the Individual and Corporate classes, you will write a base mapper for the Customer class and then derive a mapper each for the Individual and Corporate classes. The Individual and Corporate classes themselves will have no knowledge of the database. They can contain business logic; but they will be passed to their Mapper classes for saving and loading. The mappers would contain the logic to map the properties to columns. The mappers will contain the database stored procedure names, typically.
Contrast this with Active Record where the mapping is done by the middle tier objects themselves.
One way to understand this is to contrast the code given below. In the first line, a Customer object saves itself. In the second, a mapper saves the Customer object:
Customer c = new Customer();
c.name = "Ram";
c.save();
Customer
c = new Customer();
c.name = "Ram";
CustomerMapper.Save(c);
OR mapping has different shades of the above patterns. There are many open source OR mapper frameworks such as Hibernate. All of them promise configuration driven mapping; but Ted Neward and others think there cannot be a perfect mapper and that each of these need some customization. Ted Neward has even compared OR mapping to the Vietnam war
here.
There are lot more details in the Wikipedia article
here.
Developing your own mapper framework is not an easy task. You may be better off working with an open source framework or use Active Record with code generation.
There a whole set of patterns related to these mappers such as Lazy Loader (for loading child collections at the point of usage); Assembler (for assembling the objects); Separated Interface (for isolating the mappers). The best book reference for these is "Patterns of Enterprise Application Architecture" by Martin Fowler.