Recently, my employer has instituted a program called “Innovator of the Quarter”. The idea is for employees to submit ideas, such as new product features, process improvements, or tools to increase productivity, to a committee. The committee will then pick the idea it thinks has the most merit, and the person who submits the winning idea will get $500 and the chance to implement their idea, or at least build a proof-of-concept.
When this program was first announced, several ideas occurred to me, but the one I thought would result in the greatest benefit for the company (and, admittedly, the greatest reduction of development pain for me) was to use NHibernate for our data access instead of the hand-rolled DAL we currently use. I put together a proof-of-concept project based on one of our smaller systems (which I unfortunately can’t share since it’s proprietary), and wrote up a short explanation of the benefits I thought adopting NHibernate would provide.
I wanted to post the contents of that entire document here (slightly sanitized for product and team names) so that readers could point out any glaring mistakes or make suggestions for additions. So here it is:
The Case for NHibernate
One of the most fundamental parts of a system is its data-access strategy. IT organizations building on the .NET platform have a multitude of options for addressing this part of their systems. Many of these options are provided by Microsoft itself, while others have grown up out of the open-source community. They range from thin wrappers over the ADO.NET API, such as the Enterprise Library Data Application Block, which provide similar functionality to plain ADO with an easier-to-use programming model, to full-fledged Object-Relational mapping (ORM) tools, such as the Entity Framework or LINQ to SQL.
On our team, data access has traditionally been implemented using tools that operate very close to the ADO “metal”, using stored procedures for much of the application logic. The reasoning for adopting these tools and practices at the time our first system was created included both developer skillsets and performance benefits. Not long ago, there was a substantial performance benefit to using stored procedures as opposed to “inline” SQL. However, with the improvements made to SQL Server, these advantages have been eroded to the point that stored procedures no longer carry much performance advantage over parameterized SQL queries.
Since the performance differences between the two are now far less significant, other advantages of the various data-access strategies, and specifically those of ORMs, should be examined. I will outline here the multiple advantages I believe using NHibernate would bring to our projects. Examples will be taken from the proof-of-concept project stored in source control.
Chief among the advantages of using any ORM tool is the reduction in development time that is possible if the tool is used appropriately. In particular, the standard Create, Read, Update, and Delete (CRUD) operations that commonly need to be performed on any business entity are easier to implement using an ORM tool. Take, for example, the Contact entity from the proof of concept system, which looks something like this:
In order to implement persistence for an entity like this using stored procedures, we would need a minimum of four procedures, one each for INSERT, UPDATE, SELECT and DELETE. Also, application code must also be written to put the data into the appropriate properties of a .NET object when retrieved from the database, and from the .NET object properties to stored procedure parameters when saving back to the database. This sort of “right-hand, left-hand” code is tedious and time-consuming to write, and is really just specifying the same idea four or five times over. The CRUD procedures for the Contact object in the project the proof-of-concept was based on are about 350 lines of T-SQL put together, and the .NET code to move the properties and parameters back and forth is about 180 lines.
On the other hand, using NHibernate mappings, we specify the relationships between our object properties and database table columns in only one place. Through the use of conventions provided by Fluent NHibernate, these mappings can be specified concisely and clearly. In the case of the Contact entity, the mapping can be specified in about 30 lines of C# (you can see this in the file ContactMap.cs in the proof of concept project).
Once these mappings have been specified, CRUD operations can be executed in the application. For example, to retrieve a particular Contact from the database, we can simply write:
Contact con = session.Get<Contact>(contactID);
Where “contactID” is the primary key value for the contact we want to retrieve. The “session” object here is a key part of the NHibernate infrastructure; all interactions with the database are executed within a session. The other 3 CRUD operations are similarly simple. Examples of each can be seen in the ContactsUC.ascx.cs control in the proof of concept project.
Besides simple retrieval and modification by primary key, the main way that we interact with persistent data is retrieval by a set of more complex criteria. This is exemplified by the numerous search pages in our systems. Currently, these searches are accomplished using dynamic SQL generated within a stored procedure. As anyone who’s ever worked on a search page can attest, these SPs (along with the pages themselves) can get very large very quickly. This approach necessitates the use of significant conditional logic within the T-SQL code, which can make the procedure quite confusing due to the limited methods of encapsulation provided by the language and environment.
There are a couple of options for doing these complex searches using NHibernate: the Criteria API and the new LINQ provider. Both of these have their strong points, but it’s what they have in common that’s most important. They provide a way to execute dynamic queries using only application code without resorting to inline SQL.
There are a couple of advantages to using C# to specify searches rather than in T-SQL stored procedures. The first is in the kinds of tools we have available for achieving logical separation. Using both native language constructs and refactoring tools, we can break search logic down into much more easily understandable pieces, rather than having to navigate through a sea of IF blocks in T-SQL. The second advantage is a reduction of duplication. In our current search implementations, checking search values for nulls, empty strings, etc. occurs in both application and database code. Using NHibernate, we cut down the number of these checks by half, since we only need to interrogate those values in the application.
An example of how a search page might be implemented can be seen in the proof of concept project in Search.aspx.cs. Note that this page would certainly benefit from additional refactoring, but it should give the reader a general idea of how better separation of concerns and elimination of duplication can be achieved using this method.
As was stated earlier, one of the primary motivations for the data access strategy used by our systems currently was performance, particularly in the area of search. In order to consider adoption of a new data access strategy, it must be shown that it performs comparably to the current strategy. Fortunately, SQL Profiler can help us determine the performance impact by showing us the resulting queries produced both by the stored procedures and by NHibernate and the time they take to execute.
Though nowhere near exhaustive, the tests I performed showed that the queries produced by NHibernate, when not nearly textually identical to the ones produced by the stored procedures, executed with time differences that were essentially statistically insignificant. (E.g., 1885 ms vs. 1886 ms) In fact, in several instances, the NHibernate queries actually performed better than their stored-procedure-produced counterparts.
It would be prudent to note, however, that this is only one search page. Different entity relationships may give rise to situations where SQL hints provided by a developer would have a significant impact on query performance, but this would have to be approached on a case-by-case basis to determine if such optimization is worthwhile. If such a case does present itself, nothing would prevent us from using a stored procedure to perform that particular operation. While it is best to be consistent with the data-access methods used in a project, using NHibernate is certainly not an all-or-nothing proposition.
When it comes right down to it, NHibernate is an abstraction on top of ADO.NET. Abstractions are created to increase productivity, not performance. It may be that in certain cases, the delegation of responsibility to the abstraction layer may result in greater performance due to the elimination of human error or ignorance. However, a determined developer with intimate knowledge of the underlying technology will often be able to write code that outperforms the abstraction. In other words, there is no question that an abstraction comes at a price. The challenge is not to eliminate all abstractions that can be outperformed by hand-tuned code in order to eke out the absolute best performance, but to weigh the benefits and costs of each abstraction to determine if it is, overall, beneficial to the project.
The software development industry as a whole is becoming more and more willing to adopt the ORM abstraction. This can be seen on a number of different platforms, from Ruby on Rails’ ActiveRecord to the Java Hibernate project (upon which NHibernate was originally based). Microsoft itself has acknowledged the benefits this kind of technology provides, evidenced by their offering of not one, but two ORM solutions: LINQ to SQL and the Entity Framework.
NHibernate is hardly the only player in the .NET Object-Relational Mapping space. Microsoft has two different offerings, LINQ to SQL and the Entity Framework, and there are multiple other commercial and open-source frameworks that offer similar functionality. So why use NHibernate over these other technologies?
LINQ to SQL
As Microsoft’s first foray into ORM, LINQ to SQL is a fairly lightweight framework for turning your database tables into entities usable by your application. For example, if you had a table named “Contact”, you could simply drag that table from the Server Explorer in Visual Studio onto the LINQ to SQL design surface to create a persistent “Contact” class. This works well for simple scenarios, but LINQ to SQL has several significant limitations. Among these are only supporting one-to-one mapping between classes and tables, limited support for inherited classes, and a less than stellar workflow for modification of mappings. In short, LINQ to SQL is likely not a good fit for our existing systems.
The ADO.NET Entity Framework is Microsoft’s full-fledged ORM solution. While it boasts a larger set of features than LINQ to SQL, it also has a number of shortcomings. Rather than allowing the user to map tables to an existing set of entity classes, EF generates its own new classes that must be used in order to persist data. Also, lazy-loading of associated entities (e.g. waiting to load a set of Contacts belonging to a Location until and only if they are needed) is poorly supported. In addition to these and other problems, the XML mapping files themselves are tremendously complex, and consequently are very difficult to modify when needed. While there is hope that some of the problems with EF may be addressed in the upcoming version, its release is tied to Visual Studio 2010, which it still a while off.
Other Third-Party ORMs
There are a multitude of ORM options for the .NET platform other than the ones already discussed, including SubSonic, LLBLGen, and Telerik’s OpenAccess among others. Their feature sets vary widely, and while each has its merits, they all share a disadvantage against NHibernate: the size of their user-base. Due to its widespread adoption, there are simply more resources for learning about and troubleshooting NHibernate than any other .NET ORM. When faced with a technical challenge, the availability of online resources can be the difference between solving the problem in a matter of minutes or a matter of days.
Bottom line, adding the capabilities of NHibernate to our projects will mean increased productivity. Less time will be spent on repetitive tasks, leaving more time to focus on the problem domain and the needs of our clients.
Any feedback you’d like to provide in the comments is welcome!
UPDATE: Chris Benard suggested that posting the CRUD operations and the mapping would be helpful to people without access to the codebase, and wouldn’t expose any IP. I agree, so now it’s there, in the form of GitHub gists. Hopefully that’s an improvement!