N+1 has leaked into my service interfaces
October 11th, 2007 by Oscar HuseyinAbout two years ago, l (like many others) bought and read Hibernate In Action. It was definitely the most decisive book on the Object to Relational Mapping tool that l could find. Apart from being really well written, the book was a great reference text which could be used in the trenches to configure and use Hibernate.
Now, a few years after using Hibernate in angst, l have realised the decision to use ORM has large drawbacks; larger than l initially envisioned. Notably, the infamous N+1 selects problem can cause increased development times as developers spend a large portion of time “ironing out” the performance issues related to the ORM implied constraints.
In first appearances, the N+1 problem presented performance issues that would often cripple the JVM for memory whist fetching and hydrating objects from the database. The solution seems trivial in that defining lazy associations would restrict the loading of object until they were needed, consequently restricting the number of objects in the object graph. However, this is not the be-all-and-end-all of problems relating to N+1.
Looking back, l had a sense of victory after we performed a first pass to detail the domain associations with a view to implement a more performant database abstraction. However, as the system functionality increased, the requirements on the domain model unfolded to create more and more “scenario based” object associations. To illustrate this point, let me give a Hello World! example. If my domain object Customer has associations to Order, Item and Address then my association could be represented as:
Customer (1) ---------- (*) Order (1) --------- (*) Item
|
--------------(*) Address
Now, if l don’t define any lazy associations, then when l load my Customer from the database, Hibernate will retrieve all Order’s, Item’s and Addresses. Adding lazy associations to each relationship, l can now control the loading of Order’s, Item’s and Addresses as l need them. Typically, this is achieved by “touching” the Collection that l need to load from the Customer. Simple right?
In my example, one possible “scenario” is the non-lazy one; e.g. when you load a Customer, all associated objects are also loaded. Another scenario is the lazy one, Customer’s with Order’s and no Addresses. I’ve described two scenarios here. Can you see anymore? I can. Customer with Orders only (e.g. no Item’s). Any more? Yep, Customer with Addresses only. l can go on and on. So, the number of possible object loading scenarios is a function of the number of associations in the domain model. Now, that can be a very big number! Exponential actually.
Given this aspect of ORM must be solved to increase the performance and scalability of the domain, how is this typically implemented? That’s the focus of this blog; N+1 leaking into the service methods.
To continue with the above example, if l create a service to retrieve Customer’s, l could (without considering the lazy associations) create a service named CusomterManager with a single method named getCustomer(int customerId). However, the service interface is certain to be non-performant as my Customer will be loaded with Order’s, Item’s and Addresses. Now, if l want to specialise my object graph that is returned, l need to add more methods to the CustomerManager service, getCustomerWithOrders(int customerId), getCustomerWithOrdersAndItems(int customerId), getCustomerWithOrdersItemsAndAddresses(int customerId), and so forth.
So, from a single method in my CustomerManager service to four! Thats what l call an abstraction leakage. My clients are now exposed to the shortcomings of ORM and l have severely polluted my service interface.
Avoiding this service pollution is not a concern of development. This responsibility rests squarely with the application architect. Constraints applied by application architecture are the primary cause of the abstraction leakage which were mandated by the use of, say EJB. Retrieving object graphs from Stateless Session Beans will directly present the ORM shortcomings for clients to deal with. However, services deployed in, say, the Servlet container will remove the need to pollute service interfaces, but create other issues like holding onto resources (such as a database connection) for lengthy periods of time to allow the service implementations to “retrieve” the lazily loaded associations as needed.
In conclusion, living with Hibernate is costly. The semantic definitions of your interfaces will resemble the object graphs that are being fetched and returned. Ive found this to be really messy and will force unwanted constrains on otherwise simple service definitions.
December 9th, 2007 at 2:23 am
Hi,
Is this problem specific to ORM? No. what if you are using plain jdbc even then you need to write four methods in Servic layer or DAO so its plainly wrong to blame ORM. Secondly, take a look at Session per conversation, in the best ever ORM book Java Persistance with Hibernate, as it does not hold connection and do consider statefull architecture as well and have a look at JBoss Seam. Secondly, you may have 2nd level cache as well or introduce batch fetch. Or if you dont agree still then what about proposing an alternative.
Regards,
Shoaib
December 9th, 2007 at 6:02 pm
Shoaib,
On your point about using plain JDBC, i would not necessarily expect to retrieve more objects than what l called from my service interfaces. E.g. l should not have to constrain my object retrieval using semantics from my service interfaces. That was the point. ORM forces clients to have knowledge of your returned object graph, without this knowledge how can you ever obtain a performant database abstraction using ORM? So, lm sorry, but the problem surely rests on ORM’s shoulders.
Secondly, how can a second level cache possibly reduce service pollution? It’s only a tool that is used to stop one or many database calls, which does not stop me from extending my service methods to cater for all possible object graph retrieval scenarios.
December 10th, 2007 at 12:22 pm
In conclusion, living with Hibernate is costly.
Try living without it
December 10th, 2007 at 1:06 pm
HI,
I can see your pain - I’ve been through it, as well. Perhaps what you may want to consider is a seperate service for each entity you are interested in. Meaning, a seperate service for Orders (retrievable by customer id) - you probably would want to get the Items with Order (although there are cases where that’s not needed, or where you just need Items and don’t care about the order they were purchased on). And you could have an Address service (also, retrievable by customer id). This is the model my company has gone with and I think it works well for us.
So, perhaps you may want to re-approach your notion of services - maybe reading the RESTable web services book may give some interesting insights, or at least food for thought (I’m not advocating a REST-only world, but the ideas are certainly worth exploring).
-Jason
December 10th, 2007 at 1:09 pm
I’ve always felt that Hibernate should provide a way for client code to identify a “use case” at session creation time, and then allow the developer to override mappings etc. per use case, falling back on the default mappings if no override exists for that given use case. This would allow developers to stand the app up quickly using a single set of default mappings, but then add overrides as needed to tune the behaviour of each use case.
AFAICT this wouldn’t break the “idiomatic Java” ideal that Hibernate holds so dear, it also has minimal compatibility issues (simply add a new version of SessionFactory.openSession() that takes a single String as a parameter) and doesn’t increase the cognitive burden for developers who do not need this kind of capability.
I’ve also pondered the idea of having Hibernate automatically track the behaviour of each use case and modify the mappings itself based on those statistics. As with all self modifying code, it would be hell to reason about of course…
December 10th, 2007 at 1:14 pm
Ever heard about the YAGNI [1] principle? Are you really going to need all possible combinations of lazy and eager loading? I think not.
For instance, if you already know you want to show a Customer’s orders, why don’t call an OrderService with a method getOrdersForCustomer(int customerId) which just retrieves all the orders and the customer?
My 2 cents.
[1] http://en.wikipedia.org/wiki/YAGNI
December 10th, 2007 at 8:16 pm
N+1 has leaked into my service interfaces…
[...]Using Hibernate exposes more than just the N+1 problem. Optimisations of your database layer require lazy associations to provide a performant database abstraction. Service pollution is inevitable to solve the unsolvable problem in ORM.[...]…
December 11th, 2007 at 7:59 pm
Some points:
- If your service interface documents would clearly define the contracts exposed then it could say something like: will return one customer with all orders, order items and addresses. Or any other variant. But there would be a fixed, clearly understood definition.
- ORM’s lazy loading adds additional behavior and state to object graphs, namely lazy loading as behavior and a unit-of-work stateful object as state. This violates the only-do-one-thing rule with all it’s consequences.
- You don’t necessarily have to use you mapped domain classes in every corner of your application.
- OOP is about state and side effects are sometimes hard and expensive to avoid. It’s the job of architects to recognize this fact and deal with it in all its occurrences.
October 11th, 2008 at 9:50 am
[...] list would still hold strong. This made me think about my previous blog entry on a similar topic; N+1 has leaked into my service interfaces. Does the Tree Walker stop the ORM constraints from leaking in to my service interfaces? Well, no. [...]