Lazy Loading is… well… lazy

By: Jarod Ferguson Category:General PostTag: New Post:

I find it ironic that we allow our tooling to mask the biggest bottleneck in the system: The database. Having worked with the majority of the ORM tools out there, I can tell you the most common contributor to slow system performance is lazy loading for Data Access.

I?ve seen this one all too many times:

Business guy: The order checkout screen is running slow in production

Dev: It shouldn?t be, it runs great on my machine

Business guy: Well, it is. It takes 10-15 seconds to load order 123

Dev: Let me check, ya hmm, it is slow. I?ll look into it

The developer proceeds to crack open the source where he finds some LOD nightmare data binding mess in the view, or the mapping to the presentation model: ?Customer.Order-> LineItem.Product.Catalog.Vendor.Name?

Because someone wanted to display the Vendor name on the line detail, they just hopped over the graph and loaded the catalog, and then vendor. So here comes the entire catalog! (of course its N + 1)

Why did they do that? Because it is EASY! (and because your boss wanted it yesterday)

The team never knows about it until it?s too late. Yep, it runs fast locally, because we developers only have a small catalog on our machine, I mean ?we don?t want all that data on our box?.

Now I know there are many folks out there saying ?Well that?s just stupid, I would never do that?. Maybe you wouldn?t, but that guy sitting next to you, he will. I tell you, we all do it, again and again.

The result over time is an interconnected web of queries that can bring a system to a crawl.

See my follow up post for some tips I use when designing a data access layer with an ORM

7 thoughts on “Lazy Loading is… well… lazy”

Pingback: Elegant Code » Tips for ORM Data Access

I’m confused. In your example it doesn’t look like eager loading would make it any better.

Sam, If I eager load “Order.LineItem.Product.Catalog.Vendor” I can get all the data in one query, vs potentially many N + 1. (EF query.Include() or NH query.Expand())

Also, now that it is in one query, there may also be performance tuning benefits, such as indexing different columns.

Right, but it’s hard to say lazy is lazy (i.e. bad) all the time.
What you are describing is a situation where you need both the root’s data and child.child.child.property data at the same time. Many times you’d want child object’s data at different time, such as in a drill-down scenario. For example, listing 50 orders on a page, but only showing the product image(s) if the user selects an order’s product detail. You probably wouldn’t want to grab a product’s image data for a list of orders screen unless you know the user is going to need it (later).

I think needing the rootaggregate.child.child.child.propertyvalue in a list of those roots *may* be indicative of unusual design.

I agree with you 100%, bad design, in a contrived example. Ideally in this scenario the necessary data could be ‘rolled up’ into a value on the product, as in the VendorName. Except that is *much* harder to change a production schema and its data, than just walking the graph an lazy loading 🙂

That is the point of this post, in my experience these types of scenarios come up a lot, and the easiness of Lazy Loading tends to be detrimental over time.

That is why I choose to be explicit about my data access.

Thanks for your comments Sam!

Lazy loading is largely used in conjunction with the “Open Session in View” pattern. This is unfortunate, IMHO, because it leads to (ab)using database access in the view layer.

The more architectural correct approach is to have the session (thus transaction) closed by the time your model classes hits the view. Any necessary mapped properties should be initialized at the service layer. This, however can lead to substantial more complexity so people usually prefer the “quick ‘n dirty” way.

Pingback: iPhone lockscreen done right | Road Monkeys

Comments are closed.