Sitecore 8 MVC - When the errors are not telling you what is wrong

While recently working on a Sitecore 8 MVC project I ran into an issue, which only surfaced when visiting certain pages on the site. The error came when I was trying to edit the faulting page using the Experience Editor, as shown below:

By simply looking at the error it was my impression that something went wrong during the rendering of something within my main-content placeholder on my standard page layout, because the parameter args.Item was null when it was not supposed to. However, no information about where the error occurred was given. In my case the page that was throwing the error contained a large amount of complex renderings with different data-sources and alike, meaning that it was rather tedious to go through all of them one by one.

In order to get to the bottom of the error, I generated debug symbols for the Sitecore assemblies, which allowed me to debug the piece of code that was throwing the error. For this I used dotPeek, a really useful free .NET decompiler from the team behind ReShaper that can function as a symbol server. That is, it can supply a debugger with the information required to debug compiled assemblies - for more information on this, check out this link.

After having generated the debug symbols for the specific Sitecore assemblies containing the code for the failing pipeline, I was able to debug the code, whereas I located the real source of the problem - which I must admit I don't think is very obvious, based on the exception being thrown.

When digging into the code where the trail ends in the stack trace you find the following:

At first glance it looked to me like the only thing the Process method does is telling Sitecore to insert the edit buttons onto each rendering. However, looking closely I noticed that the first assertions checks whether the item for the given rendering is null - if that is the case, the Process method throws an exception. I went back looking at my renderings within the placeholder, checking if any was referring to a data-source and verifying that the referred data-source indeed existed. To my surprise I found that one of my renderings was referring to a data-source that had some how been removed (by the looks of it, forced removed). By re-adding the missing data-source my rendering was now working again, and the page was now able to load without any errors.

Lesson learned

Reflecting back on my initial approach to figuring out the cause of the error, perhaps I should have started my investigation with checking whether any of my renderings within the placeholder were missing their data-sources - as I was given a lead that an item was null, where it was not supposed to.

However, when I got the error it was not exactly what I had in mind could be a problem. As stated above, by simply looking at the error you don't get any specific details of where the null reference error occurred, other than it's somewhere inside the main-content placeholder. Also, I knew that the renderings placed within the main-content placeholder was quite complex, and that debugging each of these could end up being a rather time consuming task.

What makes me wonder is the fact that when you look through Sitecore's code, all the necessary information that Sitecore could have chosen to bubble up in the throwing of the exception is there, they just don't make any use if it. In the case of this error, I would have preferred having an error message stating that the rendering within the specified placeholder was failing because it was missing it's referred data-source (reflecting that the item was null). These informations are available within the code that throws the error, since you have the rendering, thus it's name, as well as the item that it expects to be able to use - and if I don't remember wrong, the ID of that item is supplied a couple of steps back in the calling methods. Had such information been propagated up in the error message via the exception to the user, it would have given me a much better view of what went wrong, and in my case it would have saved me a couple of hours of work.

To sum up, if you experiencing a similar issue as the one described in this blog post, chances are that you may just be having a rendering that is failing because it's referred data-source has been removed.