How to deal with malfunctioning publishing jobs in Sitecore

If you search for the keywords Sitecore, publishing and hangs and stuck, you'll quickly realize that publishing items in Sitecore can result in a publishing job that hangs during initialization or gets stuck while processing a set of items.

In this blog post I'll explain what you need to be aware of when dealing with these sorts of issues, the different kind of flavors that are commonly seen, and what causes them in the first place - which can be a bit of a nowhere land.

Disclaimer: The issues and causes described in this blog post are related to the traditional publishing job engine that ships with Sitecore, and does not apply to the Publishing Service module.

Publishing hangs during initialization

One of the most common issues you might encounter while trying to publish one or more items is that the publishing hangs during initialization. In practice this means that the spinner showing up in the publishing dialog just keeps going on into infinity and the text keeps showing '.. initializating...'.

So far I've found that there are at least two things, that might cause this type of issue to appear:

  1. Your scalability setting are wrongly configured, or
  2. The MongoDB isn't running

If you are using scalability settings in your Sitecore environment setup, one of the first things you should verify is whether the configuration is correct.

There are many folks in the Sitecore community who have experienced the same problem, as described here and here.

If you are in doubt whether you have set up the scalability settings correct, there are many great resources, such as Sitecore Publish Instance and Sitecore Scalability Setting.

If your scalability settings are set up correct (or aren't being used), you might want to check whether your MongoDB instance is running and is accessible from your Sitecore instance. The best way to go about this is to verify that the ConnectionStrings.config file found under App_Config has the correct server specified, and that you can connect to the server from your machine, i.e. using the utility named Robomongo.

I simply do not understand it

I've yet to understand why a missing connection to MongoDB can cause the publishing mechanism to break, since the publishing operation shouldn't strictly need to persist data into the MongoDB database, if any - but it's definitely something you should be aware of, and of course your MongoDB has to work in order for Sitecore to function properly.

Publishing is stuck after processing a specific amount of items

Another issue you might also encounter is that the publishing job starts, then publishes a specific amount of items, before it finally gives in and gets stuck.

By stuck I mean that the counter reaches a specific amount of published items, whereas the counter stops increasing, while the spinner keeps on going into an infinite loop.

Like with the "publishing hangs during initialization" issues described in the above, I've been able to identify some of the root causes to this type of issue, including:

  1. The database configuration is wrong, i.e. the master, web and core databases lack the correct user roles and (execute) rights

  2. The publishing pipeline processor may be using an wrongly configured parallel publishing queue

  3. The master and/or web database contains malformed items, causing the publishing the stop working when it process the malformed item

The first cause is normally not something you see very often, but it can happen that manually configured databases lack the proper user roles and rights for the database user intended to be used on the Sitecore databases. In order to verify this, you can use SQL Management Studio to check that the database user is set up correctly, and have the execute rights permission set on the databases - if this is not the case, the publishing job will get stuck after processing the first item.

The second cause seems to occur more often then the first cause, yet it belongs to the rare causes, at least to my experience. Parallel publishing has existed in Sitecore since version 7.2 and it enables features and optimizations that can increase performance of publishing operations - by default, this feature is disabled. What seems to happen sometimes is that developers tend to configure the parallel publishing feature, but doesn't consider the preliminary requirements for using the feature or ends up configuring it wrong.

Oh yes it is true

In either case, this can lead to a situation where the publishing job becomes flaky and will result in the publishing job getting stuck after processing a specific amount of items (which might differ from each publishing run in a non-deterministic way). If parallel publishing has been enabled, do make sure that your underlying hardware does support it and that all configurations have been fully (and correctly) set up.

As for the third cause, I've seen web databases containing items that caused issues when being published from the master database. As a result, the publishing job simply stopped responding and got stuck while trying to publish the malformed items. If you are in the situation of having one or more items causing this behaviour, it can be very tricky to try and resolve which items are causing havok, since there are no errors in the log files that points you in the right direction of what went wrong.

This is just the best

In order to deal with this kind of issue, the best approach is to perform a divide and conquer strategy in terms of narrowing down and locating the affected item in the web database. Once the malformed items were deleted from the web database, the publishing of items should work again.

If you are (un)lucky enough to encounter this type of issue, and survive to tell the tale, I promise that when you do find the items causing you grief, remove them and see the publishing job running through without any errors, this is how you are going to feel:

Won the publishing game

Final notes

In this blog post, I've highlighted some of the issues you might encounter when publishing items in Sitecore. However, I feel that there are most likely other issues, which I have yet to discover.

On that note, if you got additional details to the content of this blog post, or encountered an issue, and its cause, please drop me a note in the comment section below.