December 6, 2013

Issue: Starting a service crashes SharePoint site with unknown error


Issue:

After you stop a service (such as User Profile Synchronization or Search service) and try to start it again in Central Administration, the SharePoint 2010 site pages will crash with an unknown error.



Analysis:

Prior to the issue, the password was changed for the services account in SharePoint. However all services were working.

Searching the cascading numerous errors in ULS to find the root cause kept my head spinning for hours. I knew that issue was because of the password change in services account but couldn't spot where is the start point to fix.
Some of the error messages I found in IIS log:
- "Application pool 15f65d1adf954fa68aee81f7bcb9cec8 has been disabled. Windows Process Activation Service (WAS) encountered a failure when it started a worker process to serve the application pool."
- Another message is that the account may need batch logon permissions.

 then I had a clear image of what happened:

  1. The service (User Profile Synchronization service or Search service) tries to start
  2. Service fails to start because its application pool in IIS could not login 
  3. Since its application pool is down, it will bring down all the services that are running on it. If User Profile service is working on this application pool then it will stop. Also if Metadata service is working on this application pool then it will stop, and bring down the Profile service as well since User Profile depends on Metadata.
  4. All content on SharePoint pages (such as pages, web parts,..) requires a valid user login to display according to user permissions. Since User Profile service is down, current user cannot be verified and SharePoint shows unknown error.

I already made sure that that the service account has the necessary permissions and that services and service applications are setup correctly on CA. Also the application pool account is using the correct password.
Well, it didn't work! Every time I try to start the application pool in IIS it stops after couple of seconds. Agrrrrrr...

It appeared that the faulty application pool "15f65d1adf954fa68aee81f7bcb9cec8" in IIS becomes corrupted object and will not work.



Solution:

Relocate applications that are on corrupted application pool into another one with exact same settings.
First, make a backup for WFE and APP servers.
Depending on your current configurations, make the following changes in IIS in WFE and APP servers:

  1.      Create new AppPool “MetadataCustomAppPool” as copy of the corrupt AppPool but using new password  for service account
  2.      Move “metadata” and “Profile” applications from corrupt AppPool to   “MetadataCustomAppPool” AppPool.  This is useful when other service is down (and     subsequently its AppPool is down too), it will not affect the Metadata/User Profile and the   SharePoint site will continue to work.
  3.     If other service applications are in corrupt AppPool, then you can relocate them into MetaDataCustomAppPool or create a new AppPool for them as appropriate.
  4.      Restart IIS



No comments:

Post a Comment