Friday, November 2, 2012

SharePoint 2010 Search: Items not removed from the index after removing the permissions for the content crawler account

 

Recently I’ve had a customer who used SharePoint 2010 to crawl a large file share with over 9TB worth of data. The content crawling account was set up to be able to access all the documents from the file share and so all documents landed in the SharePoint search index.

So far so good.

However, after a while, the owners of several folders on the file share containing sensitive data decided to remove their documents from the search index, so they simply denied the crawler account the permissions to access their folder. The TechNet article Best practices for using crawl logs (SharePoint Server 2010) states the following:

“When a crawler cannot find an item that exists in the index because the URL is obsolete or it cannot be accessed due to a network outage, the crawler reports an error for that item in that crawl. If this continues during the next three crawls, the item is deleted from the index. For file-share content sources, items are immediately deleted from the index when they are deleted from the file share.”

Sounds simple, the crawler account cannot access the files any more and so it should think that they are not there any more and remove them from the index, right?

Wrong. We did several incremental crawls, followed by a full crawl but to no avail, the items were still in the index and would show up in the search center. Only index reset helped, but this is not a feasible solution when you have 9TB of data to crawl.

As it turnes out, the crawler now started getting “Access Denied” errors when attempthing to recrawl the documents, which is actually expected, since the files are still there, but not accessible any more. So in this case SharePoint 2010 behaves a little bit different. It will keep trying for 30 cralws AND 30 days, and only then give up and remove the items from the index.

So what if you don’t want to wait for 30 days for the items to be removed, you might ask? Thankfully, there are several policies that tell the crawler when to remove the items in case it encounters an error while crawling and you can adjust those policies by changing several properties using PowerShell:

   1: $sa = Get-SPEnterpriseSearchServiceApplication -Identity "<SearchServiceApplicationName>"
   2: $sa.SetProperty("ErrorDeleteCountAllowed", 3)
   3: $sa.SetProperty("ErrorDeleteIntervalAllowed", 10)
   4: $sa.SetProperty("DeleteUnvisitedMethod", 0)
   5: $sa.Update()

You will have to repeat this for all search service applications that you might have.


When the crawler encounters an access denied or a file not found error, the item is deleted from the index if the error was encountered in more than ErrorDeleteCountAllowed (the default is 30) consecutive crawls AND the duration since the first error is greater than ErrorDeleteIntervalAllowed hours (the default is 720 hours).


If both conditions are not met, the item is retried, otherwise it is deleted from the index. In this example the item will be removed from the index after 3 unsuccessfull crawls due to an access denied or file not found errors AND after 10 hours since the first failed crawl (both conditions must be met). You might need to read it once more to get it Smiley


The ErrorDeleteCountAllowed and ErrorDeleteIntervalAllowed properties apply to both incremental and full crawls.


In addition, you can use the DeleteUnvisitedMethod property to specify what items get deleted during the full crawl only. Setting this property to 0 will immediately remove from the index all items that are no longer found in the current full craw.


Be careful when setting these properties if your file share suffers from occasional network connection problems. Setting them to too low values might lead to a large number of items being removed from the index because of a temporary network outage, only to be reindexed again when the network connection is working again. This could kill your search perfomance, so be careful!.


You can find the full explanation of these policies and other properties that you can use to fine tune the index cleanup logic under Manage deletion of index items (SharePoint Server 2010).


Thanks to Sorin Stanila and our escalation engineers for helping solve this problem!

Thursday, March 8, 2012

Troubleshooting the .NET 3.5 Feature installation on Windows 8 Server Beta

If you attempt to activate the .NET 3.5 Feature on the new Windows 8 Server Beta, you might experience the problem that the feature installation wizard gets stuck at around 60% and eventually (after some 20 minutes) dies with a generic error “The request to add or remove features on the specified server failed”. Hovering the mouse over the error message in the new Notifications pane (rant: why isn’t there an easier way to see the full error message?) discovers a few more details about this error, namely that the installation sources for this feature could not be found and that you might specify them using the /source parameter. Huh?

It turns out that although I had my installation ISO attached on the D:\ drive, the installation wizard per default looks on the internet to download the .NET 3.5 installation sources and dies after a while if it fails for any reason (e.g. no internet connection, failed proxy server authentication etc.). So how can we install the .NET 3.5 feature using the sources on the installation drive?

One way would be to use the dism.exe command line tool and pass it the installation source folder using the “/source” parameter (remember the error message above mentionig /source? This is the one).

I opted to use PowerShell though, simply because the new PowerShell ISE is so sexy with its Command panel and the intellisense support Smile

So first, let’s look which Windows Features are there. Start the PowerShell ISE (rant: there’s no Start orb any more, so you’d have to press the Windows key on your keyboard to switch to the Metro Dashboard and simply start typing “PowerShell” –> the shortcut to the PowerShell ISE will be shown on the left side) and type the following:

PS C:\Users\Administrator> Get-WindowsFeature


This will give you a long list of features available on the server, along with their internal names and the install state. Scroll down the list until you find the feature with the display name “.NET Framework 3.5 (includes .NET 2.0 and 3.0)”. This is what I saw there:



[ ] .NET Framework 3.5 (includes .NET 2.0 and 3.0)  NET-Framework-Core  Removed


See this “Removed” in red? Let’s try to fix it:


PS C:\Users\Administrator> Install-WindowsFeature -Name NET-Framework-Core -Source "D:\sources\sxs"


Replace the “D:\” with the drive where your Windows 8 Server installation is.


Shortly after that we’ll get:


Success Restart Needed Exit Code      Feature Result                               
------- -------------- --------- --------------
True No Success {.NET Framework 3.5 (includes .NET 2.0 and...


That’s it. You can double check that it succeeded using:


PS C:\Users\Administrator>; Get-WindowsFeature -Name NET-Framework-Core

Display Name Name Install State
------------ ---- -------------
[X] .NET Framework 3.5 (includes .NET 2.0 and 3.0) NET-Framework-Core Installed


Now go ahead and install SharePoint 2010 Smile

Thursday, February 2, 2012

Troubleshooting workflow visualization in SharePoint 2010

SharePoint 2010 introduced a very nice feature that enables visual representation of workflow status on the workflow status page using Visio diagrams hosted in a Silverlight web part.

Recently I’ve created a simple declarative workflow in SharePoint Designer 2010 and configured it to display workflow visualization:

image

I then published and started the workflow and went to the workflow status page hoping that I would see the visual of the workflow status. Instead, all I saw was the error message “The server failed to process the request.”:

image

First I went to Central Administration>System Settings>Mange services on server and checked if the Visio Services were running:

image

So all fine there. Next step: go to Central Administration>Application Management>Service Application Association to check if the Visio Services was part of the Service Application Proxy associated with the affected web app:

image

No problem there either, so next I checked the ULS log to see if there are any errors. And indeed, this is the stack trace I found:

System.Data.SqlClient.SqlException: Cannot open database "WSS_Content_Team" requested by the login. The login failed.  Login failed for user 'CONTOSO\sp_serviceapp'.     
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at System.Data.SqlClient.SqlInternalConnectionTds.CompleteLogin(Boolean enlistOK)
at System.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, String newPassword, Boolean ignoreSniOpenTimeout, Int64 timerExpire, SqlConnection owningObject)
at System.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(String host, String newPassword, Boolean redirectedUserInstance, SqlConnection owningObject, SqlConnectionString connectionOptions, Int64 timerStart)
at System.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(SqlConnection owningObject, SqlConnectionString connectionOptions, String newPassword, Boolean redirectedUserInstance)
at System.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, Object providerInfo, String newPassword, SqlConnection owningObject, Boolean redirectedUserInstance)
at System.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection)
at System.Data.ProviderBase.DbConnectionFactory.CreatePooledConnection(DbConnection owningConnection, DbConnectionPool pool, DbConnectionOptions options)
at System.Data.ProviderBase.DbConnectionPool.CreateObject(DbConnection owningObject)
at System.Data.ProviderBase.DbConnectionPool.UserCreateRequest(DbConnection owningObject)
at System.Data.ProviderBase.DbConnectionPool.GetConnection(DbConnection owningObject)
at System.Data.ProviderBase.DbConnectionFactory.GetConnection(DbConnection owningConnection)
at System.Data.ProviderBase.DbConnectionClosed.OpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory)
at System.Data.SqlClient.SqlConnection.Open()
at Microsoft.SharePoint.Utilities.SqlSession.OpenConnection()
at Microsoft.SharePoint.Utilities.SqlSession.ExecuteReader(SqlCommand command, CommandBehavior behavior, SqlQueryData monitoringData, Boolean retryForDeadLock)
at Microsoft.SharePoint.Utilities.SqlSession.ExecuteReader(SqlCommand command, Boolean retryForDeadLock)
at Microsoft.SharePoint.Utilities.SqlSession.ExecuteReader(SqlCommand command)
at Microsoft.SharePoint.Upgrade.SPDatabaseSequence.GetVersion(SPDatabase database, Guid id, Version defaultVersion, SqlSession session, SPDatabaseSequence sequence)
at Microsoft.SharePoint.Upgrade.SPDatabaseSequence.get_SchemaVersion()
at Microsoft.SharePoint.Upgrade.SPSequence.get_IsBackwardsCompatible()
at Microsoft.SharePoint.Upgrade.SPUpgradeSession.IsBackwardsCompatible(Object o, Boolean bRecurse)
at Microsoft.SharePoint.Administration.SPPersistedUpgradableObject.get_IsBackwardsCompatible()
at Microsoft.SharePoint.Administration.SPPersistedUpgradableObject.ValidateBackwardsCompatibility()
at Microsoft.SharePoint.SPSite.PreinitializeServer(SPRequest request)
at Microsoft.SharePoint.SPWeb.InitializeSPRequest()
at Microsoft.SharePoint.SPWeb.BypassUseRemoteApis()
at Microsoft.Office.Visio.Server.GraphicsServer.FileHandler.GetFileContents()
at Microsoft.Office.Visio.Server.GraphicsServer.FileHandler.Open(Boolean readOnly)
at Microsoft.Office.Visio.Server.GraphicsServer.FileHandler.OpenFile(DiagramRequest request, Boolean needXml)
at Microsoft.Office.Visio.Server.GraphicsServer.Core.VectorizeDiagram()
at Microsoft.Office.Visio.Server.GraphicsServer.Core.get_VectorDiagram()
at Microsoft.Office.Visio.Server.GraphicsServer.VisioGraphicsService.GetVectorDiagram(VectorDiagramRequestContract vectorDiagramRequest)
at Microsoft.Office.Visio.Server.GraphicsServer.VisioGraphicsService.VectorDiagramCallback(Object state)
at System.Threading.ExecutionContext.runTryCode(Object userData)
at System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode code, CleanupCode backoutCode, Object userData)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading._ThreadPoolWaitCallback.PerformWaitCallbackInternal(_ThreadPoolWaitCallback tpWaitCallBack)
at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(Object state)


So it seems that the service account under which my Visio Services are running (CONTOSO\sp_serviceapp) couldn’t connect to my content database (WSS_Content_Team). I went to the SQL Server Management Studio and indeed, the CONTOSO\sp_serviceapp did not have permissions in my content database. So I went ahead and added the CONTOSO\sp_serviceapp to the dbo role in the content database:

image

Went back to my workflow status page and did Ctrl+F5 to reload, but still got the same error as before: “The server failed to process the request.”.


Back to the ULS log. This time a different exception popped up:


Failed to generate Vector diagram for file http://team.contoso.dev/sites/test/Workflows/Engagements Workflow/Engagements Workflow_V1.vdw Error : 
Microsoft.SharePoint.Upgrade.SPUpgradeCompatibilityException: There is a compatibility range mismatch between the Web server and database "WSS_Content_Team", and connections to the data have been blocked to due to this incompatibility. This can happen when a content database has not been upgraded to be within the compatibility range of the Web server, or if the database has been upgraded to a higher level than the web server. The Web server and the database must be upgraded to the same version and build level to return to compatibility range.
at Microsoft.SharePoint.Administration.SPPersistedUpgradableObject.ValidateBackwardsCompatibility()
at Microsoft.SharePoint.SPSite.PreinitializeServer(SPRequest request)
at Microsoft.SharePoint.SPWeb.InitializeSPRequest()
at Microsoft.SharePoint.SPWeb.BypassUseRemoteApis()
at Microsoft.Office.Visio.Server.GraphicsServer.FileHandler.GetFileContents()
at Microsoft.Office.Visio.Server.GraphicsServer.FileHandler.Open(Boolean readOnly)
at Microsoft.Office.Visio.Server.GraphicsServer.FileHandler.OpenFile(DiagramRequest request, Boolean needXml)
at Microsoft.Office.Visio.Server.GraphicsServer.Core.VectorizeDiagram()
at Microsoft.Office.Visio.Server.GraphicsServer.Core.get_VectorDiagram()
at Microsoft.Office.Visio.Server.GraphicsServer.VisioGraphicsService.GetVectorDiagram(VectorDiagramRequestContract vectorDiagramRequest)


Huh? The error message implies that there is a potential compatibility problem with my database, which typically occurs when you attach a SharePoint 2007 database and attempt to use 2010 features in it before upgrading it to 2010. However, this was a brand new 2010 database, so this couldn’t be the case.


Then I remembered the golden rule: never touch SharePoint databases directly.


Never.


There’s always a better way. In my case I had given the dbo rights on the content DB to the service account directly, but that was obviously not enough to make it work. Looking for the solution, I stumbled upon this article and found and executed the following PowerShell snippet to provide the necessary DB access rights to the service account:



   1: $webApp = Get-SPWebApplication "http://team.contoso.dev"
   2:  
   3: $webApp.GrantAccessToProcessIdentity("CONTOSO\sp_serviceapp")

Went back to my workflow’s status page, did a refresh and voila:


image


Note:


You might need to restart the Visio Services from Central Administration and do an iisreset if it still doesn’t work.

The obligatory first post.

Ok, I've started a blog. Let's see how long it lasts.