,

Monitoring SharePoint 2013 Search Top Level Errors

Posted by

Intro

A common request for SharePoint Administrators and SharePoint Search Administrators is some sort of alert notification when top level errors occur during crawls. The SharePoint 2013 UI provides an impressive look at Crawl statistics however it’s a more manual effort to go validate no errors occurred during a given crawl for example. While the SharePoint 2013 Management Pack for SCOM may very well provide this sort of alert notification and monitoring, I was more interested in how one might accomplish monitoring and notification of top level errors where SCOM isn’t setup in an environment. Top Level Errors are something worthy of monitoring for two reasons.

 

First, Top Level Errors will prevent crawling any new, changed, or deleted content within that given host. So for example, if my Content Source contains the URL: https://wfe.contoso.com, nothing within that name space will be crawled until we retry subsequent crawls. This is especially important considering most SharePoint 2013 On Premise installations that leverage Host Name Site Collections. When you configure crawling for an HNSC (host name site collection), you usually input only one URL in a Content Source. That URL is usually the top level or root site collection in the associated web application. The crawler is intuitive and will go crawl all of the HNSC’s assuming no top level error occurs. If a top level error occurs against the root site collection which is the only URL defined in the content source, newly added, changed, or deleted content will not be picked up by the crawler and this applies to all HNSC’s defined in the specified Web Application.

 

Secondly, if enough time passes and the error is retried over and over yet still errors, the crawler could potentially delete all items associated out of the index. This is because we have deletion policies with default values set up for the Search Service Application. For example: If a top level or any lower level URL hits errors during crawl a set # of consecutive times and it passes a set # of hours, that content will be purged from the index. From a top level error perspective with an HNSC setup, this only applies to items that contain the URL namespace defined in the content source. The remaining HNSC’s would be preserved and deletion policies wouldn’t impact the associated content. This is interesting to think about in that one could say that deletion policies are ineffective against Host Name Site Collections not defined in a Content Source when Top Level Errors occur. If those non top level Host Name Site Collections encounter errors during crawls, then that is an entirely different story however I’m focused on top level errors in this example.

 

I authored a simple PowerShell script that will check all content sources and if Top Level Errors are found, they’re reported in an email sent to the person or person/s of your choosing. This is something I would schedule to run nightly via a Scheduled Task. For more information on setting that part up see the following here.

The following is an example email generated by the script when a top level error occurs:

image

 

Requirements

A couple of requirements below:

1. Tested against SharePoint 2013 On Premise installation
2. You’ll need to know the SMTP address of your email server
3. You’ll need to edit the script below to match your desired settings.

a. $smtpServer (Enter your smtp address here)
b. $from (Enter your from email address here)
c. $to (Enter your to email address here)

 

Updated 8/3/2015:  I updated the script and added logic to include the original date of a Top Level error by leveraging a new custom property against each content source.  If a content source errors for 5 consecutive days, the original date of the error will be used for each email sent.   The date is considered the first time the Power Shell script captures a top level error for a given content source.  The date will not match the actual date and time of the top level error from the crawl log. Once that associated top level error has cleared, the custom property for that content source is reset back to healthy.  

 

Instructions and the PowerShell Script:

I recommend testing by manually running the script.  After testing is complete, I recommend including it as a scheduled task to run nightly or during a desired interval.  Steps for manually testing including the script are below:

1. Copy the below script and save it in notepad
2. Save it with anyfilename.ps1 extension
3. To run, copy the file to a SharePoint Server
4. Select Start\Microsoft SharePoint 2013 Products\SharePoint 2013 Management Shell
5. Browse to directory holding the copied script file
6. Run the script: .\anyfilename.ps1 (assuming anyfilename is the name of the file)

 

// Microsoft provides programming examples for illustration only,
// without warranty either expressed or implied, including, but not
// limited to, the implied warranties of merchantability and/or
// fitness for a particular purpose.
//
// This sample assumes that you are familiar with the programming
// language being demonstrated and the tools used to create and debug
// procedures. Microsoft support professionals can help explain the
// functionality of a particular procedure, but they will not modify
// these examples to provide added functionality or construct
// procedures to meet your specific needs. If you have limited
// programming experience, you may want to contact a Microsoft
// Certified Partner or the Microsoft fee-based consulting line at
//  (800) 936-5200
// For more information about Microsoft Certified Partners, please
// visit the following Microsoft Web site:
// https://partner.microsoft.com/global/30000104

//Version 2.0

[Void][System.Reflection.Assembly]::LoadWithPartialName(“Microsoft.SharePoint”)
Start-SPAssignment -Global

##Function checks last date of top level error##
function propcheck
{
    $temptldate = $cs.GetProperty(“tldate”)
    if($temptldate -eq $null)
    {
        $cs.SetProperty(“tldate”,”healthy”)
        $cs.Update()
    }
}

function checkdates
{
    if($cs.levelhigherrorcount -gt ‘0’)
    {
         #Need to check toplevel error prop#
        #if it’s healthy need to get date and set it#
        #else.. I need to pull the exisitng date and use it#

        $tlprop = $cs.GetProperty(“tldate”)
        if($tlprop -eq “healthy”)
        {
             $date = Get-Date
            $da = $date.ToString()
             $cs.SetProperty(“tldate”, $da)
            $cs.update()
         }
    }

    elseif($cs.LevelHighErrorCount -eq ‘0’)
    {
         $cs.SetProperty(“tldate”,”healthy”)
        $cs.update()
     }
}

##Script Starts Here##
$ctr = 0
$ssa = Get-SPEnterpriseSearchServiceApplication
$ssa
$crawlLog = New-Object Microsoft.Office.Server.Search.Administration.CrawlLog $ssa
$contentSource = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa

#This may not be used”

$msgbody = @’
One or more Content Sources have reported a Top Level Error in the last Crawl.

‘@

foreach($cs in $contentSource)
{
    ##Check to ensure tldate property exists##
    propcheck

    ##process dates##
    checkdates
    if($cs.LevelHighErrorCount -gt ‘0’)
    {
        $msgbody = $msgbody + “Content Source: ” + $cs.Name + @’

        Number of Errors:
‘@ + $cs.levelhigherrorcount + @’
        Date of Original Error:
‘@ + $cs.getproperty(“tldate”) + @’
‘@
        $ctr++
    }
}

if($ctr -gt ‘0’)
{
    Write-Host “Second if is true so sending email now” -ForegroundColor Green
    ##Emailing with results as attachment##
    $smtpServer = “app3.contoso.local”
    $from = New-Object System.Net.Mail.MailAddress “SharePointAlert@Contoso.local”
    $to = New-Object System.Net.Mail.MailAddress “admins@contoso.local”
    $subject = “Content Source Error Alert!”
    $msg = New-Object System.Net.Mail.MailMessage($from, $to, $subject, $msgbody)
    $smtp = New-Object System.Net.Mail.SmtpClient($smtpServer)
    try{
        $smtp.Send($msg)
         Write-Host “Message Sent”
    }

    catch
    {Write-Host “Exception caught during message submission {0}” -f $Error.ToSTring()}       
}

elseif($ctr -eq ‘0’)
{ Write-Host “No Top Level Errors Occurred” -ForegroundColor Green}

  

Stop-SPAssignment -Global

###Script Ends Here###

 

Thanks!

Russ Maxwell, MSFT