I know troubleshooting Search Query issues in SharePoint 2010 can be a difficult tasks. Usually, the problem I’m asked to tackle is why are my queries slow? I wanted to provide some additional detail and a behind the scenes look at what a Search Query looks like. I recommend reviewing my original Search Query blog if you need to brush up on the related components and how they work together.
Note: This scenario is strictly SharePoint 2010 without Fast
I figured the best way to diagram what happens behind the scenes is to use the Query Flow steps from my original blog and break them apart with a series of logging techniques. For Example (Network traces and ULS Logs)
When troubleshooting issues with Search Queries, It’s extremely important to identify how these moving parts operate together within the ULS logs and Network traces. Also, within the ULS logs we do a very good job of logging the execution times of each particular function “monitored scope”. The intent of this email is to provide you with a road map of how to walk a query behind the scenes with ULS logs and Network traces..
The main steps during Query Flow are the following:
1. A search is performed by a user
2. The WFE serving the call uses the associated search service application proxy to connect to a server running the Query and Site Settings Service also known as the Query Processor. It uses WCF for this communication.
3. The QP will connect to the following components to gather results merges\security trims and return results back to WFE:
4. WFE displays search results to the user
Below is a walkthrough of what a Query transaction looks like behind the scenes broken apart by the main steps above.
My current environment is setup like the following:
- WFE – Hosting the public facing sites for end users
- APP – Hosts both the Query Component and the Query Processor
Step 1 – A search is performed by a user
The following query was for “Dirk”
The following Request is submitted to the Web Front End the user is currently connected to:
Http: Request, GET /_layouts/OSSSearchResults.aspx, Query:k=Dirk&cs=This%20Site&u=http%3A%2F%2Fwfe, Using NTLM Authorization
Capturing this request with a network trace looks like:
Step 2 – The WFE serving the call uses the associated search service application proxy to connect to a server running the Query and Site Settings Service also known as the Query Processor. It uses WCF for this communication.
Behind the scenes, a query is submitted against the designated server running the Query Processor via a web service call to that servers SearchService.svc web service. Where is that located within IIS?
ULS Logs set to Verbose on the WFE
01/27/2012 11:30:45.18 w3wp.exe (0x1140) 0x120C SharePoint Foundation Logging Correlation Data xmnv Medium Name=Request (GET:http://wfe:80/_layouts/OSSSearchResults.aspx?k=Dirk&cs=This%20Site&u=http%3A%2F%2Fwfe) 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
Call to the Query Processor via the SearchService.svc web service to execute the request:
01/27/2012 11:31:17.12 w3wp.exe (0x1140) 0x120C SharePoint Foundation Monitoring nasq Verbose Entering monitored scope (SharePointSearchRuntime::GetQueryResult) 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
01/27/2012 11:31:17.15 w3wp.exe (0x1140) 0x120C SharePoint Foundation Monitoring nasq Verbose Entering monitored scope (Execute) 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
01/27/2012 11:31:17.15 w3wp.exe (0x1140) 0x120C SharePoint Foundation Topology e5mc Medium WcfSendRequest: RemoteAddress: ‘http://app:32843/6c270a2117544235b521a8e5a864c5f6/SearchService.svc‘ Channel: ‘Microsoft.Office.Server.Search.Administration.ISearchServiceApplication’ Action: ‘http://tempuri.org/ISearchQueryServiceApplication/Execute‘ MessageId: ‘urn:uuid:5c0fbea8-1ca8-4146-8150-c6d8b735adae’ 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
01/27/2012 11:31:17.15 w3wp.exe (0x1140) 0x120C SharePoint Foundation Monitoring nasq Verbose Entering monitored scope (ExecuteWcfOperation:http://tempuri.org/ISearchQueryServiceApplication/Execute) 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
Network Monitor
We send multiple Post request to the QP along with HTTP Payload. The Post request with HTTP Payload that matters is the last one sent in the network conversation. For example, the HTTP Payload contains the following:
117 12:31:05 PM 1/27/2012 4.0724607 w3wp.exe 10.1.0.12 10.1.0.11 HTTP HTTP:Request, POST /6c270a2117544235b521a8e5a864c5f6/SearchService.svc {HTTP:11, TCP:10, IPv4:9}
119 12:31:05 PM 1/27/2012 4.0746818 w3wp.exe 10.1.0.12 10.1.0.11 HTTP HTTP:HTTP Payload, URL: /6c270a2117544235b521a8e5a864c5f6/SearchService.svc {HTTP:11, TCP:10, IPv4:9}
Frame 119 is important as the payload information contains the method “Execute” and the QueryText “dirk”:
Step 3 – The QP will connect to the following components to gather results merges\security trims and return results back to WFE
ULS Logs set to Verbose on the Server hosting the Query Processor
Here is the inbound request from the WFE.
01/27/2012 09:31:16.93 w3wp.exe (0x1544) 0x1644 SharePoint Foundation Topology e5mb Medium WcfReceiveRequest: LocalAddress: ‘http://app.contoso.local:32843/6c270a2117544235b521a8e5a864c5f6/SearchService.svc‘ Channel: ‘System.ServiceModel.Channels.ServiceChannel’ Action: ‘http://tempuri.org/ISearchQueryServiceApplication/Execute‘ MessageId: ‘urn:uuid:5c0fbea8-1ca8-4146-8150-c6d8b735adae’ 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
01/27/2012 09:31:16.93 w3wp.exe (0x1544) 0x1644 SharePoint Foundation Monitoring nasq Medium Entering monitored scope (ExecuteWcfServerOperation) 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
When the Query Processor starts on the request, it will get a new correlation ID:
01/27/2012 09:31:17.13 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2ni Verbose Calling CQPCache::GetQPCache. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
Here is where Query Processor will start fetching results from the Query Component\s holding the index to retrieve results as well as the Search Property Store Database on the SQL server:
01/27/2012 09:31:18.02 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2pp Verbose Entering Split Command Tree parser. [splitter.cxx:184] d:\office\source\search\native\ytrip\tripoli\icommand\splitter.cxx 599cd6ce-7c0c-44e8-8983-0be66f7176ad
01/27/2012 09:31:18.43 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2ps Verbose Tree only had both FT and SQL constraints… split apart. [splitter.cxx:230] d:\office\source\search\native\ytrip\tripoli\icommand\splitter.cxx 599cd6ce-7c0c-44e8-8983-0be66f7176ad
QP Fetching results from the Index:
01/27/2012 09:31:18.43 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2jx Verbose Executing SubQuery FT Part: op: (116) DBOP_and wKind: (3) DBVALUEKIND_I4: 0 first child: op: (145) DBOP_content_freetext wKind: (259) DBVALUEKIND_CONTENT: ‘Dirk ‘, method 0, weight 100000000, lcid 0x409 first child: op: (5) DBOP_column_name wKind: (258) DBVALUEKIND_ID: 012357bd-1113-171d-1f25-292bb0b0b0b0 2147450879 next sibling: op: (68) DBOP_equal wKind: (3) DBVALUEKIND_I4: 0 first child: op: (5) DBOP_column_name wKind: (258) DBVALUEKIND_ID: 012357bd-1113-171d-1f25-292bb0b0b0b0 2147418095 next sibling: op: (0) DBOP_scalar_constant wKind: (12) DBVALUEKIND_VARIANT VT_LPWSTR ‘http://wfe‘ 599cd6ce-7c0c-44e8-8983-0be66f7176ad
01/27/2012 09:31:18.66 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2kp Verbose Got 14 candidates from FT. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
01/27/2012 09:31:18.66 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2ks Verbose Got some.. Now joining with SQL. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
01/27/2012 09:31:18.66 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2et Verbose Candidates Query to SQL: /* {599CD6CE-7C0C-44E8-8983-0BE66F7176AD} */ SET NOCOUNT ON;DECLARE @joinData VARBINARY(MAX) ; SET @joinData = ? ; DECLARE @joinRows INT ; SET @joinRows = DATALENGTH(@joinData) / 8 ;;WITH DocIds(DocId, Value) AS ( SELECT TOP(@joinRows) CAST(SUBSTRING(@joinData, ((ord.n-1)*8) + 1, 4) AS INT), CAST(SUBSTRING(@joinData, ((ord.n-1)*8) + 5, 4) AS INT) FROM dbo.MSSOrdinal AS ord WITH(NOLOCK) WHERE ord.n <= @joinRows ) SELECT A.DuplicateHashes, A.HasPluggableSecurityTrimming, A.DocId, A.Sdid FROM DocIds AS T INNER LOOP JOIN MssDocSdids AS A ON T.DocId = A.DocId ORDER BY T.Value DESC OPTION (MAXDOP 1) 599cd6ce-7c0c-44e8-8983-0be66f7176ad
01/27/2012 09:31:18.68 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2l2 Verbose Got 14 candidates after join. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
Now Security Trimming and Dup removal of the results:
01/27/2012 09:31:18.68 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2p4 Verbose In CSecurityTrimmer::AccessCheck there were 1 unique sdids among the candidates. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
01/27/2012 09:31:18.69 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2ke Verbose Had 14 candidates before sec-trimming, have 14 candidates after. ‘fMoreCandidates’ = 0. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
01/27/2012 09:31:18.69 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2la Verbose Had 14 candidates before near dup removal, have 5 candidates after. ‘fMoreCandidates’ = 0. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
QP Fetching results from SQL Server:
01/27/2012 09:31:18.69 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2jx Verbose Executing SubQuery SQL Part: op: (68) DBOP_equal wKind: (3) DBVALUEKIND_I4: 0 first child: op: (5) DBOP_column_name wKind: (258) DBVALUEKIND_ID: 012357bd-1113-171d-1f25-292bb0b0b0b0 7 next sibling: op: (0) DBOP_scalar_constant wKind: (12) DBVALUEKIND_VARIANT VT_LPWSTR ‘http://Dirk‘ FT Part: op: (68) DBOP_equal wKind: (3) DBVALUEKIND_I4: 0 first child: op: (5) DBOP_column_name wKind: (258) DBVALUEKIND_ID: 012357bd-1113-171d-1f25-292bb0b0b0b0 2147418095 next sibling: op: (0) DBOP_scalar_constant wKind: (12) DBVALUEKIND_VARIANT VT_LPWSTR ‘http://wfe‘ 599cd6ce-7c0c-44e8-8983-0be66f7176ad
01/27/2012 09:31:18.69 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2et Verbose Candidates Query to SQL: /* {599CD6CE-7C0C-44E8-8983-0BE66F7176AD} */ SET NOCOUNT ON; SELECT TOP 2797 A.DuplicateHashes, A.HasPluggableSecurityTrimming, A.DocId, A.Sdid FROM MssDocSdids AS A WITH (NOLOCK) LEFT OUTER JOIN dbo.MSSDocProps AS P7 WITH (NOLOCK) ON ( P7.DocId = A.DocId AND P7.Pid = 7 ) WHERE P7.llVal = ? 599cd6ce-7c0c-44e8-8983-0be66f7176ad
01/27/2012 09:31:18.71 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2jd Verbose Got 0 candidates from this sub-query. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
Sort and Rank the Search Results:
01/27/2012 09:31:18.71 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2jk Verbose Performing ‘Multi Sub-Query sort-by-rank’ final candidates code-path. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
01/27/2012 09:31:18.71 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2jw Verbose Fetching Search-Results from property store. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
Query has been Completed by the Query Processor and Query Timings are logged:
01/27/2012 09:31:18.74 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2jt Medium Completed query execution with timings: total:718 dup:0 sec:16 join:0 ft:234 sql:31. Join Retry: 0. Security Trimming Retry: 0. Duplicate removal Retry: 0. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
01/27/2012 09:31:18.76 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor g2jn Verbose Verbose query timings: total: 766, other: 454, ft: 0, sql: 31, join: 234, dup: 0, sec: 16, final sort: 0, best bets: 0, results retr: 0, high conf: 0, populate: 31. 599cd6ce-7c0c-44e8-8983-0be66f7176ad
QP Execute is completed
01/27/2012 09:31:18.99 w3wp.exe (0x1544) 0x1644 SharePoint Foundation Monitoring b4ly Medium Leaving Monitored Scope (ExecuteWcfServerOperation). Execution Time=2069.4805 599cd6ce-7c0c-44e8-8983-0be66f7176ad
The original inbound request is completed (Notice the Correlation ID matches the original execute call to the Web Service – original call above in red)
01/27/2012 09:31:18.99 w3wp.exe (0x1544) 0x1644 SharePoint Foundation Monitoring b4ly Medium Leaving Monitored Scope (ExecuteWcfServerOperation). Execution Time=2069.4805 599cd6ce-7c0c-44e8-8983-0be66f7176ad
Important: You might notice that the Query Processor decides to execute against SQL first and then the Full-Text index. The decision making on why this happens is outside the scope of this blog but it’s dependent on the type of Search Query performed.
Step 4 – WFE displays search results to the user
ULS Logs set to Verbose on the WFE that originally serviced the request from Step 1
When the Query Processor returns the results back to the WFE (Notice this is the completion of the monitored scope in green) The beginning of this request is at the top in green.
01/27/2012 11:31:19.52 w3wp.exe (0x1140) 0x120C SharePoint Foundation Monitoring b4ly Verbose Leaving Monitored Scope (ExecuteWcfOperation:http://tempuri.org/ISearchQueryServiceApplication/Execute). Execution Time=2400.0826 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
01/27/2012 11:31:19.81 w3wp.exe (0x1140) 0x120C SharePoint Server Search Query dk91 High SearchServiceApplicationProxy::Execute–Id: Elapsed Time: 2664 QP Time: 1792 Sql Time: 31 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
01/27/2012 11:31:19.81 w3wp.exe (0x1140) 0x120C SharePoint Foundation Monitoring b4ly Verbose Leaving Monitored Scope (Execute). Execution Time=2703.0805 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
01/27/2012 11:31:19.95 w3wp.exe (0x1140) 0x120C SharePoint Foundation Monitoring b4ly Verbose Leaving Monitored Scope (SharePointSearchRuntime::GetQueryResult). Execution Time=2856.5793 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
01/27/2012 11:31:20.09 w3wp.exe (0x1140) 0x120C SharePoint Foundation Monitoring b4ly Verbose Leaving Monitored Scope (SharePointSearchRuntime::SendRequest). Execution Time=3025.3754 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
01/27/2012 11:31:20.09 w3wp.exe (0x1140) 0x120C SharePoint Server Search Query dln3 Verbose GetResults on location LocalSearchIndex for Dirk completed with 5 results 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
The WFE prepares the results for the User
The WFE takes the above results and builds up the ossearchresults.aspx page by using the following monitored scopes:
RefinementManager.GetRefinement
OnPreRender
CreateChildControls
AddAssemblyReference
DataBinding DataFormWebPart
GetXPathNavigator
UserPreference.GetUserPreference
UserPreference.GetFromCache
Render Ribbon
SearchCommon::GetCurrentSiteCollectionProperty
RenderWebPart
PostRequestExecuteHandler
EndRequestHandler SPRequest Disposal
Finally when the page is rendered we leave the original Monitored Scope
01/27/2012 11:31:33.40 w3wp.exe (0x1140) 0x120C SharePoint Foundation Monitoring b4ly Medium Leaving Monitored Scope (Request (GET:http://wfe:80/_layouts/OSSSearchResults.aspx?k=Dirk&cs=This%20Site&u=http%3A%2F%2Fwfe)). Execution Time=48385.3272 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
Q&A Portion
Question: So how were you able to even identify the start of the Query conversation within Step 2 on the Web Front-End?
Answer: First, download ULSViewer here: http://archive.msdn.microsoft.com/ULSViewer
Next, once you open the associated ULS log on the Web Front-End, you can hit (CTRL+M) to bring up the filter window. Then do a query like the following:
Message Contains dirk. (In my case, I submitted a query dirk.)
Question: How did you find the other side of the conversation on the Query Processor under Step 3?
Answer: I used ULS Viewer and opened the associated ULS log on the Query Processor. I then filtered (CTRL+M) for the following:
Correlation ID Contains 9d7bacd8-0e70-4ffe-a293-cb2c9c41f205
Question: How did you find the associated Query Processor entries assuming more than one query is firing against the QP?
Answer: I used ULS Viewer and did a CTRL+F and queried for dirk. I found it immediately from this entry:
01/27/2012 09:31:18.43 w3wp.exe (0x1544) 0x1644 SharePoint Server Search Query Processor e2jx Verbose Executing SubQuery FT Part: op: (116) DBOP_and wKind: (3) DBVALUEKIND_I4: 0 first child: op: (145) DBOP_content_freetext
wKind: (259) DBVALUEKIND_CONTENT: ‘Dirk ‘, method 0, weight 100000000, lcid 0x409 first child: op: (5) DBOP_column_name wKind: (258) DBVALUEKIND_ID: 012357bd-1113-171d-1f25-292bb0b0b0b0 2147450879 next sibling: op: (68) DBOP_equal
wKind: (3) DBVALUEKIND_I4: 0 first child: op: (5) DBOP_column_name wKind: (258) DBVALUEKIND_ID: 012357bd-1113-171d-1f25-292bb0b0b0b0 2147418095 next sibling: op: (0) DBOP_scalar_constant wKind: (12) DBVALUEKIND_VARIANT VT_LPWSTR ‘http://wfe‘ 599cd6ce-7c0c-44e8-8983-0be66f7176ad
With this log entry, you can see the category is Query Processor and the query term is defined. I know from the entry this is request against the Full-Text Index. Since this is my matching entry, I can grab the correlation ID off the end.
Now that I have the correlation ID, I can filter the ULS log with ULS viewer to see the entire conversation:
Correlation ID Contains 599cd6ce-7c0c-44e8-8983-0be66f7176ad
Thanks!
Russ Maxwell, MSFT