Search 2010 Architecture and Scale – Part 2 Query

Posted by

Several things have changed in SharePoint 2010 Query.   Query infrastructure is also componentized so now you only provision what you need.   This blog will go through will define the query components, how they work together, and how to provision them. 

Special shout-out goes to Jon Waite for his valuable technical input\review…

Query Basics

Just like crawl, Query has been componentized as well and the following goals are met:

  • Sub-second query latency
  • Index is no longer a single point of failure and is stored on Query servers
  • Query consists of components which can be scaled out among multiple servers to improve performance

 

Query Flow

  1. A search is performed by a user
  2. The WFE serving the call uses the associated search service application proxy to connect to a server running the Query and Site Settings Service also known as the Query Processor.  It uses WCF for this communication.
  3. The QP will connect to the following components to gather results merges\security trims and return results back to WFE:
      • Query Component – holds entire index or partition of an index
      • Property Store DB – holds metadata\properties of indexed content
      • Search Admin DB – holds Security Descriptors\Configuration data

     4.   WFE displays search results to the user

 

Several Query components can be scaled out as an index\property store grows.  A single search service application can have multiples of the following:

      • Property Store DB
      • Query Components
      • Query Processors

 

Query Component and Property Store DB

I’ll use Query server and Query Component interchangeably throughout the blog.  A Query Server is a server that runs one or more Query Components.   These servers hold a full or partial of the search index.  Query Servers are now the sole owner of storing the index on the file system.   As stated from previous post, the indexer crawls content and builds a temporary index.  The Indexer propagates portions of the temporary index over to Query Server to be indexed.  Query Servers contain a copy of the entire or partial index referred to as an Index Partition.  Query components run under the context of an Index partition.   Query components are responsible for serving search queries.  Query component runs under MSSearch.exe.   A query component is mapped to only one Property Store DB.  By now, you should’ve noticed that we split up the databases (For Example: Property Store DB and Crawl DB).   By separating these databases the following has been accomplished:

      • Overall Database size is reduced
      • Database performance is improved

 

Also, by carving out the databases, performance hits like writing crawled data to Crawl Store DB won’t affect tasks like serving Queries “query performance” which heavily depends on the Property Store DB. 

It’s possible to provision multiple Property Store databases and Query components for a single Search service application.  The reasons for doing this are plentiful and most of the reasoning will be explained throughout this post.   Query components can be provisioned to partition an index and\or mirror an index in order to provide fault tolerance.  Both of these components can be created by using either Central Administrator or PowerShell.  To simplify things a bit I’ll cover how to do it in Central Administrator.  In order to make changes to the Search topology, you must access the Search Administration page via the following:

 

Central Administrator\Application Management\Manage Service Applications\Select Search Service Application and select Manage from Ribbon

Scroll to the bottom of the page and this is where you can view\change the search topology. 

clip_image001

 

Provisioning happens in 3 stages:

  1. Hit Modify button
  2. Select New Property Database or Query Component and enter appropriate options at your discretion
  3. Apply Topology Changes

 

Fault tolerance + Performance

 

Query Component (Fault tolerance)

It’s highly recommended to create fault tolerance with your index.   This is accomplished by mirroring a Query component assigned to a different server.   Under the Search Application Topology, you can simply select the Query Component and Add mirror:

clip_image002

The end result is a second query component within the same Index Partition.

clip_image003

Note:  The Query Processor will distribute requests across both Query Components. 

 

Question: I don’t want Queries being served by one of my mirrored Query Components.

Answer:  On the Add mirror query component page, you can check the following option:

clip_image004

This doesn’t eliminate the failover query component from receiving queries.  The Query Processor will prefer Query Components not marked as fail over (active).  If all active Query Components are down, then Query Processor will submit requests to Query Components flagged as fail over. 

 

Property Store (Fault tolerance)

We fully support SQL mirroring to achieve fault tolerance with Property Store DB’s on the backend.

 

Query Component (Performance)

In previous builds of SharePoint, every query server stored the entire index.   While this achieved fault tolerance it didn’t help with performance.    There is a direct correlation between the size of an index and query latency.  The size of an index can easily become a bottleneck for query performance.  

For Example:

  • Index contains 10 million documents =  Average of 2 seconds per query
  • Index contains 20 million documents = Average of 4 seconds per query

This problem has been solved in SharePoint 2010.   Index partition can contain the entire index or a portion of the index.  By creating additional query components, a new index partition is created and owns a portion of the index. 

For Example:

If the entire index is 8 GB and contains 20 million documents:

Holds 50%: 4GB of index\10 million documents:  Query Server 1 – Index Partition 1   

Holds 50%: 4GB of index\10 million documents:  Query Server 2 – Index Partition 2

By partitioning large indexes, query times are reduced and a solution to this type of bottleneck can be solved.   Partitioning an index is as simple as provisioning new Query Components from the Search Application Topology section in Central Administrator.

For Example:

clip_image005

Question: If an index is partitioned out with multiple Query Components, how does the crawler distribute the indexed content?

Answer: The crawler evenly distributes crawled content to Index Partitions using a hash algorithm based on Doc ID’s.  

 

Property Store DB (Performance)

Just like Query components, Property Store DB can be scaled out and share the load of the metadata stored in the Property Store DB.   If the Property Store DB becomes a bottleneck due to the size of the database and\or strains the disk subsystem with high I/O latency on the back end, a new Property Store DB can be provisioned to share the load.  Just like the Crawl DB, the Property Store DB is useless unless it’s mapped to something.  In this case, a Property Store DB must be mapped to a Query component.   If a decision is made to provision an additional Property Store DB to boost performance, an additional non-mirrored Query Component must be provisioned and mapped to it. 

The following is a true statement:

Creating an additional Property Store DB requires the Index to be partitioned off because provisioning a new Query Component is required”.     

 

Query Processor

Great, so understanding Property Store DB and Query component scale out is only half of the battle.   The Query Processor remains and still plays a vital role in Search 2010.  The Query processor is responsible for processing a Query and runs under w3wp.exe process.  It retrieves results from Property Store DB and the Index\Query Components.   Once results are retrieved, they are packaged\security trimmed and delivered back to the requester which is the WFE that initiated the request.  The Query Processor will load balance request if more than one Query Component (mirrored) exists within the same Index Partition.  The exception to this rule is if one of the Query Component’s is marked as fail over only. 

Question: What if I partitioned off my index and I have multiple Query Components provisioned each serving a partition of the index?  How does Query Processor know which partition to connect to in order to accurately retrieve results?

Answer:  It doesn’t!  The Query Processor will connect to every single non-mirrored Query component that contains a partition of the Index to retrieve results.  

Question:  What if I created multiple Property Store Databases for performance reasons?   How does Query Processor know which Property store to connect to in order to accurately retrieve results?

Answer: It doesn’t!  The Query Processor will connect to every single Property Store DB to retrieve results.  

 

In SharePoint 2007, the Query Processor ran on any WFE.   In SharePoint 2010, any server can run the Query Processor.  It’s no longer tied into a server running the Query role.   You provision Query Processor role on a server by performing the following steps:

  1. Within Central Administrator, System Settings, Service on Server
  2. Start the Search Query and Site Settings Service

clip_image007

Note:  Post provision a new web service is created within IIS on that server.

clip_image009

 

 

Query Processor Scale Out

Just like the Query Component and Property Store DB, the Query Processor role can be scaled out to multiple servers.  If the Query Processor is a bottleneck, For Example:

· Not able to keep up with inbound requests or perhaps the box and/or associated W3WP.exe process hosting Query Processor is CPU\Memory bound.

In this case, you provision additional Query Processors as needed.  By provisioning additional query processors, requests will be load balanced in a round robin fashion to each server hosting a Query Processor.    

The same case can be made for achieving fault tolerance.  By having two servers hosting Query Processor role, if one goes down, the other will be used. 

 

Query Processor functions in Parent\Child Farm

In a Publishing/Consumer farm scenario, the Query Processor always runs in the farm where the Search Service Application resides.   So if Search Service Application resides in Publishing farm, Query Processor only runs in publishing farm.   The Consumer farm utilizes the associated Search Service Application proxy to make the connection over WCF to a Query Processor in the publishing farm.

 

Observe is Step 1 and Taking Action is Step 2

Before arbitrarily provisioning new query components and property store DB’s, observe the current environment\query health so some evidence can be gathered before making this important decision.  The obvious reasons of Fault Tolerance and Query Latency are covered in the previous sections so I won’t discuss those further.  Observing for System\Hardware bottlenecks is a good first step before considering adding more Query Components\Property Store DB’s. 

 

Monitoring Query Server

Observation:  The Query server is almost maxed on CPU and\or is at the peak of available physical memory and query latency has increased as a result.

Action Taken:  Provision a new query component 

Monitoring SQL Server

Observation:  Property Store DB is I/O bound on SQL and disk latency is unexpectedly high.

Action Taken:  Provision a new Property Store on same/different SQL server

Important:  These are very basic methods on approaching system bottlenecks.   For Example, don’t assume from a general observation of a spiked CPU would automatically require provisioning additional query components.   More analysis would be required.  Such as finding answers to the following questions:

  1. Does CPU only spike during crawl times?
  2. Which process is spiking?
  3. As the overall size of the index/Property Store DB increased?
  4. Does SP health monitoring or Performance monitor reveal anything of use?
  5. Etc….

 

Thanks,

Russ Maxwell, MSFT