Here’s how office 365 crawls to improve search efficiency?

04 Feb 2019

At present cloud plays a major role in organizations and all want to move towards the cloud. This has led to some confusion around management and improvement of search. When this was communicated with the customers, regarding the continued need for a search strategy, then somebody asks, “doesn’t the Microsoft Graph develop everything? Why do I still worry about search?

Search is one of the critical points in SharePoint, it may be either online or on-premises. One should always ensure the content is properly tagged classified and indexed. The Process of SharePoint crawling and indexing is most important so that the content can be found. SharePoint Online (Office 365) and SharePoint on-premise have key differences in how content is crawled and indexed.

Crawler Analysis in SharePoint Online

The given content source is securely connected by the crawler and mapped to the content from the source system to the crawled properties of the search engine, and finally feeds the engine in either a full crawl or associate in an incremental crawl (which finds any changes). Search index helps you to see the results in the content search web part or the search results SharePoint web part, this doesn’t come directly from your lists and libraries.

One wants to change managed properties or add new ones, the changes reflect only after the content has been re-crawled, in addition crawling in SharePoint Online happens automatically based on the defined crawl schedule. The search index will not automatically re-crawl the list or the library, because your changes are made in the search schema, and not to the actual site.

You can explicitly re-index a list or library to make sure that the changes are crawled, this leads to the list or library content which will be re-crawled and gives the option of start using your new managed properties in queries, query rules, and display templates.

The breadth of connectors, coverage of different security models and data types capture the content and it makes them different from one search engine to the next, which enhances the performance (both throughput and latency), robustness, and ease of administration.
Now we will find an illustration for SharePoint 2016 that supports multiple crawl components, crawl databases, and content sources. SharePoint content management standout amongst the most mainstream  CMS, which represents Content Management System. A CMS is a system that enables a lot of clients to distribute, alter and modify content. Numerous such systems likewise give methods to oversee work processes in a synergistic situation.

• SharePoint sites (from SPS2003 through SP2016)
• HTTP (websites)
• File shares
• Business Data Connectivity (BDC) Framework — This includes the connectors that are built on the BDC framework:
• Exchange Public Folders
• Lotus Notes
• Documentum
• Taxonomy Connector (connects to MMS)
• People Profile Connector

There are two types of SharePoint crawling:

1) The continuous crawl (This runs every 15 minutes and picks up new and changed documents or items) and
2) The incremental crawl (This follows a Microsoft-defined schedule to pick up any changes in the search configuration.)

Crawl Control in SharePoint Online

In your SharePoint on-premise environment, the type or frequency of crawls can be controlled by the administrators, whereas within SharePoint Online, there is an automated schedule that cannot be changed. The frequency of these SharePoint  crawls, which typically runs every 4 to 8 hours from the previous incremental crawl and it is managed by Microsoft.

The option which contains information from all documents and pages on your site is search index and the managed properties are kept in the index, which results, the users performing a search only on managed properties. Crawled properties should be mapped to managed properties to get the content and metadata from the documents into the search index.

The administrators can re-index a site, a document library in SharePoint, or a list within SharePoint Online and this remotely-controlled crawl becomes a concern when you’ve mapped a crawled property to a managed property, and want the managed property updated to reflect this change.

Whereas in an on-premises SharePoint development environment, you can initiate a full-crawl to capture the change and re-index your environment. However, with SharePoint Online, there is no choice to re-index, rather, you’ll need to open a support ticket with Microsoft to re-index your tenant.

Customizing the search experience can have a direct impact on end-user adoption. The search schema controls what you can search for, how you search, and how you present the results on your website or intranet. Search “discovers” info by crawling items on your website. The discovered content and metadata are referred to as “properties” of the item.

The search schema features a list of crawled properties that helps the crawler decide what content and metadata to extract. By ever-changing the search schema, you’ll be able to customize the search expertise in SharePoint Online. Why would you want to change your search schema? So that you can provide search expertise that best matches your exclusive organizational requirements.

Helps your end users find the data they’re looking for more easily. To give an illustration, you might modify your schema to sort search results based on Managed Metadata columns that are exclusively for your organization, such as prioritizing results based on Product Type or Template.

Information On-premise VS Online Search

To give added information on On-Premise VS Online search, Office 365 and SharePoint Online can require a major shift in how you manage not just search, but all aspects of your SharePoint environment. Many features of SharePoint central administration streamlined, and granular controls removed.

Also check the previous post: Why OneNote is the ultimate note-taking tool?

Categories: Office 365