Tuesday, April 30, 2013

Meet Your New Best Friends in Sitecore 7: SearchLog and CrawlingLog


It’s super important to understand how both search and indexing work under the hood, and for that reason, the CrawlingLog and SearchLog facilities should be very helpful in mastering Sitecore 7.

SearchLog

The SearchLog provides an ultimate insight on how your search queries are executed.
Similar to other loggers, the SearchLog is defined in the <log4net /> section of the web.config, with the logging level set to INFO by default:
    <logger name="Sitecore.Diagnostics.Search" additivity="false">
      <level value="INFO" />
      vappender-ref ref="SearchLogFileAppender" />
    </logger>
The appender for this log is defined in the same section, and by default is setup to write to a txt file under the data folder/logs:
    <appender name="SearchLogFileAppender" ... >
      <file value="$(dataFolder)/logs/Search.log.{date}.txt" />
      <appendToFile value="true" />
      <layout type="log4net.Layout.PatternLayout">
         <conversionPattern value="%4t %d{ABSOLUTE} %-5p %m%n" />
      </layout>
    </appender>
Since the implementation is based on Log4Net, you can tweak the appender to log to a Windows Event Log or database, or any other location. Please refer to the Log4Net documentation for this.
Here is a sample output of the SearchLog:
3212 19:31:56 INFO  ExecuteQueryAgainstLucene : +_datasource:sitecore 
+(+(+_path:11111111111111111111111111111111 +_latestversion:1) 
+mileagehwy:[1 TO 4mileagecity]) - Filter : 
3212 19:31:56 INFO  Results from web database :8818
If you need to enable full level debug of the ContentSearch API/Linq, enable this setting in Sitecore.ContentSearch.config and change the logging level for the SearchLogger to DEBUG:
ContentSearch.EnableSearchDebug = true
 <logger name="Sitecore.Diagnostics.Search" additivity="false">
      <level value="DEBUG" />
      <appender-ref ref="SearchLogFileAppender" />
    </logger>
It is important to understand the value of the SearchLog in context of a particular persona:
  • For developers, it is important to understand how the Linq code translates into native search queries passed onto the Search Provider (Lucene or Solr out of the box).
  • For the administrators, it is important to understand which search queries are performed on a particular server. This information can be used for further optimization, such as inclusion into the queryWarmup pipeline.

CrawlingLog

The CrawlingLog’s mission is to provide more insight on what’s going on during the indexing process.
Similar to other loggers, the Crawling Log is defined in the <log4net /> section of the web.config, with the logging level set to INFO by default:
    <logger name="Sitecore.Diagnostics.Crawling" additivity="false">
      <level value="INFO" />
      <appender-ref ref="CrawlingLogFileAppender" />
    </logger>
The appender for this log is defined in the same section, and by default is setup to write to a txt file under the data folder/logs:
    <appender name="CrawlingLogFileAppender" ... >
      <file value="$(dataFolder)/logs/Crawling.log.{date}.txt" />
      <appendToFile value="true" />
      <layout type="log4net.Layout.PatternLayout">
        <conversionPattern value="%4t %d{ABSOLUTE} %-5p %m%n" />
      </layout>
    </appender>
Since the implementation is based on Log4Net, you can tweak the appender to log to a Windows Event Log or database, or any other location. Please refer to the Log4Net documentation for this.
Here is a sample output of the CrawlingLog. As you can see, upon the startup of the Sitecore application, the log will render the information on how each index is configured and initialized.
INFO  [Index=sitecore_core_index] Initializing IntervalAsynchronousUpdateStrategy with interval '00:01:00'
INFO  [Index=sitecore_core_index] Initializing LuceneDatabaseCrawler. DB:core / Root:/sitecore
INFO  [Index=sitecore_master_index] Initializing SynchronousStrategy.
INFO  [Index=sitecore_master_index] Initializing LuceneDatabaseCrawler. DB:master / Root:/sitecore
INFO  [Index=sitecore_web_index] Initializing OnPublishEndAsynchronousStrategy.
INFO  [Index=sitecore_web_index] Initializing LuceneDatabaseCrawler. DB:web / Root:/sitecore
INFO  [Index=custom_master] Initializing IntervalAsynchronousUpdateStrategy with interval '00:00:05'.
INFO  [Index=custom_master] Initializing LuceneDatabaseCrawler. DB:master /
                            Root:{D70CBEED-6DCF-483F-978F-6FC3C8049512}
INFO  [Index=custom_web] Initializing OnPublishEndAsynchronousStrategy.
INFO  [Index=custom_web] Initializing LuceneDatabaseCrawler. DB:web / 
                         Root:{D70CBEED-6DCF-483F-978F-6FC3C8049512}
INFO  [Index=custom_web] Creating primary and secondary directories
INFO  [Index=custom_web] Resolving directories from index property store for index 'custom_web'
INFO  [Index=custom_master] IntervalAsynchronousUpdateStrategy executing.
The CrawlingLog will also output every time an index update strategy is hit for a particular index, when a full rebuild is triggered on a particular index, etc.
Since the CrawlingLog has the INFO level by default, the output information will be limited.
If you require to see more indexing activity for troubleshooting purpose, you can change the logging level to DEBUG:
    <logger name="Sitecore.Diagnostics.Crawling" additivity="false">
      <level value="DEBUG" />
      <appender-ref ref="CrawlingLogFileAppender" />
    </logger>
This configuration change along with the VerboseLogger enabled will produce a very detailed, item level indexing log. For obvious reasons, this setting is meant to be only at very short times in any environment.
Here are some use cases of the CrawlingLog:
  1. One of my indexes is not getting updated.
  2. I have determined that some items are not being indexed, but I don’t know why.
  3. For some reason, full rebuild is triggered for my index. I need to understand why.
  4. I would like to explore indexing activity on a particular server.
  5. I am setting up a multi-server environment, which is heavily relying on ContentSearch. I need to test how my index is being updated across all servers.
Hope these two new facilities will be helping you out in configuring and monitoring your Sitecore 7 environment.

0 comments: