Windows 7 / Getting Started

How Indexing Works

To illustrate the indexing process, consider what happens when a new document is added to an indexed location (a location that is configured for being indexed) on an NTFS volume. The following high-level description explains the steps that take place during the indexing of new file system content:

  1. The NTFS change journal detects a change to the file system and notifies the main indexer process (SearchIndexer.exe). To view the state of this flag for a file, open the file's properties in Windows Explorer and click Advanced. A file change notification is then recorded in the USN journal, and the indexing service listens to these notifications.
  2. The indexer process starts the Search Filter Host process (SearchFilterHost.exe) if it isn't currently running, and the system protocol handler loads the file protocol handler and Protocol Host.
  3. The file's URL is sent to the gatherer's queue. When the indexer retrieves the URL from the queue, it picks the file protocol handler to access the item (based on the file: scheme in the URL). The file protocol handler accesses the system properties (for example, name and size), calls the property handler if one is available, and then reads the content stream from the file system and sends it to the Search Filter Host.
  4. In the Search Filter Host, the appropriate IFilter is loaded and the filter returns text and property chunks to the indexer.
  5. Back in the indexer process, the chunks are tokenized using the appropriate language wordbreaker (each chunk has a locale ID), and the text is sent into the indexing pipeline.
  6. In the pipeline, the indexing plug-in sees the data and creates the in-memory word lists (word to item ID/occurrence counts index). Occasionally, these are written to shadow indexes and then to the master index via master merge.
  7. Another plug-in reads the property values and stores them in the property cache.
  8. If you have a Tablet PC, you may have activated another plug-in that looks for text you write and uses it to help augment handwriting recognition.

Note In Windows 7, both NTFS and FAT32 volumes support notification-based indexing (crawling or pull-type indexing). For NTFS volumes, the NTFS change journal enables notification- based indexing. For FAT volumes, an initial crawl is performed when the location is added and then recrawl is done whenever the location is disconnected (for example, when using an external universal serial bus (USB) drive formatted with FAT) or when the system is rebooted. Once the crawl is complete, however, the ReadDirectoryChangesW application programming interface (AP I) can be used to listen for any updates.

[Previous] [Contents] [Next]

In this tutorial:

  1. Managing Search
  2. Search and Indexing Enhancements
  3. Search in Windows XP
  4. Search in Windows Vista
  5. Search in Windows 7
  6. Understanding the Windows Search Versions
  7. Search Versions Included in Windows 7 and Windows Vista
  8. Search Versions Included in Windows Server 2008
  9. Search Versions Available for Earlier Versions of Windows
  10. How Windows Search Works
  11. Understanding Search Engine Terminology
  12. Windows Search Engine Processes
  13. Enabling the Indexing Service
  14. Windows Search Engine Architecture
  15. Understanding the Catalog
  16. Default System Exclusion Rules
  17. Understanding the FANCI Attribute
  18. Default Indexing Scopes
  19. Initial Configuration
  20. Understanding the Indexing Process
  21. Modifying IFilter Behavior
  22. How Indexing Works
  23. Rebuilding the index
  24. Viewing Indexing Progress
  25. Understanding Remote Search
  26. Managing Indexin
  27. Configuring the Index
  28. Configuring the Index Location Using Group Policy
  29. Configuring Indexing Scopes and Exclusions Using Group Policy
  30. Configuring Offline Files Indexing
  31. Configuring Indexing of Encrypted Files
  32. Configuring Indexing of Encrypted Files Using Control Panel
  33. Configuring Indexing of Similar Words
  34. Configuring Indexing of Text in TIFF Image Documents
  35. Other Index Policy Settings
  36. Using Search
  37. Configuring Search Using Folder Options
  38. Configuring What to Search
  39. Configuring How To Search
  40. Using Start Menu Search
  41. Searching Libraries
  42. Advanced Query Syntax
  43. Using Federated Search
  44. Deploying Search Connectors
  45. Troubleshooting Search and Indexing Using the Built-in Troubleshooter