Overview
Index updates are generally processed incrementally to efficiently apply changes to existing data. However, under certain conditions, the system automatically performs a full refresh to maintain data consistency, accuracy, and optimal indexing performance.
A full refresh involves rebuilding the entire index from the source dataset rather than applying partial updates. This approach is preferred when incremental processing would be inefficient, insufficient, or could lead to inconsistencies.
Scenarios That Trigger a Full Refresh
1. Full Feed Submission
When a complete data feed is submitted, it is interpreted as a request to rebuild the index from the entire source dataset. As a result, the subsequent update is executed as a full refresh.
2. Large Delta Updates
If a delta update impacts a significant portion of the dataset (typically more than 75% of documents), the system switches to a full refresh. In such cases, rebuilding the index is generally more efficient than applying extensive incremental changes.
3. Changes to Dynamic Categories
Modifications to dynamic category definitions require re-evaluating category assignments across the entire catalogue. To ensure accuracy, the system performs a full refresh during the next update cycle.
4. Updates to Attribute Configuration or Catalog Settings
Changes at the catalog configuration level—such as attribute definitions or indexing settings—can influence how documents are processed and indexed. Because these changes may affect a large portion or the entirety of the dataset, a full refresh is triggered.
5. AI/ML Signal Updates Impacting Indexing
Updates driven by AI/ML signals (e.g., relevance tuning or behavioral signals) may require reprocessing the index to apply changes consistently. When feasible, these updates are aligned with scheduled full feed refreshes to avoid unnecessary rebuilds.
6. Disaster Recovery Events
If an incident requires Bloomreach to restore or rebuild indexed data during disaster recovery, the system runs a full refresh to ensure data integrity and consistency.
Summary
A full refresh is triggered when incremental updates are no longer efficient or sufficient to maintain index integrity. These situations typically involve large-scale changes, configuration updates, or system-level events that require rebuilding the index from scratch.