Calculating the average lead time: event_x and event_y
To understand the average time difference between two customer events, such as event_x and event_y. This metric, often called lead time, helps you analyze customer behavior and use the result later in segmentation.
Use this approach when both events share a common identifier. In this article, that identifier is ID_event_attribute. That identifier lets you match the correct event_x with the correct event_y, even if the events do not appear in sequential order.
The main challenge is that the events may be mixed in the event stream, so you need to pair them by identifier rather than by position in the timeline.
Figure 1 below shows an example timeline:
x1 → y1 → x2 → x3 → x4 → y2 → y3 → y5
Figure 1: Match event_x and event_y by ID_event_attribute, not by their order in the event timeline.
The calculation logic
Start by identifying the pairs that share the same ID_event_attribute. In this example, that means the calculation works with these matched pairs:
(x1, y1)(x2, y2)(x3, y3)
Events such as x4 and y5 do not enter the result because they do not have a matching event with the same identifier. The setup naturally calculates the averages only across matched pairs.
Then calculate the average of the individual time differences:
((y1 - x1) + (y2 - x2) + (y3 - x3)) / 3
You can also write the same result like this:
(y1 + y2 + y3) / 3 - (x1 + x2 + x3) / 3
In Bloomreach terms, this means:
avg(event_y.timestamp) - avg(event_x.timestamp)
This works only if both averages use the same set of matched pairs.
Implementing the solution with expressions and aggregates
To achieve this in a platform that uses event expressions and aggregates, you need to create:
Two event expressions to identify valid pairs
Two aggregates to calculate the average timestamps for matched events
One customer expression to subtract the two averages
Step 1: Create linking event expressions
Create one event expression for each event type. Each expression returns a list of identifiers from the opposite event type. This lets you check whether the current event has a valid pair.
For event_y, create an event expression such as Event_y_Has_Event_x_Pair with:
distinct_values(event_x.ID_event_attribute)
Figure 2: The event_y expression collects distinct ID_event_attribute values from event_x.
For event_x, create a similar event expression such as Event_x_Has_Event_y_Pair with:
distinct_values(event_y.ID_event_attribute)
Figure 3: The event_x expression collects distinct ID_event_attribute values from event_y.
Step 2: Calculate paired averages with aggregates
Now create two aggregates: one for the average event_y timestamp and one for the average event_x timestamp.
Calculate the average event_y timestamp
Create an aggregate:
avg(event_y.timestamp)
Then add a filter so the aggregate keeps only
event_yrecords whose identifier exists in the expression that points to matchingevent_xevents:
WHERE event_expression any item equals ID_event_attribute
If the same identifier can appear more than once, group
event_yevents byID_event_attributeand aggregate only the first event in each group. This prevents duplicate counting.
Figure 4: Aggregate setup for event_y
Calculate the average event_x timestamp
Create the matching aggregate:
avg(event_x.timestamp)
Then add the same filter logic in reverse so the aggregate keeps only
event_xrecords whose identifier exists in the expression that points to matchingevent_yevents:
WHERE event_expression any item equals ID_event_attribute
If the same identifier can appear more than once, group
event_xevents byID_event_attributeand aggregate only the first event in each group. This prevents duplicate counting.
Figure 5: Aggregate setup for event_x
Step 3: Create the final customer expression
Finally, create a Customer Expression that subtracts the average event_x timestamp from the average event_y timestamp:
avg(event_y.timestamp) - avg(event_x.timestamp)
Figure 5: Final customer expression for the lead time calculation (in seconds)
To output the result in days rather than seconds, use:
Figure 6: Final customer expression for the lead time calculation (in days)
This final expression gives you the average number of days between event_x and event_y for each customer.
This approach gives you a reusable lead-time metric that you can use directly in customer segmentation and analysis.