Returns a window definition that aggregates events into session windows.
Events and windows under different grouping keys are treated
independently.
The functioning of session windows is easiest to explain in terms of the
event interval: the range
[timestamp, timestamp +. Initially an event causes a new session window to be
created, covering exactly the event interval. A following event under
the same key belongs to this window iff its interval overlaps it. The
window is extended to cover the entire interval of the new event. The
event may happen to belong to two existing windows if its interval
bridges the gap between them; in that case they are combined into one.
Behavior when changing session timeout on job update
It is allowed to change session timeout in an updated pipeline. Windows
are stored in the snapshot with the end time equal to the time of the
latest event + session timeout. A new event after the update will be
merged into the old window using the new timeout. This will cause that
the windows after the update will have varying timeouts until all
windows from before the update are emitted.
For example: say
E(n) is an event with timestamp
n and
W(m, n) is a window with
startTime=m and
endTime=n. Session timeout is 10. We receive
E(50), we'll store
it in a window
W(50, 60). Then, job is updated and session
timeout changes to 20. If we then receive
E(45), we'll handle it
as merging of the restored
W(50, 60) and of
W(45, 65),
created from the new event and new timeout. It will result in
W(45, 65). Thus, the actual session timeout in this window will be 15.