Skip to content

Time Windows and Leakage Safety

GraphReduce is designed for point-in-time-correct feature generation.

GraphReduce leakage-safe time windows

Core timing controls

Graph and node parameters define what historical data is visible at transform time:

  • cut_date: reference time for feature/label splitting.
  • compute_period_val + compute_period_unit: lookback window for feature computation.
  • label_period_val + label_period_unit: forward window for label generation.
  • date_key on nodes: required for time-based filtering on that node.

Base-class definitions (window boundaries)

In GraphReduceNode, the base methods enforce these windows:

  • prep_for_features: date_key < cut_date and date_key > cut_date - compute_period_minutes()
  • prep_for_labels: date_key > cut_date and date_key < cut_date + label_period_minutes()

Why this matters

Without strict time windows, training features can include future information and inflate offline metrics. GraphReduce pushes shared time configuration through the graph so reductions and joins remain leakage-safe.

Practical checklist

  • Set a valid date_key on all time-varying nodes.
  • Confirm feature windows end at or before cut_date.
  • Compute labels only in the label horizon after cut_date.
  • Validate output grain and time boundaries in a small sample before full runs.

See Tutorial: temporal setup for a step-by-step example.