Skip to content Skip to sidebar Skip to footer

Postgresql: Identifying Return Visitors Based On Date - Joins Or Window Functions?

I am looking to identify return visitors to a website within a 7 day window. A data sample and attempt at solving are included below: visitor_id(integer) session_id(integer) event_

Solution 1:

First, I remove event_sequence with a DISTINCT (assuming that all events are on the same day), then I use the window function lead and compare with the date of the next visit:

SELECT visitor_id,
       session_id,
       COALESCE(
          lead(d_date) OVER w - d_date,
          10
       ) < 7 AS revisited,
       d_date
FROM (SELECT DISTINCT visitor_id,
                      session_id,
                      d_date
      FROM "table"
     ) t
WINDOW w AS (PARTITION BY visitor_id
             ORDER BY d_date
             ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
            )
ORDER BY visitor_id, session_id;

┌────────────┬────────────┬───────────┬────────────┐
│ visitor_id │ session_id │ revisited │   d_date   │
├────────────┼────────────┼───────────┼────────────┤
│          11 │ t         │ 2017-01-01 │
│          12 │ f         │ 2017-01-05 │
│          13 │ t         │ 2017-01-20 │
│          14 │ f         │ 2017-01-25 │
│          21 │ t         │ 2017-01-02 │
│          22 │ f         │ 2017-01-02 │
│          23 │ f         │ 2017-01-18 │
└────────────┴────────────┴───────────┴────────────┘
(7 rows)

Post a Comment for "Postgresql: Identifying Return Visitors Based On Date - Joins Or Window Functions?"