🏝️ Gap and Island SQL – Advanced Query Pattern Explained

Learn how to solve Gap and Island SQL problems using window functions like ROW_NUMBER, LAG, and LEAD — an advanced SQL pattern for analyzing consecutive dates, detecting streaks, and answering popular interview questions with clear examples and practice problems.



🌟 Why Learn the Gap and Island SQL Pattern?

Learning how to solve the Gap and Island SQL problem is essential for anyone preparing for SQL interview questions or working with time-based data. These challenges test advanced SQL queries beyond simple joins and aggregations to analyze consecutive dates, streaks, and gaps between events.

In real-world analytics, this skill helps you tackle complex SQL queries such as:

  • Detect consecutive dates or weeks of activity (islands), such as customers ordering every week for several months.
  • Measure inactivity periods, like the longest gap a supplier took between fulfilling orders.
  • Work with irregular or missing data to find meaningful streaks and breaks.
  • Use SQL window functions such as ROW_NUMBER(), LAG(), and LEAD() to uncover hidden streak analysis patterns.

By mastering these SQL gaps and islands techniques, you’ll gain the ability to analyze churn, reliability, and consistency — key insights in data analytics, performance tracking, and operations.




❓ What Are Gap and Island SQL Problems?

The Gap and Island SQL problem refers to a family of analytical queries that uncover consecutive dates, streaks of activity, or gaps between events — frequent SQL interview questions and real-world analytics challenges. These problems are a favorite in interviews because they test how well you can use SQL window functions to work with sequential data. Let’s look at both types:


SQL Gap Examples – Measuring Breaks Between Events

A Gap problem asks you to measure the break or inactivity period between consecutive rows, which is classic advanced SQL work that often appears as complex SQL queries or in technical interviews. Using LAG() and LEAD(), you compare the current row’s date or sequence number with the previous one to calculate how much time has passed — a core concept in SQL streak analysis.

  • Customer Breaks in Ordering: Xorthax suspects some of his loyal customers disappear for long stretches. For each customer, find the longest gap between two consecutive orders.
  • Supplier Delays: The merchants’ guild is worried about supplier reliability. Identify the longest gap between two shipments from each supplier.
  • Inventory Restocking Lapses: Xorthax wants to know how long shelves stay empty. For each product, calculate the longest gap with no recorded restock events.

SQL Island Examples – Finding Consecutive Streaks

An Island problem flips the idea: instead of measuring breaks, you look for continuous streaks using a slightly different Gap and Island SQL pattern (ROW_NUMBER with a date or sequence grain).

  • Customer Streaks: Xorthax wants to celebrate his most consistent shoppers. For each customer, find the longest streak of consecutive months with at least one order.
  • Category Sales Runs: On Zlacksys 19, Xorthax wonders which product categories sell steadily. Find the longest streak of consecutive weeks where each category sold 10 or more completed orders.
  • Supplier Reliability Streaks: The intergalactic merchants’ guild honors reliable suppliers. For each supplier, identify the longest streak of consecutive days where they fulfilled at least one order.






🛠️ Core Window Function Techniques for Advanced Gap and Island Queries

We’ll walk through both a Gap and an Island problem, showing how window functions like ROW_NUMBER, LAG, and LEAD power advanced SQL queries step by step.

Gap & Island: Step-by-StepA practical 4-step workflow using ROW_NUMBER, LAG, and LEAD.

  1. Create a time-grained series (day/week/month) or group by the relevant grain.
    • Gaps: Normalize to a date grain (::date, DATE_TRUNC(...)) and de-dupe if needed.
    • Islands: Choose the streak grain (day/week/month) and ensure one row per period per entity.
  2. Use a window ordered by date, partitioned by entity.
    • Gaps: LAG(date) to get the previous row’s date.
    • Islands: ROW_NUMBER() to assign a running index.
  3. Transform the Data: Gap vs. Island Logic
    • Gaps: Compute the gap gap_days = date - prev_date.
    • Islands: Build a group key like grp = date - rn * interval '1 day' (or week/month).
  4. Summarize:
    • Gaps: Rank gaps and pick the largest (e.g. WHERE rn = 1).
    • Islands: Aggregate each group (MIN/MAX/COUNT), then rank to choose the longest streak.

Let's see what that looks like for a gap problem!

"Xorthax suspects some of his loyal customers disappear for long stretches. For each customer, find the longest gap between two consecutive orders."


  1. Create a time-grained series or group by the relevant grain.

    WITH ordered AS (
      SELECT DISTINCT
        customerid,
        orderdate::date AS orderdate
      FROM orders
    ),
    

    If we were to select * from ordered we would get results that look like:

    CustomerIDOrderDate
    12245-01-02
    12245-01-03
    12245-01-05
    22245-01-20
    22245-01-15
  2. Use LAG() partitioned by entity and ordered by date.

    sequenced AS (
    SELECT
      customerid,
      orderdate,
      LAG(orderdate) OVER (
        PARTITION BY customerid
        ORDER BY orderdate
      ) AS prevdate
    FROM ordered
    ),
    

    The output from that CTE would look like this:

    CustomerIDOrderDatePrevDate
    12245-01-02NULL
    12245-01-032245-01-02
    12245-01-052245-01-03
    22245-01-15NULL
    22245-01-202245-01-15
  3. Compute the gap.

    gaps AS (
      SELECT
        customerid,
        (orderdate - prevdate) AS gapdays,
        prevdate AS gapstart,
        orderdate AS gapend
      FROM sequenced
      WHERE prev_date IS NOT NULL
    ),
    

    The output from "gaps" could look like:

    CustomerIDGapDaysGapStartGapEnd
    112245-01-022245-01-03
    122245-01-032245-01-05
    252245-01-152245-01-20
    372245-02-012245-02-08
    332245-02-082245-02-11
  4. Pick the desired gap per entity (in this case longest gap).

ranked AS (
  SELECT g.*, ROW_NUMBER() OVER (
    PARTITION BY customerid ORDER BY gapdays DESC, gapend DESC
  ) AS rn
  FROM gaps g
)

SELECT customerid, gapdays AS longestgapdays, gapstart, gapend
FROM ranked
WHERE rn = 1;

This is what the final output would look like:

CustomerID LongestGapDays GapStart GapEnd
1 2 2245-01-03 2245-01-05
2 5 2245-01-15 2245-01-20
3 7 2245-02-01 2245-02-08

Click here for the complete gap query
WITH monthly AS (
  SELECT
    o.customerid,
    DATE_TRUNC('month', o.orderdate)::date AS MonthStart
  FROM orders o
  GROUP BY o.customerid, DATE_TRUNC('month', o.orderdate)::date
),
Numbered AS (
   SELECT
       Category,
       MonthStart,
       ROW_NUMBER() OVER (PARTITION BY Category ORDER BY MonthStart) AS rn
   FROM monthly
),
grouped AS (
  SELECT
    customerid,
    MonthStart,
    MonthStart - (rn * INTERVAL '1 month') AS grp
  FROM numbered
),
streaks AS (
  SELECT
    customerid,
    MIN(MonthStart) AS StartMonth,
    MAX(MonthStart) AS EndMonth,
    COUNT(*)         AS MonthCount
  FROM grouped
  GROUP BY customerid, grp
),
ranked AS (
  SELECT
    s.*,
    ROW_NUMBER() OVER (
      PARTITION BY customerid
      ORDER BY monthcount DESC, endmonth DESC
    ) AS rnk
  FROM streaks s
)
SELECT
  customerid,
  startmonth,
  endmonth,
  monthcount AS longest_streak_months
FROM ranked
WHERE rnk = 1
ORDER BY customerid;



Now let's walk through an Island problem ...

"Xorthax wants to celebrate his most consistent shoppers. For each customer, find the longest streak of consecutive months in which they placed at least one order."

  1. Create a time-grained series or group by the relevant grain.
    WITH monthly AS (
      SELECT
        o.customerid,
        DATE_TRUNC('month', o.orderdate)::date AS MonthStart
      FROM orders o
      GROUP BY o.customerid, DATE_TRUNC('month', o.orderdate)::date
    ),
    

    Here is what the output of step 1 would look like:

    CustomerID MonthStart
    1 2245-01-01
    1 2245-02-01
    1 2245-03-01
    2 2245-01-01
    2 2245-03-01
  2. Use a window ordered by date, partitioned by entity.

    Numbered AS (
       SELECT
           Category,
           MonthStart,
           ROW_NUMBER() OVER (PARTITION BY Category ORDER BY MonthStart) AS rn
       FROM monthly
    ),
    

    The output from the second CTE would look like:

    CustomerID MonthStart RN
    1 2245-01-01 1
    1 2245-02-01 2
    1 2245-03-01 3
    2 2245-01-01 1
    2 2245-03-01 2
  3. Normalize with row number to form a stable group key 
    grouped AS (
      SELECT
        customerid,
        MonthStart,
        MonthStart - (rn * INTERVAL '1 month') AS grp
      FROM numbered
    ),
    

    Here is the sample output from grouped:

    CustomerID MonthStart Grp
    1 2245-01-01 2245-01-01
    1 2245-02-01 2245-01-01
    1 2245-03-01 2245-01-01
    2 2245-01-01 2245-01-01
    2 2245-03-01 2245-02-01
  4. Aggregate per group to get length and then pick the longest per entity.  This will take an additional CTE.
  5. streaks AS (
      SELECT
        customerid,
        MIN(MonthStart) AS StartMonth,
        MAX(MonthStart) AS EndMonth,
        COUNT(*)         AS MonthCount
      FROM grouped
      GROUP BY customerid, grp
    ),
    

    The results from streaks look like this:

    CustomerID StartMonth EndMonth MonthCount
    1 2245-01-01 2245-03-01 3
    2 2245-01-01 2245-01-01 1
    2 2245-03-01 2245-03-01 1
    3 2245-02-01 2245-04-01 3
    3 2245-05-01 2245-06-01 2

And finally …
ranked AS (
  SELECT
    s.*,
    ROW_NUMBER() OVER (
      PARTITION BY customerid
      ORDER BY monthcount DESC, endmonth DESC
    ) AS rnk
  FROM streaks s
)
SELECT
  customerid,
  startmonth,
  endmonth,
  monthcount AS longest_streak_months
FROM ranked
WHERE rnk = 1
ORDER BY customerid;

The final results would look like this:

CustomerID StartMonth EndMonth LongestStreakMonths
1 2245-01-01 2245-03-01 3
2 2245-01-01 2245-01-01 1
3 2245-02-01 2245-04-01 3
4 2245-01-01 2245-02-01 2
5 2245-03-01 2245-05-01 3

Click here for the complete island query
WITH monthly AS (
  SELECT
    o.customerid,
    DATE_TRUNC('month', o.orderdate)::date AS month_start
  FROM orders o
  GROUP BY o.customerid, DATE_TRUNC('month', o.orderdate)::date
),
numbered AS (
  SELECT
    customerid,
    month_start,
    ROW_NUMBER() OVER (
      PARTITION BY customerid
      ORDER BY month_start
    ) AS rn
  FROM monthly
),
grouped AS (
  SELECT
    customerid,
    month_start,
    -- Stable group key: shift each month back by rn months
    month_start - (rn * INTERVAL '1 month') AS grp
  FROM numbered
),
streaks AS (
  SELECT
    customerid,
    MIN(month_start) AS start_month,
    MAX(month_start) AS end_month,
    COUNT(*) AS month_count
  FROM grouped
  GROUP BY customerid, grp
),
ranked AS (
  SELECT
    s.*,
    ROW_NUMBER() OVER (
      PARTITION BY customerid
      ORDER BY month_count DESC, end_month DESC
    ) AS rnk
  FROM streaks s
)
SELECT
  customerid,
  start_month,
  end_month,
  month_count AS longest_streak_months
FROM ranked
WHERE rnk = 1
ORDER BY customerid;;



🔍 Consecutive Dates and Streak Analysis in Complex SQL Queries

Once you understand the Gap and Island SQL pattern, the next step is choosing the right SQL window function to solve complex SQL queries on consecutive dates, streaks, and gaps.

Use this quick guide to decide which function helps you analyze consecutive dates, streaks, and gaps between events.

Choosing the Right Window Function

  • Need to measure time between events? → Use LAG() (or LEAD() for “next”).
  • Need to form consecutive islands (streaks)? → Use ROW_NUMBER() + date grain + normalized key.
  • Need to rank streaks or gaps? → Use RANK() or DENSE_RANK() to find the top sequence per entity.

Each function supports a different angle of SQL streak analysis — finding breaks, identifying runs, or ranking their lengths. If you want the syntax-level breakdowns and typical combinations, expand the section below.

📜 Detailed Window Function Reference (Click to Expand)

Function-by-Function (Gaps vs Islands)

LAG(expr) OVER (PARTITION BY entity ORDER BY date)
  • Best for Gaps: get prev_date, then compute date - prev_date.
  • Also useful for “change since last event” (price jumps, inventory deltas).
  • Tip: De-dup the grain first so same-day repeats don’t create 0-day gaps.
LEAD(expr) OVER (PARTITION BY entity ORDER BY date)
  • Mirror of LAG().
  • Reads more naturally to compare current → next (e.g., “days until next order”).
  • Uses the same partitioning and ordering rules as LAG().
ROW_NUMBER() OVER (PARTITION BY entity ORDER BY date)
  • Core for Islands: create a running index (rn).
  • Build a stable group key, e.g. date - rn * interval, so consecutive rows fall in the same island.
  • Also handy for deduping: keep the first row per (entity, date).
RANK() / DENSE_RANK()
  • Use after forming islands to pick the longest streak or gap.
  • RANK() penalizes ties (1,2,2,4), DENSE_RANK() doesn’t (1,2,2,3).
  • Combine with ROW_NUMBER() to break ties deterministically for “Top 1 per entity.”

Framing & ORDER BY Gotchas

  • Always PARTITION BY the entity you care about.
  • Always ORDER BY a deterministic key (date + tiebreaker).
  • Default window frame is usually fine; what matters most is row order.

Typical Combos

  • Gaps: grain → LAG(date) → compute gap_daysROW_NUMBER() to pick max per entity.
  • Islands: grain → ROW_NUMBER() → normalize (date - rn * interval) → group/aggregate → rank longest streak.



⚠️ Common Pitfalls When Solving Gap and Island SQL Problems

These pitfalls frequently appear in SQL interview questions and real-world projects, making it essential to understand how to troubleshoot and explain your Gap and Island SQL solutions clearly.

1) Wrong time grain

Symptom: “My streaks skip months / gaps look huge.”

Fix: Explicitly set grain: ::date, DATE_TRUNC('week', ...), DATE_TRUNC('month', ...).

2) Non-deterministic window ordering

Symptom: Different answers run-to-run.

Fix: Add a tie-breaker: ORDER BY orderdate, orderid.

3) Duplicates at the grain

Symptom: Zero-day gaps; broken islands.

Fix: De-dup before windows (SELECT DISTINCT customerid, orderdate::date) or keep one row via ROW_NUMBER() and filter (e.g., QUALIFY rn = 1 or a subquery).

4) Missing scaffold for “no activity”

Symptom: Islands undercounted because dates with no rows vanish.

Fix: For rules like “every day/week”, build a calendar/time series and join, then apply the condition.

5) Off-by-one boundaries

Symptom: Streak length miscounts by 1.

Fix: Be explicit: for monthly streaks, count rows; for day streaks, MAX(date) - MIN(date) + 1 only if every day is present.

6) Mixing “any order” vs “Completed orders only”

Symptom: Streaks drop unexpectedly.

Fix: Apply the same filter in every CTE (WHERE status = 'Completed'), not just the last step.

7) Partition mismatch

Symptom: Streaks/gaps bleed across entities.

Fix: Double-check PARTITION BY (customer vs supplier vs category) matches the question.

8) Island grouping key done on wrong grain

Symptom: Consecutive months still split.

Fix: Create rn at the same grain you normalized (e.g., month), then grp = month_start - rn * interval '1 month'.

9) Time zones & timestamps

Symptom: Gaps differ by 1 day around midnight.

Fix: Normalize to date (or a consistent timezone) before windows.

10) Picking the “longest” incorrectly

Symptom: Not the most recent or not truly the max.

Fix: Rank inside entity: ORDER BY month_count DESC, end_month DESC (or your tie-break), then filter to the top (e.g., WHERE rnk = 1).

11) Non-performant (slow) query

Symptom: Slow queries on big tables.

Fix: Pre-filter date range; index (customerid, orderdate); avoid unnecessary DISTINCTs; materialize intermediate CTEs if needed.




🧪 Practice Problems by Concept

Here are some problems you can try right away:




🔗 Related Topics

See also: LEAD / LAG, ROWNUMBER() & RANK()