From: Tom Lane Date: Fri, 12 Feb 2010 17:33:21 +0000 (+0000) Subject: Extend the set of frame options supported for window functions. X-Git-Tag: REL9_0_ALPHA4~69 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=ec4be2ee6827b6bd85e0813c7a8993cfbb0e6fa7;p=postgresql Extend the set of frame options supported for window functions. This patch allows the frame to start from CURRENT ROW (in either RANGE or ROWS mode), and it also adds support for ROWS n PRECEDING and ROWS n FOLLOWING start and end points. (RANGE value PRECEDING/FOLLOWING isn't there yet --- the grammar works, but that's all.) Hitoshi Harada, reviewed by Pavel Stehule --- diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index fed003c4d0..a4ea1462a8 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -1,4 +1,4 @@ - + Functions and Operators @@ -10559,21 +10559,23 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab; nth_value consider only the rows within the window frame, which by default contains the rows from the start of the partition through the last peer of the current row. This is - likely to give unhelpful results for nth_value and - particularly last_value. You can redefine the frame as - being the whole partition by adding ROWS BETWEEN UNBOUNDED - PRECEDING AND UNBOUNDED FOLLOWING to the OVER clause. - See for more information. + likely to give unhelpful results for last_value and + sometimes also nth_value. You can redefine the frame by + adding a suitable frame specification (RANGE or + ROWS) to the OVER clause. + See for more information + about frame specifications. When an aggregate function is used as a window function, it aggregates - over the rows within the current row's window frame. To obtain - aggregation over the whole partition, omit ORDER BY or use - ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. + over the rows within the current row's window frame. An aggregate used with ORDER BY and the default window frame definition produces a running sum type of behavior, which may or - may not be what's wanted. + may not be what's wanted. To obtain + aggregation over the whole partition, omit ORDER BY or use + ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. + Other frame specifications can be used to obtain other effects. diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml index 804a697e49..395ca79604 100644 --- a/doc/src/sgml/ref/select.sgml +++ b/doc/src/sgml/ref/select.sgml @@ -1,5 +1,5 @@ @@ -616,27 +616,66 @@ WINDOW window_name AS ( The optional frame_clause defines the window frame for window functions that depend on the - frame (not all do). It can be one of + frame (not all do). The window frame is a set of related rows for + each row of the query (called the current row). + The frame_clause can be one of + + +[ RANGE | ROWS ] frame_start +[ RANGE | ROWS ] BETWEEN frame_start AND frame_end + + + where frame_start and frame_end can be + one of + -RANGE UNBOUNDED PRECEDING -RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW -RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -ROWS UNBOUNDED PRECEDING -ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW -ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING +UNBOUNDED PRECEDING +value PRECEDING +CURRENT ROW +value FOLLOWING +UNBOUNDED FOLLOWING - The first two are equivalent and are also the default: they set the - frame to be all rows from the partition start up through the current row's - last peer in the ORDER BY ordering (which means all rows if - there is no ORDER BY). The options - RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING and - ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING - are also equivalent: they always select all rows in the partition. - Lastly, ROWS UNBOUNDED PRECEDING or its verbose equivalent - ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW select - all rows up through the current row (regardless of duplicates). - Beware that this option can produce implementation-dependent results - if the ORDER BY ordering does not order the rows uniquely. + + If frame_end is omitted it defaults to CURRENT + ROW. Restrictions are that + frame_start cannot be UNBOUNDED FOLLOWING, + frame_end cannot be UNBOUNDED PRECEDING, + and the frame_end choice cannot appear earlier in the + above list than the frame_start choice — for example + RANGE BETWEEN CURRENT ROW AND value + PRECEDING is not allowed. + + + + The default framing option is RANGE UNBOUNDED PRECEDING, + which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND + CURRENT ROW; it sets the frame to be all rows from the partition start + up through the current row's last peer in the ORDER BY + ordering (which means all rows if there is no ORDER BY). + In general, UNBOUNDED PRECEDING means that the frame + starts with the first row of the partition, and similarly + UNBOUNDED FOLLOWING means that the frame ends with the last + row of the partition (regardless of RANGE or ROWS + mode). In ROWS mode, CURRENT ROW + means that the frame starts or ends with the current row; but in + RANGE mode it means that the frame starts or ends with + the current row's first or last peer in the ORDER BY ordering. + The value PRECEDING and + value FOLLOWING cases are currently only + allowed in ROWS mode. They indicate that the frame starts + or ends with the row that many rows before or after the current row. + value must be an integer expression not + containing any variables, aggregate functions, or window functions. + The value must not be null or negative; but it can be zero, which + selects the current row itself. + + + + Beware that the ROWS options can produce unpredictable + results if the ORDER BY ordering does not order the rows + uniquely. The RANGE options are designed to ensure that + rows that are peers in the ORDER BY ordering are treated + alike; any two peer rows will be both in or both not in the frame. diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index 64aec4f684..83ecb8653c 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -1,4 +1,4 @@ - + SQL Syntax @@ -1667,14 +1667,21 @@ SELECT array_agg(a ORDER BY b DESC) FROM table; and the optional frame_clause can be one of -RANGE UNBOUNDED PRECEDING -RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW -RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -ROWS UNBOUNDED PRECEDING -ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW -ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING +[ RANGE | ROWS ] frame_start +[ RANGE | ROWS ] BETWEEN frame_start AND frame_end + where frame_start and frame_end can be + one of + +UNBOUNDED PRECEDING +value PRECEDING +CURRENT ROW +value FOLLOWING +UNBOUNDED FOLLOWING + + + Here, expression represents any value expression that does not itself contain window function calls. The PARTITION BY and ORDER BY lists have @@ -1699,19 +1706,35 @@ ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING The frame_clause specifies the set of rows constituting the window frame, for those window functions that act on the frame instead of the whole partition. + If frame_end is omitted it defaults to CURRENT + ROW. Restrictions are that + frame_start cannot be UNBOUNDED FOLLOWING, + frame_end cannot be UNBOUNDED PRECEDING, + and the frame_end choice cannot appear earlier in the + above list than the frame_start choice — for example + RANGE BETWEEN CURRENT ROW AND value + PRECEDING is not allowed. The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND - CURRENT ROW; it selects rows up through the current row's last - peer in the ORDER BY ordering (which means all rows if - there is no ORDER BY). The options - RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING and - ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING - are also equivalent: they always select all rows in the partition. - Lastly, ROWS UNBOUNDED PRECEDING or its verbose equivalent - ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW select - all rows up through the current row (regardless of duplicates). - Beware that this option can produce implementation-dependent results - if the ORDER BY ordering does not order the rows uniquely. + CURRENT ROW; it sets the frame to be all rows from the partition start + up through the current row's last peer in the ORDER BY + ordering (which means all rows if there is no ORDER BY). + In general, UNBOUNDED PRECEDING means that the frame + starts with the first row of the partition, and similarly + UNBOUNDED FOLLOWING means that the frame ends with the last + row of the partition (regardless of RANGE or ROWS + mode). In ROWS mode, CURRENT ROW + means that the frame starts or ends with the current row; but in + RANGE mode it means that the frame starts or ends with + the current row's first or last peer in the ORDER BY ordering. + The value PRECEDING and + value FOLLOWING cases are currently only + allowed in ROWS mode. They indicate that the frame starts + or ends with the row that many rows before or after the current row. + value must be an integer expression not + containing any variables, aggregate functions, or window functions. + The value must not be null or negative; but it can be zero, which + selects the current row itself. diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index ea722f1ee3..9f748ca6c0 100644 --- a/src/backend/executor/nodeAgg.c +++ b/src/backend/executor/nodeAgg.c @@ -71,7 +71,7 @@ * Portions Copyright (c) 1994, Regents of the University of California * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/executor/nodeAgg.c,v 1.172 2010/02/08 20:39:51 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/executor/nodeAgg.c,v 1.173 2010/02/12 17:33:19 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -1999,7 +1999,7 @@ AggCheckCallContext(FunctionCallInfo fcinfo, MemoryContext *aggcontext) if (fcinfo->context && IsA(fcinfo->context, WindowAggState)) { if (aggcontext) - *aggcontext = ((WindowAggState *) fcinfo->context)->wincontext; + *aggcontext = ((WindowAggState *) fcinfo->context)->aggcontext; return AGG_CONTEXT_WINDOW; } diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c index 0b90992b80..c2c0af3cde 100644 --- a/src/backend/executor/nodeWindowAgg.c +++ b/src/backend/executor/nodeWindowAgg.c @@ -27,7 +27,7 @@ * Portions Copyright (c) 1994, Regents of the University of California * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/executor/nodeWindowAgg.c,v 1.9 2010/01/02 16:57:45 momjian Exp $ + * $PostgreSQL: pgsql/src/backend/executor/nodeWindowAgg.c,v 1.10 2010/02/12 17:33:19 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -165,6 +165,7 @@ static void release_partition(WindowAggState *winstate); static bool row_is_in_frame(WindowAggState *winstate, int64 pos, TupleTableSlot *slot); +static void update_frameheadpos(WindowObject winobj, TupleTableSlot *slot); static void update_frametailpos(WindowObject winobj, TupleTableSlot *slot); static WindowStatePerAggData *initialize_peragg(WindowAggState *winstate, @@ -193,7 +194,7 @@ initialize_windowaggregate(WindowAggState *winstate, peraggstate->transValue = peraggstate->initValue; else { - oldContext = MemoryContextSwitchTo(winstate->wincontext); + oldContext = MemoryContextSwitchTo(winstate->aggcontext); peraggstate->transValue = datumCopy(peraggstate->initValue, peraggstate->transtypeByVal, peraggstate->transtypeLen); @@ -258,10 +259,10 @@ advance_windowaggregate(WindowAggState *winstate, * already checked that the agg's input type is binary-compatible * with its transtype, so straight copy here is OK.) * - * We must copy the datum into wincontext if it is pass-by-ref. We + * We must copy the datum into aggcontext if it is pass-by-ref. We * do not need to pfree the old transValue, since it's NULL. */ - MemoryContextSwitchTo(winstate->wincontext); + MemoryContextSwitchTo(winstate->aggcontext); peraggstate->transValue = datumCopy(fcinfo->arg[1], peraggstate->transtypeByVal, peraggstate->transtypeLen); @@ -294,7 +295,7 @@ advance_windowaggregate(WindowAggState *winstate, newVal = FunctionCallInvoke(fcinfo); /* - * If pass-by-ref datatype, must copy the new value into wincontext and + * If pass-by-ref datatype, must copy the new value into aggcontext and * pfree the prior transValue. But if transfn returned a pointer to its * first input, we don't need to do anything. */ @@ -303,7 +304,7 @@ advance_windowaggregate(WindowAggState *winstate, { if (!fcinfo->isnull) { - MemoryContextSwitchTo(winstate->wincontext); + MemoryContextSwitchTo(winstate->aggcontext); newVal = datumCopy(newVal, peraggstate->transtypeByVal, peraggstate->transtypeLen); @@ -390,6 +391,7 @@ eval_windowaggregates(WindowAggState *winstate) int i; MemoryContext oldContext; ExprContext *econtext; + WindowObject agg_winobj; TupleTableSlot *agg_row_slot; numaggs = winstate->numaggs; @@ -398,10 +400,14 @@ eval_windowaggregates(WindowAggState *winstate) /* final output execution is in ps_ExprContext */ econtext = winstate->ss.ps.ps_ExprContext; + agg_winobj = winstate->agg_winobj; + agg_row_slot = winstate->agg_row_slot; /* * Currently, we support only a subset of the SQL-standard window framing - * rules. In all the supported cases, the window frame always consists of + * rules. + * + * If the frame start is UNBOUNDED_PRECEDING, the window frame consists of * a contiguous group of rows extending forward from the start of the * partition, and rows only enter the frame, never exit it, as the current * row advances forward. This makes it possible to use an incremental @@ -413,6 +419,10 @@ eval_windowaggregates(WindowAggState *winstate) * damage the running transition value, but we have the same assumption * in nodeAgg.c too (when it rescans an existing hash table). * + * For other frame start rules, we discard the aggregate state and re-run + * the aggregates whenever the frame head row moves. We can still + * optimize as above whenever successive rows share the same frame head. + * * In many common cases, multiple rows share the same frame and hence the * same aggregate value. (In particular, if there's no ORDER BY in a RANGE * window, then all rows are peers and so they all have window frame equal @@ -424,63 +434,90 @@ eval_windowaggregates(WindowAggState *winstate) * accumulated into the aggregate transition values. Whenever we start a * new peer group, we accumulate forward to the end of the peer group. * - * TODO: In the future, we should implement the full SQL-standard set of - * framing rules. We could implement the other cases by recalculating the - * aggregates whenever a row exits the frame. That would be pretty slow, - * though. For aggregates like SUM and COUNT we could implement a - * "negative transition function" that would be called for each row as it - * exits the frame. We'd have to think about avoiding recalculation of - * volatile arguments of aggregate functions, too. + * TODO: Rerunning aggregates from the frame start can be pretty slow. + * For some aggregates like SUM and COUNT we could avoid that by + * implementing a "negative transition function" that would be called for + * each row as it exits the frame. We'd have to think about avoiding + * recalculation of volatile arguments of aggregate functions, too. */ /* - * If we've already aggregated up through current row, reuse the saved - * result values. NOTE: this test works for the currently supported - * framing rules, but will need fixing when more are added. + * First, update the frame head position. */ - if (winstate->aggregatedupto > winstate->currentpos) + update_frameheadpos(agg_winobj, winstate->temp_slot_1); + + /* + * Initialize aggregates on first call for partition, or if the frame + * head position moved since last time. + */ + if (winstate->currentpos == 0 || + winstate->frameheadpos != winstate->aggregatedbase) { + /* + * Discard transient aggregate values + */ + MemoryContextResetAndDeleteChildren(winstate->aggcontext); + for (i = 0; i < numaggs; i++) { peraggstate = &winstate->peragg[i]; wfuncno = peraggstate->wfuncno; - econtext->ecxt_aggvalues[wfuncno] = peraggstate->resultValue; - econtext->ecxt_aggnulls[wfuncno] = peraggstate->resultValueIsNull; + initialize_windowaggregate(winstate, + &winstate->perfunc[wfuncno], + peraggstate); } - return; + + /* + * If we created a mark pointer for aggregates, keep it pushed up + * to frame head, so that tuplestore can discard unnecessary rows. + */ + if (agg_winobj->markptr >= 0) + WinSetMarkPosition(agg_winobj, winstate->frameheadpos); + + /* + * Initialize for loop below + */ + ExecClearTuple(agg_row_slot); + winstate->aggregatedbase = winstate->frameheadpos; + winstate->aggregatedupto = winstate->frameheadpos; } - /* Initialize aggregates on first call for partition */ - if (winstate->currentpos == 0) + /* + * In UNBOUNDED_FOLLOWING mode, we don't have to recalculate aggregates + * except when the frame head moves. In END_CURRENT_ROW mode, we only + * have to recalculate when the frame head moves or currentpos has advanced + * past the place we'd aggregated up to. Check for these cases and if + * so, reuse the saved result values. + */ + if ((winstate->frameOptions & (FRAMEOPTION_END_UNBOUNDED_FOLLOWING | + FRAMEOPTION_END_CURRENT_ROW)) && + winstate->aggregatedbase <= winstate->currentpos && + winstate->aggregatedupto > winstate->currentpos) { for (i = 0; i < numaggs; i++) { peraggstate = &winstate->peragg[i]; wfuncno = peraggstate->wfuncno; - initialize_windowaggregate(winstate, - &winstate->perfunc[wfuncno], - peraggstate); + econtext->ecxt_aggvalues[wfuncno] = peraggstate->resultValue; + econtext->ecxt_aggnulls[wfuncno] = peraggstate->resultValueIsNull; } + return; } /* * Advance until we reach a row not in frame (or end of partition). * * Note the loop invariant: agg_row_slot is either empty or holds the row - * at position aggregatedupto. The agg_ptr read pointer must always point - * to the next row to read into agg_row_slot. + * at position aggregatedupto. We advance aggregatedupto after processing + * a row. */ - agg_row_slot = winstate->agg_row_slot; for (;;) { /* Fetch next row if we didn't already */ if (TupIsNull(agg_row_slot)) { - spool_tuples(winstate, winstate->aggregatedupto); - tuplestore_select_read_pointer(winstate->buffer, - winstate->agg_ptr); - if (!tuplestore_gettupleslot(winstate->buffer, true, true, - agg_row_slot)) + if (!window_gettupleslot(agg_winobj, winstate->aggregatedupto, + agg_row_slot)) break; /* must be end of partition */ } @@ -544,11 +581,11 @@ eval_windowaggregates(WindowAggState *winstate) pfree(DatumGetPointer(peraggstate->resultValue)); /* - * If pass-by-ref, copy it into our global context. + * If pass-by-ref, copy it into our aggregate context. */ if (!*isnull) { - oldContext = MemoryContextSwitchTo(winstate->wincontext); + oldContext = MemoryContextSwitchTo(winstate->aggcontext); peraggstate->resultValue = datumCopy(*result, peraggstate->resulttypeByVal, @@ -624,11 +661,12 @@ begin_partition(WindowAggState *winstate) int i; winstate->partition_spooled = false; + winstate->framehead_valid = false; winstate->frametail_valid = false; winstate->spooled_rows = 0; winstate->currentpos = 0; + winstate->frameheadpos = 0; winstate->frametailpos = -1; - winstate->aggregatedupto = 0; ExecClearTuple(winstate->agg_row_slot); /* @@ -654,18 +692,39 @@ begin_partition(WindowAggState *winstate) winstate->buffer = tuplestore_begin_heap(false, false, work_mem); /* - * Set up read pointers for the tuplestore. The current and agg pointers - * don't need BACKWARD capability, but the per-window-function read - * pointers do. + * Set up read pointers for the tuplestore. The current pointer doesn't + * need BACKWARD capability, but the per-window-function read pointers do, + * and the aggregate pointer does if frame start is movable. */ winstate->current_ptr = 0; /* read pointer 0 is pre-allocated */ /* reset default REWIND capability bit for current ptr */ tuplestore_set_eflags(winstate->buffer, 0); - /* create a read pointer for aggregates, if needed */ + /* create read pointers for aggregates, if needed */ if (winstate->numaggs > 0) - winstate->agg_ptr = tuplestore_alloc_read_pointer(winstate->buffer, 0); + { + WindowObject agg_winobj = winstate->agg_winobj; + int readptr_flags = 0; + + /* If the frame head is potentially movable ... */ + if (!(winstate->frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING)) + { + /* ... create a mark pointer to track the frame head */ + agg_winobj->markptr = tuplestore_alloc_read_pointer(winstate->buffer, 0); + /* and the read pointer will need BACKWARD capability */ + readptr_flags |= EXEC_FLAG_BACKWARD; + } + + agg_winobj->readptr = tuplestore_alloc_read_pointer(winstate->buffer, + readptr_flags); + agg_winobj->markpos = -1; + agg_winobj->seekpos = -1; + + /* Also reset the row counters for aggregates */ + winstate->aggregatedbase = 0; + winstate->aggregatedupto = 0; + } /* create mark and read pointers for each real window function */ for (i = 0; i < numfuncs; i++) @@ -694,8 +753,8 @@ begin_partition(WindowAggState *winstate) } /* - * Read tuples from the outer node, up to position 'pos', and store them - * into the tuplestore. If pos is -1, reads the whole partition. + * Read tuples from the outer node, up to and including position 'pos', and + * store them into the tuplestore. If pos is -1, reads the whole partition. */ static void spool_tuples(WindowAggState *winstate, int64 pos) @@ -789,7 +848,8 @@ release_partition(WindowAggState *winstate) * any aggregate temp data). We don't rely on retail pfree because some * aggregates might have allocated data we don't have direct pointers to. */ - MemoryContextResetAndDeleteChildren(winstate->wincontext); + MemoryContextResetAndDeleteChildren(winstate->partcontext); + MemoryContextResetAndDeleteChildren(winstate->aggcontext); if (winstate->buffer) tuplestore_end(winstate->buffer); @@ -809,108 +869,303 @@ release_partition(WindowAggState *winstate) static bool row_is_in_frame(WindowAggState *winstate, int64 pos, TupleTableSlot *slot) { - WindowAgg *node = (WindowAgg *) winstate->ss.ps.plan; - int frameOptions = node->frameOptions; + int frameOptions = winstate->frameOptions; Assert(pos >= 0); /* else caller error */ - /* We only support frame start mode UNBOUNDED PRECEDING for now */ - Assert(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING); + /* First, check frame starting conditions */ + if (frameOptions & FRAMEOPTION_START_CURRENT_ROW) + { + if (frameOptions & FRAMEOPTION_ROWS) + { + /* rows before current row are out of frame */ + if (pos < winstate->currentpos) + return false; + } + else if (frameOptions & FRAMEOPTION_RANGE) + { + /* preceding row that is not peer is out of frame */ + if (pos < winstate->currentpos && + !are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot)) + return false; + } + else + Assert(false); + } + else if (frameOptions & FRAMEOPTION_START_VALUE) + { + if (frameOptions & FRAMEOPTION_ROWS) + { + int64 offset = DatumGetInt64(winstate->startOffsetValue); - /* In UNBOUNDED FOLLOWING mode, all partition rows are in frame */ - if (frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING) - return true; + /* rows before current row + offset are out of frame */ + if (frameOptions & FRAMEOPTION_START_VALUE_PRECEDING) + offset = -offset; - /* Else frame tail mode must be CURRENT ROW */ - Assert(frameOptions & FRAMEOPTION_END_CURRENT_ROW); + if (pos < winstate->currentpos + offset) + return false; + } + else if (frameOptions & FRAMEOPTION_RANGE) + { + /* parser should have rejected this */ + elog(ERROR, "window frame with value offset is not implemented"); + } + else + Assert(false); + } - /* if row is current row or a predecessor, it must be in frame */ - if (pos <= winstate->currentpos) - return true; + /* Okay so far, now check frame ending conditions */ + if (frameOptions & FRAMEOPTION_END_CURRENT_ROW) + { + if (frameOptions & FRAMEOPTION_ROWS) + { + /* rows after current row are out of frame */ + if (pos > winstate->currentpos) + return false; + } + else if (frameOptions & FRAMEOPTION_RANGE) + { + /* following row that is not peer is out of frame */ + if (pos > winstate->currentpos && + !are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot)) + return false; + } + else + Assert(false); + } + else if (frameOptions & FRAMEOPTION_END_VALUE) + { + if (frameOptions & FRAMEOPTION_ROWS) + { + int64 offset = DatumGetInt64(winstate->endOffsetValue); - /* In ROWS mode, *only* such rows are in frame */ - if (frameOptions & FRAMEOPTION_ROWS) - return false; + /* rows after current row + offset are out of frame */ + if (frameOptions & FRAMEOPTION_END_VALUE_PRECEDING) + offset = -offset; - /* Else must be RANGE mode */ - Assert(frameOptions & FRAMEOPTION_RANGE); + if (pos > winstate->currentpos + offset) + return false; + } + else if (frameOptions & FRAMEOPTION_RANGE) + { + /* parser should have rejected this */ + elog(ERROR, "window frame with value offset is not implemented"); + } + else + Assert(false); + } - /* In frame iff it's a peer of current row */ - return are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot); + /* If we get here, it's in frame */ + return true; } /* - * update_frametailpos - * make frametailpos valid for the current row + * update_frameheadpos + * make frameheadpos valid for the current row * - * Uses the winobj's read pointer for any required fetches; the winobj's - * mark must not be past the currently known frame tail. Also uses the - * specified slot for any required fetches. + * Uses the winobj's read pointer for any required fetches; hence, if the + * frame mode is one that requires row comparisons, the winobj's mark must + * not be past the currently known frame head. Also uses the specified slot + * for any required fetches. */ static void -update_frametailpos(WindowObject winobj, TupleTableSlot *slot) +update_frameheadpos(WindowObject winobj, TupleTableSlot *slot) { WindowAggState *winstate = winobj->winstate; WindowAgg *node = (WindowAgg *) winstate->ss.ps.plan; - int frameOptions = node->frameOptions; - int64 ftnext; + int frameOptions = winstate->frameOptions; - if (winstate->frametail_valid) + if (winstate->framehead_valid) return; /* already known for current row */ - /* We only support frame start mode UNBOUNDED PRECEDING for now */ - Assert(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING); - - /* In UNBOUNDED FOLLOWING mode, all partition rows are in frame */ - if (frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING) + if (frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING) { - spool_tuples(winstate, -1); - winstate->frametailpos = winstate->spooled_rows - 1; - winstate->frametail_valid = true; - return; + /* In UNBOUNDED PRECEDING mode, frame head is always row 0 */ + winstate->frameheadpos = 0; + winstate->framehead_valid = true; } + else if (frameOptions & FRAMEOPTION_START_CURRENT_ROW) + { + if (frameOptions & FRAMEOPTION_ROWS) + { + /* In ROWS mode, frame head is the same as current */ + winstate->frameheadpos = winstate->currentpos; + winstate->framehead_valid = true; + } + else if (frameOptions & FRAMEOPTION_RANGE) + { + int64 fhprev; - /* Else frame tail mode must be CURRENT ROW */ - Assert(frameOptions & FRAMEOPTION_END_CURRENT_ROW); + /* If no ORDER BY, all rows are peers with each other */ + if (node->ordNumCols == 0) + { + winstate->frameheadpos = 0; + winstate->framehead_valid = true; + return; + } - /* In ROWS mode, exactly the rows up to current are in frame */ - if (frameOptions & FRAMEOPTION_ROWS) + /* + * In RANGE START_CURRENT mode, frame head is the first row that + * is a peer of current row. We search backwards from current, + * which could be a bit inefficient if peer sets are large. + * Might be better to have a separate read pointer that moves + * forward tracking the frame head. + */ + fhprev = winstate->currentpos - 1; + for (;;) + { + /* assume the frame head can't go backwards */ + if (fhprev < winstate->frameheadpos) + break; + if (!window_gettupleslot(winobj, fhprev, slot)) + break; /* start of partition */ + if (!are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot)) + break; /* not peer of current row */ + fhprev--; + } + winstate->frameheadpos = fhprev + 1; + winstate->framehead_valid = true; + } + else + Assert(false); + } + else if (frameOptions & FRAMEOPTION_START_VALUE) { - winstate->frametailpos = winstate->currentpos; - winstate->frametail_valid = true; - return; + if (frameOptions & FRAMEOPTION_ROWS) + { + /* In ROWS mode, bound is physically n before/after current */ + int64 offset = DatumGetInt64(winstate->startOffsetValue); + + if (frameOptions & FRAMEOPTION_START_VALUE_PRECEDING) + offset = -offset; + + winstate->frameheadpos = winstate->currentpos + offset; + /* frame head can't go before first row */ + if (winstate->frameheadpos < 0) + winstate->frameheadpos = 0; + else if (winstate->frameheadpos > winstate->currentpos) + { + /* make sure frameheadpos is not past end of partition */ + spool_tuples(winstate, winstate->frameheadpos - 1); + if (winstate->frameheadpos > winstate->spooled_rows) + winstate->frameheadpos = winstate->spooled_rows; + } + winstate->framehead_valid = true; + } + else if (frameOptions & FRAMEOPTION_RANGE) + { + /* parser should have rejected this */ + elog(ERROR, "window frame with value offset is not implemented"); + } + else + Assert(false); } + else + Assert(false); +} - /* Else must be RANGE mode */ - Assert(frameOptions & FRAMEOPTION_RANGE); +/* + * update_frametailpos + * make frametailpos valid for the current row + * + * Uses the winobj's read pointer for any required fetches; hence, if the + * frame mode is one that requires row comparisons, the winobj's mark must + * not be past the currently known frame tail. Also uses the specified slot + * for any required fetches. + */ +static void +update_frametailpos(WindowObject winobj, TupleTableSlot *slot) +{ + WindowAggState *winstate = winobj->winstate; + WindowAgg *node = (WindowAgg *) winstate->ss.ps.plan; + int frameOptions = winstate->frameOptions; - /* If no ORDER BY, all rows are peers with each other */ - if (node->ordNumCols == 0) + if (winstate->frametail_valid) + return; /* already known for current row */ + + if (frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING) { + /* In UNBOUNDED FOLLOWING mode, all partition rows are in frame */ spool_tuples(winstate, -1); winstate->frametailpos = winstate->spooled_rows - 1; winstate->frametail_valid = true; - return; } + else if (frameOptions & FRAMEOPTION_END_CURRENT_ROW) + { + if (frameOptions & FRAMEOPTION_ROWS) + { + /* In ROWS mode, exactly the rows up to current are in frame */ + winstate->frametailpos = winstate->currentpos; + winstate->frametail_valid = true; + } + else if (frameOptions & FRAMEOPTION_RANGE) + { + int64 ftnext; - /* - * Else we have to search for the first non-peer of the current row. We - * assume the current value of frametailpos is a lower bound on the - * possible frame tail location, ie, frame tail never goes backward, and - * that currentpos is also a lower bound, ie, current row is always in - * frame. - */ - ftnext = Max(winstate->frametailpos, winstate->currentpos) + 1; - for (;;) + /* If no ORDER BY, all rows are peers with each other */ + if (node->ordNumCols == 0) + { + spool_tuples(winstate, -1); + winstate->frametailpos = winstate->spooled_rows - 1; + winstate->frametail_valid = true; + return; + } + + /* + * Else we have to search for the first non-peer of the current + * row. We assume the current value of frametailpos is a lower + * bound on the possible frame tail location, ie, frame tail never + * goes backward, and that currentpos is also a lower bound, ie, + * frame end always >= current row. + */ + ftnext = Max(winstate->frametailpos, winstate->currentpos) + 1; + for (;;) + { + if (!window_gettupleslot(winobj, ftnext, slot)) + break; /* end of partition */ + if (!are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot)) + break; /* not peer of current row */ + ftnext++; + } + winstate->frametailpos = ftnext - 1; + winstate->frametail_valid = true; + } + else + Assert(false); + } + else if (frameOptions & FRAMEOPTION_END_VALUE) { - if (!window_gettupleslot(winobj, ftnext, slot)) - break; /* end of partition */ - if (!are_peers(winstate, slot, winstate->ss.ss_ScanTupleSlot)) - break; /* not peer of current row */ - ftnext++; + if (frameOptions & FRAMEOPTION_ROWS) + { + /* In ROWS mode, bound is physically n before/after current */ + int64 offset = DatumGetInt64(winstate->endOffsetValue); + + if (frameOptions & FRAMEOPTION_END_VALUE_PRECEDING) + offset = -offset; + + winstate->frametailpos = winstate->currentpos + offset; + /* smallest allowable value of frametailpos is -1 */ + if (winstate->frametailpos < 0) + winstate->frametailpos = -1; + else if (winstate->frametailpos > winstate->currentpos) + { + /* make sure frametailpos is not past last row of partition */ + spool_tuples(winstate, winstate->frametailpos); + if (winstate->frametailpos >= winstate->spooled_rows) + winstate->frametailpos = winstate->spooled_rows - 1; + } + winstate->frametail_valid = true; + } + else if (frameOptions & FRAMEOPTION_RANGE) + { + /* parser should have rejected this */ + elog(ERROR, "window frame with value offset is not implemented"); + } + else + Assert(false); } - winstate->frametailpos = ftnext - 1; - winstate->frametail_valid = true; + else + Assert(false); } @@ -953,6 +1208,73 @@ ExecWindowAgg(WindowAggState *winstate) winstate->ss.ps.ps_TupFromTlist = false; } + /* + * Compute frame offset values, if any, during first call. + */ + if (winstate->all_first) + { + int frameOptions = winstate->frameOptions; + ExprContext *econtext = winstate->ss.ps.ps_ExprContext; + Datum value; + bool isnull; + int16 len; + bool byval; + + if (frameOptions & FRAMEOPTION_START_VALUE) + { + Assert(winstate->startOffset != NULL); + value = ExecEvalExprSwitchContext(winstate->startOffset, + econtext, + &isnull, + NULL); + if (isnull) + ereport(ERROR, + (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), + errmsg("frame starting offset must not be NULL"))); + /* copy value into query-lifespan context */ + get_typlenbyval(exprType((Node *) winstate->startOffset->expr), + &len, &byval); + winstate->startOffsetValue = datumCopy(value, byval, len); + if (frameOptions & FRAMEOPTION_ROWS) + { + /* value is known to be int8 */ + int64 offset = DatumGetInt64(value); + + if (offset < 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("frame starting offset must not be negative"))); + } + } + if (frameOptions & FRAMEOPTION_END_VALUE) + { + Assert(winstate->endOffset != NULL); + value = ExecEvalExprSwitchContext(winstate->endOffset, + econtext, + &isnull, + NULL); + if (isnull) + ereport(ERROR, + (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), + errmsg("frame ending offset must not be NULL"))); + /* copy value into query-lifespan context */ + get_typlenbyval(exprType((Node *) winstate->endOffset->expr), + &len, &byval); + winstate->endOffsetValue = datumCopy(value, byval, len); + if (frameOptions & FRAMEOPTION_ROWS) + { + /* value is known to be int8 */ + int64 offset = DatumGetInt64(value); + + if (offset < 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("frame ending offset must not be negative"))); + } + } + winstate->all_first = false; + } + restart: if (winstate->buffer == NULL) { @@ -964,7 +1286,8 @@ restart: { /* Advance current row within partition */ winstate->currentpos++; - /* This might mean that the frame tail moves, too */ + /* This might mean that the frame moves, too */ + winstate->framehead_valid = false; winstate->frametail_valid = false; } @@ -1099,10 +1422,18 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags) winstate->tmpcontext = tmpcontext; ExecAssignExprContext(estate, &winstate->ss.ps); - /* Create long-lived context for storage of aggregate transvalues etc */ - winstate->wincontext = + /* Create long-lived context for storage of partition-local memory etc */ + winstate->partcontext = AllocSetContextCreate(CurrentMemoryContext, - "WindowAggContext", + "WindowAgg_Partition", + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + + /* Create mid-lived context for aggregate trans values etc */ + winstate->aggcontext = + AllocSetContextCreate(CurrentMemoryContext, + "WindowAgg_Aggregates", ALLOCSET_DEFAULT_MINSIZE, ALLOCSET_DEFAULT_INITSIZE, ALLOCSET_DEFAULT_MAXSIZE); @@ -1229,7 +1560,7 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags) perfuncstate->numArguments = list_length(wfuncstate->args); fmgr_info_cxt(wfunc->winfnoid, &perfuncstate->flinfo, - tmpcontext->ecxt_per_query_memory); + econtext->ecxt_per_query_memory); perfuncstate->flinfo.fn_expr = (Node *) wfunc; get_typlenbyval(wfunc->wintype, &perfuncstate->resulttypeLen, @@ -1264,6 +1595,30 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags) winstate->numfuncs = wfuncno + 1; winstate->numaggs = aggno + 1; + /* Set up WindowObject for aggregates, if needed */ + if (winstate->numaggs > 0) + { + WindowObject agg_winobj = makeNode(WindowObjectData); + + agg_winobj->winstate = winstate; + agg_winobj->argstates = NIL; + agg_winobj->localmem = NULL; + /* make sure markptr = -1 to invalidate. It may not get used */ + agg_winobj->markptr = -1; + agg_winobj->readptr = -1; + winstate->agg_winobj = agg_winobj; + } + + /* copy frame options to state node for easy access */ + winstate->frameOptions = node->frameOptions; + + /* initialize frame bound offset expressions */ + winstate->startOffset = ExecInitExpr((Expr *) node->startOffset, + (PlanState *) winstate); + winstate->endOffset = ExecInitExpr((Expr *) node->endOffset, + (PlanState *) winstate); + + winstate->all_first = true; winstate->partition_spooled = false; winstate->more_partitions = false; @@ -1297,7 +1652,8 @@ ExecEndWindowAgg(WindowAggState *node) node->ss.ps.ps_ExprContext = node->tmpcontext; ExecFreeExprContext(&node->ss.ps); - MemoryContextDelete(node->wincontext); + MemoryContextDelete(node->partcontext); + MemoryContextDelete(node->aggcontext); outerPlan = outerPlanState(node); ExecEndNode(outerPlan); @@ -1315,6 +1671,7 @@ ExecReScanWindowAgg(WindowAggState *node, ExprContext *exprCtxt) node->all_done = false; node->ss.ps.ps_TupFromTlist = false; + node->all_first = true; /* release tuplestore et al */ release_partition(node); @@ -1566,7 +1923,7 @@ window_gettupleslot(WindowObject winobj, int64 pos, TupleTableSlot *slot) * There's no API to refetch the tuple at the current position. We have to * move one tuple forward, and then one backward. (We don't do it the * other way because we might try to fetch the row before our mark, which - * isn't allowed.) + * isn't allowed.) XXX this case could stand to be optimized. */ if (winobj->seekpos == pos) { @@ -1616,8 +1973,8 @@ WinGetPartitionLocalMemory(WindowObject winobj, Size sz) { Assert(WindowObjectIsValid(winobj)); if (winobj->localmem == NULL) - winobj->localmem = MemoryContextAllocZero(winobj->winstate->wincontext, - sz); + winobj->localmem = + MemoryContextAllocZero(winobj->winstate->partcontext, sz); return winobj->localmem; } @@ -1791,7 +2148,30 @@ WinGetFuncArgInPartition(WindowObject winobj, int argno, if (isout) *isout = false; if (set_mark) - WinSetMarkPosition(winobj, abs_pos); + { + int frameOptions = winstate->frameOptions; + int64 mark_pos = abs_pos; + + /* + * In RANGE mode with a moving frame head, we must not let the + * mark advance past frameheadpos, since that row has to be + * fetchable during future update_frameheadpos calls. + * + * XXX it is very ugly to pollute window functions' marks with + * this consideration; it could for instance mask a logic bug + * that lets a window function fetch rows before what it had + * claimed was its mark. Perhaps use a separate mark for + * frame head probes? + */ + if ((frameOptions & FRAMEOPTION_RANGE) && + !(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING)) + { + update_frameheadpos(winobj, winstate->temp_slot_2); + if (mark_pos > winstate->frameheadpos) + mark_pos = winstate->frameheadpos; + } + WinSetMarkPosition(winobj, mark_pos); + } econtext->ecxt_outertuple = slot; return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno), econtext, isnull, NULL); @@ -1838,7 +2218,8 @@ WinGetFuncArgInFrame(WindowObject winobj, int argno, abs_pos = winstate->currentpos + relpos; break; case WINDOW_SEEK_HEAD: - abs_pos = relpos; + update_frameheadpos(winobj, slot); + abs_pos = winstate->frameheadpos + relpos; break; case WINDOW_SEEK_TAIL: update_frametailpos(winobj, slot); @@ -1866,7 +2247,30 @@ WinGetFuncArgInFrame(WindowObject winobj, int argno, if (isout) *isout = false; if (set_mark) - WinSetMarkPosition(winobj, abs_pos); + { + int frameOptions = winstate->frameOptions; + int64 mark_pos = abs_pos; + + /* + * In RANGE mode with a moving frame head, we must not let the + * mark advance past frameheadpos, since that row has to be + * fetchable during future update_frameheadpos calls. + * + * XXX it is very ugly to pollute window functions' marks with + * this consideration; it could for instance mask a logic bug + * that lets a window function fetch rows before what it had + * claimed was its mark. Perhaps use a separate mark for + * frame head probes? + */ + if ((frameOptions & FRAMEOPTION_RANGE) && + !(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING)) + { + update_frameheadpos(winobj, winstate->temp_slot_2); + if (mark_pos > winstate->frameheadpos) + mark_pos = winstate->frameheadpos; + } + WinSetMarkPosition(winobj, mark_pos); + } econtext->ecxt_outertuple = slot; return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno), econtext, isnull, NULL); diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 4de0f7c87e..271a7a2129 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -15,7 +15,7 @@ * Portions Copyright (c) 1994, Regents of the University of California * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.460 2010/01/28 23:21:11 petere Exp $ + * $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.461 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -718,6 +718,8 @@ _copyWindowAgg(WindowAgg *from) COPY_POINTER_FIELD(ordOperators, from->ordNumCols * sizeof(Oid)); } COPY_SCALAR_FIELD(frameOptions); + COPY_NODE_FIELD(startOffset); + COPY_NODE_FIELD(endOffset); return newnode; } @@ -1848,6 +1850,8 @@ _copyWindowClause(WindowClause *from) COPY_NODE_FIELD(partitionClause); COPY_NODE_FIELD(orderClause); COPY_SCALAR_FIELD(frameOptions); + COPY_NODE_FIELD(startOffset); + COPY_NODE_FIELD(endOffset); COPY_SCALAR_FIELD(winref); COPY_SCALAR_FIELD(copiedOrder); @@ -2076,6 +2080,8 @@ _copyWindowDef(WindowDef *from) COPY_NODE_FIELD(partitionClause); COPY_NODE_FIELD(orderClause); COPY_SCALAR_FIELD(frameOptions); + COPY_NODE_FIELD(startOffset); + COPY_NODE_FIELD(endOffset); COPY_LOCATION_FIELD(location); return newnode; diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index 319070add1..001c2096b4 100644 --- a/src/backend/nodes/equalfuncs.c +++ b/src/backend/nodes/equalfuncs.c @@ -22,7 +22,7 @@ * Portions Copyright (c) 1994, Regents of the University of California * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.381 2010/01/28 23:21:11 petere Exp $ + * $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.382 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -2056,6 +2056,8 @@ _equalWindowDef(WindowDef *a, WindowDef *b) COMPARE_NODE_FIELD(partitionClause); COMPARE_NODE_FIELD(orderClause); COMPARE_SCALAR_FIELD(frameOptions); + COMPARE_NODE_FIELD(startOffset); + COMPARE_NODE_FIELD(endOffset); COMPARE_LOCATION_FIELD(location); return true; @@ -2205,6 +2207,8 @@ _equalWindowClause(WindowClause *a, WindowClause *b) COMPARE_NODE_FIELD(partitionClause); COMPARE_NODE_FIELD(orderClause); COMPARE_SCALAR_FIELD(frameOptions); + COMPARE_NODE_FIELD(startOffset); + COMPARE_NODE_FIELD(endOffset); COMPARE_SCALAR_FIELD(winref); COMPARE_SCALAR_FIELD(copiedOrder); diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 76b22b4518..c8c3202df4 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/nodes/nodeFuncs.c,v 1.45 2010/01/02 16:57:46 momjian Exp $ + * $PostgreSQL: pgsql/src/backend/nodes/nodeFuncs.c,v 1.46 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -1298,6 +1298,10 @@ expression_tree_walker(Node *node, return true; if (walker(wc->orderClause, context)) return true; + if (walker(wc->startOffset, context)) + return true; + if (walker(wc->endOffset, context)) + return true; } break; case T_CommonTableExpr: @@ -1950,6 +1954,8 @@ expression_tree_mutator(Node *node, FLATCOPY(newnode, wc, WindowClause); MUTATE(newnode->partitionClause, wc->partitionClause, List *); MUTATE(newnode->orderClause, wc->orderClause, List *); + MUTATE(newnode->startOffset, wc->startOffset, Node *); + MUTATE(newnode->endOffset, wc->endOffset, Node *); return (Node *) newnode; } break; @@ -2475,6 +2481,10 @@ bool return true; if (walker(wd->orderClause, context)) return true; + if (walker(wd->startOffset, context)) + return true; + if (walker(wd->endOffset, context)) + return true; } break; case T_RangeSubselect: diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 7095080790..593fe397d3 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.381 2010/01/28 23:21:12 petere Exp $ + * $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.382 2010/02/12 17:33:20 tgl Exp $ * * NOTES * Every node type that can appear in stored rules' parsetrees *must* @@ -610,6 +610,8 @@ _outWindowAgg(StringInfo str, WindowAgg *node) appendStringInfo(str, " %u", node->ordOperators[i]); WRITE_INT_FIELD(frameOptions); + WRITE_NODE_FIELD(startOffset); + WRITE_NODE_FIELD(endOffset); } static void @@ -2035,6 +2037,8 @@ _outWindowClause(StringInfo str, WindowClause *node) WRITE_NODE_FIELD(partitionClause); WRITE_NODE_FIELD(orderClause); WRITE_INT_FIELD(frameOptions); + WRITE_NODE_FIELD(startOffset); + WRITE_NODE_FIELD(endOffset); WRITE_UINT_FIELD(winref); WRITE_BOOL_FIELD(copiedOrder); } @@ -2326,6 +2330,8 @@ _outWindowDef(StringInfo str, WindowDef *node) WRITE_NODE_FIELD(partitionClause); WRITE_NODE_FIELD(orderClause); WRITE_INT_FIELD(frameOptions); + WRITE_NODE_FIELD(startOffset); + WRITE_NODE_FIELD(endOffset); WRITE_LOCATION_FIELD(location); } diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index de0d25bb84..35ba22203f 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/readfuncs.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/nodes/readfuncs.c,v 1.230 2010/01/02 16:57:46 momjian Exp $ + * $PostgreSQL: pgsql/src/backend/nodes/readfuncs.c,v 1.231 2010/02/12 17:33:20 tgl Exp $ * * NOTES * Path and Plan nodes do not have any readfuncs support, because we @@ -279,6 +279,8 @@ _readWindowClause(void) READ_NODE_FIELD(partitionClause); READ_NODE_FIELD(orderClause); READ_INT_FIELD(frameOptions); + READ_NODE_FIELD(startOffset); + READ_NODE_FIELD(endOffset); READ_UINT_FIELD(winref); READ_BOOL_FIELD(copiedOrder); diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index 41e048359f..112c2389d1 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -10,7 +10,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/optimizer/plan/createplan.c,v 1.270 2010/01/02 16:57:47 momjian Exp $ + * $PostgreSQL: pgsql/src/backend/optimizer/plan/createplan.c,v 1.271 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -3364,7 +3364,8 @@ make_windowagg(PlannerInfo *root, List *tlist, int numWindowFuncs, Index winref, int partNumCols, AttrNumber *partColIdx, Oid *partOperators, int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators, - int frameOptions, Plan *lefttree) + int frameOptions, Node *startOffset, Node *endOffset, + Plan *lefttree) { WindowAgg *node = makeNode(WindowAgg); Plan *plan = &node->plan; @@ -3379,6 +3380,8 @@ make_windowagg(PlannerInfo *root, List *tlist, node->ordColIdx = ordColIdx; node->ordOperators = ordOperators; node->frameOptions = frameOptions; + node->startOffset = startOffset; + node->endOffset = endOffset; copy_plan_costsize(plan, lefttree); /* only care about copying size */ cost_windowagg(&windowagg_path, root, diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 3748c83fd6..77e9d65ae7 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/optimizer/plan/planner.c,v 1.264 2010/02/10 03:38:35 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/optimizer/plan/planner.c,v 1.265 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -398,7 +398,10 @@ subquery_planner(PlannerGlobal *glob, Query *parse, root->hasPseudoConstantQuals = false; /* - * Do expression preprocessing on targetlist and quals. + * Do expression preprocessing on targetlist and quals, as well as other + * random expressions in the querytree. Note that we do not need to + * handle sort/group expressions explicitly, because they are actually + * part of the targetlist. */ parse->targetList = (List *) preprocess_expression(root, (Node *) parse->targetList, @@ -413,6 +416,17 @@ subquery_planner(PlannerGlobal *glob, Query *parse, parse->havingQual = preprocess_expression(root, parse->havingQual, EXPRKIND_QUAL); + foreach(l, parse->windowClause) + { + WindowClause *wc = (WindowClause *) lfirst(l); + + /* partitionClause/orderClause are sort/group expressions */ + wc->startOffset = preprocess_expression(root, wc->startOffset, + EXPRKIND_LIMIT); + wc->endOffset = preprocess_expression(root, wc->endOffset, + EXPRKIND_LIMIT); + } + parse->limitOffset = preprocess_expression(root, parse->limitOffset, EXPRKIND_LIMIT); parse->limitCount = preprocess_expression(root, parse->limitCount, @@ -1513,6 +1527,8 @@ grouping_planner(PlannerInfo *root, double tuple_fraction) ordColIdx, ordOperators, wc->frameOptions, + wc->startOffset, + wc->endOffset, result_plan); } } diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c index aa4fd4e1eb..18e721a581 100644 --- a/src/backend/optimizer/plan/setrefs.c +++ b/src/backend/optimizer/plan/setrefs.c @@ -9,7 +9,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/optimizer/plan/setrefs.c,v 1.157 2010/01/15 22:36:32 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/optimizer/plan/setrefs.c,v 1.158 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -466,10 +466,26 @@ set_plan_refs(PlannerGlobal *glob, Plan *plan, int rtoffset) } break; case T_Agg: - case T_WindowAgg: case T_Group: set_upper_references(glob, plan, rtoffset); break; + case T_WindowAgg: + { + WindowAgg *wplan = (WindowAgg *) plan; + + set_upper_references(glob, plan, rtoffset); + + /* + * Like Limit node limit/offset expressions, WindowAgg has + * frame offset expressions, which cannot contain subplan + * variable refs, so fix_scan_expr works for them. + */ + wplan->startOffset = + fix_scan_expr(glob, wplan->startOffset, rtoffset); + wplan->endOffset = + fix_scan_expr(glob, wplan->endOffset, rtoffset); + } + break; case T_Result: { Result *splan = (Result *) plan; diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c index 2e99cc6b4b..d75981467b 100644 --- a/src/backend/optimizer/plan/subselect.c +++ b/src/backend/optimizer/plan/subselect.c @@ -7,7 +7,7 @@ * Portions Copyright (c) 1994, Regents of the University of California * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/optimizer/plan/subselect.c,v 1.158 2010/01/18 18:17:45 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/optimizer/plan/subselect.c,v 1.159 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -2098,9 +2098,15 @@ finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params, locally_added_param); break; + case T_WindowAgg: + finalize_primnode(((WindowAgg *) plan)->startOffset, + &context); + finalize_primnode(((WindowAgg *) plan)->endOffset, + &context); + break; + case T_Hash: case T_Agg: - case T_WindowAgg: case T_Material: case T_Sort: case T_Unique: diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 2541d02128..da70ee089c 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -11,7 +11,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/parser/gram.y,v 2.707 2010/02/08 04:33:54 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/parser/gram.y,v 2.708 2010/02/12 17:33:20 tgl Exp $ * * HISTORY * AUTHOR DATE MAJOR EVENT @@ -434,8 +434,8 @@ static TypeName *TableFuncTypeName(List *columns); %type window_clause window_definition_list opt_partition_clause %type window_definition over_clause window_specification + opt_frame_clause frame_extent frame_bound %type opt_existing_window_name -%type opt_frame_clause frame_extent frame_bound /* @@ -578,8 +578,18 @@ static TypeName *TableFuncTypeName(List *columns); * RANGE, ROWS to support opt_existing_window_name; and for RANGE, ROWS * so that they can follow a_expr without creating * postfix-operator problems. + * + * The frame_bound productions UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING + * are even messier: since UNBOUNDED is an unreserved keyword (per spec!), + * there is no principled way to distinguish these from the productions + * a_expr PRECEDING/FOLLOWING. We hack this up by giving UNBOUNDED slightly + * lower precedence than PRECEDING and FOLLOWING. At present this doesn't + * appear to cause UNBOUNDED to be treated differently from other unreserved + * keywords anywhere else in the grammar, but it's definitely risky. We can + * blame any funny behavior of UNBOUNDED on the SQL standard, though. */ -%nonassoc IDENT PARTITION RANGE ROWS +%nonassoc UNBOUNDED /* ideally should have same precedence as IDENT */ +%nonassoc IDENT PARTITION RANGE ROWS PRECEDING FOLLOWING %left Op OPERATOR /* multi-character ops and user-defined operators */ %nonassoc NOTNULL %nonassoc ISNULL @@ -9907,6 +9917,8 @@ over_clause: OVER window_specification n->partitionClause = NIL; n->orderClause = NIL; n->frameOptions = FRAMEOPTION_DEFAULTS; + n->startOffset = NULL; + n->endOffset = NULL; n->location = @2; $$ = n; } @@ -9922,7 +9934,10 @@ window_specification: '(' opt_existing_window_name opt_partition_clause n->refname = $2; n->partitionClause = $3; n->orderClause = $4; - n->frameOptions = $5; + /* copy relevant fields of opt_frame_clause */ + n->frameOptions = $5->frameOptions; + n->startOffset = $5->startOffset; + n->endOffset = $5->endOffset; n->location = @1; $$ = n; } @@ -9947,58 +9962,100 @@ opt_partition_clause: PARTITION BY expr_list { $$ = $3; } ; /* + * For frame clauses, we return a WindowDef, but only some fields are used: + * frameOptions, startOffset, and endOffset. + * * This is only a subset of the full SQL:2008 frame_clause grammar. - * We don't support PRECEDING, FOLLOWING, - * nor yet. + * We don't support yet. */ opt_frame_clause: RANGE frame_extent { - $$ = FRAMEOPTION_NONDEFAULT | FRAMEOPTION_RANGE | $2; + WindowDef *n = $2; + n->frameOptions |= FRAMEOPTION_NONDEFAULT | FRAMEOPTION_RANGE; + if (n->frameOptions & (FRAMEOPTION_START_VALUE_PRECEDING | + FRAMEOPTION_END_VALUE_PRECEDING)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("RANGE PRECEDING is only supported with UNBOUNDED"), + parser_errposition(@1))); + if (n->frameOptions & (FRAMEOPTION_START_VALUE_FOLLOWING | + FRAMEOPTION_END_VALUE_FOLLOWING)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("RANGE FOLLOWING is only supported with UNBOUNDED"), + parser_errposition(@1))); + $$ = n; } | ROWS frame_extent { - $$ = FRAMEOPTION_NONDEFAULT | FRAMEOPTION_ROWS | $2; + WindowDef *n = $2; + n->frameOptions |= FRAMEOPTION_NONDEFAULT | FRAMEOPTION_ROWS; + $$ = n; } | /*EMPTY*/ - { $$ = FRAMEOPTION_DEFAULTS; } + { + WindowDef *n = makeNode(WindowDef); + n->frameOptions = FRAMEOPTION_DEFAULTS; + n->startOffset = NULL; + n->endOffset = NULL; + $$ = n; + } ; frame_extent: frame_bound { + WindowDef *n = $1; /* reject invalid cases */ - if ($1 & FRAMEOPTION_START_UNBOUNDED_FOLLOWING) + if (n->frameOptions & FRAMEOPTION_START_UNBOUNDED_FOLLOWING) ereport(ERROR, (errcode(ERRCODE_WINDOWING_ERROR), errmsg("frame start cannot be UNBOUNDED FOLLOWING"), parser_errposition(@1))); - if ($1 & FRAMEOPTION_START_CURRENT_ROW) + if (n->frameOptions & FRAMEOPTION_START_VALUE_FOLLOWING) ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("frame start at CURRENT ROW is not implemented"), + (errcode(ERRCODE_WINDOWING_ERROR), + errmsg("frame starting from following row cannot end with current row"), parser_errposition(@1))); - $$ = $1 | FRAMEOPTION_END_CURRENT_ROW; + n->frameOptions |= FRAMEOPTION_END_CURRENT_ROW; + $$ = n; } | BETWEEN frame_bound AND frame_bound { + WindowDef *n1 = $2; + WindowDef *n2 = $4; + /* form merged options */ + int frameOptions = n1->frameOptions; + /* shift converts START_ options to END_ options */ + frameOptions |= n2->frameOptions << 1; + frameOptions |= FRAMEOPTION_BETWEEN; /* reject invalid cases */ - if ($2 & FRAMEOPTION_START_UNBOUNDED_FOLLOWING) + if (frameOptions & FRAMEOPTION_START_UNBOUNDED_FOLLOWING) ereport(ERROR, (errcode(ERRCODE_WINDOWING_ERROR), errmsg("frame start cannot be UNBOUNDED FOLLOWING"), parser_errposition(@2))); - if ($2 & FRAMEOPTION_START_CURRENT_ROW) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("frame start at CURRENT ROW is not implemented"), - parser_errposition(@2))); - if ($4 & FRAMEOPTION_START_UNBOUNDED_PRECEDING) + if (frameOptions & FRAMEOPTION_END_UNBOUNDED_PRECEDING) ereport(ERROR, (errcode(ERRCODE_WINDOWING_ERROR), errmsg("frame end cannot be UNBOUNDED PRECEDING"), parser_errposition(@4))); - /* shift converts START_ options to END_ options */ - $$ = FRAMEOPTION_BETWEEN | $2 | ($4 << 1); + if ((frameOptions & FRAMEOPTION_START_CURRENT_ROW) && + (frameOptions & FRAMEOPTION_END_VALUE_PRECEDING)) + ereport(ERROR, + (errcode(ERRCODE_WINDOWING_ERROR), + errmsg("frame starting from current row cannot have preceding rows"), + parser_errposition(@4))); + if ((frameOptions & FRAMEOPTION_START_VALUE_FOLLOWING) && + (frameOptions & (FRAMEOPTION_END_VALUE_PRECEDING | + FRAMEOPTION_END_CURRENT_ROW))) + ereport(ERROR, + (errcode(ERRCODE_WINDOWING_ERROR), + errmsg("frame starting from following row cannot have preceding rows"), + parser_errposition(@4))); + n1->frameOptions = frameOptions; + n1->endOffset = n2->startOffset; + $$ = n1; } ; @@ -10010,15 +10067,43 @@ frame_extent: frame_bound frame_bound: UNBOUNDED PRECEDING { - $$ = FRAMEOPTION_START_UNBOUNDED_PRECEDING; + WindowDef *n = makeNode(WindowDef); + n->frameOptions = FRAMEOPTION_START_UNBOUNDED_PRECEDING; + n->startOffset = NULL; + n->endOffset = NULL; + $$ = n; } | UNBOUNDED FOLLOWING { - $$ = FRAMEOPTION_START_UNBOUNDED_FOLLOWING; + WindowDef *n = makeNode(WindowDef); + n->frameOptions = FRAMEOPTION_START_UNBOUNDED_FOLLOWING; + n->startOffset = NULL; + n->endOffset = NULL; + $$ = n; } | CURRENT_P ROW { - $$ = FRAMEOPTION_START_CURRENT_ROW; + WindowDef *n = makeNode(WindowDef); + n->frameOptions = FRAMEOPTION_START_CURRENT_ROW; + n->startOffset = NULL; + n->endOffset = NULL; + $$ = n; + } + | a_expr PRECEDING + { + WindowDef *n = makeNode(WindowDef); + n->frameOptions = FRAMEOPTION_START_VALUE_PRECEDING; + n->startOffset = $1; + n->endOffset = NULL; + $$ = n; + } + | a_expr FOLLOWING + { + WindowDef *n = makeNode(WindowDef); + n->frameOptions = FRAMEOPTION_START_VALUE_FOLLOWING; + n->startOffset = $1; + n->endOffset = NULL; + $$ = n; } ; @@ -10981,7 +11066,8 @@ unreserved_keyword: * looks too much like a function call for an LR(1) parser. */ col_name_keyword: - BIGINT + BETWEEN + | BIGINT | BIT | BOOLEAN_P | CHAR_P @@ -11040,7 +11126,6 @@ col_name_keyword: */ type_func_name_keyword: AUTHORIZATION - | BETWEEN | BINARY | CONCURRENTLY | CROSS diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c index 9b01047b91..e883e283e0 100644 --- a/src/backend/parser/parse_agg.c +++ b/src/backend/parser/parse_agg.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/parser/parse_agg.c,v 1.90 2010/01/02 16:57:49 momjian Exp $ + * $PostgreSQL: pgsql/src/backend/parser/parse_agg.c,v 1.91 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -258,7 +258,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc, continue; if (equal(refwin->partitionClause, windef->partitionClause) && equal(refwin->orderClause, windef->orderClause) && - refwin->frameOptions == windef->frameOptions) + refwin->frameOptions == windef->frameOptions && + equal(refwin->startOffset, windef->startOffset) && + equal(refwin->endOffset, windef->endOffset)) { /* found a duplicate window specification */ wfunc->winref = winref; @@ -505,6 +507,7 @@ parseCheckWindowFuncs(ParseState *pstate, Query *qry) parser_errposition(pstate, locate_windowfunc(expr)))); } + /* startOffset and limitOffset were checked in transformFrameOffset */ } } diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c index 54c5cb39e8..54bb867631 100644 --- a/src/backend/parser/parse_clause.c +++ b/src/backend/parser/parse_clause.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/parser/parse_clause.c,v 1.196 2010/02/07 20:48:10 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/parser/parse_clause.c,v 1.197 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -72,6 +72,8 @@ static Node *transformFromClauseItem(ParseState *pstate, Node *n, Relids *containedRels); static Node *buildMergedJoinVar(ParseState *pstate, JoinType jointype, Var *l_colvar, Var *r_colvar); +static void checkExprIsVarFree(ParseState *pstate, Node *n, + const char *constructName); static TargetEntry *findTargetlistEntrySQL92(ParseState *pstate, Node *node, List **tlist, int clause); static TargetEntry *findTargetlistEntrySQL99(ParseState *pstate, Node *node, @@ -85,6 +87,8 @@ static List *addTargetToGroupList(ParseState *pstate, TargetEntry *tle, List *grouplist, List *targetlist, int location, bool resolveUnknown); static WindowClause *findWindowClause(List *wclist, const char *name); +static Node *transformFrameOffset(ParseState *pstate, int frameOptions, + Node *clause); /* @@ -1177,10 +1181,28 @@ transformLimitClause(ParseState *pstate, Node *clause, qual = coerce_to_specific_type(pstate, qual, INT8OID, constructName); - /* - * LIMIT can't refer to any vars or aggregates of the current query - */ - if (contain_vars_of_level(qual, 0)) + /* LIMIT can't refer to any vars or aggregates of the current query */ + checkExprIsVarFree(pstate, qual, constructName); + + return qual; +} + +/* + * checkExprIsVarFree + * Check that given expr has no Vars of the current query level + * (and no aggregates or window functions, either). + * + * This is used to check expressions that have to have a consistent value + * across all rows of the query, such as a LIMIT. Arguably it should reject + * volatile functions, too, but we don't do that --- whatever value the + * function gives on first execution is what you get. + * + * constructName does not affect the semantics, but is used in error messages + */ +static void +checkExprIsVarFree(ParseState *pstate, Node *n, const char *constructName) +{ + if (contain_vars_of_level(n, 0)) { ereport(ERROR, (errcode(ERRCODE_INVALID_COLUMN_REFERENCE), @@ -1188,10 +1210,10 @@ transformLimitClause(ParseState *pstate, Node *clause, errmsg("argument of %s must not contain variables", constructName), parser_errposition(pstate, - locate_var_of_level(qual, 0)))); + locate_var_of_level(n, 0)))); } if (pstate->p_hasAggs && - checkExprHasAggs(qual)) + checkExprHasAggs(n)) { ereport(ERROR, (errcode(ERRCODE_GROUPING_ERROR), @@ -1199,10 +1221,10 @@ transformLimitClause(ParseState *pstate, Node *clause, errmsg("argument of %s must not contain aggregate functions", constructName), parser_errposition(pstate, - locate_agg_of_level(qual, 0)))); + locate_agg_of_level(n, 0)))); } if (pstate->p_hasWindowFuncs && - checkExprHasWindowFuncs(qual)) + checkExprHasWindowFuncs(n)) { ereport(ERROR, (errcode(ERRCODE_WINDOWING_ERROR), @@ -1210,10 +1232,8 @@ transformLimitClause(ParseState *pstate, Node *clause, errmsg("argument of %s must not contain window functions", constructName), parser_errposition(pstate, - locate_windowfunc(qual)))); + locate_windowfunc(n)))); } - - return qual; } @@ -1664,6 +1684,11 @@ transformWindowDefinitions(ParseState *pstate, windef->refname), parser_errposition(pstate, windef->location))); wc->frameOptions = windef->frameOptions; + /* Process frame offset expressions */ + wc->startOffset = transformFrameOffset(pstate, wc->frameOptions, + windef->startOffset); + wc->endOffset = transformFrameOffset(pstate, wc->frameOptions, + windef->endOffset); wc->winref = winref; result = lappend(result, wc); @@ -2166,3 +2191,47 @@ findWindowClause(List *wclist, const char *name) return NULL; } + +/* + * transformFrameOffset + * Process a window frame offset expression + */ +static Node * +transformFrameOffset(ParseState *pstate, int frameOptions, Node *clause) +{ + const char *constructName = NULL; + Node *node; + + /* Quick exit if no offset expression */ + if (clause == NULL) + return NULL; + + /* Transform the raw expression tree */ + node = transformExpr(pstate, clause); + + if (frameOptions & FRAMEOPTION_ROWS) + { + /* + * Like LIMIT clause, simply coerce to int8 + */ + constructName = "ROWS"; + node = coerce_to_specific_type(pstate, node, INT8OID, constructName); + } + else if (frameOptions & FRAMEOPTION_RANGE) + { + /* + * this needs a lot of thought to decide how to support in the + * context of Postgres' extensible datatype framework + */ + constructName = "RANGE"; + /* error was already thrown by gram.y, this is just a backstop */ + elog(ERROR, "window frame with value offset is not implemented"); + } + else + Assert(false); + + /* Disallow variables and aggregates in frame offsets */ + checkExprIsVarFree(pstate, node, constructName); + + return node; +} diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 6e81a88b30..202d68f589 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -9,7 +9,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/utils/adt/ruleutils.c,v 1.320 2010/01/21 06:11:45 itagaki Exp $ + * $PostgreSQL: pgsql/src/backend/utils/adt/ruleutils.c,v 1.321 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -3160,6 +3160,16 @@ get_rule_windowspec(WindowClause *wc, List *targetList, appendStringInfoString(buf, "UNBOUNDED PRECEDING "); else if (wc->frameOptions & FRAMEOPTION_START_CURRENT_ROW) appendStringInfoString(buf, "CURRENT ROW "); + else if (wc->frameOptions & FRAMEOPTION_START_VALUE) + { + get_rule_expr(wc->startOffset, context, false); + if (wc->frameOptions & FRAMEOPTION_START_VALUE_PRECEDING) + appendStringInfoString(buf, " PRECEDING "); + else if (wc->frameOptions & FRAMEOPTION_START_VALUE_FOLLOWING) + appendStringInfoString(buf, " FOLLOWING "); + else + Assert(false); + } else Assert(false); if (wc->frameOptions & FRAMEOPTION_BETWEEN) @@ -3169,6 +3179,16 @@ get_rule_windowspec(WindowClause *wc, List *targetList, appendStringInfoString(buf, "UNBOUNDED FOLLOWING "); else if (wc->frameOptions & FRAMEOPTION_END_CURRENT_ROW) appendStringInfoString(buf, "CURRENT ROW "); + else if (wc->frameOptions & FRAMEOPTION_END_VALUE) + { + get_rule_expr(wc->endOffset, context, false); + if (wc->frameOptions & FRAMEOPTION_END_VALUE_PRECEDING) + appendStringInfoString(buf, " PRECEDING "); + else if (wc->frameOptions & FRAMEOPTION_END_VALUE_FOLLOWING) + appendStringInfoString(buf, " FOLLOWING "); + else + Assert(false); + } else Assert(false); } diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h index 4a4ea6b492..9a1d19380c 100644 --- a/src/include/catalog/catversion.h +++ b/src/include/catalog/catversion.h @@ -37,7 +37,7 @@ * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/catalog/catversion.h,v 1.583 2010/02/07 20:48:11 tgl Exp $ + * $PostgreSQL: pgsql/src/include/catalog/catversion.h,v 1.584 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -53,6 +53,6 @@ */ /* yyyymmddN */ -#define CATALOG_VERSION_NO 201002071 +#define CATALOG_VERSION_NO 201002121 #endif diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h index 866cd3ef3f..4d9dfc4c82 100644 --- a/src/include/nodes/execnodes.h +++ b/src/include/nodes/execnodes.h @@ -7,7 +7,7 @@ * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/nodes/execnodes.h,v 1.217 2010/01/05 23:25:36 tgl Exp $ + * $PostgreSQL: pgsql/src/include/nodes/execnodes.h,v 1.218 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -1595,23 +1595,36 @@ typedef struct WindowAggState FmgrInfo *ordEqfunctions; /* equality funcs for ordering columns */ Tuplestorestate *buffer; /* stores rows of current partition */ int current_ptr; /* read pointer # for current */ - int agg_ptr; /* read pointer # for aggregates */ int64 spooled_rows; /* total # of rows in buffer */ int64 currentpos; /* position of current row in partition */ + int64 frameheadpos; /* current frame head position */ int64 frametailpos; /* current frame tail position */ + /* use struct pointer to avoid including windowapi.h here */ + struct WindowObjectData *agg_winobj; /* winobj for aggregate fetches */ + int64 aggregatedbase; /* start row for current aggregates */ int64 aggregatedupto; /* rows before this one are aggregated */ - MemoryContext wincontext; /* context for partition-lifespan data */ + int frameOptions; /* frame_clause options, see WindowDef */ + ExprState *startOffset; /* expression for starting bound offset */ + ExprState *endOffset; /* expression for ending bound offset */ + Datum startOffsetValue; /* result of startOffset evaluation */ + Datum endOffsetValue; /* result of endOffset evaluation */ + + MemoryContext partcontext; /* context for partition-lifespan data */ + MemoryContext aggcontext; /* context for each aggregate data */ ExprContext *tmpcontext; /* short-term evaluation context */ + bool all_first; /* true if the scan is starting */ bool all_done; /* true if the scan is finished */ bool partition_spooled; /* true if all tuples in current * partition have been spooled into * tuplestore */ - bool more_partitions;/* true if there's more partitions after this - * one */ - bool frametail_valid;/* true if frametailpos is known up to date - * for current row */ + bool more_partitions; /* true if there's more partitions after + * this one */ + bool framehead_valid; /* true if frameheadpos is known up to date + * for current row */ + bool frametail_valid; /* true if frametailpos is known up to date + * for current row */ TupleTableSlot *first_part_slot; /* first tuple of current or next * partition */ diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index ffa6055a57..0c3aecfa6e 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -13,7 +13,7 @@ * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/nodes/parsenodes.h,v 1.428 2010/02/08 04:33:54 tgl Exp $ + * $PostgreSQL: pgsql/src/include/nodes/parsenodes.h,v 1.429 2010/02/12 17:33:20 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -393,6 +393,8 @@ typedef struct WindowDef List *partitionClause; /* PARTITION BY expression list */ List *orderClause; /* ORDER BY (list of SortBy) */ int frameOptions; /* frame_clause options, see below */ + Node *startOffset; /* expression for starting bound, if any */ + Node *endOffset; /* expression for ending bound, if any */ int location; /* parse location, or -1 if none/unknown */ } WindowDef; @@ -414,6 +416,15 @@ typedef struct WindowDef #define FRAMEOPTION_END_UNBOUNDED_FOLLOWING 0x00080 /* end is U. F. */ #define FRAMEOPTION_START_CURRENT_ROW 0x00100 /* start is C. R. */ #define FRAMEOPTION_END_CURRENT_ROW 0x00200 /* end is C. R. */ +#define FRAMEOPTION_START_VALUE_PRECEDING 0x00400 /* start is V. P. */ +#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */ +#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */ +#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */ + +#define FRAMEOPTION_START_VALUE \ + (FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING) +#define FRAMEOPTION_END_VALUE \ + (FRAMEOPTION_END_VALUE_PRECEDING | FRAMEOPTION_END_VALUE_FOLLOWING) #define FRAMEOPTION_DEFAULTS \ (FRAMEOPTION_RANGE | FRAMEOPTION_START_UNBOUNDED_PRECEDING | \ @@ -799,6 +810,8 @@ typedef struct WindowClause List *partitionClause; /* PARTITION BY list */ List *orderClause; /* ORDER BY list */ int frameOptions; /* frame_clause options, see WindowDef */ + Node *startOffset; /* expression for starting bound, if any */ + Node *endOffset; /* expression for ending bound, if any */ Index winref; /* ID referenced by window functions */ bool copiedOrder; /* did we copy orderClause from refname? */ } WindowClause; diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h index 48f7578858..b6640cfab3 100644 --- a/src/include/nodes/plannodes.h +++ b/src/include/nodes/plannodes.h @@ -7,7 +7,7 @@ * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/nodes/plannodes.h,v 1.115 2010/01/02 16:58:04 momjian Exp $ + * $PostgreSQL: pgsql/src/include/nodes/plannodes.h,v 1.116 2010/02/12 17:33:21 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -552,6 +552,8 @@ typedef struct WindowAgg AttrNumber *ordColIdx; /* their indexes in the target list */ Oid *ordOperators; /* equality operators for ordering columns */ int frameOptions; /* frame_clause options, see WindowDef */ + Node *startOffset; /* expression for starting bound, if any */ + Node *endOffset; /* expression for ending bound, if any */ } WindowAgg; /* ---------------- diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h index 024142250a..ed00f86d27 100644 --- a/src/include/optimizer/planmain.h +++ b/src/include/optimizer/planmain.h @@ -7,7 +7,7 @@ * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/include/optimizer/planmain.h,v 1.124 2010/01/15 22:36:35 tgl Exp $ + * $PostgreSQL: pgsql/src/include/optimizer/planmain.h,v 1.125 2010/02/12 17:33:21 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -62,7 +62,8 @@ extern WindowAgg *make_windowagg(PlannerInfo *root, List *tlist, int numWindowFuncs, Index winref, int partNumCols, AttrNumber *partColIdx, Oid *partOperators, int ordNumCols, AttrNumber *ordColIdx, Oid *ordOperators, - int frameOptions, Plan *lefttree); + int frameOptions, Node *startOffset, Node *endOffset, + Plan *lefttree); extern Group *make_group(PlannerInfo *root, List *tlist, List *qual, int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators, double numGroups, diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h index e9c25e8d83..5065bd609e 100644 --- a/src/include/parser/kwlist.h +++ b/src/include/parser/kwlist.h @@ -11,7 +11,7 @@ * Portions Copyright (c) 1994, Regents of the University of California * * IDENTIFICATION - * $PostgreSQL: pgsql/src/include/parser/kwlist.h,v 1.11 2010/02/08 04:33:55 tgl Exp $ + * $PostgreSQL: pgsql/src/include/parser/kwlist.h,v 1.12 2010/02/12 17:33:21 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -53,7 +53,7 @@ PG_KEYWORD("authorization", AUTHORIZATION, TYPE_FUNC_NAME_KEYWORD) PG_KEYWORD("backward", BACKWARD, UNRESERVED_KEYWORD) PG_KEYWORD("before", BEFORE, UNRESERVED_KEYWORD) PG_KEYWORD("begin", BEGIN_P, UNRESERVED_KEYWORD) -PG_KEYWORD("between", BETWEEN, TYPE_FUNC_NAME_KEYWORD) +PG_KEYWORD("between", BETWEEN, COL_NAME_KEYWORD) PG_KEYWORD("bigint", BIGINT, COL_NAME_KEYWORD) PG_KEYWORD("binary", BINARY, TYPE_FUNC_NAME_KEYWORD) PG_KEYWORD("bit", BIT, COL_NAME_KEYWORD) diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out index c14011ce0e..0481cc6dd8 100644 --- a/src/test/regress/expected/window.out +++ b/src/test/regress/expected/window.out @@ -728,6 +728,193 @@ FROM (select distinct ten, four from tenk1) ss; 3 | 2 | 4 | 2 (20 rows) +SELECT sum(unique1) over (order by four range between current row and unbounded following), + unique1, four +FROM tenk1 WHERE unique1 < 10; + sum | unique1 | four +-----+---------+------ + 45 | 0 | 0 + 45 | 8 | 0 + 45 | 4 | 0 + 33 | 5 | 1 + 33 | 9 | 1 + 33 | 1 | 1 + 18 | 6 | 2 + 18 | 2 | 2 + 10 | 3 | 3 + 10 | 7 | 3 +(10 rows) + +SELECT sum(unique1) over (rows between current row and unbounded following), + unique1, four +FROM tenk1 WHERE unique1 < 10; + sum | unique1 | four +-----+---------+------ + 45 | 4 | 0 + 41 | 2 | 2 + 39 | 1 | 1 + 38 | 6 | 2 + 32 | 9 | 1 + 23 | 8 | 0 + 15 | 5 | 1 + 10 | 3 | 3 + 7 | 7 | 3 + 0 | 0 | 0 +(10 rows) + +SELECT sum(unique1) over (rows between 2 preceding and 2 following), + unique1, four +FROM tenk1 WHERE unique1 < 10; + sum | unique1 | four +-----+---------+------ + 7 | 4 | 0 + 13 | 2 | 2 + 22 | 1 | 1 + 26 | 6 | 2 + 29 | 9 | 1 + 31 | 8 | 0 + 32 | 5 | 1 + 23 | 3 | 3 + 15 | 7 | 3 + 10 | 0 | 0 +(10 rows) + +SELECT sum(unique1) over (rows between 2 preceding and 1 preceding), + unique1, four +FROM tenk1 WHERE unique1 < 10; + sum | unique1 | four +-----+---------+------ + | 4 | 0 + 4 | 2 | 2 + 6 | 1 | 1 + 3 | 6 | 2 + 7 | 9 | 1 + 15 | 8 | 0 + 17 | 5 | 1 + 13 | 3 | 3 + 8 | 7 | 3 + 10 | 0 | 0 +(10 rows) + +SELECT sum(unique1) over (rows between 1 following and 3 following), + unique1, four +FROM tenk1 WHERE unique1 < 10; + sum | unique1 | four +-----+---------+------ + 9 | 4 | 0 + 16 | 2 | 2 + 23 | 1 | 1 + 22 | 6 | 2 + 16 | 9 | 1 + 15 | 8 | 0 + 10 | 5 | 1 + 7 | 3 | 3 + 0 | 7 | 3 + | 0 | 0 +(10 rows) + +SELECT sum(unique1) over (rows between unbounded preceding and 1 following), + unique1, four +FROM tenk1 WHERE unique1 < 10; + sum | unique1 | four +-----+---------+------ + 6 | 4 | 0 + 7 | 2 | 2 + 13 | 1 | 1 + 22 | 6 | 2 + 30 | 9 | 1 + 35 | 8 | 0 + 38 | 5 | 1 + 45 | 3 | 3 + 45 | 7 | 3 + 45 | 0 | 0 +(10 rows) + +SELECT sum(unique1) over (w range between current row and unbounded following), + unique1, four +FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four); + sum | unique1 | four +-----+---------+------ + 45 | 0 | 0 + 45 | 8 | 0 + 45 | 4 | 0 + 33 | 5 | 1 + 33 | 9 | 1 + 33 | 1 | 1 + 18 | 6 | 2 + 18 | 2 | 2 + 10 | 3 | 3 + 10 | 7 | 3 +(10 rows) + +-- fail: not implemented yet +SELECT sum(unique1) over (order by four range between 2::int8 preceding and 1::int2 preceding), + unique1, four +FROM tenk1 WHERE unique1 < 10; +ERROR: RANGE PRECEDING is only supported with UNBOUNDED +LINE 1: SELECT sum(unique1) over (order by four range between 2::int... + ^ +SELECT first_value(unique1) over w, + nth_value(unique1, 2) over w AS nth_2, + last_value(unique1) over w, unique1, four +FROM tenk1 WHERE unique1 < 10 +WINDOW w AS (order by four range between current row and unbounded following); + first_value | nth_2 | last_value | unique1 | four +-------------+-------+------------+---------+------ + 0 | 8 | 7 | 0 | 0 + 0 | 8 | 7 | 8 | 0 + 0 | 8 | 7 | 4 | 0 + 5 | 9 | 7 | 5 | 1 + 5 | 9 | 7 | 9 | 1 + 5 | 9 | 7 | 1 | 1 + 6 | 2 | 7 | 6 | 2 + 6 | 2 | 7 | 2 | 2 + 3 | 7 | 7 | 3 | 3 + 3 | 7 | 7 | 7 | 3 +(10 rows) + +SELECT sum(unique1) over + (rows (SELECT unique1 FROM tenk1 ORDER BY unique1 LIMIT 1) + 1 PRECEDING), + unique1 +FROM tenk1 WHERE unique1 < 10; + sum | unique1 +-----+--------- + 4 | 4 + 6 | 2 + 3 | 1 + 7 | 6 + 15 | 9 + 17 | 8 + 13 | 5 + 8 | 3 + 10 | 7 + 7 | 0 +(10 rows) + +CREATE TEMP VIEW v_window AS + SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows + FROM generate_series(1, 10) i; +SELECT * FROM v_window; + i | sum_rows +----+---------- + 1 | 3 + 2 | 6 + 3 | 9 + 4 | 12 + 5 | 15 + 6 | 18 + 7 | 21 + 8 | 24 + 9 | 27 + 10 | 19 +(10 rows) + +SELECT pg_get_viewdef('v_window'); + pg_get_viewdef +--------------------------------------------------------------------------------------------------------------------------------- + SELECT i.i, sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows FROM generate_series(1, 10) i(i); +(1 row) + -- with UNION SELECT count(*) OVER (PARTITION BY four) FROM (SELECT * FROM tenk1 UNION ALL SELECT * FROM tenk2)s LIMIT 0; count diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule index 43794122f5..7cdf872df5 100644 --- a/src/test/regress/parallel_schedule +++ b/src/test/regress/parallel_schedule @@ -1,5 +1,5 @@ # ---------- -# $PostgreSQL: pgsql/src/test/regress/parallel_schedule,v 1.60 2010/02/07 22:40:33 tgl Exp $ +# $PostgreSQL: pgsql/src/test/regress/parallel_schedule,v 1.61 2010/02/12 17:33:21 tgl Exp $ # # By convention, we put no more than twenty tests in any one parallel group; # this limits the number of connections needed to run the tests. @@ -78,18 +78,19 @@ test: select_into select_distinct select_distinct_on select_implicit select_havi test: privileges test: misc +# rules cannot run concurrently with any test that creates a view +test: rules # ---------- # Another group of parallel tests # ---------- -test: select_views portals_p2 rules foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap +test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap # ---------- # Another group of parallel tests # NB: temp.sql does a reconnect which transiently uses 2 connections, # so keep this parallel group to at most 19 tests # ---------- -# "plpgsql" cannot run concurrently with "rules", nor can "plancache" test: plancache limit plpgsql copy2 temp domain rangefuncs prepare without_oid conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml # run stats by itself because its delay may be insufficient under heavy load diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule index 037abf2341..c404d54206 100644 --- a/src/test/regress/serial_schedule +++ b/src/test/regress/serial_schedule @@ -1,4 +1,4 @@ -# $PostgreSQL: pgsql/src/test/regress/serial_schedule,v 1.55 2010/01/28 23:21:13 petere Exp $ +# $PostgreSQL: pgsql/src/test/regress/serial_schedule,v 1.56 2010/02/12 17:33:21 tgl Exp $ # This should probably be in an order similar to parallel_schedule. test: tablespace test: boolean @@ -89,9 +89,9 @@ test: namespace test: prepared_xacts test: privileges test: misc +test: rules test: select_views test: portals_p2 -test: rules test: foreign_key test: cluster test: dependency diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql index fc62a6fd6e..1cfc64bd8b 100644 --- a/src/test/regress/sql/window.sql +++ b/src/test/regress/sql/window.sql @@ -161,6 +161,58 @@ SELECT four, ten/4 as two, last_value(ten/4) over (partition by four order by ten/4 rows between unbounded preceding and current row) FROM (select distinct ten, four from tenk1) ss; +SELECT sum(unique1) over (order by four range between current row and unbounded following), + unique1, four +FROM tenk1 WHERE unique1 < 10; + +SELECT sum(unique1) over (rows between current row and unbounded following), + unique1, four +FROM tenk1 WHERE unique1 < 10; + +SELECT sum(unique1) over (rows between 2 preceding and 2 following), + unique1, four +FROM tenk1 WHERE unique1 < 10; + +SELECT sum(unique1) over (rows between 2 preceding and 1 preceding), + unique1, four +FROM tenk1 WHERE unique1 < 10; + +SELECT sum(unique1) over (rows between 1 following and 3 following), + unique1, four +FROM tenk1 WHERE unique1 < 10; + +SELECT sum(unique1) over (rows between unbounded preceding and 1 following), + unique1, four +FROM tenk1 WHERE unique1 < 10; + +SELECT sum(unique1) over (w range between current row and unbounded following), + unique1, four +FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four); + +-- fail: not implemented yet +SELECT sum(unique1) over (order by four range between 2::int8 preceding and 1::int2 preceding), + unique1, four +FROM tenk1 WHERE unique1 < 10; + +SELECT first_value(unique1) over w, + nth_value(unique1, 2) over w AS nth_2, + last_value(unique1) over w, unique1, four +FROM tenk1 WHERE unique1 < 10 +WINDOW w AS (order by four range between current row and unbounded following); + +SELECT sum(unique1) over + (rows (SELECT unique1 FROM tenk1 ORDER BY unique1 LIMIT 1) + 1 PRECEDING), + unique1 +FROM tenk1 WHERE unique1 < 10; + +CREATE TEMP VIEW v_window AS + SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows + FROM generate_series(1, 10) i; + +SELECT * FROM v_window; + +SELECT pg_get_viewdef('v_window'); + -- with UNION SELECT count(*) OVER (PARTITION BY four) FROM (SELECT * FROM tenk1 UNION ALL SELECT * FROM tenk2)s LIMIT 0;