Thomas Munro [Sat, 13 Jul 2019 01:55:10 +0000 (13:55 +1200)]
Forward received condition variable signals on cancel.
After a process decides not to wait for a condition variable, it can
still consume a signal before it reaches ConditionVariableCancelSleep().
In that case, pass the signal on to another waiter if possible, so that
a signal doesn't go missing when there is another process ready to
receive it.
Author: Thomas Munro Reviewed-by: Shawn Debnath
Discussion: https://postgr.es/m/CA%2BhUKGLQ_RW%2BXs8znDn36e-%2Bmq2--zrPemBqTQ8eKT-VO1OF4Q%40mail.gmail.com
Thomas Munro [Fri, 12 Jul 2019 22:35:34 +0000 (10:35 +1200)]
Warn if wal_level is too low when creating a publication.
Provide a hint to users that they need to increase wal_level before
subscriptions can work.
Author: Lucas Viecelli, with some adjustments by Thomas Munro Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAPjy-57rn5Y9g4e5u--eSOP-7P4QrE9uOZmT2ZcUebF8qxsYhg%40mail.gmail.com
Tom Lane [Fri, 12 Jul 2019 20:24:59 +0000 (16:24 -0400)]
Fix get_actual_variable_range() to cope with broken HOT chains.
Commit 3ca930fc3 modified get_actual_variable_range() to use a new
"SnapshotNonVacuumable" snapshot type for selecting tuples that it
would consider valid. However, because that snapshot type can accept
recently-dead tuples, this caused a bug when using a recently-created
index: we might accept a recently-dead tuple that is an early member
of a broken HOT chain and does not actually match the index entry.
Then, the data extracted from the heap tuple would not necessarily be
an endpoint value of the column; it could even be NULL, leading to
get_actual_variable_range() itself reporting "found unexpected null
value in index". Even without an error, this could lead to poor
plan choices due to an erroneous notion of the endpoint value.
We can improve matters by changing the code to use the index-only
scan technique (which didn't exist when get_actual_variable_range was
originally written). If any of the tuples in a HOT chain are live
enough to satisfy SnapshotNonVacuumable, we take the data from the
index entry, ignoring what is in the heap. This fixes the problem
without changing the live-vs-dead-tuple behavior from what was
intended by commit 3ca930fc3.
A side benefit is that for static tables we might not have to touch
the heap at all (when the extremal value is in an all-visible page).
In addition, we can save some overhead by not having to create a
complete ExecutorState, and we don't need to run FormIndexDatum,
avoiding more cycles as well as the possibility of failure for
indexes on expressions. (I'm not sure that this code would ever
be used to determine the extreme value of an expression, in the
current state of the planner; but it's definitely possible that
lower-order columns of the selected index could be expressions.
So one could construct perhaps-artificial examples in which the
old code unexpectedly failed due to trying to compute an
expression's value for a now-dead row.)
Per report from Manuel Rigger. Back-patch to v11 where commit 3ca930fc3 came in.
David Rowley [Fri, 12 Jul 2019 07:12:38 +0000 (19:12 +1200)]
Fix RANGE partition pruning with multiple boolean partition keys
match_clause_to_partition_key incorrectly would return
PARTCLAUSE_UNSUPPORTED if a bool qual could not be matched to the current
partition key. This was a problem, as it causes the calling function to
discard the qual and not try to match it to any other partition key. If
there was another partition key which did match this qual, then the qual
would not be checked again and we could fail to prune some partitions.
The worst this could do was to cause partitions not to be pruned when they
could have been, so there was no danger of incorrect query results here.
Fix this by changing match_boolean_partition_clause to have it return a
PartClauseMatchStatus rather than a boolean value. This allows it to
communicate if the qual is unsupported or if it just does not match this
particular partition key, previously these two cases were treated the
same. Now, if match_clause_to_partition_key is unable to match the qual
to any other qual type then we can simply return the value from the
match_boolean_partition_clause call so that the calling function properly
treats the qual as either unmatched or unsupported.
Reported-by: Rares Salcudean Reviewed-by: Amit Langote
Backpatch-through: 11 where partition pruning was introduced
Discussion: https://postgr.es/m/CAHp_FN2xwEznH6oyS0hNTuUUZKp5PvegcVv=Co6nBXJ+mC7Y5w@mail.gmail.com
Tom Lane [Wed, 10 Jul 2019 18:32:28 +0000 (14:32 -0400)]
Reduce memory consumption for multi-statement query strings.
Previously, exec_simple_query always ran parse analysis, rewrite, and
planning in MessageContext, allowing all the data generated thereby
to persist until the end of processing of the whole query string.
That's fine for single-command strings, but if a client sends many
commands in a single simple-Query message, this strategy could result
in annoying memory bloat, as complained of by Andreas Seltenreich.
To fix, create a child context to do this work in, and reclaim it
after each command. But we only do so for parsetrees that are not
last in their query string. That avoids adding any memory management
overhead for the typical case of a single-command string. Memory
allocated for the last parsetree would be freed immediately after
finishing the command string anyway.
Similarly, adjust extension.c's execute_sql_string() to reclaim memory
after each command. In that usage, multi-command strings are the norm,
so it's a bit surprising that no one has yet complained of bloat ---
especially since the bloat extended to whatever data ProcessUtility
execution might leak.
Michael Paquier [Wed, 10 Jul 2019 06:14:54 +0000 (15:14 +0900)]
Fix variable initialization when using buffering build with GiST
This can cause valgrind to complain, as the flag marking a buffer as a
temporary copy was not getting initialized.
While on it, fill in with zeros newly-created buffer pages. This does
not matter when loading a block from a temporary file, but it makes the
push of an index tuple into a new buffer page safer.
This has been introduced by 1d27dcf, so backpatch all the way down to
9.4.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/15899-0d24fb273b3dd90c@postgresql.org
Backpatch-through: 9.4
David Rowley [Wed, 10 Jul 2019 04:03:04 +0000 (16:03 +1200)]
Fix missing calls to table_finish_bulk_insert during COPY, take 2
86b85044e abstracted calls to heap functions in COPY FROM to support a
generic table AM. However, when performing a copy into a partitioned
table, this commit neglected to call table_finish_bulk_insert for each
partition. Before 86b85044e, when we always called the heap functions,
there was no need to call heapam_finish_bulk_insert for partitions since
it only did any work when performing a copy without WAL. For partitioned
tables, this was unsupported anyway, so there was no issue. With
pluggable storage, we can't make any assumptions about what the table AM
might want to do in its equivalent function, so we'd better ensure we
always call table_finish_bulk_insert each partition that's received a row.
For now, we make the table_finish_bulk_insert call whenever we evict a
CopyMultiInsertBuffer out of the CopyMultiInsertInfo. This does mean
that it's possible that we call table_finish_bulk_insert multiple times
per partition, which is not a problem other than being an inefficiency.
Improving this requires a more invasive patch, so let's leave that for
another day.
This also changes things so that we no longer needlessly call
table_finish_bulk_insert when performing a COPY FROM for a non-partitioned
table when not using multi-inserts.
Reported-by: Robert Haas
Backpatch-through: 12
Discussion: https://postgr.es/m/CA+TgmoYK=6BpxiJ0tN-p9wtH0BTAfbdxzHhwou0mdud4+BkYuQ@mail.gmail.com
Amit Kapila [Wed, 10 Jul 2019 02:22:51 +0000 (07:52 +0530)]
Fix few typos and minor wordsmithing in tableam comments.
Reported-by: Ashwin Agrawal
Author: Ashwin Agrawal Reviewed-by: Amit Kapila
Backpatch-through: 12, where it was introduced
Discussion: https://postgr.es/m/CALfoeisgdZhYDrJOukaBzvXfJOK2FQ0szVMK7dzmcy6w93iDUA@mail.gmail.com
Thomas Munro [Tue, 9 Jul 2019 22:15:32 +0000 (10:15 +1200)]
Pass QueryEnvironment down to EvalPlanQual's EState.
Otherwise the executor can't see trigger transition tables during
EPQ evaluation. Fixes bug #15900 and almost certainly also #15720.
Back-patch to 10, where trigger transition tables landed.
Author: Alex Aktsipetrov Reviewed-by: Thomas Munro, Tom Lane
Discussion: https://postgr.es/m/15900-bc482754fe8d7415%40postgresql.org
Discussion: https://postgr.es/m/15720-38c2b29e5d720187%40postgresql.org
We were creating the cloned triggers with an empty list of arguments,
losing the ones that had been specified by the user when creating the
trigger in the partitioned table. Repair.
Author: Patrick McHardy Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/20190709130027.amr2cavjvo7rdvac@access1.trash.net
Discussion: https://postgr.es/m/15752-123bc90287986de4@postgresql.org
Robert Haas [Mon, 8 Jul 2019 18:51:53 +0000 (14:51 -0400)]
tableam: Provide helper functions for relation sizing.
Most block-based table AMs will need the exact same implementation of
the relation_size callback as the heap, and if they use a standard
page layout, they will likely need an implementation of the
relation_estimate_size callback that is very similar to that of the
heap. Rearrange to facilitate code reuse.
Patch by me, reviewed by Michael Paquier, Daniel Gustafsson, and
Álvaro Herrera.
Michael Paquier [Mon, 8 Jul 2019 04:15:09 +0000 (13:15 +0900)]
Fix inconsistencies in the code
This addresses a couple of issues in the code:
- Typos and inconsistencies in comments and function declarations.
- Removal of unreferenced function declarations.
- Removal of unnecessary compile flags.
- A cleanup error in regressplans.sh.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/0c991fdf-2670-1997-c027-772a420c4604@gmail.com
Tom Lane [Sat, 6 Jul 2019 15:25:37 +0000 (11:25 -0400)]
In pg_log_generic(), be more paranoid about preserving errno.
This code failed to account for the possibility that malloc() would
change errno, resulting in wrong output for %m, not to mention the
possibility of message truncation. Such a change is obviously
expected when malloc fails, but there's reason to fear that on some
platforms even a successful malloc call can modify errno.
In normal interactive mode, psql's log messages accidentally got a
"psql:" prefix that was not supposed to be there. This only happened
if there was no .psqlrc file being read, so it wasn't discovered for a
while. Fix this by adding the appropriate logging format
configuration call in the right code path.
Amit Kapila [Sat, 6 Jul 2019 06:11:23 +0000 (11:41 +0530)]
Add missing assertions for required table am callbacks.
Reported-by: Ashwin Agrawal
Author: Ashwin Agrawal Reviewed-by: Amit Kapila
Backpatch-through: 12, where it was introduced
Discussion: https://postgr.es/m/CALfoeisgdZhYDrJOukaBzvXfJOK2FQ0szVMK7dzmcy6w93iDUA@mail.gmail.com
Tom Lane [Sat, 6 Jul 2019 03:56:34 +0000 (23:56 -0400)]
Add some test cases to improve test coverage of parse_expr.c.
I chanced to notice while thumbing through lcov reports that we had
exactly no coverage of BETWEEN SYMMETRIC, nor of current_time(N) and
localtime(N). Improve that.
parse_expr.c still has a pretty awful coverage number, but a large part
of that is due to lack of coverage of the operator_precedence_warning
logic. I have zero desire to write tests for that; I think ripping it
out would be more sensible at this point.
Tom Lane [Fri, 5 Jul 2019 18:17:27 +0000 (14:17 -0400)]
Remove dead encoding-conversion functions.
The code for conversions SQL_ASCII <-> MULE_INTERNAL and
SQL_ASCII <-> UTF8 was unreachable, because we long ago changed
the wrapper functions pg_do_encoding_conversion() et al so that
they have hard-wired behaviors for conversions involving SQL_ASCII.
(At least some of those fast paths date back to 2002, though it
looks like we may not have been totally consistent about this until
later.) Given the lack of complaints, nobody is dissatisfied with
this state of affairs. Hence, let's just remove the unreachable code.
Also, change CREATE CONVERSION so that it rejects attempts to
define such conversions. Since we consider that SQL_ASCII represents
lack of knowledge about the encoding in use, such a conversion would
be semantically dubious even if it were reachable.
Adjust a couple of regression test cases that had randomly decided
to rely on these conversion functions rather than any other ones.
Tomas Vondra [Fri, 5 Jul 2019 16:06:02 +0000 (18:06 +0200)]
Remove unused variable in statext_mcv_serialize()
The itemlen variable used to be referenced in multiple places, but since
reworking the serialization code it's used only in one assert. Fixed by
removing the variable and calling the macro from the assert directly.
Backpatch to 12, where this code was introduced.
Reported-by: Jeff Janes
Discussion: https://postgr.es/m/CAMkU=1zc_ovH9NZd_9ovuiEWkF9yX06URUDdXCmgDydf-bqB5A@mail.gmail.com
Tom Lane [Fri, 5 Jul 2019 16:32:36 +0000 (12:32 -0400)]
Add \warn command to psql.
This is like \echo except that the text is sent to stderr not stdout.
In passing, fix a pre-existing bug in \echo and \qecho: per documentation
the -n switch should only be recognized when it is the first argument,
but actually any argument matching "-n" was treated as a switch.
(Should we back-patch that?)
David Fetter (bug fix by me), reviewed by Fabien Coelho
Michael Paquier [Fri, 5 Jul 2019 01:47:32 +0000 (10:47 +0900)]
Update hardcoded DH parameters to IANA standards
The source defining the current fallback and hardcoded DH parameters
has disappeared from the web a long time ago, and RFC 3526 defines the
most current Diffie-Hellman MODP groups, so update to those new values.
Author: Daniel Gustafsson Reviewed-by: Peter Eisentraut, Michael Paquier
Discussion: https://postgr.es/m/5E60AC9A-CB10-4851-9EF2-7209490A164C@yesql.se
Tomas Vondra [Thu, 4 Jul 2019 22:45:20 +0000 (00:45 +0200)]
Simplify pg_mcv_list (de)serialization
The serialization format of multivariate MCV lists included alignment in
order to allow direct access to part of the serialized data, but despite
multiple fixes (see for example commits d85e0f366a and ea4e1c0e8f) this
proved to be problematic.
This commit abandons alignment in the serialized format, and just copies
everything during deserialization. We now also track amount of memory
needed after deserialization (including alignment), which allows us to
deserialize the MCV list in a single pass.
Bump catversion, as this affects contents of pg_statistic_ext_data.
Backpatch to 12, where multi-column MCV lists were introduced.
Author: Tomas Vondra Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/2201.1561521148@sss.pgh.pa.us
Tomas Vondra [Thu, 4 Jul 2019 21:43:04 +0000 (23:43 +0200)]
Fix pg_mcv_list_items() to produce text[]
The function pg_mcv_list_items() returns values stored in MCV items. The
items may contain columns with different data types, so the function was
generating text array-like representation, but in an ad-hoc way without
properly escaping various characters etc.
Fixed by simply building a text[] array, which also makes it easier to
use from queries etc.
Requires changes to pg_proc entry, so bump catversion.
Backpatch to 12, where multi-column MCV lists were introduced.
Author: Tomas Vondra Reviewed-by: Dean Rasheed
Discussion: https://postgr.es/m/20190618205920.qtlzcu73whfpfqne@development
Tomas Vondra [Thu, 4 Jul 2019 21:02:02 +0000 (23:02 +0200)]
Speed-up build of MCV lists with many distinct values
When building multi-column MCV lists, we compute base frequency for each
item, i.e. a product of per-column frequencies for values from the item.
As a value may be in multiple groups, the code was scanning the whole
array of groups while adding items to the MCV list. This works fine as
long as the number of distinct groups is small, but it's easy to trigger
trigger O(N^2) behavior, especially after increasing statistics target.
This commit precomputes frequencies for values in all columns, so that
when computing the base frequency it's enough to make a simple bsearch
lookup in the array.
Backpatch to 12, where multi-column MCV lists were introduced.
Unwind some workarounds for lack of portable int64 format specifier
Because there is no portable int64/uint64 format specifier and we
can't stick macros like INT64_FORMAT into the middle of a translatable
string, we have been using various workarounds that put the number to
be printed into a string buffer first. Now that we always use our own
sprintf(), we can rely on %lld and %llu to work, so we can use those.
This patch undoes this workaround in a few places where it was
egregiously verbose.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/CAH2-Wz%3DWbNxc5ob5NJ9yqo2RMJ0q4HXDS30GVCobeCvC9A1L9A%40mail.gmail.com
Michael Paquier [Thu, 4 Jul 2019 07:08:09 +0000 (16:08 +0900)]
Introduce safer encoding and decoding routines for base64.c
This is a follow-up refactoring after 09ec55b and b674211, which has
proved that the encoding and decoding routines used by SCRAM have a
poor interface when it comes to check after buffer overflows. This adds
an extra argument in the shape of the length of the result buffer for
each routine, which is used for overflow checks when encoding or
decoding an input string. The original idea comes from Tom Lane.
As a result of that, the encoding routine can now fail, so all its
callers are adjusted to generate proper error messages in case of
problems.
On failure, the result buffer gets zeroed.
Author: Michael Paquier Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20190623132535.GB1628@paquier.xyz
Michael Paquier [Thu, 4 Jul 2019 02:33:42 +0000 (11:33 +0900)]
Simplify TAP tests of pg_dump for connection strings
The last set of scenarios did an initialization of nodes followed by an
extra command to set up the authentication policy with pg_regress
--config-auth. This configuration step can be integrated directly using
the option auth_extra from PostgresNode::init when initializing the
node, saving from one extra command. On Windows, this also restricts
more pg_ident.conf for the SSPI user mapping by removing the entry of
the OS user running the test, which is not needed anyway.
Note that IPC::Run mishandles double quotes, hence the restore user name
is changed to map with that. This was already done in the test as a
later step, but not in a consistent way, causing the switch to use
auth_extra to fail.
David Rowley [Thu, 4 Jul 2019 01:01:13 +0000 (13:01 +1200)]
Use appendStringInfoString and appendPQExpBufferStr where possible
This changes various places where appendPQExpBuffer was used in places
where it was possible to use appendPQExpBufferStr, and likewise for
appendStringInfo and appendStringInfoString. This is really just a
stylistic improvement, but there are also small performance gains to be
had from doing this.
Tom Lane [Wed, 3 Jul 2019 22:08:53 +0000 (18:08 -0400)]
Ensure plpgsql result tuples have the right composite type marking.
A function that is declared to return a named composite type must
return tuple datums that are physically marked as having that type.
The plpgsql code path that allowed directly returning an expanded-record
datum forgot to check that, so that an expanded record marked as type
RECORDOID could be returned if it had a physically-compatible tupdesc.
This'd be harmless, I think, if the record value never escaped the
current session --- but it's possible for it to get stored into a table,
and then subsequent sessions can't interpret the anonymous record type.
Fix by flattening the record into a tuple datum and overwriting its
type/typmod fields, if its declared type doesn't match the function's
declared type. (In principle it might be possible to just change the
expanded record's stored type ID info, but there are enough tricky
consequences that I didn't want to mess with that, especially not in
a back-patched bug fix.)
Per bug report from Steve Rogerson. Back-patch to v11 where the bug
was introduced.
David Rowley [Wed, 3 Jul 2019 11:44:54 +0000 (23:44 +1200)]
Don't remove surplus columns from GROUP BY for inheritance parents
d4c3a156c added code to remove columns that were not part of a table's
PRIMARY KEY constraint from the GROUP BY clause when all the primary key
columns were present in the group by. This is fine to do since we know
that there will only be one row per group coming from this relation.
However, the logic failed to consider inheritance parent relations. These
can have child relations without a primary key, but even if they did, they
could duplicate one of the parent's rows or one from another child
relation. In this case, those additional GROUP BY columns are required.
Fix this by disabling the optimization for inheritance parent tables.
In v11 and beyond, partitioned tables are fine since partitions cannot
overlap and before v11 partitioned tables could not have a primary key.
Reported-by: Manuel Rigger
Discussion: http://postgr.es/m/CA+u7OA7VLKf_vEr6kLF3MnWSA9LToJYncgpNX2tQ-oWzYCBQAw@mail.gmail.com
Backpatch-through: 9.6
postgres_fdw: Remove redundancy in postgresAcquireSampleRowsFunc().
Previously, in the loop in postgresAcquireSampleRowsFunc() to iterate
fetching rows from a given remote table, we redundantly 1) determined the
fetch size by parsing the table's server/table-level options and then 2)
constructed the fetch command; remove that redundancy.
Tom Lane [Tue, 2 Jul 2019 18:04:42 +0000 (14:04 -0400)]
Don't treat complete_from_const as equivalent to complete_from_list.
Commit 4f3b38fe2 supposed that complete_from_const() is equivalent to
the one-element-list case of complete_from_list(), but that's not
really true at all. complete_from_const() supposes that the completion
is certain enough to justify wiping out whatever the user typed, while
complete_from_list() will only provide completions that match the
word-so-far.
In practice, given the lame parsing technology used by tab-complete.c,
it's fairly hard to believe that we're *ever* certain enough about
a completion to justify auto-correcting user input that doesn't match.
Hence, remove the inappropriate unification of the two cases.
As things now stand, complete_from_const() is used only for the
situation where we have no matches and we need to keep readline
from applying its default complete-with-file-names behavior.
This (mis?) behavior actually exists much further back, but
I'm hesitant to change it in released branches. It's not too
late for v12, though, especially seeing that the aforesaid
commit is new in v12.
Tom Lane [Tue, 2 Jul 2019 17:35:14 +0000 (13:35 -0400)]
Fix tab completion of "SET variable TO|=" to not offer bogus completions.
Don't think that the context "UPDATE tab SET var =" is a GUC-setting
command.
If we have "SET var =" but the "var" is not a known GUC variable,
don't offer any completions. The most likely explanation is that
we've misparsed the context and it's not really a GUC-setting command.
Per gripe from Ken Tanzer. Back-patch to 9.6. The issue exists
further back, but before 9.6 the code looks very different and it
doesn't actually know whether the "var" name matches anything,
so I desisted from trying to fix it.
Tom Lane [Tue, 2 Jul 2019 16:32:49 +0000 (12:32 -0400)]
Simplify psql \d's rule for ordering the indexes of a table.
The previous rule was "primary key (if any) first, then other unique
indexes in name order, then all other indexes in name order".
But the preference for unique indexes seems a bit obsolete since the
introduction of exclusion constraints. It's no longer the case
that unique indexes are the only ones that constrain what data can
be in the table, and it's hard to see what other rationale there is
for separating out unique indexes. Other new features like the
possibility for some indexes to be INVALID (hence, not constraining
anything) make this even shakier.
Hence, simplify the sort order to be "primary key (if any) first,
then all other indexes in name order".
No documentation change, since this was never documented anyway.
A couple of existing regression test cases change output, though.
Michael Paquier [Tue, 2 Jul 2019 05:02:33 +0000 (14:02 +0900)]
Add support for Visual Studio 2019 in build scripts
This fixes at the same time a set of inconsistencies in the
documentation and the scripts related to the versions of Windows SDK
supported.
Author: Haribabu Kommi Reviewed-by: Andrew Dunstan, Juan José Santamaría Flecha, Michael
Paquier
Discussion: https://postgr.es/m/CAJrrPGcfqXhfPyMrny9apoDU7M1t59dzVAvoJ9AeAh5BJi+UzA@mail.gmail.com
Michael Paquier [Tue, 2 Jul 2019 02:36:53 +0000 (11:36 +0900)]
Refactor code of reindexdb for query generation
This merges the portion related to REINDEX SYSTEM into the routine
already available for all the other reindex types, making the query
generation cleaner. While on it, change the handling of the reindex
types using an enum, which allows to get rid of the hardcoded strings
used directly in the query generation present for the same purpose (aka
"TABLE", "DATABASE", etc.).
Per discussion with Julien Rouhaud, Tom Lane, Alvaro Herrera and me.
David Rowley [Mon, 1 Jul 2019 15:07:15 +0000 (03:07 +1200)]
Remove surplus call to table_finish_bulk_insert
4de60244e added the call to table_finish_bulk_insert to the
CopyMultiInsertBufferCleanup function. We use a CopyMultiInsertBuffer even
for non-partitioned tables, so having the cleanup do that meant we would
call table_finsh_bulk_insert twice when performing COPY FROM with
a non-partitioned table.
Here we can just remove the direct call in CopyFrom and let
CopyMultiInsertBufferCleanup handle the call instead.
David Rowley [Mon, 1 Jul 2019 13:23:26 +0000 (01:23 +1200)]
Fix missing call to table_finish_bulk_insert during COPY
86b85044e abstracted calls to heap functions in COPY FROM to support a
generic table AM. However, when performing a copy into a partitioned
table, this commit neglected to call table_finish_bulk_insert for each
partition. Before 86b85044e, when we always called the heap functions,
there was no need to call heapam_finish_bulk_insert for partitions since
it only did any work when performing a copy without WAL. For partitioned
tables, this was unsupported anyway, so there was no issue. With pluggable
storage, we can't make any assumptions about what the table AM might want
to do in its equivalent function, so we'd better ensure we always call
table_finish_bulk_insert each partition that's received a row.
For now, we make the table_finish_bulk_insert call whenever we evict a
CopyMultiInsertBuffer out of the CopyMultiInsertInfo. This does mean
that it's possible that we call table_finish_bulk_insert multiple times
per partition, which is not a problem other than being an inefficiency.
Improving this requires a more invasive patch, so let's leave that for
another day.
In passing, move the table_finish_bulk_insert for the target of the COPY
command so that it's only called when we're actually performing bulk
inserts. We don't need to call this when inserting 1 row at a time.
Reported-by: Robert Haas
Discussion: https://postgr.es/m/CA+TgmoYK=6BpxiJ0tN-p9wtH0BTAfbdxzHhwou0mdud4+BkYuQ@mail.gmail.com
Don't read fields of a misaligned ExpandedObjectHeader or AnyArrayType.
UBSan complains about this. Instead, cast to a suitable type requiring
only 4-byte alignment. DatumGetAnyArrayP() already assumes one can cast
between AnyArrayType and ArrayType, so this doesn't introduce a new
assumption. Back-patch to 9.5, where AnyArrayType was introduced.
Andrew Gierth [Sun, 30 Jun 2019 22:49:13 +0000 (23:49 +0100)]
Repair logic for reordering grouping sets optimization.
The logic in reorder_grouping_sets to order grouping set elements to
match a pre-specified sort ordering was defective, resulting in
unnecessary sort nodes (though the query output would still be
correct). Repair, simplifying the code a little, and add a test.
Per report from Richard Guo, though I didn't use their patch. Original
bug seems to have been my fault.
Backpatch back to 9.5 where grouping sets were introduced.
Tom Lane [Sun, 30 Jun 2019 17:34:45 +0000 (13:34 -0400)]
Blind attempt to fix SSPI-auth case in 010_dump_connstr.pl.
Up to now, pg_regress --config-auth had a hard-wired assumption
that the target cluster uses the default bootstrap superuser name.
pg_dump's 010_dump_connstr.pl TAP test uses non-default superuser
names, and was klugily getting around the restriction by listing
the desired superuser name as a role to "create". This is pretty
confusing (or at least, it confused me). Let's make it clearer by
allowing --config-auth mode to be told the bootstrap superuser name.
Repurpose the existing --user switch for that, since it has no
other function in --config-auth mode.
Per buildfarm. I don't have an environment at hand in which I can
test this fix, but the buildfarm should soon show if it works.
Tom Lane [Sun, 30 Jun 2019 16:51:08 +0000 (12:51 -0400)]
Move rolenames test out of the core regression tests.
This test script is unsafe to run in "make installcheck" mode for
(at least) two reasons: it creates and destroys some role names
that don't follow the "regress_xxx" naming convention, and it
sets and then resets the application_name GUC attached to every
existing role. While we've not had complaints, these surely are
not good things to do within a production installation, and
regress.sgml pretty clearly implies that we won't do them.
Rather than lose test coverage altogether, let's just move this
script somewhere where it will get run by "make check" but not
"make installcheck". src/test/modules/ already has that property.
Since it seems likely that we'll want other regression tests in
future that also exceed the constraints of "make installcheck",
create a generically-named src/test/modules/unsafe_tests/
directory to hold them.
Peter Eisentraut [Sun, 30 Jun 2019 08:15:25 +0000 (10:15 +0200)]
Don't call data type input functions in GUC check hooks
Instead of calling pg_lsn_in() in check_recovery_target_lsn and
timestamptz_in() in check_recovery_target_time, reorganize the
respective code so that we don't raise any errors in the check hooks.
The previous code tried to use PG_TRY/PG_CATCH to handle errors in a
way that is not safe, so now the code contains no ereport() calls and
can operate safely within the GUC error handling system.
Moreover, since the interpretation of the recovery_target_time string
may depend on the time zone, we cannot do the final processing of that
string until all the GUC processing is done. Instead,
check_recovery_target_time() now does some parsing for syntax
checking, but the actual conversion to a timestamptz value is done
later in the recovery code that uses it.
Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/20190611061115.njjwkagvxp4qujhp%40alap3.anarazel.de
Peter Eisentraut [Wed, 12 Jun 2019 09:29:53 +0000 (11:29 +0200)]
Remove explicit error handling for obsolete date/time values
The date/time values 'current', 'invalid', and 'undefined' were
removed a long time ago, but the code still contains explicit error
handling for the transition. To simplify the code and avoid having to
handle these values everywhere, just remove the recognition of these
tokens altogether now.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Tom Lane [Sat, 29 Jun 2019 15:34:00 +0000 (11:34 -0400)]
Add an enforcement mechanism for global object names in regression tests.
In commit 18555b132 we tentatively established a rule that regression
tests should use names containing "regression" for databases, and names
starting with "regress_" for all other globally-visible object names, so
as to circumscribe the side-effects that "make installcheck" could have
on an existing installation.
This commit adds a simple enforcement mechanism for that rule: if the code
is compiled with ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS defined, it
will emit a warning (not an error) whenever a database, role, tablespace,
subscription, or replication origin name is created that doesn't obey the
rule. Running one or more buildfarm members with that symbol defined
should be enough to catch new violations, at least in the regular
regression tests. Most TAP tests wouldn't notice such warnings, but
that's actually fine because TAP tests don't execute against an existing
server anyway.
Since it's already the case that running src/test/modules/ tests in
installcheck mode is deprecated, we can use that as a home for tests
that seem unsafe to run against an existing server, such as tests that
might have side-effects on existing roles. Document that (though this
commit doesn't in itself make it any less safe than before).
Update regress.sgml to define these restrictions more clearly, and
to clean up assorted lack-of-up-to-date-ness in its descriptions of
the available regression tests.
Tom Lane [Sat, 29 Jun 2019 15:09:03 +0000 (11:09 -0400)]
Fix regression tests to use only global names beginning with "regress_".
In commit 18555b132 we tentatively established a rule that regression
tests should use names containing "regression" for databases, and names
starting with "regress_" for all other globally-visible object names, so
as to circumscribe the side-effects that "make installcheck" could have on
an existing installation. However, no enforcement mechanism was created,
so it's unsurprising that some new violations have crept in since then.
In fact, a whole new *category* of violations has crept in, to wit we now
also have globally-visible subscription and replication origin names, and
"make installcheck" could very easily clobber user-created objects of
those types. So it's past time to do something about this.
This commit sanitizes the tests enough that they will pass (i.e. not
generate any visible warnings) with the enforcement mechanism I'll add
in the next commit. There are some TAP tests that still trigger the
warnings, but the warnings do not cause test failure. Since these tests
do not actually run against a pre-existing installation, there's no need
to worry whether they could conflict with user-created objects.
The problem with rolenames.sql testing special role names like "user"
is still there, and is dealt with only very cosmetically in this patch
(by hiding the warnings :-(). What we actually need to do to be safe is
to take that test script out of "make installcheck" altogether, but that
seems like material for a separate patch.
Tom Lane [Sat, 29 Jun 2019 14:30:08 +0000 (10:30 -0400)]
Disallow user-created replication origins named "pg_xxx".
Since we generate such names internally, it seems like a good idea
to have a policy of disallowing them for user use, as we do for many
other object types. Otherwise attempts to use them will randomly
fail due to collisions with internally-generated names.
Alvaro Herrera [Fri, 28 Jun 2019 18:51:08 +0000 (14:51 -0400)]
Fix for dropped columns in a partitioned table's default partition
We forgot to map column numbers to/from the default partition for
various operations, leading to valid cases failing with spurious
errors, such as
ERROR: attribute N of type some_partition has been dropped
It was also possible that the search for conflicting rows in the default
partition when attaching another partition would fail to detect some.
Secondarily, it was also possible that such a search should be skipped
(because the constraint was implied) but wasn't.
Fix all this by mapping column numbers when necessary.
Reported by: Daniel Wilches
Author: Amit Langote
Discussion: https://postgr.es/m/15873-8c61945d6b3ef87c@postgresql.org
Thomas Munro [Thu, 27 Jun 2019 23:11:26 +0000 (11:11 +1200)]
Fix misleading comment in nodeIndexonlyscan.c.
The stated reason for acquiring predicate locks on heap pages hasn't
existed since commit c01262a8, so fix the comment. Perhaps in a later
release we'll also be able to change the code to use tuple locks.
Tomas Vondra [Thu, 27 Jun 2019 15:41:29 +0000 (17:41 +0200)]
Update reference to sampling algorithm in analyze.c
Commit 83e176ec1 moved row sampling functions from analyze.c to
utils/misc/sampling.c, but failed to update comment referring to
the sampling algorithm from Jeff Vitter's paper. Correct the
comment by pointing to utils/misc/sampling.c.
Alvaro Herrera [Wed, 26 Jun 2019 22:38:51 +0000 (18:38 -0400)]
Fix partitioned index creation with foreign partitions
When a partitioned tables contains foreign tables as partitions, it is
not possible to implement unique or primary key indexes -- but when
regular indexes are created, there is no reason to do anything other
than ignoring such partitions. We were raising errors upon encountering
the foreign partitions, which is unfriendly and doesn't protect against
any actual problems.
Relax this restriction so that index creation is allowed on partitioned
tables containing foreign partitions, becoming a no-op on them. (We may
later want to redefine this so that the FDW is told to create the
indexes on the foreign side.) This applies to CREATE INDEX, as well as
ALTER TABLE / ATTACH PARTITION and CREATE TABLE / PARTITION OF.
Backpatch to 11, where indexes on partitioned tables were introduced.
Discussion: https://postgr.es/m/15724-d5a58fa9472eef4f@postgresql.org
Author: Álvaro Herrera Reviewed-by: Amit Langote
Michael Paquier [Wed, 26 Jun 2019 01:44:46 +0000 (10:44 +0900)]
Add support for OpenSSL 1.1.0 and newer versions in MSVC scripts
Up to now, the MSVC build scripts are able to support only one fixed
version of OpenSSL, and they lacked logic to detect the version of
OpenSSL a given compilation of Postgres is linking to (currently 1.0.2,
the latest LTS of upstream which will be EOL'd at the end of 2019).
This commit adds more logic to detect the version of OpenSSL used by a
build and makes use of it to add support for compilation with OpenSSL
1.1.0 which requires a new set of compilation flags to work properly.
The supported OpenSSL installers have changed their library layer with
various library renames with the upgrade to 1.1.0, making the logic a
bit more complicated. The scripts are now able to adapt to the new
world order.
Reported-by: Sergey Pashkov
Author: Juan José Santamaría Flecha, Michael Paquier Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/15789-8fc75dea3c5a17c8@postgresql.org
Michael Paquier [Tue, 25 Jun 2019 00:09:27 +0000 (09:09 +0900)]
Add toast-level reloption for vacuum_index_cleanup
a96c41f has introduced the option for heap, but it still lacked the
variant to control the behavior for toast relations.
While on it, refactor the tests so as they stress more scenarios with
the various values that vacuum_index_cleanup can use. It would be
useful to couple those tests with pageinspect to check that pages are
actually cleaned up, but this is left for later.
Author: Masahiko Sawada, Michael Paquier Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAD21AoCqs8iN04RX=i1KtLSaX5RrTEM04b7NHYps4+rqtpWNEg@mail.gmail.com