Tom Lane [Sat, 5 Jun 2004 01:55:05 +0000 (01:55 +0000)]
Make the world very nearly safe for composite-type columns in tables.
1. Solve the problem of not having TOAST references hiding inside composite
values by establishing the rule that toasting only goes one level deep:
a tuple can contain toasted fields, but a composite-type datum that is
to be inserted into a tuple cannot. Enforcing this in heap_formtuple
is relatively cheap and it avoids a large increase in the cost of running
the tuptoaster during final storage of a row.
2. Fix some interesting problems in expansion of inherited queries that
reference whole-row variables. We never really did this correctly before,
but it's now relatively painless to solve by expanding the parent's
whole-row Var into a RowExpr() selecting the proper columns from the
child.
If you dike out the preventive check in CheckAttributeType(),
composite-type columns now seem to actually work. However, we surely
cannot ship them like this --- without I/O for composite types, you
can't get pg_dump to dump tables containing them. So a little more
work still to do.
Tom Lane [Fri, 4 Jun 2004 20:35:21 +0000 (20:35 +0000)]
Resurrect heap_deformtuple(), this time implemented as a singly nested
loop over the fields instead of a loop around heap_getattr. This is
considerably faster (O(N) instead of O(N^2)) when there are nulls or
varlena fields, since those prevent use of attcacheoff. Replace loops
over heap_getattr with heap_deformtuple in situations where all or most
of the fields have to be fetched, such as printtup and tuptoaster.
Profiling done more than a year ago shows that this should be a nice
win for situations involving many-column tables.
Bruce Momjian [Fri, 4 Jun 2004 13:30:04 +0000 (13:30 +0000)]
The attached patch will create a dummy pg_config_paths.h. Additionally,
ENABLE_THREAD_SAFETY is supported by the makefile (but not by the
sources, which need some rework)
Tom Lane [Fri, 4 Jun 2004 03:24:04 +0000 (03:24 +0000)]
Remove some long-obsolete code that was causing a strange error message
when someone attempts to create a column of a composite datatype. For
now, just make sure we produce a reasonable error at the 'right place'.
Not sure if this will be made to work before 7.5, but make it act
reasonably in case nothing more gets done.
Tom Lane [Fri, 4 Jun 2004 02:37:06 +0000 (02:37 +0000)]
Support assignment to whole-row variables in plpgsql; also fix glitch
with using a trigger's NEW or OLD record as a whole-row variable in an
expression. Fixes several long-standing complaints.
Tom Lane [Fri, 4 Jun 2004 00:07:52 +0000 (00:07 +0000)]
Allow plpgsql to pass composite-type arguments (ie, whole-row variables)
into SQL expressions. At present this only works usefully for variables
of named rowtypes, not RECORD variables, since the SQL parser can't infer
anything about datatypes from a RECORD Param. Still, it's a step forward.
Tom Lane [Thu, 3 Jun 2004 22:56:43 +0000 (22:56 +0000)]
Restructure plpgsql's parsing of datatype declarations to unify the
scalar and composite (rowtype) cases a little better. This commit is
just a code-beautification operation and shouldn't make any real
difference in behavior, but it's an important preliminary step for
trying to improve plgsql's handling of rowtypes.
Teodor Sigaev [Thu, 3 Jun 2004 12:26:10 +0000 (12:26 +0000)]
- Add aligment of variable data types
- Add aligment for interval data types
- Avoid floating point overflow in penalty functions
Janko Richter <jankorichter@yahoo.de> and teodor
Tom Lane [Thu, 3 Jun 2004 02:08:07 +0000 (02:08 +0000)]
Adjust our timezone library to use pg_time_t (typedef'd as int64) in
place of time_t, as per prior discussion. The behavior does not change
on machines without a 64-bit-int type, but on machines with one, which
is most, we are rid of the bizarre boundary behavior at the edges of
the 32-bit-time_t range (1901 and 2038). The system will now treat
times over the full supported timestamp range as being in your local
time zone. It may seem a little bizarre to consider that times in
4000 BC are PST or EST, but this is surely at least as reasonable as
propagating Gregorian calendar rules back that far.
I did not modify the format of the zic timezone database files, which
means that for the moment the system will not know about daylight-savings
periods outside the range 1901-2038. Given the way the files are set up,
it's not a simple decision like 'widen to 64 bits'; we have to actually
think about the range of years that need to be supported. We should
probably inquire what the plans of the upstream zic people are before
making any decisions of our own.
Bruce Momjian [Thu, 3 Jun 2004 00:25:47 +0000 (00:25 +0000)]
Win32 regression fixes:
. only use the -W flag on pwd for $pkglibdir. All the other paths need
to be seen as MSys type paths, whereas $pkglibdir needs to be expressed
as a genuine windows path.
. run single tests in the background and explicitly wait for them -
solves the problem of the MSys shell not waiting properly for the copy
test to finish.
. use pg_ctl to shut down the test postmaster - no more use of ad hoc
kill programs or the task manager.
Bruce Momjian [Thu, 3 Jun 2004 00:07:38 +0000 (00:07 +0000)]
Add PGETC (for pg_service.conf) and PGLOCALE (for locale dir)
environment variable processing to libpq.
The patch also adds code to our client apps so we set the environment
variable directly based on our binary location, unless it is already
set. This will allow our applications to emit proper locale messages
that are generated in libpq.
Bruce Momjian [Wed, 2 Jun 2004 21:34:49 +0000 (21:34 +0000)]
Small patch that adds some documentation for the area() function.
Specifically, point out that intersecting points in a path will yield
(most likely), unexpected results. Visually these are identical paths,
but mathematically they're not the same. Ex:
area | plan
------
+-----------------------------------------------------------------------
-------------------
-0 | ((0,0),(0,1),(2,1),(2,2),(1,2),(1,0),(0,0))
2 | ((0,0),(0,1),(1,1),(1,2),(2,2),(2,1),(1,1),(1,0),(0,0))
The current algorithm for area(PATH) is very quick, but only handles
non-intersecting paths. I'm going to work on two other functions for
the PATH data type that determines if a PATH is intersecting or not,
and a function that returns the area() for an intersecting PATH. The
intersecting area() function will be considerably slower (I think it's
going to be O(n!) or worse instead of the current O(n), but that comes
with the territory).
Bruce Momjian [Wed, 2 Jun 2004 21:29:29 +0000 (21:29 +0000)]
Per previous discussions, here are two functions to send INT and TERM
(cancel and terminate) signals to other backends. They permit only INT
and TERM, and permits sending only to postgresql backends.
Tom Lane [Wed, 2 Jun 2004 17:28:18 +0000 (17:28 +0000)]
Adjust btree index build to not use shared buffers, thereby avoiding the
locking conflict against concurrent CHECKPOINT that was discussed a few
weeks ago. Also, if not using WAL archiving (which is always true ATM
but won't be if PITR makes it into this release), there's no need to
WAL-log the index build process; it's sufficient to force-fsync the
completed index before commit. This seems to gain about a factor of 2
in my tests, which is consistent with writing half as much data. I did
not try it with WAL on a separate drive though --- probably the gain would
be a lot less in that scenario.
Tom Lane [Tue, 1 Jun 2004 21:49:23 +0000 (21:49 +0000)]
Align GRANT/REVOKE behavior more closely with the SQL spec, per discussion
of bug report #1150. Also, arrange that the object owner's irrevocable
grant-option permissions are handled implicitly by the system rather than
being listed in the ACL as self-granted rights (which was wrong anyway).
I did not take the further step of showing these permissions in an
explicit 'granted by _SYSTEM' ACL entry, as that seemed more likely to
bollix up existing clients than to do anything really useful. It's still
a possible future direction, though.
Tom Lane [Mon, 31 May 2004 20:31:33 +0000 (20:31 +0000)]
Additional mop-up for sync-to-fsync changes: avoid issuing fsyncs for
temp tables, and avoid WAL-logging truncations of temp tables. Do issue
fsync on truncated files (not sure this is necessary but it seems like
a good idea).
Tom Lane [Mon, 31 May 2004 19:24:05 +0000 (19:24 +0000)]
Minor code rationalization: FlushRelationBuffers just returns void,
rather than an error code, and does elog(ERROR) not elog(WARNING)
when it detects a problem. All callers were simply elog(ERROR)'ing on
failure return anyway, and I find it hard to envision a caller that would
not, so we may as well simplify the callers and produce the more useful
error message directly.
Tom Lane [Mon, 31 May 2004 18:31:51 +0000 (18:31 +0000)]
I think I've finally identified the cause of the off-by-one-second
issue in timestamp conversion that we hacked around for so long by
ignoring the seconds field from localtime(). It's simple: you have
to watch out for platform-specific roundoff error when reducing a
possibly-fractional timestamp to integral time_t form. In particular
we should subtract off the already-determined fractional fsec field.
This should be enough to get an exact answer with int64 timestamps;
with float timestamps, throw in a rint() call just to be sure.
Tom Lane [Mon, 31 May 2004 03:48:10 +0000 (03:48 +0000)]
Per previous discussions, get rid of use of sync(2) in favor of
explicitly fsync'ing every (non-temp) file we have written since the
last checkpoint. In the vast majority of cases, the burden of the
fsyncs should fall on the bgwriter process not on backends. (To this
end, we assume that an fsync issued by the bgwriter will force out
blocks written to the same file by other processes using other file
descriptors. Anyone have a problem with that?) This makes the world
safe for WIN32, which ain't even got sync(2), and really makes the world
safe for Unixen as well, because sync(2) never had the semantics we need:
it offers no way to wait for the requested I/O to finish.
Along the way, fix a bug I recently introduced in xlog recovery:
file truncation replay failed to clear bufmgr buffers for the dropped
blocks, which could result in 'PANIC: heap_delete_redo: no block'
later on in xlog replay.
Neil Conway [Sun, 30 May 2004 23:40:41 +0000 (23:40 +0000)]
Use the new List API function names throughout the backend, and disable the
list compatibility API by default. While doing this, I decided to keep
the llast() macro around and introduce llast_int() and llast_oid() variants.
Tom Lane [Sat, 29 May 2004 22:48:23 +0000 (22:48 +0000)]
Separate out bgwriter code into a logically separate module, rather
than being random pieces of other files. Give bgwriter responsibility
for all checkpoint activity (other than a post-recovery checkpoint);
so this child process absorbs the functionality of the former transient
checkpoint and shutdown subprocesses. While at it, create an actual
include file for postmaster.c, which for some reason never had its own
file before.
Tom Lane [Sat, 29 May 2004 05:55:13 +0000 (05:55 +0000)]
Fix another place that assumed 'x = lcons(y, z)' would not have any
side-effect on the original list z. I fear we have a few more of these
to track down yet :-(.
Bruce Momjian [Fri, 28 May 2004 18:37:10 +0000 (18:37 +0000)]
When checking for thread safety with src/tools/thread/thread_test.c, the
mktemp function wants an argument that contains 6 X, while the current
version only supplies 5 X which will fail on my SuSE 8.1.
Tom Lane [Fri, 28 May 2004 16:17:14 +0000 (16:17 +0000)]
Fix thinko in recent patch to change temp-table permissions behavior:
this is an aclmask function and does not have the same return convention
as aclcheck functions. Also adjust the behavior so that users without
CREATE TEMP permission still have USAGE permission on their session's
temp schema. This allows privileged code to create a temp table and
make it accessible to code that's not got the same privilege. (Since
the default permissions on a table are no-access, an explicit grant on
the table will still be needed; but I see no reason that the temp schema
itself should prohibit such access.)
Teodor Sigaev [Fri, 28 May 2004 10:43:32 +0000 (10:43 +0000)]
New version. Add support for int2, int8, float4, float8, timestamp with/without time zone, time with/without time zone, date, interval, oid, money and macaddr, char, varchar/text, bytea, numeric, bit, varbit, inet/cidr types for GiST
Tom Lane [Fri, 28 May 2004 05:13:32 +0000 (05:13 +0000)]
Code review for EXEC_BACKEND changes. Reduce the number of #ifdefs by
about a third, make it work on non-Windows platforms again. (But perhaps
I broke the WIN32 code, since I have no way to test that.) Fold all the
paths that fork postmaster child processes to go through the single
routine SubPostmasterMain, which takes care of resurrecting the state that
would normally be inherited from the postmaster (including GUC variables).
Clean up some places where there's no particularly good reason for the
EXEC and non-EXEC cases to work differently. Take care of one or two
FIXMEs that remained in the code.
Tom Lane [Thu, 27 May 2004 17:12:57 +0000 (17:12 +0000)]
Get rid of the former rather baroque mechanism for propagating the values
of ThisStartUpID and RedoRecPtr into new backends. It's a lot easier just
to make them all grab the values out of shared memory during startup.
This helps to decouple the postmaster from checkpoint execution, which I
need since I'm intending to let the bgwriter do it instead, and it also
fixes a bug in the Win32 port: ThisStartUpID wasn't getting propagated at
all AFAICS. (Doesn't give me a lot of faith in the amount of testing that
port has gotten.)
Tom Lane [Thu, 27 May 2004 03:30:11 +0000 (03:30 +0000)]
Recommend ALTER TABLE ... TYPE as the best way to reclaim space occupied by deleted columns. The old method involving UPDATE and VACUUM FULL will be considerably less efficient.
Tom Lane [Wed, 26 May 2004 19:44:15 +0000 (19:44 +0000)]
Reduce the minimum allocable chunk size to 8 bytes (from 16). Now that
ListCells are only 8 bytes instead of 12 (on 4-byte-pointer machines
anyway), it's worth maintaining a separate freelist for 8-byte objects.
Remembering that alloc chunks carry 8 bytes of overhead, this should
reduce the net storage requirement for a long List by about a third.
Bruce Momjian [Wed, 26 May 2004 18:51:43 +0000 (18:51 +0000)]
AIX doc addition:
> FWIW, the section on configuring kernel resources under various
> Unixen[1] doesn't have any documentation for AIX. If someone out there
> knows which knobs need to be tweaked, would they mind sending in a doc
> patch? (Or just specifying what needs to be done, and I'll add the
> SGML.)
After verifying that nobody wound up messing with the kernel
parameters, here's a docs patch...
Bruce Momjian [Wed, 26 May 2004 18:35:51 +0000 (18:35 +0000)]
*) inet_(client|server)_(addr|port)() and necessary documentation for
the four functions.
> Also, please justify the temp-related changes. I was not aware that we
> had any breakage there.
patch-tmp-schema.txt contains the following bits:
*) Changes pg_namespace_aclmask() so that the superuser is always able
to create objects in the temp namespace.
*) Changes pg_namespace_aclmask() so that if this is a temp namespace,
objects are only allowed to be created in the temp namespace if the
user has TEMP privs on the database. This encompasses all object
creation, not just TEMP tables.
*) InitTempTableNamespace() checks to see if the current user, not the
session user, has access to create a temp namespace.
The first two changes are necessary to support the third change. Now
it's possible to revoke all temp table privs from non-super users and
limiting all creation of temp tables/schemas via a function that's
executed with elevated privs (security definer). Before this change,
it was not possible to have a setuid function to create a temp
table/schema if the session user had no TEMP privs.
patch-area-path.txt contains:
*) Can now determine the area of a closed path.
patch-dfmgr.txt contains:
*) Small tweak to add the library path that's being expanded.
I was using $lib/foo.so and couldn't easily figure out what the error
message, "invalid macro name in dynamic library path" meant without
looking through the source code. With the path in there, at least I
know where to start looking in my config file.
Bruce Momjian [Wed, 26 May 2004 15:26:28 +0000 (15:26 +0000)]
The added aggregates are:
(1) boolean-and and boolean-or aggregates named bool_and and bool_or.
they (SHOULD;-) correspond to standard sql every and some/any aggregates.
they do not have the right name as there is a problem with
the standard and the parser for some/any. Tom also think that
the standard name is misleading because NULL are ignored.
Also add 'every' aggregate.
(2) bitwise integer aggregates named bit_and and bit_or for
int2, int4, int8 and bit types. They are not standard, but I find
them useful. I needed them once.
The patches adds:
- 2 new very short strict functions for boolean aggregates in
src/backed/utils/adt/bool.c,
src/include/utils/builtins.h and src/include/catalog/pg_proc.h
- the new aggregates declared in src/include/catalog/pg_proc.h and
src/include/catalog/pg_aggregate.h
- some documentation and validation about these new aggregates.
Bruce Momjian [Wed, 26 May 2004 15:07:41 +0000 (15:07 +0000)]
The patch adresses the TODO list item "Allow external interfaces to
extend the GUC variable set".
Plugin modules like the pl<lang> modules needs a way to declare
configuration parameters. The postmaster has no knowledge of such
modules when it reads the postgresql.conf file. Rather than allowing
totally unknown configuration parameters, the concept of a variable
"class" is introduced. Variables that belongs to a declared classes will
create a placeholder value of string type and will not generate an
error. When a module is loaded, it will declare variables for such a
class and make those variables "consume" any placeholders that has been
defined. Finally, the module will generate warnings for unrecognized
placeholders defined for its class.
More detail:
The design is outlined after the suggestions made by Tom Lane and Joe
Conway in this thread:
A new string variable 'custom_variable_classes' is introduced. This
variable is a comma separated string of identifiers. Each identifier
denots a 'class' that will allow its members to be added without error.
This variable must be defined in postmaster.conf.
The lexer (guc_file.l) is changed so that it can accept a qualified name
in the form <ID>.<ID> as the name of a variable. I also changed so that
the 'custom_variable_classes', if found, is added first of all variables
in order to remove the order of declaration issue.
The guc_variables table is made more dynamic. It is originally created
with 20% slack and can grow dynamically. A capacity is introduced to
avoid resizing every time a new variable is added. guc_variables and
num_guc_variables becomes static (hidden).
The GucInfoMain now uses the new function get_guc_variables() and
GetNumConfigOptions instead or using the guc_variables directly.
The find_option() function, when passed a missing name, will check if
the name is qualified. If the name is qualified and if the qualifier
denotes a class included in the 'custom_variable_classes', a placeholder
variable will be created. Such a placeholder will not participate in a
list operation but will otherwise function as a normal string variable.
Define<type>GucVariable() functions will be added, one for each variable
type. They are inteded to be used by add-on modules like the pl<lang>
mappings. Example:
(I created typedefs for the assign-hook and show-hook functions). A call
to these functions will define a new GUC-variable. If a placeholder
exists it will be replaced but it's value will be used in place of the
default value. The valueAddr is assumed ot point at a default value when
the define function is called. The only constraint that is imposed on a
Custom variable is that its name is qualified.
was added. This function should be called when a module has completed
its variable definitions. At that time, no placeholders should remain
for the class that the module uses. If they do, elog(INFO, ...) messages
will be issued to inform the user that unrecognized variables are
present.
Bruce Momjian [Wed, 26 May 2004 13:57:04 +0000 (13:57 +0000)]
This patch implement the TODO [ALTER DATABASE foo OWNER TO bar].
It was necessary to touch in grammar and create a new node to make home
to the new syntax. The command is also supported in E
CPG. Doc updates are attached too. Only superusers can change the owner
of the database. New owners don't need any aditional
privileges.
Neil Conway [Wed, 26 May 2004 04:41:50 +0000 (04:41 +0000)]
Reimplement the linked list data structure used throughout the backend.
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.
The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.