Tom Lane [Fri, 12 Aug 2005 18:23:56 +0000 (18:23 +0000)]
Code & docs review for server instrumentation patch. File timestamps
should surely be timestamptz not timestamp; fix some but not all of the
holes in check_and_make_absolute(); other minor cleanup. Also put in
the missed catversion bump.
Tom Lane [Fri, 12 Aug 2005 14:34:14 +0000 (14:34 +0000)]
Change a couple of "can't happen" error messages to be a shade more
verbose when they do happen. The "left link changed unexpectedly"
one in particular has been seen more than once in the field.
Tom Lane [Fri, 12 Aug 2005 05:05:51 +0000 (05:05 +0000)]
Remove BufferBlockPointers array in favor of a base + (bufnum) * BLCKSZ
computation. On modern machines this is as fast if not faster, and we
don't have to clog the CPU's L2 cache with a tens-of-KB pointer array.
If we ever decide to adopt a more dynamic allocation method for shared
buffers, we'll probably have to revert this patch, but in the meantime
we might as well save a few bytes and nanoseconds. Per Qingqing Zhou.
Tom Lane [Fri, 12 Aug 2005 01:36:05 +0000 (01:36 +0000)]
Solve the problem of OID collisions by probing for duplicate OIDs
whenever we generate a new OID. This prevents occasional duplicate-OID
errors that can otherwise occur once the OID counter has wrapped around.
Duplicate relfilenode values are also checked for when creating new
physical files. Per my recent proposal.
Tom Lane [Thu, 11 Aug 2005 22:53:41 +0000 (22:53 +0000)]
EINTR return from connect() should be treated exactly the same as
EINPROGRESS, according to Florian Hars. I'm not completely convinced
but the spec does seem to read that way.
Tom Lane [Thu, 11 Aug 2005 21:11:50 +0000 (21:11 +0000)]
Autovacuum loose end mop-up. Provide autovacuum-specific vacuum cost
delay and limit, both as global GUCs and as table-specific entries in
pg_autovacuum. stats_reset_on_server_start is now OFF by default,
but a reset is forced if we did WAL replay. XID-wrap vacuums do not
ANALYZE, but do FREEZE if it's a template database. Alvaro Herrera
Tom Lane [Tue, 9 Aug 2005 22:47:03 +0000 (22:47 +0000)]
Extend pg_config to be able to report the build-time values of CC,
CPPFLAGS, CFLAGS, CFLAGS_SL, LDFLAGS, LDFLAGS_SL, and LIBS. Change it
so that invoking pg_config with no arguments reports all available
information, rather than just giving an error message. Per discussion.
Tom Lane [Tue, 9 Aug 2005 21:14:55 +0000 (21:14 +0000)]
Make backends that are reading the pgstats file verify each backend PID
against the PGPROC array. Anything in the file that isn't in PGPROC
gets rejected as being a stale entry. This should solve complaints about
stale entries in pg_stat_activity after a BETERM message has been dropped
due to overload.
Bruce Momjian [Tue, 9 Aug 2005 20:24:31 +0000 (20:24 +0000)]
Combine entries:
< inheritance
< * Allow enable_constraint_exclusion to work for UPDATE and DELETE queries
> inheritance, and allow it to work for UPDATE and DELETE queries
Tom Lane [Mon, 8 Aug 2005 23:39:01 +0000 (23:39 +0000)]
Fix crash when reading 'timezone = unknown' from postgresql.conf during
SIGHUP; it's not OK for an assign_hook to return a non-malloc'd string.
Problem was introduced during timezone library rewrite.
Tom Lane [Mon, 8 Aug 2005 19:17:23 +0000 (19:17 +0000)]
Modify AtEOXact_CatCache and AtEOXact_RelationCache to assume that the
ResourceOwner mechanism already released all reference counts for the
cache entries; therefore, we do not need to scan the catcache or relcache
at transaction end, unless we want to do it as a debugging crosscheck.
Do the crosscheck only in Assert mode. This is the same logic we had
previously installed in AtEOXact_Buffers to avoid overhead with large
numbers of shared buffers. I thought it'd be a good idea to do it here
too, in view of Kari Lavikka's recent report showing a real-world case
where AtEOXact_CatCache is taking a significant fraction of runtime.
Tom Lane [Mon, 8 Aug 2005 03:12:16 +0000 (03:12 +0000)]
Cause ShutdownPostgres to do a normal transaction abort during backend
exit, instead of trying to take shortcuts. Introduce some additional
shutdown callback routines to eliminate kluges like having ProcKill
be responsible for shutting down the buffer manager. Ensure that the
order of operations during shutdown is predictable and what you would
expect given the module layering.
Tom Lane [Sun, 7 Aug 2005 19:02:08 +0000 (19:02 +0000)]
Set shlib naming convention on Cygwin to 'cygFOO.dll', which appears
to be the platform standard. This should fix recursive-rule breakage
due to recent Makefile changes. Per discussion.
Tom Lane [Sun, 7 Aug 2005 18:47:19 +0000 (18:47 +0000)]
Fix count_usable_fds() to stop trying to open files once it reaches
max_files_per_process. Going further than that is just a waste of
cycles, and it seems that current Cygwin does not cope gracefully
with deliberately running the system out of FDs. Per Andrew Dunstan.
Tom Lane [Sat, 6 Aug 2005 20:41:58 +0000 (20:41 +0000)]
COPY performance improvements. Avoid calling CopyGetData for each input
character, tighten the inner loops of CopyReadLine and CopyReadAttribute,
arrange to parse out all the attributes of a line in just one call instead
of one CopyReadAttribute call per attribute, be smarter about which client
encodings require slow pg_encoding_mblen() loops. Also, clean up the
mishmash of static variables and overly-long parameter lists in favor of
passing around a single CopyState struct containing all the state data.
Original patch by Alon Goldshuv, reworked by Tom Lane.
Tom Lane [Thu, 4 Aug 2005 01:09:29 +0000 (01:09 +0000)]
ALTER TABLE OWNER must change the ownership of the table's rowtype too.
This was not especially critical before, but it is now that we track
ownership dependencies --- the dependency for the rowtype *must* shift
to the new owner. Spotted by Bernd Helmle.
Also fix a problem introduced by recent change to allow non-superusers
to do ALTER OWNER in some cases: if the table had a toast table, ALTER
OWNER failed *even for superusers*, because the test being applied would
conclude that the new would-be owner had no create rights on pg_toast.
A side-effect of the fix is to disallow changing the ownership of indexes
or toast tables separately from their parent table, which seems a good
idea on the whole.
Tom Lane [Tue, 2 Aug 2005 20:52:08 +0000 (20:52 +0000)]
Tweak BgBufferSync() so that a persistent write error on a dirty buffer
doesn't block the bgwriter from making progress writing out other buffers.
This was a hard problem in the context of the ARC/2Q design, but it's
trivial in the context of clock sweep ... just advance the sweep counter
before we try to write not after.
Tom Lane [Tue, 2 Aug 2005 19:02:32 +0000 (19:02 +0000)]
Clean up CREATE DATABASE processing to make it more robust and get rid
of special case for Windows port. Put a PG_TRY around most of createdb()
to ensure that we remove copied subdirectories on failure, even if the
failure happens while creating the pg_database row. (I think this explains
Oliver Siegmar's recent report.) Having done that, there's no need for
the fragile assumption that copydir() mustn't ereport(ERROR), so simplify
its API. Eliminate the old code that used system("cp ...") to copy
subdirectories, in favor of using copydir() on all platforms. This not
only should allow much better error reporting, but allows us to fsync
the created files before trusting that the copy has succeeded.
Tom Lane [Tue, 2 Aug 2005 15:16:27 +0000 (15:16 +0000)]
Add ERROR_NO_MORE_FILES workaround to check_data_dir(). This may or
may not be obsolete, but since every other readdir loop in our code
has it, I think this should too.
Bruce Momjian [Mon, 1 Aug 2005 14:05:03 +0000 (14:05 +0000)]
Done:
< o Allow objects to be moved to different schemas
> o -Allow objects to be moved to different schemas
Fix word wrap:
< * Allow GRANT/REVOKE permissions to be applied to all schema objects with one
< command
> o Allow GRANT/REVOKE permissions to be applied to all schema objects
> with one command
Tom Lane [Mon, 1 Aug 2005 04:03:59 +0000 (04:03 +0000)]
Add ALTER object SET SCHEMA capability for a limited but useful set of
object kinds (tables, functions, types). Documentation is not here yet.
Original code by Bernd Helmle, extensive rework by Bruce Momjian and
Tom Lane.
Bruce Momjian [Mon, 1 Aug 2005 00:52:27 +0000 (00:52 +0000)]
Add description:
< This would require a new global table that is dumped to flat file for
< use by the postmaster. We do a similar thing for pg_shadow currently.
> This would add a function to load the SQL table from
> pg_hba.conf, and one to writes its contents to the flat file.
> The table should have a line number that is a float so rows
> can be inserted between existing rows, e.g. row 2.5 goes
> between row 2 and row 3.
Tom Lane [Sun, 31 Jul 2005 17:19:22 +0000 (17:19 +0000)]
Add per-user and per-database connection limit options.
This patch also includes preliminary update of pg_dumpall for roles.
Petr Jelinek, with review by Bruce Momjian and Tom Lane.
Bruce Momjian [Sun, 31 Jul 2005 13:54:52 +0000 (13:54 +0000)]
Suggest syntax:
< o Allow postgresql.conf file values to be changed via an SQL API
> o Allow postgresql.conf file values to be changed via an SQL
> API, perhaps using SET GLOBAL
Bruce Momjian [Sat, 30 Jul 2005 14:52:04 +0000 (14:52 +0000)]
Please find attached diffs for documentation and simple regression
tests for the new interval->day changes. I added tests for
justify_hours() and justify_days() to interval.sql, as they take
interval input and produce interval output. If there's a more
appropriate place for them, please let me know.
Bruce Momjian [Sat, 30 Jul 2005 04:05:17 +0000 (04:05 +0000)]
Add constraint exclusion items:
<
> * Allow EXPLAIN to identify tables that were skipped because of
> enable_constraint_exclusion
> * Allow EXPLAIN output to be more easily processed by scripts 760a763
> * Allow enable_constraint_exclusion to work for UPDATE and DELETE queries
Bruce Momjian [Sat, 30 Jul 2005 03:15:22 +0000 (03:15 +0000)]
Add:
> * Add TRUNCATE permission
>
> Currently only the owner can TRUNCATE a table because triggers are not
> called, and the table is locked in exclusive mode.
>
Tom Lane [Fri, 29 Jul 2005 21:40:02 +0000 (21:40 +0000)]
Fix an oversight I introduced on 2003-12-28: find_nots/push_nots should
continue to recurse after eliminating a NOT-below-a-NOT, since the
contained subexpression will now be part of the top-level AND/OR structure
and so deserves to be simplified. The real-world impact of this is
probably minimal, since it'd require at least three levels of NOT to make
a difference, but it's still a bug.
Also remove some redundant tests for NULL subexpressions.
Tom Lane [Fri, 29 Jul 2005 19:30:09 +0000 (19:30 +0000)]
Clean up a number of autovacuum loose ends. Make the stats collector
track shared relations in a separate hashtable, so that operations done
from different databases are counted correctly. Add proper support for
anti-XID-wraparound vacuuming, even in databases that are never connected
to and so have no stats entries. Miscellaneous other bug fixes.
Alvaro Herrera, some additional fixes by Tom Lane.
Bruce Momjian [Fri, 29 Jul 2005 03:23:00 +0000 (03:23 +0000)]
Done:
< * Consider use of open/fcntl(O_DIRECT) to minimize OS caching,
< especially for WAL writes
> * -Consider use of open/fcntl(O_DIRECT) to minimize OS caching,
> for WAL writes
> If we disable writeback-cache and use open_sync, the per-page writing
> behavior in WAL module will show up as bad result. O_DIRECT is similar
> to O_DSYNC (at least on linux), so that the benefit of it will disappear
> behind the slow disk revolution.
>
> In the current source, WAL is written as:
> for (i = 0; i < N; i++) { write(&buffers[i], BLCKSZ); }
> Is this intentional? Can we rewrite it as follows?
> write(&buffers[0], N * BLCKSZ);
>
> In order to achieve it, I wrote a 'gather-write' patch (xlog.gw.diff).
> Aside from this, I'll also send the fixed direct io patch (xlog.dio.diff).
> These two patches are independent, so they can be applied either or both.
>
>
> I tested them on my machine and the results as follows. It shows that
> direct-io and gather-write is the best choice when writeback-cache is off.
> Are these two patches worth trying if they are used together?
>
>
> | writeback | fsync= | fdata | open_ | fsync_ | open_
> patch | cache | false | sync | sync | direct | direct
> ------------+-----------+--------+-------+-------+--------+---------
> direct io | off | 124.2 | 105.7 | 48.3 | 48.3 | 48.2
> direct io | on | 129.1 | 112.3 | 114.1 | 142.9 | 144.5
> gather-write| off | 124.3 | 108.7 | 105.4 | (N/A) | (N/A)
> both | off | 131.5 | 115.5 | 114.4 | 145.4 | 145.2
>
> - 20runs * pgbench -s 100 -c 50 -t 200
> - with tuning (wal_buffers=64, commit_delay=500, checkpoint_segments=8)
> - using 2 ATA disks:
> - hda(reiserfs) includes system and wal.
> - hdc(jfs) includes database files. writeback-cache is always on.
>
> ---
> ITAGAKI Takahiro
Bruce Momjian [Fri, 29 Jul 2005 03:17:55 +0000 (03:17 +0000)]
Thank you for applying patch --- regexp_replace.
An attached patch is a small additional improvement.
This patch use appendStringInfoText instead of appendStringInfoString.
There is an overhead of PG_TEXT_GET_STR when appendStringInfoString is
executed by text type. This can be reduced by appendStringInfoText.
Tom Lane [Thu, 28 Jul 2005 20:26:22 +0000 (20:26 +0000)]
Fix a bunch of bad interactions between partial indexes and the new
planning logic for bitmap indexscans. Partial indexes create corner
cases in which a scan might be done with no explicit index qual conditions,
and the code wasn't handling those cases nicely. Also be a little
tenser about eliminating redundant clauses in the generated plan.
Per report from Dmitry Karasik.