Teodor Sigaev [Fri, 9 Jun 2006 13:25:59 +0000 (13:25 +0000)]
Now ispell dictionary can eat dictionaries in MySpell format,
used by OpenOffice. Dictionaries are placed at
http://lingucomponent.openoffice.org/spell_dic.html
Dictionary automatically recognizes format of files.
Warning. MySpell's format has limitation with compound
word support: it's impossible to mark affix as
compound-only affix. So for norwegian, german etc
languages it's recommended to use original ispell format.
For that reason I don't want to remove my2ispell
scripts, it's has workaround at least for norwegian language.
Tom Lane [Thu, 8 Jun 2006 23:55:48 +0000 (23:55 +0000)]
Fix bootstrap.c so that database startup process and bgwriter properly release
LWLocks during a panic exit. This avoids the possible self-deadlock pointed
out by Qingqing Zhou. Also, I noted that an error during LoadFreeSpaceMap()
or BuildFlatFiles() would result in exit(0) which would leave the postmaster
thinking all is well. Added a critical section to ensure such errors don't
allow startup to proceed.
Backpatched to 8.1. The 8.0 code is a bit different and I'm not sure if the
problem exists there; given we've not seen this reported from the field, I'm
going to be conservative about backpatching any further.
Bruce Momjian [Thu, 8 Jun 2006 16:07:23 +0000 (16:07 +0000)]
Use simple URL's rather than text and a URL:
< recovery. See http://archives.postgresql.org/pgsql-patches/2005-04/msg00121.php.
> recovery.
> http://archives.postgresql.org/pgsql-patches/2005-04/msg00121.php
< Right now only one encoding is allowed per database. For a partial
< patch, see http://archives.postgresql.org/pgsql-hackers/2005-03/msg00932.php.
> Right now only one encoding is allowed per database.
> http://archives.postgresql.org/pgsql-hackers/2005-03/msg00932.php 459c460
< notify the protocol when a RESET CONNECTION command is used. See
> notify the protocol when a RESET CONNECTION command is used. 461d461
< for a partial implementation. 515c515
< See http://archives.postgresql.org/pgsql-patches/2006-02/msg00168.php.
> http://archives.postgresql.org/pgsql-patches/2006-02/msg00168.php 535c535
< See http://archives.postgresql.org/pgsql-hackers/2006-05/msg00988.php.
> http://archives.postgresql.org/pgsql-hackers/2006-05/msg00988.php 821c821
< See http://archives.postgresql.org/pgsql-patches/2005-07/msg00107.php.
> http://archives.postgresql.org/pgsql-patches/2005-07/msg00107.php 877c877
< Details at http://archives.postgresql.org/pgsql-hackers/2004-04/msg00818.php.
> http://archives.postgresql.org/pgsql-hackers/2004-04/msg00818.php
< See partially completed patch and additional work required at
< http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php.
> http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php 1297c1296
< See http://archives.postgresql.org/pgsql-patches/2006-05/msg00040.php.
> http://archives.postgresql.org/pgsql-patches/2006-05/msg00040.php 1311c1310,1311
< o Improve signal handling,
> o Improve signal handling
> 1312a1313
>
Bruce Momjian [Thu, 8 Jun 2006 15:41:22 +0000 (15:41 +0000)]
Add URL.
< * Support triggers on columns (Greg Sabino Mullane)
> * Support triggers on columns
>
> See http://archives.postgresql.org/pgsql-patches/2005-07/msg00107.php.
>
Tom Lane [Thu, 8 Jun 2006 14:58:33 +0000 (14:58 +0000)]
Remove obsolete comment about VACUUM FULL: it takes buffer content locks
now, and must do so to ensure bgwriter doesn't write a page that is in
process of being compacted.
Bruce Momjian [Thu, 8 Jun 2006 14:32:11 +0000 (14:32 +0000)]
/contrib/adminpack: More clearly identify renaming of existing backend
functions. I also found that pg_file_length was incorrectly documented
in the README as pg_file_size.
Bruce Momjian [Thu, 8 Jun 2006 02:42:44 +0000 (02:42 +0000)]
Add URL:
< Right now only one encoding is allowed per database.
> Right now only one encoding is allowed per database. For a partial
> patch, see http://archives.postgresql.org/pgsql-hackers/2005-03/msg00932.php.
Bruce Momjian [Thu, 8 Jun 2006 01:02:53 +0000 (01:02 +0000)]
Add entry:
> * Consider allowing control of upper/lower case folding of unquoted
> identifiers
>
> Details at http://archives.postgresql.org/pgsql-hackers/2004-04/msg00818.php.
Bruce Momjian [Wed, 7 Jun 2006 22:24:46 +0000 (22:24 +0000)]
Prepare code to be built by MSVC:
o remove many WIN32_CLIENT_ONLY defines
o add WIN32_ONLY_COMPILER define
o add 3rd argument to open() for portability
o add include/port/win32_msvc directory for
system includes
Tom Lane [Wed, 7 Jun 2006 18:49:03 +0000 (18:49 +0000)]
Per previous analysis, the most correct notion of SampleOverhead is that
it is just the total time to do INSTR_TIME_SET_CURRENT(), and not any of
the other code involved in InstrStartNode/InstrStopNode. Even though I
fear we may end up reverting this patch altogether, we may as well have
the most correct version in our CVS archive.
Tom Lane [Wed, 7 Jun 2006 17:08:07 +0000 (17:08 +0000)]
Remove "fuzzy comparison" logic in qsort comparison function for
choose_bitmap_and(). It was way too fuzzy --- per comment, it was meant to be
1% relative difference, but was actually coded as 0.01 absolute difference,
thus causing selectivities of say 0.001 and 0.000000000001 to be treated as
equal. I believe this thinko explains Maxim Boguk's recent complaint. While
we could change it to a relative test coded like compare_fuzzy_path_costs(),
there's a bigger problem here, which is that any fuzziness at all renders the
comparison function non-transitive, which could confuse qsort() to the point
of delivering completely wrong results. So forget the whole thing and just
do an exact comparison.
Tom Lane [Tue, 6 Jun 2006 17:59:58 +0000 (17:59 +0000)]
Make the planner estimate costs for nestloop inner indexscans on the basis
that the Mackert-Lohmann formula applies across all the repetitions of the
nestloop, not just each scan independently. We use the M-L formula to
estimate the number of pages fetched from the index as well as from the table;
that isn't what it was designed for, but it seems reasonably applicable
anyway. This makes large numbers of repetitions look much cheaper than
before, which accords with many reports we've received of overestimation
of the cost of a nestloop. Also, change the index access cost model to
charge random_page_cost per index leaf page touched, while explicitly
not counting anything for access to metapage or upper tree pages. This
may all need tweaking after we get some field experience, but in simple
tests it seems to be giving saner results than before. The main thing
is to get the infrastructure in place to let cost_index() and amcostestimate
functions take repeated scans into account at all. Per my recent proposal.
Note: this patch changes pg_proc.h, but I did not force initdb because
the changes are basically cosmetic --- the system does not look into
pg_proc to decide how to call an index amcostestimate function, and
there's no way to call such a function from SQL at all.
Bruce Momjian [Tue, 6 Jun 2006 16:27:23 +0000 (16:27 +0000)]
Add URL to RESET CONNECTION:
< notify the protocol when a RESET CONNECTION command is used.
> notify the protocol when a RESET CONNECTION command is used. See
> http://archives.postgresql.org/pgsql-patches/2006-04/msg00192.php
> for a partial implementation.
Teodor Sigaev [Tue, 6 Jun 2006 16:25:55 +0000 (16:25 +0000)]
Allow do not lexize words in substitution.
Docs will be submitted some later, now it's at
http://www.sai.msu.su/~megera/oddmuse/index.cgi/Thesaurus_dictionary
Tom Lane [Mon, 5 Jun 2006 20:56:33 +0000 (20:56 +0000)]
While making the seq_page_cost changes, I was struck by the fact that
cost_nonsequential_access() is really totally inappropriate for its only
remaining use, namely estimating I/O costs in cost_sort(). The routine
was designed on the assumption that disk caching might eliminate the need
for some re-reads on a random basis, but there's nothing very random in
that sense about sort's access pattern --- it'll always be picking up the
oldest outputs. If we had a good fix on the effective cache size we
might consider charging zero for I/O unless the sort temp file size
exceeds it, but that's probably putting much too much faith in the
parameter. Instead just drop the logic in favor of a fixed compromise
between seq_page_cost and random_page_cost per page of sort I/O.
Tom Lane [Mon, 5 Jun 2006 03:03:42 +0000 (03:03 +0000)]
Increase the default value of cpu_index_tuple_cost from 0.001 to 0.005.
This shouldn't affect simple indexscans much, while for bitmap scans that
are touching a lot of index rows, this seems to bring the estimates more
in line with reality. Per recent discussion.
Tom Lane [Mon, 5 Jun 2006 02:49:58 +0000 (02:49 +0000)]
Add a GUC parameter seq_page_cost, and use that everywhere we formerly
assumed that a sequential page fetch has cost 1.0. This patch doesn't
in itself change the system's behavior at all, but it opens the door to
people adopting other units of measurement for EXPLAIN costs. Also, if
we ever decide it's worth inventing per-tablespace access cost settings,
this change provides a workable intellectual framework for that.
Bruce Momjian [Sun, 4 Jun 2006 01:33:39 +0000 (01:33 +0000)]
Update:
< o Allow COPY to output from views
> o Allow COPY to output from SELECT 570c570
< Another idea would be to allow actual SELECT statements in a COPY.
> COPY should also be able to output views.
Tom Lane [Sat, 3 Jun 2006 17:36:10 +0000 (17:36 +0000)]
Don't choke during startup if the environment offers an invalid value
for LC_MESSAGES; instead, just press forward, leaving the effective setting
at 'C'. There is not any very good reason to complain when we are going
to replace the value soon with whatever postgresql.conf says. This change
should solve the occasionally-reported problem of initdb failing with
'failed to initialize lc_messages'; the current theory is that that is
a reflection of either wrong LANG/LC_MESSAGES or completely broken locale
support.
Bruce Momjian [Sat, 3 Jun 2006 04:00:01 +0000 (04:00 +0000)]
Record location of partial patch :
> * Allow WAL information to recover corrupted pg_controldata
>
> See partially completed patch and additional work required at
> http://archives.postgresql.org/pgsql-patches/2006-06/msg00025.php.
>
Tom Lane [Thu, 1 Jun 2006 00:15:36 +0000 (00:15 +0000)]
Fix up hack to suppress escape_string_warning so that it actually works
and there's only one place that's a kluge, ie, appendStringLiteralConn.
Note that pg_dump itself doesn't use appendStringLiteralConn, so its
behavior is not affected; only the other utility programs care.
Tom Lane [Wed, 31 May 2006 20:58:09 +0000 (20:58 +0000)]
Make PG_MODULE_MAGIC required in shared libraries that are loaded into
the server. Per discussion, there seems no point in a waiting period
before making this required.
Teodor Sigaev [Wed, 31 May 2006 14:05:31 +0000 (14:05 +0000)]
Add thesaurus dictionary which can replace N>0 lexemes by M>0 lexemes.
It required some changes in lexize algorithm, but interface with
dictionaries stays compatible with old dictionaries.
Funded by Georgia Public Library Service and LibLime, Inc.
Bruce Momjian [Wed, 31 May 2006 11:02:42 +0000 (11:02 +0000)]
Escape processing patch:
o turns off escape_string_warning in pg_dumpall.c
o optionally use E'' for \password (undocumented option?)
o honor standard_conforming-strings for \copy (but not
support literal E'' strings)
o optionally use E'' for \d commands
o turn off escape_string_warning for createdb, createuser,
droplang
Tom Lane [Tue, 30 May 2006 21:21:30 +0000 (21:21 +0000)]
Code review for magic-block patch. Remove separate header file pgmagic.h,
as this seems only likely to create headaches for module developers. Put
the macro in the pre-existing fmgr.h file instead. Avoid being too cute
about how many fields we can cram into a word, and avoid trying to fetch
from a library we've already unlinked.
Along the way, it occurred to me that the magic block really ought to be
'const' so it can be stored in the program text area. Do the same for
the existing data blocks for PG_FUNCTION_INFO_V1 functions.
Tom Lane [Tue, 30 May 2006 19:24:25 +0000 (19:24 +0000)]
Code review for EXPLAIN patch. Fix some typos, make it behave sanely
across multiple loops, get rid of the shaky assumption that exactly one
tuple is returned per node iteration.
Tom Lane [Tue, 30 May 2006 15:48:20 +0000 (15:48 +0000)]
Update ppport.h to not cause warnings with newest Perl versions.
This is just the minimal necessary change; we might want to adopt
later PPPort output instead.
Bruce Momjian [Tue, 30 May 2006 14:09:32 +0000 (14:09 +0000)]
Add pgmagic header block to store compile-time constants:
It now only checks four things:
Major version number (7.4 or 8.1 for example)
NAMEDATALEN
FUNC_MAX_ARGS
INDEX_MAX_KEYS
The three constants were chosen because:
1. We document them in the config page in the docs
2. We mark them as changable in pg_config_manual.h
3. Changing any of these will break some of the more popular modules:
FUNC_MAX_ARGS changes fmgr interface, every module uses this NAMEDATALEN
changes syscache interface, every PL as well as tsearch uses this
INDEX_MAX_KEYS breaks tsearch and anything using GiST.
Bruce Momjian [Tue, 30 May 2006 12:56:45 +0000 (12:56 +0000)]
Re-defines SHA2 symbols so that they would not conflict with certain
versions of OpenSSL. If your OpenSSL does not contain SHA2, then there
should be no conflict. But ofcourse, if someone upgrades OpenSSL,
server starts crashing.
Bruce Momjian [Tue, 30 May 2006 11:40:21 +0000 (11:40 +0000)]
Update PL documentation:
An article at WebProNews quoted from the PG docs as to the merits of
stored procedures. I have added a bit more material on their merits,
as well as making a few changes to improve the introductions to
PL/Perl and PL/Tcl.
Delay write of pg_stats file to once every five minutes, during
shutdown, or when requested by a backend:
It changes so the file is only written once every 5 minutes (changeable
of course, I just picked something) instead of once every half second.
It's still written when the stats collector shuts down, just as before.
And it is now also written on backend request. A backend requests a
rewrite by simply sending a special stats message. It operates on the
assumption that the backends aren't actually going to read the
statistics file very often, compared to how frequent it's written today.