PostgreSQL TODO List
====================
Current maintainer: Bruce Momjian (pgman@candle.pha.pa.us)
-Last updated: Mon Jul 4 08:32:37 EDT 2005
+Last updated: Mon Jul 4 13:00:23 EDT 2005
The most recent version of this document can be viewed at
http://www.postgresql.org/docs/faqs.TODO.html.
* Remove behavior of postmaster -o after making postmaster/postgres
flags unique
* Allow limits on per-db/role connections
-* Allow server log information to be output as INSERT statements
-
- This would allow server log information to be easily loaded into
- a database for analysis.
-
-* Prevent dropping user that still owns objects, or auto-drop the objects
* Allow pooled connections to list all prepared queries
This would allow an application inheriting a pooled connection to know
* Allow major upgrades without dump/reload, perhaps using pg_upgrade
[pg_upgrade]
-* Allow GRANT/REVOKE permissions to be applied to all schema objects with one
- command
-
- The proposed syntax is:
- GRANT SELECT ON ALL TABLES IN public TO phpuser;
- GRANT SELECT ON NEW TABLES IN public TO phpuser;
-
-* Allow GRANT/REVOKE permissions to be inherited by objects based on
- schema permissions
* Check for unreferenced table files created by transactions that were
in-progress when the server terminated abruptly
-* Allow reporting of which objects are in which tablespaces
+* Allow administrators to safely terminate individual sessions either
+ via an SQL function or SIGTERM
- This item is difficult because a tablespace can contain objects from
- multiple databases. There is a server-side function that returns the
- databases which use a specific tablespace, so this requires a tool
- that will call that function and connect to each database to find the
- objects in each database for that tablespace.
+ Currently SIGTERM of a backend can lead to lock table corruption.
-* Allow a database in tablespace t1 with tables created in tablespace t2
- to be used as a template for a new database created with default
- tablespace t2
+* Prevent dropping user that still owns objects, or auto-drop the objects
+* Set proper permissions on non-system schemas during db creation
- All objects in the default database tablespace must have default tablespace
- specifications. This is because new databases are created by copying
- directories. If you mix default tablespace tables and tablespace-specified
- tables in the same directory, creating a new database from such a mixed
- directory would create a new database with tables that had incorrect
- explicit tablespaces. To fix this would require modifying pg_class in the
- newly copied database, which we don't currently do.
+ Currently all schemas are owned by the super-user because they are
+ copied from the template1 database.
-* Add a GUC variable to control the tablespace for temporary objects and
- sort files
+* -Add the client IP address and port to pg_stat_activity
+* Support table partitioning that allows a single table to be stored
+ in subtables that are partitioned based on the primary key or a WHERE
+ clause
- It could start with a random tablespace from a supplied list and cycle
- through the list.
-* Add ability to monitor the use of temporary sort files
-* Allow WAL replay of CREATE TABLESPACE to work when the directory
- structure on the recovery computer is different from the original
-* Add "include file" functionality in postgresql.conf
-* -Add session start time and last statement time to pg_stat_activity
-* Allow server logs to be remotely read using SQL commands
-* Allow pg_hba.conf settings to be controlled via SQL
+* Improve replication solutions
+ o Load balancing
- This would require a new global table that is dumped to flat file for
- use by the postmaster. We do a similar thing for pg_shadow currently.
+ You can use any of the master/slave replication servers to use a
+ standby server for data warehousing. To allow read/write queries to
+ multiple servers, you need multi-master replication like pgcluster.
-* Allow administrators to safely terminate individual sessions either
- via an SQL function or SIGTERM
+ o Allow replication over unreliable or non-persistent links
- Currently SIGTERM of a backend can lead to lock table corruption.
-* Un-comment all variables in postgresql.conf
+* Configuration files
+ o Add "include file" functionality in postgresql.conf
+ o Allow postgresql.conf values to be set so they can not be changed
+ by the user
+ o Allow commenting of variables in postgresql.conf to restore them
+ to defaults
+ o Allow pg_hba.conf settings to be controlled via SQL
- By not showing commented-out variables, we discourage people from
- thinking that re-commenting a variable returns it to its default.
- This has to address environment variables that are then overridden
- by config file values. Another option is to allow commented values
- to return to their default values.
+ This would require a new global table that is dumped to flat file for
+ use by the postmaster. We do a similar thing for pg_shadow currently.
-* Allow point-in-time recovery to archive partially filled write-ahead
- logs [pitr]
- Currently only full WAL files are archived. This means that the most
- recent transactions aren't available for recovery in case of a disk
- failure. This could be triggered by a user command or a timer.
+* Tablespaces
+ * Allow a database in tablespace t1 with tables created in
+ tablespace t2 to be used as a template for a new database created
+ with default tablespace t2
-* Automatically force archiving of partially-filled WAL files when
- pg_stop_backup() is called or the server is stopped
+ All objects in the default database tablespace must have default
+ tablespace specifications. This is because new databases are
+ created by copying directories. If you mix default tablespace
+ tables and tablespace-specified tables in the same directory,
+ creating a new database from such a mixed directory would create a
+ new database with tables that had incorrect explicit tablespaces.
+ To fix this would require modifying pg_class in the newly copied
+ database, which we don't currently do.
- Doing this will allow administrators to know more easily when the
- archive contins all the files needed for point-in-time recovery.
+ * Allow reporting of which objects are in which tablespaces
-* Create dump tool for write-ahead logs for use in determining
- transaction id for point-in-time recovery
-* Set proper permissions on non-system schemas during db creation
+ This item is difficult because a tablespace can contain objects
+ from multiple databases. There is a server-side function that
+ returns the databases which use a specific tablespace, so this
+ requires a tool that will call that function and connect to each
+ database to find the objects in each database for that tablespace.
- Currently all schemas are owned by the super-user because they are
- copied from the template1 database.
+ o Add a GUC variable to control the tablespace for temporary objects
+ and sort files
-* Add a function that returns the 'uptime' of the postmaster
-* Allow a warm standby system to also allow read-only queries [pitr]
+ It could start with a random tablespace from a supplied list and
+ cycle through the list.
- This is useful for checking PITR recovery.
+ o Allow WAL replay of CREATE TABLESPACE to work when the directory
+ structure on the recovery computer is different from the original
-* Allow the PITR process to be debugged and data examined
-* -Add the client IP address and port to pg_stat_activity
-* Improve replication solutions
- o Load balancing
+ o Allow per-tablespace quotas
- You can use any of the master/slave replication servers to use a
- standby server for data warehousing. To allow read/write queries to
- multiple servers, you need multi-master replication like pgcluster.
- o Allow replication over unreliable or non-persistent links
+* Point-in-time Recovery (PITR)
+ o Allow point-in-time recovery to archive partially filled
+ write-ahead logs [pitr]
-* Support table partitioning that allows a single table to be stored
- in subtables that are partitioned based on the primary key or a WHERE
- clause
-* Allow postgresql.conf values to be set so they can not be changed by
- the user
-* Allow per-tablespace quotas
+ Currently only full WAL files are archived. This means that the
+ most recent transactions aren't available for recovery in case
+ of a disk failure. This could be triggered by a user command or
+ a timer.
+
+ o Automatically force archiving of partially-filled WAL files when
+ pg_stop_backup() is called or the server is stopped
+
+ Doing this will allow administrators to know more easily when
+ the archive contins all the files needed for point-in-time
+ recovery.
+
+ o Create dump tool for write-ahead logs for use in determining
+ transaction id for point-in-time recovery
+ o Allow a warm standby system to also allow read-only queries
+ [pitr]
+
+ This is useful for checking PITR recovery.
+
+ o Allow the PITR process to be debugged and data examined
+
+
+Monitoring
+==========
+
+* Allow server log information to be output as INSERT statements
+
+ This would allow server log information to be easily loaded into
+ a database for analysis.
+
+* Add ability to monitor the use of temporary sort files
+* -Add session start time and last statement time to pg_stat_activity
+* Allow server logs to be remotely read using SQL commands
+* -Add a function that returns the start time of the postmaster
Data Types
The positive modulus result returned by NUMERICs might be considered
inaccurate, in one sense.
-* Add function to return compressed length of TOAST data values
-* Allow INET subnet tests using non-constants to be indexed
-* Add transaction_timestamp(), statement_timestamp(), clock_timestamp()
- functionality
-
- Current CURRENT_TIMESTAMP returns the start time of the current
- transaction, and gettimeofday() returns the wallclock time. This will
- make time reporting more consistent and will allow reporting of
- the statement start time.
-
* Have sequence dependency track use of DEFAULT sequences,
- seqname.nextval (?)
-* Disallow changing default expression of a SERIAL column (?)
+ seqname.nextval?
+* Disallow changing default expression of a SERIAL column?
* Allow infinite dates just like infinite timestamps
-* Have initdb set DateStyle based on locale?
-* Add pg_get_acldef(), pg_get_typedefault(), and pg_get_attrdef()
-* Allow to_char() to print localized month names
-* Allow functions to have a schema search path specified at creation time
-* Allow substring/replace() to get/set bit values
* Add a GUC variable to allow output of interval values in ISO8601 format
* Fix data types where equality comparison isn't intuitive, e.g. box
* Merge hardwired timezone names with the TZ database; allow either kind
* Allow INET + INT4 to increment the host part of the address, or
throw an error on overflow
* Add 'tid != tid ' operator for use in corruption recovery
-* Prevent to_char() on interval from returning meaningless values
-
- For example, to_char('1 month', 'mon') is meaningless. Basically,
- most date-related parameters to to_char() are meaningless for
- intervals because interval is not anchored to a date.
-* Allow to_char() on interval values to accumulate the highest unit
- requested
-
- Some special format flag would be required to request such
- accumulation. Such functionality could also be added to EXTRACT.
- Prevent accumulation that crosses the month/day boundary because of
- the uneven number of days in a month.
-
- o to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
- o to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
- o to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
- o to_char(INTERVAL '3 years 5 months','MM') => 41
* Add ISO INTERVAL handling
o Add support for day-time syntax, INTERVAL '1 2:03:04' DAY TO SECOND
INTERVAL '11 months' AS YEAR should return one or zero
o Support precision, CREATE TABLE foo (a INTERVAL MONTH(3))
+
* ARRAYS
o Allow NULLs in arrays
o Allow MIN()/MAX() on arrays
* BINARY DATA
- o Improve vacuum of large objects, like /contrib/vacuumlo (?)
+ o Improve vacuum of large objects, like /contrib/vacuumlo?
o Add security checking for large objects
Currently large objects entries do not have owners. Permissions can
This requires the TOAST column to be stored EXTERNAL.
+Functions
+=========
+
+* Add function to return compressed length of TOAST data values
+* Allow INET subnet tests using non-constants to be indexed
+* Add transaction_timestamp(), statement_timestamp(), clock_timestamp()
+ functionality
+
+ Current CURRENT_TIMESTAMP returns the start time of the current
+ transaction, and gettimeofday() returns the wallclock time. This will
+ make time reporting more consistent and will allow reporting of
+ the statement start time.
+
+* Add pg_get_acldef(), pg_get_typedefault(), and pg_get_attrdef()
+* Allow to_char() to print localized month names
+* Allow functions to have a schema search path specified at creation time
+* Allow substring/replace() to get/set bit values
+* Allow to_char() on interval values to accumulate the highest unit
+ requested
+
+ Some special format flag would be required to request such
+ accumulation. Such functionality could also be added to EXTRACT.
+ Prevent accumulation that crosses the month/day boundary because of
+ the uneven number of days in a month.
+
+ o to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
+ o to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
+ o to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
+ o to_char(INTERVAL '3 years 5 months','MM') => 41
+
+* Prevent to_char() on interval from returning meaningless values
+
+ For example, to_char('1 month', 'mon') is meaningless. Basically,
+ most date-related parameters to to_char() are meaningless for
+ intervals because interval is not anchored to a date.
+
+
Multi-Language Support
======================
Right now only one encoding is allowed per database.
* Support multiple simultaneous character sets, per SQL92
-* Improve UTF8 combined character handling (?)
+* Improve UTF8 combined character handling?
* Add octet_length_server() and octet_length_client()
* Make octet_length_client() the same as octet_length()?
* Allow RULE recompilation
-Indexes
-=======
-
-* Allow inherited tables to inherit index, UNIQUE constraint, and primary
- key, foreign key
-* UNIQUE INDEX on base column not honored on INSERTs/UPDATEs from
- inherited table: INSERT INTO inherit_table (unique_index_col) VALUES
- (dup) should fail
-
- The main difficulty with this item is the problem of creating an index
- that can span more than one table.
-
-* Add UNIQUE capability to non-btree indexes
-* Add more gist index support for geometric data types
-* -Use indexes for MIN() and MAX()
-
- MIN/MAX queries can already be rewritten as SELECT col FROM tab ORDER
- BY col {DESC} LIMIT 1. Completing this item involves doing this
- transformation automatically.
-
-* -Use index to restrict rows returned by multi-key index when used with
- non-consecutive keys to reduce heap accesses
-
- For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and
- col3 = 9, spin though the index checking for col1 and col3 matches,
- rather than just col1; also called skip-scanning.
-
-* Prevent index uniqueness checks when UPDATE does not modify the column
-
- Uniqueness (index) checks are done when updating a column even if the
- column is not modified by the UPDATE.
-
-* Fetch heap pages matching index entries in sequential order
-
- Rather than randomly accessing heap pages based on index entries, mark
- heap pages needing access in a bitmap and do the lookups in sequential
- order. Another method would be to sort heap ctids matching the index
- before accessing the heap rows.
-
-* -Allow non-bitmap indexes to be combined by creating bitmaps in memory
-
- This feature allows separate indexes to be ANDed or ORed together. This
- is particularly useful for data warehousing applications that need to
- query the database in an many permutations. This feature scans an index
- and creates an in-memory bitmap, and allows that bitmap to be combined
- with other bitmap created in a similar way. The bitmap can either index
- all TIDs, or be lossy, meaning it records just page numbers and each
- page tuple has to be checked for validity in a separate pass.
-
-* Allow the creation of on-disk bitmap indexes which can be quickly
- combined with other bitmap indexes
-
- Such indexes could be more compact if there are only a few distinct values.
- Such indexes can also be compressed. Keeping such indexes updated can be
- costly.
-
-* Allow use of indexes to search for NULLs
-
- One solution is to create a partial index on an IS NULL expression.
-
-* -Add concurrency to GIST
-* Pack hash index buckets onto disk pages more efficiently
-
- Currently no only one hash bucket can be stored on a page. Ideally
- several hash buckets could be stored on a single page and greater
- granularity used for the hash algorithm.
-
-* Consider sorting hash buckets so entries can be found using a binary
- search, rather than a linear scan
-* In hash indexes, consider storing the hash value with or instead
- of the key itself
-* Allow accurate statistics to be collected on indexes with more than
- one column or expression indexes, perhaps using per-index statistics
-* Add fillfactor to control reserved free space during index creation
-* Allow the creation of indexes with mixed ascending/descending specifiers
-* -Fix incorrect rtree results due to wrong assumptions about "over"
- operator semantics
-* Allow GIST indexes to create certain complex index types, like digital
- trees (see Aoki)
-
-
Commands
========
* Change LIMIT/OFFSET and FETCH/MOVE to use int8
* Allow CREATE TABLE AS to determine column lengths for complex
expressions like SELECT col1 || col2
-* Allow UPDATE to handle complex aggregates [update] (?)
-* Allow backslash handling in quoted strings to be disabled for portability
-
- The use of C-style backslashes (.e.g. \n, \r) in quoted strings is not
- SQL-spec compliant, so allow such handling to be disabled. However,
- disabling backslashes could break many third-party applications and tools.
+* Allow UPDATE to handle complex aggregates [update]?
+* -Add E'' escape string marker so eventually ordinary strings can treat
+ backslashes literally, for portability
* Allow an alias to be provided for the target table in UPDATE/DELETE
functionality in DELETE. It's been agreed that the keyword should
be USING, to avoid anything as confusing as DELETE FROM a FROM b.
+* Have initdb set DateStyle based on locale?
* Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT
* -Allow REINDEX to rebuild all database indexes
* Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
Currently the system uses the operating system COPY command to create
a new database.
-* Add C code to copy directories for use in creating new databases
+* Add C code on Unix to copy directories for use in creating new databases
* Have pg_ctl look at PGHOST in case it is a socket directory?
* Allow pg_ctl to work properly with configuration files located outside
the PGDATA directory
allow pg_ctl to read and understand postgresql.conf to find the
data_directory value.
-* Allow column-level GRANT/REVOKE privileges
* Add a GUC variable to warn about non-standard SQL usage in queries
* Add MERGE command that does UPDATE/DELETE, or on failure, INSERT (rules,
triggers?)
creation
* Add COMMENT ON for all cluster global objects (roles, databases
and tablespaces)
-* Add an option to automatically use savepoints for each statement in a
+* -Add an option to automatically use savepoints for each statement in a
multi-statement transaction.
When enabled, this would allow errors in multi-statement transactions
* Allow FOR UPDATE queries to do NOWAIT locks
* Add GUC to issue notice about queries that use unjoined tables
+
* ALTER
o Have ALTER TABLE RENAME rename SERIAL sequence names
o Add ALTER DOMAIN TYPE
processed, with ROLLBACK on COPY failure.
o -Allow COPY to understand \x as a hex byte
- o Have COPY return the number of rows loaded/unloaded (?)
+ o Have COPY return the number of rows loaded/unloaded?
o -Allow COPY to optionally include column headings in the first line
o -Allow COPY FROM ... CSV to interpret newlines and carriage
returns in data
+* GRANT/REVOKE
+ o Allow column-level privileges
+ * Allow GRANT/REVOKE permissions to be applied to all schema objects with one
+ command
+
+ The proposed syntax is:
+ GRANT SELECT ON ALL TABLES IN public TO phpuser;
+ GRANT SELECT ON NEW TABLES IN public TO phpuser;
+
+ * Allow GRANT/REVOKE permissions to be inherited by objects based on
+ schema permissions
+
+
* CURSOR
o Allow UPDATE/DELETE WHERE CURRENT OF cursor
and no FOR UPDATE lock.
o Prevent DROP TABLE from dropping a row referenced by its own open
- cursor (?)
+ cursor?
o Allow pooled connections to list all open WITH HOLD cursors
o -Have SHOW ALL show descriptions for server-side variables
o Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM
ANALYZE, and CLUSTER
- o Add SET PATH for schemas (?)
+ o Add SET PATH for schemas?
This is basically the same as SET search_path.
o Allow function parameters to be passed by name,
get_employee_salary(emp_id => 12345, tax_year => 2001)
o Add Oracle-style packages
- o Add table function support to pltcl, plperl, plpython (?)
+ o Add table function support to pltcl, plperl, plpython?
o Allow PL/pgSQL to name columns by ordinal position, e.g. rec.(3)
o -Allow PL/pgSQL EXECUTE query_var INTO record_var;
o Add capability to create and call PROCEDURES
=======
* Add a libpq function to support Parse/DescribeStatement capability
-* Prevent libpq's PQfnumber() from lowercasing the column name (?)
+* Prevent libpq's PQfnumber() from lowercasing the column name?
* Allow libpq to access SQLSTATE so pg_ctl can test for connection failure
This would be used for checking if the server is up.
* Allow psql \pset boolean variables to set to fixed values, rather than toggle
* Consistently display privilege information for all objects in psql
* Improve psql's handling of multi-line queries
+
+
* pg_dump
o Have pg_dump use multi-statement transactions for INSERT dumps
o Allow pg_dump to use multiple -t and -n switches [pg_dump]
o Add CSV output format
o Update pg_dump and psql to use the new COPY libpq API (Christopher)
+
* ECPG
o Docs
Document differences between ecpg and the SQL standard and
information about the Informix-compatibility module.
- o Solve cardinality > 1 for input descriptors / variables (?)
+ o Solve cardinality > 1 for input descriptors / variables?
o Add a semantic check level, e.g. check if a table really exists
o fix handling of DB attributes that are arrays
o Use backend PREPARE/EXECUTE facility for ecpg where possible
* -Implement shared row locks and use them in RI triggers
* Enforce referential integrity for system tables
* Change foreign key constraint for array -> element to mean element
- in array (?)
-* Allow DEFERRABLE UNIQUE constraints (?)
+ in array?
+* Allow DEFERRABLE UNIQUE constraints?
* Allow triggers to be disabled [trigger]
Currently the only way to disable triggers is to modify the system
o Allow nested schemas
-PERFORMANCE
-===========
+Indexes
+=======
+
+* Allow inherited tables to inherit index, UNIQUE constraint, and primary
+ key, foreign key
+* UNIQUE INDEX on base column not honored on INSERTs/UPDATEs from
+ inherited table: INSERT INTO inherit_table (unique_index_col) VALUES
+ (dup) should fail
+
+ The main difficulty with this item is the problem of creating an index
+ that can span more than one table.
+
+* Add UNIQUE capability to non-btree indexes
+* Add more GIST index support for geometric data types
+* -Use indexes for MIN() and MAX()
+
+ MIN/MAX queries can already be rewritten as SELECT col FROM tab ORDER
+ BY col {DESC} LIMIT 1. Completing this item involves doing this
+ transformation automatically.
+
+* -Use index to restrict rows returned by multi-key index when used with
+ non-consecutive keys to reduce heap accesses
+
+ For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and
+ col3 = 9, spin though the index checking for col1 and col3 matches,
+ rather than just col1; also called skip-scanning.
+
+* Prevent index uniqueness checks when UPDATE does not modify the column
+
+ Uniqueness (index) checks are done when updating a column even if the
+ column is not modified by the UPDATE.
+
+* Fetch heap pages matching index entries in sequential order
+
+ Rather than randomly accessing heap pages based on index entries, mark
+ heap pages needing access in a bitmap and do the lookups in sequential
+ order. Another method would be to sort heap ctids matching the index
+ before accessing the heap rows.
+
+* -Allow non-bitmap indexes to be combined by creating bitmaps in memory
+
+ This feature allows separate indexes to be ANDed or ORed together. This
+ is particularly useful for data warehousing applications that need to
+ query the database in an many permutations. This feature scans an index
+ and creates an in-memory bitmap, and allows that bitmap to be combined
+ with other bitmap created in a similar way. The bitmap can either index
+ all TIDs, or be lossy, meaning it records just page numbers and each
+ page tuple has to be checked for validity in a separate pass.
+
+* Allow the creation of on-disk bitmap indexes which can be quickly
+ combined with other bitmap indexes
+
+ Such indexes could be more compact if there are only a few distinct values.
+ Such indexes can also be compressed. Keeping such indexes updated can be
+ costly.
+
+* Allow use of indexes to search for NULLs
+
+ One solution is to create a partial index on an IS NULL expression.
+
+* -Add concurrency to GIST
+* Allow accurate statistics to be collected on indexes with more than
+ one column or expression indexes, perhaps using per-index statistics
+* Add fillfactor to control reserved free space during index creation
+* Allow the creation of indexes with mixed ascending/descending specifiers
+* -Fix incorrect rtree results due to wrong assumptions about "over"
+ operator semantics
+* Allow GIST indexes to create certain complex index types, like digital
+ trees (see Aoki)
+
+
+* Hash
+ o Pack hash index buckets onto disk pages more efficiently
+
+ Currently no only one hash bucket can be stored on a page. Ideally
+ several hash buckets could be stored on a single page and greater
+ granularity used for the hash algorithm.
+
+ o Consider sorting hash buckets so entries can be found using a
+ binary search, rather than a linear scan
+
+ o In hash indexes, consider storing the hash value with or instead
+ of the key itself
Fsync
* Add program to test if fsync has a delay compared to non-fsync
-Cache
-=====
+Cache Usage
+===========
* Allow free-behind capability for large sequential scans, perhaps using
posix_fadvise()
* Fix priority ordering of read and write light-weight locks (Neil)
-Startup Time
-============
+Startup Time Improvements
+=========================
* Experiment with multi-threaded backend [thread]
eliminated from point-in-time archive files.
* Reduce WAL traffic so only modified values are written rather than
- entire rows (?)
+ entire rows?
* Turn off full page writes if fsync is disabled
If fsync is off, there is no purpose in writing full pages to WAL
* ANALYZE should record a pg_statistic entry for an all-NULL column
* Log queries where the optimizer row estimates were dramatically
- different from the number of rows actually found (?)
+ different from the number of rows actually found?
-Miscellaneous
-=============
+Miscellaneous Performance
+=========================
* Do async I/O for faster random read-ahead of data
Async I/O allows multiple I/O requests to be sent to the disk with
results coming back asynchronously.
-* Use mmap() rather than SYSV shared memory or to write WAL files (?)
+* Use mmap() rather than SYSV shared memory or to write WAL files?
This would remove the requirement for SYSV SHM but would introduce
portability issues. Anonymous mmap (or mmap to /dev/zero) is required
<body bgcolor="#FFFFFF" text="#000000" link="#FF0000" vlink="#A00000" alink="#0000FF">
<h1><a name="section_1">PostgreSQL TODO List</a></h1>
<p>Current maintainer: Bruce Momjian (<a href="mailto:pgman@candle.pha.pa.us">pgman@candle.pha.pa.us</a>)<br/>
-Last updated: Mon Jul 4 08:32:37 EDT 2005
+Last updated: Mon Jul 4 13:00:23 EDT 2005
</p>
<p>The most recent version of this document can be viewed at<br/>
<a href="http://www.postgresql.org/docs/faqs.TODO.html">http://www.postgresql.org/docs/faqs.TODO.html</a>.
<li>Remove behavior of postmaster -o after making postmaster/postgres
flags unique
</li><li>Allow limits on per-db/role connections
- </li><li>Allow server log information to be output as INSERT statements
-<p> This would allow server log information to be easily loaded into
- a database for analysis.
-</p>
- </li><li>Prevent dropping user that still owns objects, or auto-drop the objects
</li><li>Allow pooled connections to list all prepared queries
<p> This would allow an application inheriting a pooled connection to know
the queries prepared in the current session.
</p>
</li><li>Allow major upgrades without dump/reload, perhaps using pg_upgrade
[<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?pg_upgrade">pg_upgrade</a>]
- </li><li>Allow GRANT/REVOKE permissions to be applied to all schema objects with one
- command
-<p> The proposed syntax is:
-</p><p> GRANT SELECT ON ALL TABLES IN public TO phpuser;
- GRANT SELECT ON NEW TABLES IN public TO phpuser;
-</p>
- </li><li>Allow GRANT/REVOKE permissions to be inherited by objects based on
- schema permissions
</li><li>Check for unreferenced table files created by transactions that were
in-progress when the server terminated abruptly
- </li><li>Allow reporting of which objects are in which tablespaces
-<p> This item is difficult because a tablespace can contain objects from
- multiple databases. There is a server-side function that returns the
- databases which use a specific tablespace, so this requires a tool
- that will call that function and connect to each database to find the
- objects in each database for that tablespace.
-</p>
- </li><li>Allow a database in tablespace t1 with tables created in tablespace t2
- to be used as a template for a new database created with default
- tablespace t2
-<p> All objects in the default database tablespace must have default tablespace
- specifications. This is because new databases are created by copying
- directories. If you mix default tablespace tables and tablespace-specified
- tables in the same directory, creating a new database from such a mixed
- directory would create a new database with tables that had incorrect
- explicit tablespaces. To fix this would require modifying pg_class in the
- newly copied database, which we don't currently do.
-</p>
- </li><li>Add a GUC variable to control the tablespace for temporary objects and
- sort files
-<p> It could start with a random tablespace from a supplied list and cycle
- through the list.
-</p>
- </li><li>Add ability to monitor the use of temporary sort files
- </li><li>Allow WAL replay of CREATE TABLESPACE to work when the directory
- structure on the recovery computer is different from the original
- </li><li>Add "include file" functionality in postgresql.conf
- </li><li>-<em>Add session start time and last statement time to pg_stat_activity</em>
- </li><li>Allow server logs to be remotely read using SQL commands
- </li><li>Allow pg_hba.conf settings to be controlled via SQL
-<p> This would require a new global table that is dumped to flat file for
- use by the postmaster. We do a similar thing for pg_shadow currently.
-</p>
</li><li>Allow administrators to safely terminate individual sessions either
via an SQL function or SIGTERM
<p> Currently SIGTERM of a backend can lead to lock table corruption.
</p>
- </li><li>Un-comment all variables in postgresql.conf
-<p> By not showing commented-out variables, we discourage people from
- thinking that re-commenting a variable returns it to its default.
- This has to address environment variables that are then overridden
- by config file values. Another option is to allow commented values
- to return to their default values.
-</p>
- </li><li>Allow point-in-time recovery to archive partially filled write-ahead
- logs [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?pitr">pitr</a>]
-<p> Currently only full WAL files are archived. This means that the most
- recent transactions aren't available for recovery in case of a disk
- failure. This could be triggered by a user command or a timer.
-</p>
- </li><li>Automatically force archiving of partially-filled WAL files when
- pg_stop_backup() is called or the server is stopped
-<p> Doing this will allow administrators to know more easily when the
- archive contins all the files needed for point-in-time recovery.
-</p>
- </li><li>Create dump tool for write-ahead logs for use in determining
- transaction id for point-in-time recovery
+ </li><li>Prevent dropping user that still owns objects, or auto-drop the objects
</li><li>Set proper permissions on non-system schemas during db creation
<p> Currently all schemas are owned by the super-user because they are
copied from the template1 database.
</p>
- </li><li>Add a function that returns the 'uptime' of the postmaster
- </li><li>Allow a warm standby system to also allow read-only queries [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?pitr">pitr</a>]
-<p> This is useful for checking PITR recovery.
-</p>
- </li><li>Allow the PITR process to be debugged and data examined
</li><li>-<em>Add the client IP address and port to pg_stat_activity</em>
+ </li><li>Support table partitioning that allows a single table to be stored
+ in subtables that are partitioned based on the primary key or a WHERE
+ clause
</li><li>Improve replication solutions
<ul>
<li>Load balancing
</p>
</li><li>Allow replication over unreliable or non-persistent links
</li></ul>
- </li><li>Support table partitioning that allows a single table to be stored
- in subtables that are partitioned based on the primary key or a WHERE
- clause
- </li><li>Allow postgresql.conf values to be set so they can not be changed by
- the user
- </li><li>Allow per-tablespace quotas
+ </li><li>Configuration files
+ <ul>
+ <li>Add "include file" functionality in postgresql.conf
+ </li><li>Allow postgresql.conf values to be set so they can not be changed
+ by the user
+ </li><li>Allow commenting of variables in postgresql.conf to restore them
+ to defaults
+ </li><li>Allow pg_hba.conf settings to be controlled via SQL
+<p> This would require a new global table that is dumped to flat file for
+ use by the postmaster. We do a similar thing for pg_shadow currently.
+</p>
+ </li></ul>
+ </li><li>Tablespaces
+ <ul>
+ <li>Allow a database in tablespace t1 with tables created in
+ tablespace t2 to be used as a template for a new database created
+ with default tablespace t2
+<p> All objects in the default database tablespace must have default
+ tablespace specifications. This is because new databases are
+ created by copying directories. If you mix default tablespace
+ tables and tablespace-specified tables in the same directory,
+ creating a new database from such a mixed directory would create a
+ new database with tables that had incorrect explicit tablespaces.
+ To fix this would require modifying pg_class in the newly copied
+ database, which we don't currently do.
+</p>
+ </li><li>Allow reporting of which objects are in which tablespaces
+<p> This item is difficult because a tablespace can contain objects
+ from multiple databases. There is a server-side function that
+ returns the databases which use a specific tablespace, so this
+ requires a tool that will call that function and connect to each
+ database to find the objects in each database for that tablespace.
+</p>
+ <ul>
+ <li>Add a GUC variable to control the tablespace for temporary objects
+ and sort files
+<p> It could start with a random tablespace from a supplied list and
+ cycle through the list.
+</p>
+ </li><li>Allow WAL replay of CREATE TABLESPACE to work when the directory
+ structure on the recovery computer is different from the original
+ </li><li>Allow per-tablespace quotas
+ </li></ul>
+ </li></ul>
+ </li><li>Point-in-time Recovery (PITR)
+ <ul>
+ <li>Allow point-in-time recovery to archive partially filled
+ write-ahead logs [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?pitr">pitr</a>]
+<p> Currently only full WAL files are archived. This means that the
+ most recent transactions aren't available for recovery in case
+ of a disk failure. This could be triggered by a user command or
+ a timer.
+</p>
+ </li><li>Automatically force archiving of partially-filled WAL files when
+ pg_stop_backup() is called or the server is stopped
+<p> Doing this will allow administrators to know more easily when
+ the archive contins all the files needed for point-in-time
+ recovery.
+</p>
+ </li><li>Create dump tool for write-ahead logs for use in determining
+ transaction id for point-in-time recovery
+ </li><li>Allow a warm standby system to also allow read-only queries
+ [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?pitr">pitr</a>]
+<p> This is useful for checking PITR recovery.
+</p>
+ </li><li>Allow the PITR process to be debugged and data examined
+ </li></ul>
+</li></ul>
+<h1><a name="section_3">Monitoring</a></h1>
+
+<ul>
+ <li>Allow server log information to be output as INSERT statements
+<p> This would allow server log information to be easily loaded into
+ a database for analysis.
+</p>
+ </li><li>Add ability to monitor the use of temporary sort files
+ </li><li>-<em>Add session start time and last statement time to pg_stat_activity</em>
+ </li><li>Allow server logs to be remotely read using SQL commands
+ </li><li>-<em>Add a function that returns the start time of the postmaster</em>
</li></ul>
-<h1><a name="section_3">Data Types</a></h1>
+<h1><a name="section_4">Data Types</a></h1>
<ul>
<li>Remove Money type, add money formatting for decimal type
inaccurate, in one sense.
</p>
<ul>
- <li>Add function to return compressed length of TOAST data values
- </li><li>Allow INET subnet tests using non-constants to be indexed
- </li><li>Add transaction_timestamp(), statement_timestamp(), clock_timestamp()
- functionality
-<p> Current CURRENT_TIMESTAMP returns the start time of the current
- transaction, and gettimeofday() returns the wallclock time. This will
- make time reporting more consistent and will allow reporting of
- the statement start time.
-</p>
- </li><li>Have sequence dependency track use of DEFAULT sequences,
- seqname.nextval (?)
- </li><li>Disallow changing default expression of a SERIAL column (?)
+ <li>Have sequence dependency track use of DEFAULT sequences,
+ seqname.nextval?
+ </li><li>Disallow changing default expression of a SERIAL column?
</li><li>Allow infinite dates just like infinite timestamps
- </li><li>Have initdb set DateStyle based on locale?
- </li><li>Add pg_get_acldef(), pg_get_typedefault(), and pg_get_attrdef()
- </li><li>Allow to_char() to print localized month names
- </li><li>Allow functions to have a schema search path specified at creation time
- </li><li>Allow substring/replace() to get/set bit values
</li><li>Add a GUC variable to allow output of interval values in ISO8601 format
</li><li>Fix data types where equality comparison isn't intuitive, e.g. box
</li><li>Merge hardwired timezone names with the TZ database; allow either kind
</li><li>Allow INET + INT4 to increment the host part of the address, or
throw an error on overflow
</li><li>Add 'tid != tid ' operator for use in corruption recovery
- </li><li>Prevent to_char() on interval from returning meaningless values
-<p> For example, to_char('1 month', 'mon') is meaningless. Basically,
- most date-related parameters to to_char() are meaningless for
- intervals because interval is not anchored to a date.
-</p>
- </li><li>Allow to_char() on interval values to accumulate the highest unit
- requested
-<p> Some special format flag would be required to request such
- accumulation. Such functionality could also be added to EXTRACT.
- Prevent accumulation that crosses the month/day boundary because of
- the uneven number of days in a month.
-</p>
- <ul>
- <li>to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
- </li><li>to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
- </li><li>to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
- </li><li>to_char(INTERVAL '3 years 5 months','MM') => 41
- </li></ul>
</li><li>Add ISO INTERVAL handling
<ul>
<li>Add support for day-time syntax, INTERVAL '1 2:03:04' DAY TO SECOND
</li></ul>
</li><li>BINARY DATA
<ul>
- <li>Improve vacuum of large objects, like /contrib/vacuumlo (?)
+ <li>Improve vacuum of large objects, like /contrib/vacuumlo?
</li><li>Add security checking for large objects
<p> Currently large objects entries do not have owners. Permissions can
only be set at the pg_largeobject table level.
</p>
</li></ul>
</li></ul>
-<h1><a name="section_4">Multi-Language Support</a></h1>
+<h1><a name="section_5">Functions</a></h1>
+
+<ul>
+ <li>Add function to return compressed length of TOAST data values
+ </li><li>Allow INET subnet tests using non-constants to be indexed
+ </li><li>Add transaction_timestamp(), statement_timestamp(), clock_timestamp()
+ functionality
+<p> Current CURRENT_TIMESTAMP returns the start time of the current
+ transaction, and gettimeofday() returns the wallclock time. This will
+ make time reporting more consistent and will allow reporting of
+ the statement start time.
+</p>
+ </li><li>Add pg_get_acldef(), pg_get_typedefault(), and pg_get_attrdef()
+ </li><li>Allow to_char() to print localized month names
+ </li><li>Allow functions to have a schema search path specified at creation time
+ </li><li>Allow substring/replace() to get/set bit values
+ </li><li>Allow to_char() on interval values to accumulate the highest unit
+ requested
+<p> Some special format flag would be required to request such
+ accumulation. Such functionality could also be added to EXTRACT.
+ Prevent accumulation that crosses the month/day boundary because of
+ the uneven number of days in a month.
+</p>
+ <ul>
+ <li>to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
+ </li><li>to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
+ </li><li>to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
+ </li><li>to_char(INTERVAL '3 years 5 months','MM') => 41
+ </li></ul>
+ </li><li>Prevent to_char() on interval from returning meaningless values
+<p> For example, to_char('1 month', 'mon') is meaningless. Basically,
+ most date-related parameters to to_char() are meaningless for
+ intervals because interval is not anchored to a date.
+</p>
+</li></ul>
+<h1><a name="section_6">Multi-Language Support</a></h1>
<ul>
<li>Add NCHAR (as distinguished from ordinary varchar),
<p> Right now only one encoding is allowed per database.
</p>
</li><li>Support multiple simultaneous character sets, per SQL92
- </li><li>Improve UTF8 combined character handling (?)
+ </li><li>Improve UTF8 combined character handling?
</li><li>Add octet_length_server() and octet_length_client()
</li><li>Make octet_length_client() the same as octet_length()?
</li></ul>
-<h1><a name="section_5">Views / Rules</a></h1>
+<h1><a name="section_7">Views / Rules</a></h1>
<ul>
<li>Automatically create rules on views so they are updateable, per SQL99
</li><li>Allow temporary views on non-temporary tables
</li><li>Allow RULE recompilation
</li></ul>
-<h1><a name="section_6">Indexes</a></h1>
-
-<ul>
- <li>Allow inherited tables to inherit index, UNIQUE constraint, and primary
- key, foreign key
- </li><li>UNIQUE INDEX on base column not honored on INSERTs/UPDATEs from
- inherited table: INSERT INTO inherit_table (unique_index_col) VALUES
- (dup) should fail
-<p> The main difficulty with this item is the problem of creating an index
- that can span more than one table.
-</p>
- </li><li>Add UNIQUE capability to non-btree indexes
- </li><li>Add more gist index support for geometric data types
- </li><li>-<em>Use indexes for MIN() and MAX()</em>
-<p> MIN/MAX queries can already be rewritten as SELECT col FROM tab ORDER
- BY col {DESC} LIMIT 1. Completing this item involves doing this
- transformation automatically.
-</p>
- </li><li>-<em>Use index to restrict rows returned by multi-key index when used with</em>
- non-consecutive keys to reduce heap accesses
-<p> For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and
- col3 = 9, spin though the index checking for col1 and col3 matches,
- rather than just col1; also called skip-scanning.
-</p>
- </li><li>Prevent index uniqueness checks when UPDATE does not modify the column
-<p> Uniqueness (index) checks are done when updating a column even if the
- column is not modified by the UPDATE.
-</p>
- </li><li>Fetch heap pages matching index entries in sequential order
-<p> Rather than randomly accessing heap pages based on index entries, mark
- heap pages needing access in a bitmap and do the lookups in sequential
- order. Another method would be to sort heap ctids matching the index
- before accessing the heap rows.
-</p>
- </li><li>-<em>Allow non-bitmap indexes to be combined by creating bitmaps in memory</em>
-<p> This feature allows separate indexes to be ANDed or ORed together. This
- is particularly useful for data warehousing applications that need to
- query the database in an many permutations. This feature scans an index
- and creates an in-memory bitmap, and allows that bitmap to be combined
- with other bitmap created in a similar way. The bitmap can either index
- all TIDs, or be lossy, meaning it records just page numbers and each
- page tuple has to be checked for validity in a separate pass.
-</p>
- </li><li>Allow the creation of on-disk bitmap indexes which can be quickly
- combined with other bitmap indexes
-<p> Such indexes could be more compact if there are only a few distinct values.
- Such indexes can also be compressed. Keeping such indexes updated can be
- costly.
-</p>
- </li><li>Allow use of indexes to search for NULLs
-<p> One solution is to create a partial index on an IS NULL expression.
-</p>
- </li><li>-<em>Add concurrency to GIST</em>
- </li><li>Pack hash index buckets onto disk pages more efficiently
-<p> Currently no only one hash bucket can be stored on a page. Ideally
- several hash buckets could be stored on a single page and greater
- granularity used for the hash algorithm.
-</p>
- </li><li>Consider sorting hash buckets so entries can be found using a binary
- search, rather than a linear scan
- </li><li>In hash indexes, consider storing the hash value with or instead
- of the key itself
- </li><li>Allow accurate statistics to be collected on indexes with more than
- one column or expression indexes, perhaps using per-index statistics
- </li><li>Add fillfactor to control reserved free space during index creation
- </li><li>Allow the creation of indexes with mixed ascending/descending specifiers
- </li><li>-<em>Fix incorrect rtree results due to wrong assumptions about "over"</em>
- operator semantics
- </li><li>Allow GIST indexes to create certain complex index types, like digital
- trees (see Aoki)
-</li></ul>
-<h1><a name="section_7">Commands</a></h1>
+<h1><a name="section_8">Commands</a></h1>
<ul>
<li>-<em>Add BETWEEN SYMMETRIC/ASYMMETRIC</em>
</li><li>Change LIMIT/OFFSET and FETCH/MOVE to use int8
</li><li>Allow CREATE TABLE AS to determine column lengths for complex
expressions like SELECT col1 || col2
- </li><li>Allow UPDATE to handle complex aggregates [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?update">update</a>] (?)
- </li><li>Allow backslash handling in quoted strings to be disabled for portability
-<p> The use of C-style backslashes (.e.g. \n, \r) in quoted strings is not
- SQL-spec compliant, so allow such handling to be disabled. However,
- disabling backslashes could break many third-party applications and tools.
-</p>
+ </li><li>Allow UPDATE to handle complex aggregates [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?update">update</a>]?
+ </li><li>-<em>Add E'' escape string marker so eventually ordinary strings can treat</em>
+ backslashes literally, for portability
</li><li>Allow an alias to be provided for the target table in UPDATE/DELETE
<p> This is not SQL-spec but many DBMSs allow it.
</p>
functionality in DELETE. It's been agreed that the keyword should
be USING, to avoid anything as confusing as DELETE FROM a FROM b.
</p>
+ </li><li>Have initdb set DateStyle based on locale?
</li><li>Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT
</li><li>-<em>Allow REINDEX to rebuild all database indexes</em>
</li><li>Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
<p> Currently the system uses the operating system COPY command to create
a new database.
</p>
- </li><li>Add C code to copy directories for use in creating new databases
+ </li><li>Add C code on Unix to copy directories for use in creating new databases
</li><li>Have pg_ctl look at PGHOST in case it is a socket directory?
</li><li>Allow pg_ctl to work properly with configuration files located outside
the PGDATA directory
allow pg_ctl to read and understand postgresql.conf to find the
data_directory value.
</p>
- </li><li>Allow column-level GRANT/REVOKE privileges
</li><li>Add a GUC variable to warn about non-standard SQL usage in queries
</li><li>Add MERGE command that does UPDATE/DELETE, or on failure, INSERT (rules,
triggers?)
creation
</li><li>Add COMMENT ON for all cluster global objects (roles, databases
and tablespaces)
- </li><li>Add an option to automatically use savepoints for each statement in a
+ </li><li>-<em>Add an option to automatically use savepoints for each statement in a</em>
multi-statement transaction.
<p> When enabled, this would allow errors in multi-statement transactions
to be automatically ignored.
processed, with ROLLBACK on COPY failure.
</p>
</li><li>-<em>Allow COPY to understand \x as a hex byte</em>
- </li><li>Have COPY return the number of rows loaded/unloaded (?)
+ </li><li>Have COPY return the number of rows loaded/unloaded?
</li><li>-<em>Allow COPY to optionally include column headings in the first line</em>
</li><li>-<em>Allow COPY FROM ... CSV to interpret newlines and carriage</em>
returns in data
</li></ul>
+ </li><li>GRANT/REVOKE
+ <ul>
+ <li>Allow column-level privileges
+ <ul>
+ <li>Allow GRANT/REVOKE permissions to be applied to all schema objects with one
+ command
+<p> The proposed syntax is:
+</p><p> GRANT SELECT ON ALL TABLES IN public TO phpuser;
+ GRANT SELECT ON NEW TABLES IN public TO phpuser;
+</p>
+ </li><li>Allow GRANT/REVOKE permissions to be inherited by objects based on
+ schema permissions
+ </li></ul>
+ </li></ul>
</li><li>CURSOR
<ul>
<li>Allow UPDATE/DELETE WHERE CURRENT OF cursor
and no FOR UPDATE lock.
</p>
</li><li>Prevent DROP TABLE from dropping a row referenced by its own open
- cursor (?)
+ cursor?
</li><li>Allow pooled connections to list all open WITH HOLD cursors
<p> Because WITH HOLD cursors exist outside transactions, this allows
them to be listed so they can be closed.
<li>-<em>Have SHOW ALL show descriptions for server-side variables</em>
</li><li>Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM
ANALYZE, and CLUSTER
- </li><li>Add SET PATH for schemas (?)
+ </li><li>Add SET PATH for schemas?
<p> This is basically the same as SET search_path.
</p>
</li></ul>
</li><li>Allow function parameters to be passed by name,
get_employee_salary(emp_id => 12345, tax_year => 2001)
</li><li>Add Oracle-style packages
- </li><li>Add table function support to pltcl, plperl, plpython (?)
+ </li><li>Add table function support to pltcl, plperl, plpython?
</li><li>Allow PL/pgSQL to name columns by ordinal position, e.g. rec.(3)
</li><li>-<em>Allow PL/pgSQL EXECUTE query_var INTO record_var;</em>
</li><li>Add capability to create and call PROCEDURES
</li><li>Add support for polymorphic arguments and return types to plperl
</li></ul>
</li></ul>
-<h1><a name="section_8">Clients</a></h1>
+<h1><a name="section_9">Clients</a></h1>
<ul>
<li>Add a libpq function to support Parse/DescribeStatement capability
- </li><li>Prevent libpq's PQfnumber() from lowercasing the column name (?)
+ </li><li>Prevent libpq's PQfnumber() from lowercasing the column name?
</li><li>Allow libpq to access SQLSTATE so pg_ctl can test for connection failure
<p> This would be used for checking if the server is up.
</p>
<p> Document differences between ecpg and the SQL standard and
information about the Informix-compatibility module.
</p>
- </li><li>Solve cardinality > 1 for input descriptors / variables (?)
+ </li><li>Solve cardinality > 1 for input descriptors / variables?
</li><li>Add a semantic check level, e.g. check if a table really exists
</li><li>fix handling of DB attributes that are arrays
</li><li>Use backend PREPARE/EXECUTE facility for ecpg where possible
</li><li>Add internationalized message strings
</li></ul>
</li></ul>
-<h1><a name="section_9">Referential Integrity</a></h1>
+<h1><a name="section_10">Referential Integrity</a></h1>
<ul>
<li>Add MATCH PARTIAL referential integrity
</li><li>-<em>Implement shared row locks and use them in RI triggers</em>
</li><li>Enforce referential integrity for system tables
</li><li>Change foreign key constraint for array -> element to mean element
- in array (?)
- </li><li>Allow DEFERRABLE UNIQUE constraints (?)
+ in array?
+ </li><li>Allow DEFERRABLE UNIQUE constraints?
</li><li>Allow triggers to be disabled [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?trigger">trigger</a>]
<p> Currently the only way to disable triggers is to modify the system
tables.
to fire triggers.
</p>
</li></ul>
-<h1><a name="section_10">Dependency Checking</a></h1>
+<h1><a name="section_11">Dependency Checking</a></h1>
<ul>
<li>Flush cached query plans when the dependent objects change
</li><li>Track dependencies in function bodies and recompile/invalidate
</li></ul>
-<h1><a name="section_11">Exotic Features</a></h1>
+<h1><a name="section_12">Exotic Features</a></h1>
<ul>
<li>Add SQL99 WITH clause to SELECT
</li><li>Allow nested schemas
</li></ul>
</li></ul>
-<h2><a name="section_11_1">PERFORMANCE</a></h2>
+<h1><a name="section_13">Indexes</a></h1>
-<h1><a name="section_12">Fsync</a></h1>
+<ul>
+ <li>Allow inherited tables to inherit index, UNIQUE constraint, and primary
+ key, foreign key
+ </li><li>UNIQUE INDEX on base column not honored on INSERTs/UPDATEs from
+ inherited table: INSERT INTO inherit_table (unique_index_col) VALUES
+ (dup) should fail
+<p> The main difficulty with this item is the problem of creating an index
+ that can span more than one table.
+</p>
+ </li><li>Add UNIQUE capability to non-btree indexes
+ </li><li>Add more GIST index support for geometric data types
+ </li><li>-<em>Use indexes for MIN() and MAX()</em>
+<p> MIN/MAX queries can already be rewritten as SELECT col FROM tab ORDER
+ BY col {DESC} LIMIT 1. Completing this item involves doing this
+ transformation automatically.
+</p>
+ </li><li>-<em>Use index to restrict rows returned by multi-key index when used with</em>
+ non-consecutive keys to reduce heap accesses
+<p> For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and
+ col3 = 9, spin though the index checking for col1 and col3 matches,
+ rather than just col1; also called skip-scanning.
+</p>
+ </li><li>Prevent index uniqueness checks when UPDATE does not modify the column
+<p> Uniqueness (index) checks are done when updating a column even if the
+ column is not modified by the UPDATE.
+</p>
+ </li><li>Fetch heap pages matching index entries in sequential order
+<p> Rather than randomly accessing heap pages based on index entries, mark
+ heap pages needing access in a bitmap and do the lookups in sequential
+ order. Another method would be to sort heap ctids matching the index
+ before accessing the heap rows.
+</p>
+ </li><li>-<em>Allow non-bitmap indexes to be combined by creating bitmaps in memory</em>
+<p> This feature allows separate indexes to be ANDed or ORed together. This
+ is particularly useful for data warehousing applications that need to
+ query the database in an many permutations. This feature scans an index
+ and creates an in-memory bitmap, and allows that bitmap to be combined
+ with other bitmap created in a similar way. The bitmap can either index
+ all TIDs, or be lossy, meaning it records just page numbers and each
+ page tuple has to be checked for validity in a separate pass.
+</p>
+ </li><li>Allow the creation of on-disk bitmap indexes which can be quickly
+ combined with other bitmap indexes
+<p> Such indexes could be more compact if there are only a few distinct values.
+ Such indexes can also be compressed. Keeping such indexes updated can be
+ costly.
+</p>
+ </li><li>Allow use of indexes to search for NULLs
+<p> One solution is to create a partial index on an IS NULL expression.
+</p>
+ </li><li>-<em>Add concurrency to GIST</em>
+ </li><li>Allow accurate statistics to be collected on indexes with more than
+ one column or expression indexes, perhaps using per-index statistics
+ </li><li>Add fillfactor to control reserved free space during index creation
+ </li><li>Allow the creation of indexes with mixed ascending/descending specifiers
+ </li><li>-<em>Fix incorrect rtree results due to wrong assumptions about "over"</em>
+ operator semantics
+ </li><li>Allow GIST indexes to create certain complex index types, like digital
+ trees (see Aoki)
+ </li><li>Hash
+ <ul>
+ <li>Pack hash index buckets onto disk pages more efficiently
+<p> Currently no only one hash bucket can be stored on a page. Ideally
+ several hash buckets could be stored on a single page and greater
+ granularity used for the hash algorithm.
+</p>
+ </li><li>Consider sorting hash buckets so entries can be found using a
+ binary search, rather than a linear scan
+ </li><li>In hash indexes, consider storing the hash value with or instead
+ of the key itself
+ </li></ul>
+</li></ul>
+<h1><a name="section_14">Fsync</a></h1>
<ul>
<li>Improve commit_delay handling to reduce fsync()
</li><li>Add an option to sync() before fsync()'ing checkpoint files
</li><li>Add program to test if fsync has a delay compared to non-fsync
</li></ul>
-<h1><a name="section_13">Cache</a></h1>
+<h1><a name="section_15">Cache Usage</a></h1>
<ul>
<li>Allow free-behind capability for large sequential scans, perhaps using
at the start of the table.
</p>
</li></ul>
-<h1><a name="section_14">Vacuum</a></h1>
+<h1><a name="section_16">Vacuum</a></h1>
<ul>
<li>Improve speed with indexes
</li><li>Do VACUUM FULL if table is nearly empty?
</li></ul>
</li></ul>
-<h1><a name="section_15">Locking</a></h1>
+<h1><a name="section_17">Locking</a></h1>
<ul>
<li>Make locking of shared data structures more fine-grained
</li><li>Research use of sched_yield() for spinlock acquisition failure
</li><li>Fix priority ordering of read and write light-weight locks (Neil)
</li></ul>
-<h1><a name="section_16">Startup Time</a></h1>
+<h1><a name="section_18">Startup Time Improvements</a></h1>
<ul>
<li>Experiment with multi-threaded backend [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?thread">thread</a>]
existing backends is one of the difficulties with a backend approach.
</p>
</li></ul>
-<h1><a name="section_17">Write-Ahead Log</a></h1>
+<h1><a name="section_19">Write-Ahead Log</a></h1>
<ul>
<li>Eliminate need to write full pages to WAL before page modification [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?wal">wal</a>]
eliminated from point-in-time archive files.
</p>
</li><li>Reduce WAL traffic so only modified values are written rather than
- entire rows (?)
+ entire rows?
</li><li>Turn off full page writes if fsync is disabled
<p> If fsync is off, there is no purpose in writing full pages to WAL
</p>
</li><li>Eliminate WAL logging for CREATE TABLE AS when not doing WAL archiving
</li><li>-<em>Change WAL to use 32-bit CRC, for performance reasons</em>
</li></ul>
-<h1><a name="section_18">Optimizer / Executor</a></h1>
+<h1><a name="section_20">Optimizer / Executor</a></h1>
<ul>
<li>Add missing optimizer selectivities for date, r-tree, etc
</p>
</li><li>ANALYZE should record a pg_statistic entry for an all-NULL column
</li><li>Log queries where the optimizer row estimates were dramatically
- different from the number of rows actually found (?)
+ different from the number of rows actually found?
</li></ul>
-<h1><a name="section_19">Miscellaneous</a></h1>
+<h1><a name="section_21">Miscellaneous Performance</a></h1>
<ul>
<li>Do async I/O for faster random read-ahead of data
<p> Async I/O allows multiple I/O requests to be sent to the disk with
results coming back asynchronously.
</p>
- </li><li>Use mmap() rather than SYSV shared memory or to write WAL files (?)
+ </li><li>Use mmap() rather than SYSV shared memory or to write WAL files?
<p> This would remove the requirement for SYSV SHM but would introduce
portability issues. Anonymous mmap (or mmap to /dev/zero) is required
to prevent I/O overhead.
</li><li>Use a phantom command counter for nested subtransactions to reduce
per-tuple overhead
</li></ul>
-<h1><a name="section_20">Source Code</a></h1>
+<h1><a name="section_22">Source Code</a></h1>
<ul>
<li>Add use of 'const' for variables in source tree
</li></ul>
<hr/>
-<h3><a name="section_20_1_1">Developers who have claimed items are:</a></h3>
+<h2><a name="section_22_1">Developers who have claimed items are:</a></h2>
<ul>
<li>Alvaro is Alvaro Herrera <<a href="mailto:alvherre@dcc.uchile.cl">alvherre@dcc.uchile.cl</a>>
</li><li>Andrew is Andrew Dunstan <<a href="mailto:andrew@dunslane.net">andrew@dunslane.net</a>>