granicus.if.org Git - postgresql/blob - doc/TODO

   1
   2 PostgreSQL TODO List
   3 ====================
   4 Current maintainer:     Bruce Momjian (pgman@candle.pha.pa.us)
   5 Last updated:           Fri Aug 26 16:38:55 EDT 2005
   6
   7 The most recent version of this document can be viewed at
   8 http://www.postgresql.org/docs/faqs.TODO.html.
   9
  10 #A hyphen, "-", marks changes that will appear in the upcoming 8.1 release.#
  11 #A percent sign, "%", marks items that are easier to implement.#
  12
  13 Bracketed items, "[]", have more detail.
  14
  15 This list contains all known PostgreSQL bugs and feature requests. If
  16 you would like to work on an item, please read the Developer's FAQ
  17 first.
  18
  19
  20 Administration
  21 ==============
  22
  23 * %Remove behavior of postmaster -o after making postmaster/postgres
  24   flags unique
  25 * %Allow pooled connections to list all prepared queries
  26
  27   This would allow an application inheriting a pooled connection to know
  28   the queries prepared in the current session.
  29
  30 * Allow major upgrades without dump/reload, perhaps using pg_upgrade
  31   [pg_upgrade]
  32 * Check for unreferenced table files created by transactions that were
  33   in-progress when the server terminated abruptly
  34 * Allow administrators to safely terminate individual sessions either
  35   via an SQL function or SIGTERM
  36
  37   Lock table corruption following SIGTERM of an individual backend
  38   has been reported in 8.0.  A possible cause was fixed in 8.1, but
  39   it is unknown whether other problems exist.  This item mostly
  40   requires additional testing rather than of writing any new code.
  41
  42 * %Set proper permissions on non-system schemas during db creation
  43
  44   Currently all schemas are owned by the super-user because they are
  45   copied from the template1 database.
  46
  47 * Support table partitioning that allows a single table to be stored
  48   in subtables that are partitioned based on the primary key or a WHERE
  49   clause
  50
  51
  52 * Improve replication solutions
  53
  54         o Load balancing
  55
  56           You can use any of the master/slave replication servers to use a
  57           standby server for data warehousing. To allow read/write queries to
  58           multiple servers, you need multi-master replication like pgcluster.
  59
  60         o Allow replication over unreliable or non-persistent links
  61
  62
  63 * Configuration files
  64
  65         o %Add "include file" functionality in postgresql.conf
  66         o %Allow commenting of variables in postgresql.conf to restore them
  67           to defaults
  68
  69           Currently, if a variable is commented out, it keeps the
  70           previous uncommented value until a server restarted.
  71
  72         o %Allow pg_hba.conf settings to be controlled via SQL
  73
  74           This would add a function to load the SQL table from
  75           pg_hba.conf, and one to writes its contents to the flat file.
  76           The table should have a line number that is a float so rows
  77           can be inserted between existing rows, e.g. row 2.5 goes
  78           between row 2 and row 3.
  79
  80         o %Allow postgresql.conf file values to be changed via an SQL
  81           API, perhaps using SET GLOBAL
  82         o Allow the server to be stopped/restarted via an SQL API
  83         o Issue a warning if a change-on-restart-only postgresql.conf value
  84           is modified  and the server config files are reloaded
  85         o Mark change-on-restart-only values in postgresql.conf
  86
  87
  88 * Tablespaces
  89
  90         * Allow a database in tablespace t1 with tables created in
  91           tablespace t2 to be used as a template for a new database created
  92           with default tablespace t2
  93
  94           All objects in the default database tablespace must have default
  95           tablespace specifications. This is because new databases are
  96           created by copying directories. If you mix default tablespace
  97           tables and tablespace-specified tables in the same directory,
  98           creating a new database from such a mixed directory would create a
  99           new database with tables that had incorrect explicit tablespaces.
 100           To fix this would require modifying pg_class in the newly copied
 101           database, which we don't currently do.
 102
 103         * Allow reporting of which objects are in which tablespaces
 104
 105           This item is difficult because a tablespace can contain objects
 106           from multiple databases. There is a server-side function that
 107           returns the databases which use a specific tablespace, so this
 108           requires a tool that will call that function and connect to each
 109           database to find the objects in each database for that tablespace.
 110
 111         o %Add a GUC variable to control the tablespace for temporary objects
 112           and sort files
 113
 114           It could start with a random tablespace from a supplied list and
 115           cycle through the list.
 116
 117         o Allow WAL replay of CREATE TABLESPACE to work when the directory
 118           structure on the recovery computer is different from the original
 119
 120         o Allow per-tablespace quotas
 121
 122
 123 * Point-In-Time Recovery (PITR)
 124
 125           o Allow point-in-time recovery to archive partially filled
 126             write-ahead logs [pitr]
 127
 128             Currently only full WAL files are archived. This means that the
 129             most recent transactions aren't available for recovery in case
 130             of a disk failure. This could be triggered by a user command or
 131             a timer.
 132
 133           o Automatically force archiving of partially-filled WAL files when
 134             pg_stop_backup() is called or the server is stopped
 135
 136             Doing this will allow administrators to know more easily when
 137             the archive contins all the files needed for point-in-time
 138             recovery.
 139
 140           o %Create dump tool for write-ahead logs for use in determining
 141             transaction id for point-in-time recovery
 142           o Allow a warm standby system to also allow read-only queries
 143             [pitr]
 144
 145             This is useful for checking PITR recovery.
 146
 147           o Allow the PITR process to be debugged and data examined
 148
 149
 150 Monitoring
 151 ==========
 152
 153 * Allow server log information to be output as INSERT statements
 154
 155   This would allow server log information to be easily loaded into
 156   a database for analysis.
 157
 158 * %Add ability to monitor the use of temporary sort files
 159 * Allow server logs to be remotely read and removed using SQL commands
 160
 161
 162 Data Types
 163 ==========
 164
 165 * Improve the MONEY data type
 166
 167   Change the MONEY data type to use DECIMAL internally, with special
 168   locale-aware output formatting.
 169
 170 * Change NUMERIC to enforce the maximum precision, and increase it
 171 * Add NUMERIC division operator that doesn't round?
 172
 173   Currently NUMERIC _rounds_ the result to the specified precision.
 174   This means division can return a result that multiplied by the
 175   divisor is greater than the dividend, e.g. this returns a value > 10:
 176
 177     SELECT (10::numeric(2,0) / 6::numeric(2,0))::numeric(2,0) * 6;
 178
 179   The positive modulus result returned by NUMERICs might be considered
 180   inaccurate, in one sense.
 181
 182 * Have sequence dependency track use of DEFAULT sequences,
 183   seqname.nextval?
 184 * %Disallow changing default expression of a SERIAL column?
 185 * Fix data types where equality comparison isn't intuitive, e.g. box
 186 * %Prevent INET cast to CIDR if the unmasked bits are not zero, or
 187   zero the bits
 188 * %Prevent INET cast to CIDR from droping netmask, SELECT '1.1.1.1'::inet::cidr
 189 * Allow INET + INT4 to increment the host part of the address, or
 190   throw an error on overflow
 191 * %Add 'tid != tid ' operator for use in corruption recovery
 192
 193
 194 * Dates and Times
 195
 196         o Allow infinite dates just like infinite timestamps
 197         o Add a GUC variable to allow output of interval values in ISO8601
 198           format
 199         o Merge hardwired timezone names with the TZ database; allow either
 200           kind everywhere a TZ name is currently taken
 201         o Allow customization of the known set of TZ names (generalize the
 202           present australian_timezones hack)
 203         o Allow TIMESTAMP WITH TIME ZONE to store the original timezone
 204           information, either zone name or offset from UTC [timezone]
 205
 206           If the TIMESTAMP value is stored with a time zone name, interval
 207           computations should adjust based on the time zone rules.
 208
 209         o Fix SELECT '0.01 years'::interval, '0.01 months'::interval
 210         o Add ISO INTERVAL handling
 211                 o Add support for day-time syntax, INTERVAL '1 2:03:04' DAY TO
 212                   SECOND
 213                 o Add support for year-month syntax, INTERVAL '50-6' YEAR TO MONTH
 214                 o For syntax that isn't uniquely ISO or PG syntax, like '1:30' or
 215                   '1', treat as ISO if there is a range specification clause,
 216                   and as PG if there no clause is present, e.g. interpret
 217                           '1:30' MINUTE TO SECOND as '1 minute 30 seconds', and
 218                           interpret '1:30' as '1 hour, 30 minutes'
 219                 o Interpret INTERVAL '1 year' MONTH as CAST (INTERVAL '1 year' AS
 220                   INTERVAL MONTH), and this should return '12 months'
 221                 o Round or truncate values to the requested precision, e.g.
 222                   INTERVAL '11 months' AS YEAR should return one or zero
 223                 o Support precision, CREATE TABLE foo (a INTERVAL MONTH(3))
 224
 225
 226 * Arrays
 227
 228         o Allow NULLs in arrays
 229         o Delay resolution of array expression's data type so assignment
 230           coercion can be performed on empty array expressions
 231
 232
 233 * Binary Data
 234
 235         o Improve vacuum of large objects, like /contrib/vacuumlo?
 236         o Add security checking for large objects
 237         o Auto-delete large objects when referencing row is deleted
 238
 239           /contrib/lo offers this functionality.
 240
 241         o Allow read/write into TOAST values like large objects
 242
 243           This requires the TOAST column to be stored EXTERNAL.
 244
 245
 246 Functions
 247 =========
 248
 249 * Allow INET subnet tests using non-constants to be indexed
 250 * Add transaction_timestamp(), statement_timestamp(), clock_timestamp()
 251   functionality
 252
 253   Current CURRENT_TIMESTAMP returns the start time of the current
 254   transaction, and gettimeofday() returns the wallclock time. This will
 255   make time reporting more consistent and will allow reporting of
 256   the statement start time.
 257
 258 * %Add pg_get_acldef(), pg_get_typedefault(), and pg_get_attrdef()
 259 * Allow to_char() to print localized month names
 260 * Allow functions to have a schema search path specified at creation time
 261 * Allow substring/replace() to get/set bit values
 262 * Allow to_char() on interval values to accumulate the highest unit
 263   requested
 264
 265   Some special format flag would be required to request such
 266   accumulation.  Such functionality could also be added to EXTRACT.
 267   Prevent accumulation that crosses the month/day boundary because of
 268   the uneven number of days in a month.
 269
 270         o to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
 271         o to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
 272         o to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
 273         o to_char(INTERVAL '3 years 5 months','MM') => 41
 274
 275 * Add sleep() function, remove from regress.c
 276
 277
 278 Multi-Language Support
 279 ======================
 280
 281 * Add NCHAR (as distinguished from ordinary varchar),
 282 * Allow locale to be set at database creation
 283
 284   Currently locale can only be set during initdb.  No global tables have
 285   locale-aware columns.  However, the database template used during
 286   database creation might have locale-aware indexes.  The indexes would
 287   need to be reindexed to match the new locale.
 288
 289 * Allow encoding on a per-column basis
 290
 291   Right now only one encoding is allowed per database.
 292
 293 * Support multiple simultaneous character sets, per SQL92
 294 * Improve UTF8 combined character handling?
 295 * Add octet_length_server() and octet_length_client()
 296 * Make octet_length_client() the same as octet_length()?
 297 * Fix problems with wrong runtime encoding conversion for NLS message files
 298
 299
 300 Views / Rules
 301 =============
 302
 303 * %Automatically create rules on views so they are updateable, per SQL99
 304
 305   We can only auto-create rules for simple views.  For more complex
 306   cases users will still have to write rules.
 307
 308 * Add the functionality for WITH CHECK OPTION clause of CREATE VIEW
 309 * Allow NOTIFY in rules involving conditionals
 310 * Allow VIEW/RULE recompilation when the underlying tables change
 311
 312
 313 SQL Commands
 314 ============
 315
 316 * Change LIMIT/OFFSET and FETCH/MOVE to use int8
 317 * Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT
 318 * Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
 319 * %Allow SET CONSTRAINTS to be qualified by schema/table name
 320 * %Allow TRUNCATE ... CASCADE/RESTRICT
 321
 322   This is like DELETE CASCADE, but truncates.
 323
 324 * %Add a separate TRUNCATE permission
 325
 326   Currently only the owner can TRUNCATE a table because triggers are not
 327   called, and the table is locked in exclusive mode.
 328
 329 * Allow PREPARE of cursors
 330 * Allow PREPARE to automatically determine parameter types based on the SQL
 331   statement
 332 * Allow finer control over the caching of prepared query plans
 333
 334   Currently, queries prepared via the libpq API are planned on first
 335   execute using the supplied parameters --- allow SQL PREPARE to do the
 336   same.  Also, allow control over replanning prepared queries either
 337   manually or automatically when statistics for execute parameters
 338   differ dramatically from those used during planning.
 339
 340 * Allow LISTEN/NOTIFY to store info in memory rather than tables?
 341
 342   Currently LISTEN/NOTIFY information is stored in pg_listener. Storing
 343   such information in memory would improve performance.
 344
 345 * Add optional textual message to NOTIFY
 346
 347   This would allow an informational message to be added to the notify
 348   message, perhaps indicating the row modified or other custom
 349   information.
 350
 351 * Add a GUC variable to warn about non-standard SQL usage in queries
 352 * Add MERGE command that does UPDATE/DELETE, or on failure, INSERT (rules,
 353   triggers?)
 354 * Add NOVICE output level for helpful messages like automatic sequence/index
 355   creation
 356 * %Add COMMENT ON for all cluster global objects (roles, databases
 357   and tablespaces)
 358 * %Make row-wise comparisons work per SQL spec
 359 * Add RESET CONNECTION command to reset all session state
 360
 361   This would include resetting of all variables (RESET ALL), dropping of
 362   temporary tables, removing any NOTIFYs, cursors, open transactions,
 363   prepared queries, currval()s, etc.  This could be used  for connection
 364   pooling.  We could also change RESET ALL to have this functionality.
 365   The difficult of this features is allowing RESET ALL to not affect
 366   changes made by the interface driver for its internal use.  One idea
 367   is for this to be a protocol-only feature.  Another approach is to
 368   notify the protocol when a RESET CONNECTION command is used.
 369
 370 * Add GUC to issue notice about queries that use unjoined tables
 371 * Allow EXPLAIN to identify tables that were skipped because of
 372   constraint_exclusion
 373 * Allow EXPLAIN output to be more easily processed by scripts
 374
 375
 376 * CREATE
 377
 378         o Allow CREATE TABLE AS to determine column lengths for complex
 379           expressions like SELECT col1 || col2
 380
 381         o Use more reliable method for CREATE DATABASE to get a consistent
 382           copy of db?
 383
 384         o Add ON COMMIT capability to CREATE TABLE AS ... SELECT
 385
 386
 387 * UPDATE
 388         o Allow UPDATE to handle complex aggregates [update]?
 389         o Allow an alias to be provided for the target table in
 390           UPDATE/DELETE
 391
 392           This is not SQL-spec but many DBMSs allow it.
 393
 394         o Allow UPDATE tab SET ROW (col, ...) = (...) for updating multiple
 395           columns
 396
 397
 398 * ALTER
 399
 400         o %Have ALTER TABLE RENAME rename SERIAL sequence names
 401         o Add ALTER DOMAIN to modify the underlying data type
 402         o %Allow ALTER TABLE ... ALTER CONSTRAINT ... RENAME
 403         o %Allow ALTER TABLE to change constraint deferrability and actions
 404         o Add missing object types for ALTER ... SET SCHEMA
 405         o Allow ALTER TABLESPACE to move to different directories
 406         o Allow databases to be moved to different tablespaces
 407         o Allow moving system tables to other tablespaces, where possible
 408
 409           Currently non-global system tables must be in the default database
 410           tablespace. Global system tables can never be moved.
 411
 412         o %Disallow dropping of an inherited constraint
 413         o %Prevent child tables from altering or dropping constraints
 414           like CHECK that were inherited from the parent table
 415
 416
 417 * CLUSTER
 418
 419         o Automatically maintain clustering on a table
 420
 421           This might require some background daemon to maintain clustering
 422           during periods of low usage. It might also require tables to be only
 423           paritally filled for easier reorganization.  Another idea would
 424           be to create a merged heap/index data file so an index lookup would
 425           automatically access the heap data too.  A third idea would be to
 426           store heap rows in hashed groups, perhaps using a user-supplied
 427           hash function.
 428
 429         o %Add default clustering to system tables
 430
 431           To do this, determine the ideal cluster index for each system
 432           table and set the cluster setting during initdb.
 433
 434
 435 * COPY
 436
 437         o Allow COPY to report error lines and continue
 438
 439           This requires the use of a savepoint before each COPY line is
 440           processed, with ROLLBACK on COPY failure.
 441
 442         o %Have COPY return the number of rows loaded/unloaded?
 443
 444
 445 * GRANT/REVOKE
 446
 447         o Allow column-level privileges
 448         o %Allow GRANT/REVOKE permissions to be applied to all schema objects
 449           with one command
 450
 451           The proposed syntax is:
 452                 GRANT SELECT ON ALL TABLES IN public TO phpuser;
 453                 GRANT SELECT ON NEW TABLES IN public TO phpuser;
 454
 455         * Allow GRANT/REVOKE permissions to be inherited by objects based on
 456           schema permissions
 457
 458
 459 * CURSOR
 460
 461         o Allow UPDATE/DELETE WHERE CURRENT OF cursor
 462
 463           This requires using the row ctid to map cursor rows back to the
 464           original heap row. This become more complicated if WITH HOLD cursors
 465           are to be supported because WITH HOLD cursors have a copy of the row
 466           and no FOR UPDATE lock.
 467
 468         o Prevent DROP TABLE from dropping a row referenced by its own open
 469           cursor?
 470
 471         o %Allow pooled connections to list all open WITH HOLD cursors
 472
 473           Because WITH HOLD cursors exist outside transactions, this allows
 474           them to be listed so they can be closed.
 475
 476
 477 * INSERT
 478
 479         o Allow INSERT/UPDATE of the system-generated oid value for a row
 480         o Allow INSERT INTO tab (col1, ..) VALUES (val1, ..), (val2, ..)
 481         o Allow INSERT/UPDATE ... RETURNING new.col or old.col
 482
 483           This is useful for returning the auto-generated key for an INSERT.
 484           One complication is how to handle rules that run as part of
 485           the insert.
 486
 487
 488 * SHOW/SET
 489
 490         o Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM
 491           ANALYZE, and CLUSTER
 492         o Add SET PATH for schemas?
 493
 494           This is basically the same as SET search_path.
 495
 496
 497 * Server-Side Languages
 498
 499         o Fix PL/pgSQL RENAME to work on variables other than OLD/NEW
 500         o Allow function parameters to be passed by name,
 501           get_employee_salary(emp_id => 12345, tax_year => 2001)
 502         o Add Oracle-style packages
 503         o Add table function support to pltcl, plpython
 504         o Add capability to create and call PROCEDURES
 505         o Allow PL/pgSQL to handle %TYPE arrays, e.g. tab.col%TYPE[]
 506         o Allow function argument names to be queries from PL/PgSQL
 507         o Add MOVE to PL/pgSQL
 508         o Add support for polymorphic arguments and return types to
 509           languages other than PL/PgSQL
 510         o Add support for OUT and INOUT parameters to languages other
 511           than PL/PgSQL
 512
 513
 514 Clients
 515 =======
 516
 517 * Add a libpq function to support Parse/DescribeStatement capability
 518 * Prevent libpq's PQfnumber() from lowercasing the column name?
 519 * Add PQescapeIdentifier() to libpq
 520 * Have initdb set the input DateStyle (MDY or DMY) based on locale?
 521 * Have pg_ctl look at PGHOST in case it is a socket directory?
 522 * Allow pg_ctl to work properly with configuration files located outside
 523   the PGDATA directory
 524
 525   pg_ctl can not read the pid file because it isn't located in the
 526   config directory but in the PGDATA directory.  The solution is to
 527   allow pg_ctl to read and understand postgresql.conf to find the
 528   data_directory value.
 529
 530
 531 * psql
 532
 533         o Have psql show current values for a sequence
 534         o Move psql backslash database information into the backend, use
 535           mnemonic commands? [psql]
 536
 537           This would allow non-psql clients to pull the same information out
 538           of the database as psql.
 539
 540         o Fix psql's display of schema information (Neil)
 541         o Allow psql \pset boolean variables to set to fixed values, rather
 542           than toggle
 543         o Consistently display privilege information for all objects in psql
 544         o Improve psql's handling of multi-line queries
 545
 546           Currently, while \e saves a single query as one entry, interactive
 547           queries are saved one line at a time.  Ideally all queries
 548           whould be saved like \e does.
 549
 550         o Allow multi-line column values to align in the proper columns
 551
 552           If the second output column value is 'a\nb', the 'b' should appear
 553           in the second display column, rather than the first column as it
 554           does now.
 555
 556
 557 * pg_dump
 558
 559         o %Have pg_dump use multi-statement transactions for INSERT dumps
 560         o %Allow pg_dump to use multiple -t and -n switches [pg_dump]
 561         o %Add dumping of comments on composite type columns
 562         o %Add dumping of comments on index columns
 563         o %Replace crude DELETE FROM method of pg_dumpall --clean for
 564           cleaning of roles with separate DROP commands
 565         o Stop dumping CASCADE on DROP TYPE commands in clean mode
 566         o %Add full object name to the tag field.  eg. for operators we need
 567           '=(integer, integer)', instead of just '='.
 568         o Add pg_dumpall custom format dumps?
 569         o %Add CSV output format
 570         o Update pg_dump and psql to use the new COPY libpq API (Christopher)
 571         o Remove unnecessary function pointer abstractions in pg_dump source
 572           code
 573
 574
 575 * ecpg
 576
 577         o Docs
 578
 579           Document differences between ecpg and the SQL standard and
 580           information about the Informix-compatibility module.
 581
 582         o Solve cardinality > 1 for input descriptors / variables?
 583         o Add a semantic check level, e.g. check if a table really exists
 584         o fix handling of DB attributes that are arrays
 585         o Use backend PREPARE/EXECUTE facility for ecpg where possible
 586         o Implement SQLDA
 587         o Fix nested C comments
 588         o %sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified
 589         o Make SET CONNECTION thread-aware, non-standard?
 590         o Allow multidimensional arrays
 591         o Add internationalized message strings
 592
 593
 594 Referential Integrity
 595 =====================
 596
 597 * Add MATCH PARTIAL referential integrity
 598 * Add deferred trigger queue file
 599
 600   Right now all deferred trigger information is stored in backend
 601   memory.  This could exhaust memory for very large trigger queues.
 602   This item involves dumping large queues into files.
 603
 604 * Change foreign key constraint for array -> element to mean element
 605   in array?
 606 * Allow DEFERRABLE UNIQUE constraints?
 607 * Allow triggers to be disabled in only the current session.
 608
 609   This is currently possible by starting a multi-statement transaction,
 610   modifying the system tables, performing the desired SQL, restoring the
 611   system tables, and committing the transaction.  ALTER TABLE ...
 612   TRIGGER requires a table lock so it is not ideal for this usage.
 613
 614 * With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY
 615
 616   If the dump is known to be valid, allow foreign keys to be added
 617   without revalidating the data.
 618
 619 * Allow statement-level triggers to access modified rows
 620 * Support triggers on columns (Greg Sabino Mullane)
 621 * Enforce referential integrity for system tables
 622 * Allow AFTER triggers on system tables
 623
 624   System tables are modified in many places in the backend without going
 625   through the executor and therefore not causing triggers to fire. To
 626   complete this item, the functions that modify system tables will have
 627   to fire triggers.
 628
 629
 630 Dependency Checking
 631 ===================
 632
 633 * Flush cached query plans when the dependent objects change
 634 * Track dependencies in function bodies and recompile/invalidate
 635
 636   This is particularly important for references to temporary tables
 637   in PL/PgSQL because PL/PgSQL caches query plans.  The only workaround
 638   in PL/PgSQL is to use EXECUTE.  One complexity is that a function
 639   might itself drop and recreate dependent tables, causing it to
 640   invalidate its own query plan.
 641
 642
 643 Exotic Features
 644 ===============
 645
 646 * Add SQL99 WITH clause to SELECT
 647 * Add SQL99 WITH RECURSIVE to SELECT
 648 * Add pre-parsing phase that converts non-ISO syntax to supported
 649   syntax
 650
 651   This could allow SQL written for other databases to run without
 652   modification.
 653
 654 * Allow plug-in modules to emulate features from other databases
 655 * SQL*Net listener that makes PostgreSQL appear as an Oracle database
 656   to clients
 657 * Allow queries across databases or servers with transaction
 658   semantics
 659
 660   This can be done using dblink and two-phase commit.
 661
 662 * Add the features of packages
 663
 664         o  Make private objects accessable only to objects in the same schema
 665         o  Allow current_schema.objname to access current schema objects
 666         o  Add session variables
 667         o  Allow nested schemas
 668
 669
 670 Indexes
 671 =======
 672
 673 * Allow inherited tables to inherit index, UNIQUE constraint, and primary
 674   key, foreign key
 675 * UNIQUE INDEX on base column not honored on INSERTs/UPDATEs from
 676   inherited table:  INSERT INTO inherit_table (unique_index_col) VALUES
 677   (dup) should fail
 678
 679   The main difficulty with this item is the problem of creating an index
 680   that can span more than one table.
 681
 682 * Allow SELECT ... FOR UPDATE on inherited tables
 683 * Add UNIQUE capability to non-btree indexes
 684 * Prevent index uniqueness checks when UPDATE does not modify the column
 685
 686   Uniqueness (index) checks are done when updating a column even if the
 687   column is not modified by the UPDATE.
 688
 689 * Allow the creation of on-disk bitmap indexes which can be quickly
 690   combined with other bitmap indexes
 691
 692   Such indexes could be more compact if there are only a few distinct values.
 693   Such indexes can also be compressed.  Keeping such indexes updated can be
 694   costly.
 695
 696 * Allow use of indexes to search for NULLs
 697
 698   One solution is to create a partial index on an IS NULL expression.
 699
 700 * Allow accurate statistics to be collected on indexes with more than
 701   one column or expression indexes, perhaps using per-index statistics
 702 * Add fillfactor to control reserved free space during index creation
 703 * Allow the creation of indexes with mixed ascending/descending specifiers
 704 * Allow constraint_exclusion to work for UNIONs like it does for
 705   inheritance, allow it to work for UPDATE and DELETE queries, and allow
 706   it to be used for all queries with little performance impact
 707
 708
 709 * GIST
 710
 711         o Add more GIST index support for geometric data types
 712         o Allow GIST indexes to create certain complex index types, like
 713           digital trees (see Aoki)
 714
 715 * Hash
 716
 717         o Pack hash index buckets onto disk pages more efficiently
 718
 719           Currently only one hash bucket can be stored on a page. Ideally
 720           several hash buckets could be stored on a single page and greater
 721           granularity used for the hash algorithm.
 722
 723         o Consider sorting hash buckets so entries can be found using a
 724           binary search, rather than a linear scan
 725
 726         o In hash indexes, consider storing the hash value with or instead
 727           of the key itself
 728
 729         o Add WAL logging for crash recovery
 730         o Allow multi-column hash indexes
 731
 732
 733 Fsync
 734 =====
 735
 736 * Improve commit_delay handling to reduce fsync()
 737 * Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
 738
 739   Ideally this requires a separate test program that can be run
 740   at initdb time or optionally later.
 741
 742 * %Add an option to sync() before fsync()'ing checkpoint files
 743 * Add program to test if fsync has a delay compared to non-fsync
 744
 745
 746 Cache Usage
 747 ===========
 748
 749 * Allow free-behind capability for large sequential scans, perhaps using
 750   posix_fadvise()
 751
 752   Posix_fadvise() can control both sequential/random file caching and
 753   free-behind behavior, but it is unclear how the setting affects other
 754   backends that also have the file open, and the feature is not supported
 755   on all operating systems.
 756
 757 * Speed up COUNT(*)
 758
 759   We could use a fixed row count and a +/- count to follow MVCC
 760   visibility rules, or a single cached value could be used and
 761   invalidated if anyone modifies the table.  Another idea is to
 762   get a count directly from a unique index, but for this to be
 763   faster than a sequential scan it must avoid access to the heap
 764   to obtain tuple visibility information.
 765
 766 * Allow data to be pulled directly from indexes
 767
 768   Currently indexes do not have enough tuple visibility information
 769   to allow data to be pulled from the index without also accessing
 770   the heap.  One way to allow this is to set a bit to index tuples
 771   to indicate if a tuple is currently visible to all transactions
 772   when the first valid heap lookup happens.  This bit would have to
 773   be cleared when a heap tuple is expired.
 774
 775
 776 * Consider automatic caching of queries at various levels:
 777
 778         o Parsed query tree
 779         o Query execute plan
 780         o Query results
 781
 782 * Allow sequential scans to take advantage of other concurrent
 783   sequentiqal scans, also called "Synchronised Scanning"
 784
 785   One possible implementation is to start sequential scans from the lowest
 786   numbered buffer in the shared cache, and when reaching the end wrap
 787   around to the beginning, rather than always starting sequential scans
 788   at the start of the table.
 789
 790
 791 Vacuum
 792 ======
 793
 794 * Improve speed with indexes
 795
 796   For large table adjustements during VACUUM FULL, it is faster to
 797   reindex rather than update the index.
 798
 799 * Reduce lock time during VACUUM FULL by moving tuples with read lock,
 800   then write lock and truncate table
 801
 802   Moved tuples are invisible to other backends so they don't require a
 803   write lock. However, the read lock promotion to write lock could lead
 804   to deadlock situations.
 805
 806 * Maintain a map of recently-expired rows
 807
 808   This allows vacuum to target specific pages for possible free space
 809   without requiring a sequential scan.
 810
 811 * Auto-fill the free space map by scanning the buffer cache or by
 812   checking pages written by the background writer
 813 * Create a bitmap of pages that need vacuuming
 814
 815   Instead of sequentially scanning the entire table, have the background
 816   writer or some other process record pages that have expired rows, then
 817   VACUUM can look at just those pages rather than the entire table.  In
 818   the event of a system crash, the bitmap would probably be invalidated.
 819
 820 * %Add system view to show free space map contents
 821
 822
 823 * Auto-vacuum
 824
 825         o Use free-space map information to guide refilling
 826         o %Issue log message to suggest VACUUM FULL if a table is nearly
 827           empty?
 828         o Improve xid wraparound detection by recording per-table rather
 829           than per-database
 830
 831
 832 Locking
 833 =======
 834
 835 * Add code to detect an SMP machine and handle spinlocks accordingly
 836   from distributted.net, http://www1.distributed.net/source,
 837   in client/common/cpucheck.cpp
 838
 839   On SMP machines, it is possible that locks might be released shortly,
 840   while on non-SMP machines, the backend should sleep so the process
 841   holding the lock can complete and release it.
 842
 843 * Research use of sched_yield() for spinlock acquisition failure
 844 * Fix priority ordering of read and write light-weight locks (Neil)
 845
 846
 847 Startup Time Improvements
 848 =========================
 849
 850 * Experiment with multi-threaded backend [thread]
 851
 852   This would prevent the overhead associated with process creation. Most
 853   operating systems have trivial process creation time compared to
 854   database startup overhead, but a few operating systems (WIn32,
 855   Solaris) might benefit from threading.  Also explore the idea of
 856   a single session using multiple threads to execute a query faster.
 857
 858 * Add connection pooling
 859
 860   It is unclear if this should be done inside the backend code or done
 861   by something external like pgpool. The passing of file descriptors to
 862   existing backends is one of the difficulties with a backend approach.
 863
 864
 865 Write-Ahead Log
 866 ===============
 867
 868 * Eliminate need to write full pages to WAL before page modification [wal]
 869
 870   Currently, to protect against partial disk page writes, we write
 871   full page images to WAL before they are modified so we can correct any
 872   partial page writes during recovery.  These pages can also be
 873   eliminated from point-in-time archive files.
 874
 875         o  When off, write CRC to WAL and check file system blocks
 876            on recovery
 877
 878            If CRC check fails during recovery, remember the page in case
 879            a later CRC for that page properly matches.
 880
 881         o  Write full pages during file system write and not when
 882            the page is modified in the buffer cache
 883
 884            This allows most full page writes to happen in the background
 885            writer.  It might cause problems for applying WAL on recovery
 886            into a partially-written page, but later the full page will be
 887            replaced from WAL.
 888
 889 * Reduce WAL traffic so only modified values are written rather than
 890   entire rows?
 891 * Allow the pg_xlog directory location to be specified during initdb
 892   with a symlink back to the /data location
 893 * Allow WAL information to recover corrupted pg_controldata
 894 * Find a way to reduce rotational delay when repeatedly writing
 895   last WAL page
 896
 897   Currently fsync of WAL requires the disk platter to perform a full
 898   rotation to fsync again. One idea is to write the WAL to different
 899   offsets that might reduce the rotational delay.
 900
 901 * Allow buffered WAL writes and fsync
 902
 903   Instead of guaranteeing recovery of all committed transactions, this
 904   would provide improved performance by delaying WAL writes and fsync
 905   so an abrupt operating system restart might lose a few seconds of
 906   committed transactions but still be consistent.  We could perhaps
 907   remove the 'fsync' parameter (which results in an an inconsistent
 908   database) in favor of this capability.
 909
 910
 911 Optimizer / Executor
 912 ====================
 913
 914 * Add missing optimizer selectivities for date, r-tree, etc
 915 * Allow ORDER BY ... LIMIT # to select high/low value without sort or
 916   index using a sequential scan for highest/lowest values
 917
 918   Right now, if no index exists, ORDER BY ... LIMIT # requires we sort
 919   all values to return the high/low value.  Instead The idea is to do a
 920   sequential scan to find the high/low value, thus avoiding the sort.
 921   MIN/MAX already does this, but not for LIMIT > 1.
 922
 923 * Precompile SQL functions to avoid overhead
 924 * Create utility to compute accurate random_page_cost value
 925 * Improve ability to display optimizer analysis using OPTIMIZER_DEBUG
 926 * Have EXPLAIN ANALYZE highlight poor optimizer estimates
 927 * Consider using hash buckets to do DISTINCT, rather than sorting
 928
 929   This would be beneficial when there are few distinct values.
 930
 931 * Log queries where the optimizer row estimates were dramatically
 932   different from the number of rows actually found?
 933
 934
 935 Miscellaneous Performance
 936 =========================
 937
 938 * Do async I/O for faster random read-ahead of data
 939
 940   Async I/O allows multiple I/O requests to be sent to the disk with
 941   results coming back asynchronously.
 942
 943 * Use mmap() rather than SYSV shared memory or to write WAL files?
 944
 945   This would remove the requirement for SYSV SHM but would introduce
 946   portability issues. Anonymous mmap (or mmap to /dev/zero) is required
 947   to prevent I/O overhead.
 948
 949 * Consider mmap()'ing files into a backend?
 950
 951   Doing I/O to large tables would consume a lot of address space or
 952   require frequent mapping/unmapping.  Extending the file also causes
 953   mapping problems that might require mapping only individual pages,
 954   leading to thousands of mappings.  Another problem is that there is no
 955   way to _prevent_ I/O to disk from the dirty shared buffers so changes
 956   could hit disk before WAL is written.
 957
 958 * Add a script to ask system configuration questions and tune postgresql.conf
 959 * Use a phantom command counter for nested subtransactions to reduce
 960   per-tuple overhead
 961 * Research storing disk pages with no alignment/padding
 962
 963 Source Code
 964 ===========
 965
 966 * Add use of 'const' for variables in source tree
 967 * Rename some /contrib modules from pg* to pg_*
 968 * Move some things from /contrib into main tree
 969 * Move some /contrib modules out to their own project sites
 970 * %Remove warnings created by -Wcast-align
 971 * Move platform-specific ps status display info from ps_status.c to ports
 972 * Add optional CRC checksum to heap and index pages
 973 * Improve documentation to build only interfaces (Marc)
 974 * Remove or relicense modules that are not under the BSD license, if possible
 975 * %Remove memory/file descriptor freeing before ereport(ERROR)
 976 * Acquire lock on a relation before building a relcache entry for it
 977 * %Promote debug_query_string into a server-side function current_query()
 978 * %Allow the identifier length to be increased via a configure option
 979 * Remove Win32 rename/unlink looping if unnecessary
 980 * Allow cross-compiling by generating the zic database on the target system
 981 * Improve NLS maintenace of libpgport messages linked onto applications
 982 * Allow ecpg to work with MSVC and BCC
 983 * Add xpath_array() to /contrib/xml2 to return results as an array
 984 * Allow building in directories containing spaces
 985
 986   This is probably not possible because 'gmake' and other compiler tools
 987   do not fully support quoting of paths with spaces.
 988
 989 * Allow installing to directories containing spaces
 990
 991   This is possible if proper quoting is added to the makefiles for the
 992   install targets.  Because PostgreSQL supports relocatable installs, it
 993   is already possible to install into a directory that doesn't contain
 994   spaces and then copy the install to a directory with spaces.
 995
 996 * Fix sgmltools so PDFs can be generated with bookmarks
 997 * %Clean up compiler warnings (especially with gcc version 4)
 998
 999
1000 * Win32
1001
1002         o Remove configure.in check for link failure when cause is found
1003         o Remove readdir() errno patch when runtime/mingwex/dirent.c rev
1004           1.4 is released
1005         o Remove psql newline patch when we find out why mingw outputs an
1006           extra newline
1007         o Allow psql to use readline once non-US code pages work with
1008           backslashes
1009         o Re-enable timezone output on log_line_prefix '%t' when a
1010           shorter timezone string is available
1011         o Fix problem with shared memory on the Win32 Terminal Server
1012         o %Add support for Unicode
1013
1014           To fix this, the data needs to be converted to/from UTF16/UTF8
1015           so the Win32 wcscoll() can be used, and perhaps other functions
1016           like towupper().  However, UTF8 already works with normal
1017           locales but provides no ordering or character set classes.
1018
1019
1020 * Wire Protocol Changes
1021
1022         o Allow dynamic character set handling
1023         o Add decoded type, length, precision
1024         o Use compression?
1025         o Update clients to use data types, typmod, schema.table.column names
1026           of result sets using new query protocol
1027
1028
1029 ---------------------------------------------------------------------------
1030
1031
1032 Developers who have claimed items are:
1033 --------------------------------------
1034 * Alvaro is Alvaro Herrera <alvherre@dcc.uchile.cl>
1035 * Andrew is Andrew Dunstan <andrew@dunslane.net>
1036 * Bruce is Bruce Momjian <pgman@candle.pha.pa.us> of Software Research Assoc.
1037 * Christopher is Christopher Kings-Lynne <chriskl@familyhealth.com.au> of
1038     Family Health Network
1039 * Claudio is Claudio Natoli <claudio.natoli@memetrics.com>
1040 * D'Arcy is D'Arcy J.M. Cain <darcy@druid.net> of The Cain Gang Ltd.
1041 * Fabien is Fabien Coelho <coelho@cri.ensmp.fr>
1042 * Gavin is Gavin Sherry <swm@linuxworld.com.au> of Alcove Systems Engineering
1043 * Greg is Greg Sabino Mullane <greg@turnstep.com>
1044 * Hiroshi is Hiroshi Inoue <Inoue@tpf.co.jp>
1045 * Jan is Jan Wieck <JanWieck@Yahoo.com> of Afilias, Inc.
1046 * Joe is Joe Conway <mail@joeconway.com>
1047 * Karel is Karel Zak <zakkr@zf.jcu.cz>
1048 * Magnus is Magnus Hagander <mha@sollentuna.net>
1049 * Marc is Marc Fournier <scrappy@hub.org> of PostgreSQL, Inc.
1050 * Matthew T. O'Connor <matthew@zeut.net>
1051 * Michael is Michael Meskes <meskes@postgresql.org> of Credativ
1052 * Neil is Neil Conway <neilc@samurai.com>
1053 * Oleg is Oleg Bartunov <oleg@sai.msu.su>
1054 * Peter is Peter Eisentraut <peter_e@gmx.net>
1055 * Philip is Philip Warner <pjw@rhyme.com.au> of Albatross Consulting Pty. Ltd.
1056 * Rod is Rod Taylor <pg@rbt.ca>
1057 * Simon is Simon Riggs <simon@2ndquadrant.com>
1058 * Stephan is Stephan Szabo <sszabo@megazone23.bigpanda.com>
1059 * Tatsuo is Tatsuo Ishii <t-ishii@sra.co.jp> of Software Research Assoc.
1060 * Tom is Tom Lane <tgl@sss.pgh.pa.us> of Red Hat