granicus.if.org Git - postgresql/blob - doc/TODO

   1 TODO list for PostgreSQL
   2 ========================
   3 Last updated:           Fri Feb 14 12:02:42 EST 2003
   4
   5 Current maintainer:     Bruce Momjian (pgman@candle.pha.pa.us)
   6
   7 The most recent version of this document can be viewed at
   8 the PostgreSQL web site, http://www.PostgreSQL.org.
   9
  10 A dash (-) marks changes that will appear in the upcoming 7.4 release.
  11
  12 Bracketed items "[]" have more detailed.
  13
  14
  15 Urgent
  16 ======
  17
  18 * Add replication of distributed databases [replication]
  19         o automatic failover
  20         o load balancing
  21         o master/slave replication
  22         o multi-master replication
  23         o partition data across servers
  24         o sample implementation in contrib/rserv
  25         o queries across databases or servers (two-phase commit)
  26         o allow replication over unreliable or non-persistent links
  27         o http://gborg.postgresql.org/project/pgreplication/projdisplay.php
  28 * Point-in-time data recovery using backup and write-ahead log
  29 * Create native Win32 port [win32]
  30
  31
  32 Reporting
  33 =========
  34
  35 * Allow elog() to return error codes, module name, file name, line
  36   number, not just messages (Peter E)
  37 * Add error codes (Peter E)
  38 * Make error messages more consistent [error]
  39 * Show location of syntax error in query [yacc]
  40
  41
  42 Administration
  43 ==============
  44
  45 * Incremental backups
  46 * Remove unreferenced table files and temp tables during database vacuum
  47   or postmaster startup (Bruce)
  48 * Remove behavior of postmaster -o after making postmaster/postgres
  49   flags unique
  50 * Allow easy display of usernames in a group
  51 * Allow configuration files to be specified in a different directory
  52 * Add start time to pg_stat_activity
  53 * Allow limits on per-db/user connections
  54 * Have standalone backend read postgresql.conf
  55 * Add group object ownership, so groups can rename/drop/grant on objects,
  56   so we can implement roles
  57 * Add the concept of dataspaces/tablespaces [tablespaces]
  58 * Allow incremental backups
  59
  60
  61 Data Types
  62 ==========
  63
  64 * Add IPv6 capability to INET/CIDR types
  65 * Remove Money type, add money formatting for decimal type
  66 * Change factorial to return a numeric
  67 * Change NUMERIC data type to use base 10,000 internally
  68 * Change NUMERIC to enforce the maximum precision, and increase it
  69 * Add function to return compressed length of TOAST data values (Tom)
  70 * Allow INET subnet tests using non-constants
  71 * Add now("transaction|statement|clock") functionality
  72 * -Add GUC variables to control floating number output digits (Pedro Ferreira)
  73 * Have sequence dependency track use of DEFAULT sequences, seqname.nextval
  74 * Disallow changing default expression of a SERIAL column
  75 * Allow infinite dates just like infinite timestamps
  76
  77
  78 * CONVERSION
  79         o Allow better handling of numeric constants, type conversion
  80           [typeconv]
  81
  82 * ARRAYS
  83         o Allow nulls in arrays
  84         o Allow arrays to be ORDER'ed
  85         o Support construction of array result values in expressions
  86
  87 * BINARY DATA
  88         o Improve vacuum of large objects, like /contrib/vacuumlo
  89         o Add security checking for large objects
  90         o Make file in/out interface for TOAST columns, similar to large object
  91           interface (force out-of-line storage and no compression)
  92         o Auto-delete large objects when referencing row is deleted
  93
  94
  95 Multi-Language Support
  96 ======================
  97
  98 * Add NCHAR (as distinguished from ordinary varchar),
  99 * Allow LOCALE on a per-column basis, default to ASCII
 100 * Support multiple simultaneous character sets, per SQL92
 101 * Improve Unicode combined character handling
 102 * Optimize locale to have minimal performance impact when not used (Peter E)
 103 * Add octet_length_server() and octet_length_client() (Thomas, Tatsuo)
 104 * Make octet_length_client the same as octet_length() (?)
 105 * Prevent mismatch of frontend/backend encodings from converting bytea
 106   data from being interpreted as encoded strings
 107 * Remove Cyrillic recode support
 108
 109
 110 Views / Rules
 111 =============
 112
 113 * Automatically create rules on views so they are updateable, per SQL92 [view]
 114 * Add the functionality for WITH CHECK OPTION clause of CREATE VIEW
 115 * Allow NOTIFY in rules involving conditionals
 116 * Have views on temporary tables exist in the temporary namespace
 117 * Move psql backslash information into views
 118 * Allow RULE recompilation
 119
 120
 121 Indexes
 122 =======
 123
 124 * Allow CREATE INDEX zman_index ON test (date_trunc( 'day', zman ) datetime_ops)
 125   fails index can't store constant parameters
 126 * Order duplicate index entries by tid for faster heap lookups
 127 * Allow inherited tables to inherit index, UNIQUE constraint, and primary
 128   key, foreign key  [inheritance]
 129 * UNIQUE INDEX on base column not honored on inserts from inherited table
 130   INSERT INTO inherit_table (unique_index_col) VALUES (dup) should fail
 131   [inheritance]
 132 * Add UNIQUE capability to non-btree indexes
 133 * Add btree index support for reltime, tinterval, regproc
 134 * Add rtree index support for line, lseg, path, point
 135 * Certain indexes will not shrink, e.g. indexes on ever-increasing
 136   columns and indexes with many duplicate keys
 137 * Use indexes for min() and max() or convert to SELECT col FROM tab ORDER
 138   BY col DESC LIMIT 1 if appropriate index exists and WHERE clause acceptible
 139 * Allow LIKE indexing optimization for non-ASCII locales
 140 * Use index to restrict rows returned by multi-key index when used with
 141   non-consecutive keys or OR clauses, so fewer heap accesses
 142 * Be smarter about insertion of already-ordered data into btree index
 143 * Prevent index uniqueness checks when UPDATE does not modifying column
 144 * Use bitmaps to fetch heap pages in sequential order [performance]
 145 * Use bitmaps to combine existing indexes [performance]
 146 * Improve handling of index scans for NULL
 147 * Allow SELECT * FROM tab WHERE int2col = 4 to use int2col index, int8,
 148   float4, numeric/decimal too [optimizer]
 149 * Add FILLFACTOR to btree index creation
 150 * Add concurrency to GIST
 151 * Improve concurrency of hash indexes (Neil)
 152
 153
 154 Commands
 155 ========
 156
 157 * Add BETWEEN ASYMMETRIC/SYMMETRIC (Christopher)
 158 * Allow LIMIT/OFFSET to use expressions
 159 * CREATE TABLE AS can not determine column lengths from expressions [atttypmod]
 160 * Allow UPDATE to handle complex aggregates [update]
 161 * Allow command blocks to ignore certain types of errors
 162 * Allow backslash handling in quoted strings to be disabled for portability
 163 * Return proper effected tuple count from complex commands [return]
 164 * Allow DELETE to handle table aliases for self-joins [delete]
 165 * Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT
 166 * Allow REINDEX to rebuild all indexes, remove /contrib/reindex
 167 * Make a transaction-safe TRUNCATE
 168 * Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
 169 * Add schema option to createlang
 170
 171
 172 * ALTER
 173         o ALTER TABLE ADD COLUMN does not honor DEFAULT and non-CHECK CONSTRAINT
 174         o ALTER TABLE ADD COLUMN column DEFAULT should fill existing
 175           rows with DEFAULT value
 176         o ALTER TABLE ADD COLUMN column SERIAL doesn't create sequence because
 177           of the item above
 178         o Add ALTER TABLE tab SET WITHOUT OIDS
 179         * Add ALTER SEQUENCE to modify min/max/increment/cache/cycle values
 180
 181 * CLUSTER
 182         o Automatically maintain clustering on a table
 183         o Allow CLUSTER to cluster all tables, remove clusterdb
 184
 185 * COPY
 186         o Allow dump/load of CSV format
 187         o Allow COPY to report error lines and continue;  optionally
 188           allow error codes to be specified; requires savepoints or can
 189           not be run in a multi-statement transaction
 190         o Allow copy to understand \x as hex
 191
 192 * CURSOR
 193         o Allow BINARY option to SELECT, just like DECLARE
 194         o -MOVE 0 should not move to end of cursor (Bruce)
 195         o Allow UPDATE/DELETE WHERE CURRENT OF cursor using per-cursor tid
 196           stored in the backend
 197         o Prevent DROP of table being referenced by our own open cursor
 198         o Allow cursors outside transactions [cursor]
 199
 200 * INSERT
 201         o Allow INSERT/UPDATE of system-generated oid value for a row
 202         o Allow INSERT INTO tab (col1, ..) VALUES (val1, ..), (val2, ..)
 203         o Allow INSERT/UPDATE ... RETURNING new.col or old.col; handle
 204           RULE cases (Philip)
 205
 206 * SHOW/SET
 207         o Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM
 208           ANALYZE, and CLUSTER
 209         o Add SET SCHEMA
 210         o Allow EXPLAIN EXECUTE to see prepared plans
 211         o Allow SHOW of non-modifiable variables, like pg_controldata
 212         o Add GUC parameter to control the maximum number of rewrite cycles
 213
 214 * SERVER-SIDE LANGUAGES
 215         o Allow PL/PgSQL's RAISE function to take expressions
 216         o Change PL/PgSQL to use palloc() instead of malloc()
 217         o Add untrusted version of plpython
 218         o Allow Java server-side programming, http://pljava.sourceforge.net
 219           [java]
 220         o Fix problems with complex temporary table creation/destruction
 221           without using PL/PgSQL EXECUTE, needs cache prevention/invalidation
 222         o Fix PL/pgSQL RENAME to work on variables other than OLD/NEW
 223         o Improve PL/PgSQL exception handling
 224         o Allow parameters to be specified by name and type during
 225           definition
 226         o Allow function parameters to be passed by name,
 227           get_employee_salary(emp_id => 12345, tax_year => 2001)
 228         o Add PL/PgSQL packages
 229         o Allow array declarations and other data types in PL/PgSQL DECLARE
 230         o Add PL/PgSQL PROCEDURES that can return multiple values
 231         o Add table function support to pltcl, plperl, plpython
 232         o Make PL/PgSQL %TYPE schema-aware
 233         o Allow PL/PgSQL to support array element assignment
 234
 235
 236 Clients
 237 =======
 238
 239 * Allow psql to show transaction status if backend protocol changes made
 240 * Add XML interface:  psql, pg_dump, COPY, separate server (?)
 241 * -Add schema, cast, and conversion backslash commands to psql (Christopher)
 242 * Allow pg_dump to dump a specific schema
 243 * Allow psql to do table completion for SELECT * FROM schema_part and
 244   table completion for SELECT * FROM schema_name.
 245
 246 * JDBC
 247         o Comprehensive test suite. This may be available already.
 248         o JDBC-standard BLOB support
 249         o Error Codes (pending backend implementation)
 250         o Support both 'make' and 'ant'
 251         o Fix LargeObject API to handle OIDs as unsigned ints
 252         o Use cursors implicitly to avoid large results (see setCursorName())
 253         o Add LISTEN/NOTIFY support to the JDBC driver (Barry)
 254
 255 * ECPG
 256         o Implement set descriptor, using descriptor
 257         o Make casts work in variable initializations
 258         o Implement SQLDA
 259         o Allow multi-threaded use of SQLCA
 260         o Solve cardinality > 1 for input descriptors / variables
 261         o Understand structure definitions outside a declare section
 262         o sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified
 263         o Improve error handling
 264         o Allow :var[:index] or :var[<integer>] as cvariable for an array var
 265         o Add a semantic check level, e.g. check if a table really exists
 266         o Fix nested C comments
 267         o Add SQLSTATE
 268         o fix handling of DB attributes that are arrays
 269
 270 * Python
 271         o Allow users to register their own types with _pg
 272         o Allow SELECT to return a dictionary of dictionaries
 273         o Allow COPY BINARY FROM
 274
 275
 276 Referential Integrity
 277 =====================
 278
 279 * Add MATCH PARTIAL referential integrity [foreign]
 280 * Add deferred trigger queue file (Jan)
 281 * Implement dirty reads and use them in RI triggers
 282 * Enforce referential integrity for system tables
 283 * Change foreign key constraint for array -> element to mean element
 284   in array
 285 * Allow DEFERRABLE UNIQUE constraints
 286 * Allow triggers to be disabled [trigger]
 287 * -Support statement-level triggers (Neil)
 288 * Support triggers on columns (Neil)
 289
 290
 291 Dependency Checking
 292 ===================
 293
 294 * Flush cached query plans when their underlying catalog data changes
 295 * Use dependency information to dump data in proper order
 296
 297
 298 Transactions
 299 ============
 300
 301 * Overhaul bufmgr/lockmgr/transaction manager
 302 * Allow savepoints / nested transactions [transactions] (Bruce)
 303
 304
 305 Exotic Features
 306 ===============
 307
 308 * Add SQL99 WITH clause to SELECT (Tom, Fernando)
 309 * Add SQL99 WITH RECURSIVE to SELECT (Tom, Fernando)
 310 * Allow queries across multiple databases [crossdb]
 311 * Add pre-parsing phase that converts non-ANSI features to supported features
 312 * Allow plug-in modules to emulate features from other databases
 313 * SQL*Net listener that makes PostgreSQL appear as an Oracle database
 314   to clients
 315 * Two-phase commit to implement distributed transactions
 316
 317
 318 PERFORMANCE
 319 ===========
 320
 321
 322 Fsync
 323 =====
 324
 325 * Delay fsync() when other backends are about to commit too [fsync]
 326         o Determine optimal commit_delay value
 327 * Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
 328         o Allow multiple blocks to be written to WAL with one write()
 329
 330
 331 Cache
 332 =====
 333 * Shared catalog cache, reduce lseek()'s by caching table size in shared area
 334 * Add free-behind capability for large sequential scans (Bruce)
 335 * Allow binding query args over FE/BE protocol
 336 * Consider use of open/fcntl(O_DIRECT) to minimize OS caching
 337 * Make blind writes go through the file descriptor cache
 338 * Cache last known per-tuple offsets to speed long tuple access
 339
 340
 341 Vacuum
 342 ======
 343
 344 * Improve speed with indexes (perhaps recreate index instead) [vacuum]
 345 * Reduce lock time by moving tuples with read lock, then write
 346   lock and truncate table [vacuum]
 347 * Provide automatic running of vacuum in the background (Tom) [vacuum]
 348 * Allow free space map to be auto-sized or warn when it is too small
 349
 350
 351 Locking
 352 =======
 353
 354 * Make locking of shared data structures more fine-grained
 355 * Add code to detect an SMP machine and handle spinlocks accordingly
 356   from distributted.net, http://www1.distributed.net/source,
 357   in client/common/cpucheck.cpp
 358 * Research use of sched_yield() for spinlock acquisition failure
 359
 360
 361 Startup Time
 362 ============
 363
 364 * Experiment with multi-threaded backend [thread]
 365 * Add connection pooling [pool]
 366 * Allow persistent backends [persistent]
 367 * Create a transaction processor to aid in persistent connections and
 368   connection pooling
 369 * Do listen() in postmaster and accept() in pre-forked backend
 370 * Have pre-forked backend pre-connect to last requested database or pass
 371   file descriptor to backend pre-forked for matching database
 372
 373
 374 Write-Ahead Log
 375 ===============
 376
 377 * Have after-change WAL write()'s write only modified data to kernel
 378 * Reduce number of after-change WAL writes; they exist only to gaurd against
 379   partial page writes [wal]
 380 * Turn off after-change writes if fsync is disabled (?)
 381 * Add WAL index reliability improvement to non-btree indexes
 382 * Find proper defaults for postgresql.conf WAL entries
 383 * Add checkpoint_min_warning postgresql.conf option to warn about checkpoints
 384   that are too frequent
 385 * Allow xlog directory location to be specified during initdb, perhaps
 386   using symlinks
 387 * Allow pg_xlog to be moved without symlinks
 388
 389
 390 Optimizer / Executor
 391 ====================
 392
 393 * Improve Subplan list handling
 394 * Allow Subplans to use efficient joins(hash, merge) with upper variable
 395 * -Add hash for evaluating GROUP BY aggregates (Tom)
 396 * Allow merge and hash joins on expressions not just simple variables (Tom)
 397 * Make IN/NOT IN have similar performance to EXISTS/NOT EXISTS [exists]
 398 * Missing optimizer selectivities for date, r-tree, etc. [optimizer]
 399 * Allow ORDER BY ... LIMIT to select top values without sort or index
 400   using a sequential scan for highest/lowest values (Oleg)
 401 * -Inline simple SQL functions to avoid overhead (Tom)
 402 * Precompile SQL functions to avoid overhead (Neil)
 403 * Add utility to compute accurate random_page_cost value
 404 * Improve ability to display optimizer analysis using OPTIMIZER_DEBUG
 405 * Use CHECK constraints to improve optimizer decisions
 406 * Check GUC geqo_threshold to see if it is still accurate
 407 * Allow sorting, temp files, temp tables to use multiple work directories
 408
 409
 410 Miscellaneous
 411 =============
 412
 413 * Do async I/O for faster random read-ahead of data
 414 * -Get faster regex() code from Henry Spencer <henry@zoo.utoronto.ca>
 415 * Use mmap() rather than SYSV shared memory or to write WAL files (?) [mmap]
 416 * Improve caching of attribute offsets when NULLs exist in the row
 417
 418 * Wire Protocol Changes
 419         o Show transaction status in psql
 420         o Allow binding of query parameters, support for prepared queries
 421         o Add optional textual message to NOTIFY
 422         o Remove hard-coded limits on user/db/password names
 423         o Remove unused elements of startup packet (unused, tty, passlength)
 424         o Fix COPY/fastpath protocol?
 425         o Allow fastpast to pass values in portable format
 426         o Replication support?
 427         o Error codes
 428         o Dynamic character set handling
 429         o Special passing of binary values in platform-neutral format (bytea?)
 430         o ecpg improvements?
 431         o Add decoded type, length, precision
 432         o Compression?
 433
 434
 435 Source Code
 436 ===========
 437
 438 * Add use of 'const' for variables in source tree
 439 * Rename some /contrib modules from pg* to pg_*
 440 * Move some things from /contrib into main tree
 441 * Remove warnings created by -Wcast-align
 442 * Move platform-specific ps status display info from ps_status.c to ports
 443 * Modify regression tests to prevent failures do to minor numeric rounding
 444 * -Add OpenBSD's getpeereid() call for local socket authentication
 445 * Improve access-permissions check on data directory in Cygwin (Tom)
 446 * Add --port flag to regression tests
 447 * Add documentation for perl, including mention of DBI/DBD perl location
 448 * Add optional CRC checksum to heap and index pages
 449 * Change representation of whole-tuple parameters to functions
 450 * Clarify use of 'application' and 'command' tags in SGML docs
 451 * Better document ability to build only certain interfaces (Marc)
 452 * Remove or relicense modules that are not under the BSD license, if possible
 453 * Remove memory/file descriptor freeing befor elog(ERROR)  (Bruce)
 454 * Acquire lock on a relation before building a relcache entry for it
 455 * Research interaction of setitimer() and sleep() used by statement_timeout
 456
 457 ---------------------------------------------------------------------------
 458
 459
 460 Developers who have claimed items are:
 461 --------------------------------------
 462 * Barry is Barry Lind <barry@xythos.com>
 463 * Billy is Billy G. Allie <Bill.Allie@mug.org>
 464 * Bruce is Bruce Momjian <pgman@candle.pha.pa.us> of Software Research Assoc.
 465 * Christopher is Christopher Kings-Lynne <chriskl@familyhealth.com.au> of
 466     Family Health Network
 467 * D'Arcy is D'Arcy J.M. Cain <darcy@druid.net> of The Cain Gang Ltd.
 468 * Dave is Dave Cramer <dave@fastcrypt.com>
 469 * Edmund is Edmund Mergl <E.Mergl@bawue.de>
 470 * Fernando Nasser <fnasser@redhat.com> of Red Hat
 471 * Gavin Sherry <swm@linuxworld.com.au> of Alcove Systems Engineering
 472 * Hiroshi is Hiroshi Inoue <Inoue@tpf.co.jp>
 473 * Karel is Karel Zak <zakkr@zf.jcu.cz>
 474 * Jan is Jan Wieck <JanWieck@Yahoo.com> of PeerDirect Corp.
 475 * Liam is Liam Stewart <liams@redhat.com> of Red Hat
 476 * Marc is Marc Fournier <scrappy@hub.org> of PostgreSQL, Inc.
 477 * Mark is Mark Hollomon <mhh@mindspring.com>
 478 * Michael is Michael Meskes <meskes@postgresql.org> of Credativ
 479 * Neil is Neil Conway <neilc@samurai.com>
 480 * Oleg is Oleg Bartunov <oleg@sai.msu.su>
 481 * Peter M is Peter T Mount <peter@retep.org.uk> of Retep Software
 482 * Peter E is Peter Eisentraut <peter_e@gmx.net>
 483 * Philip is Philip Warner <pjw@rhyme.com.au> of Albatross Consulting Pty. Ltd.
 484 * Rod is Rod Taylor <rbt@zort.ca>
 485 * Ross is Ross J. Reedstrom <reedstrm@wallace.ece.rice.edu>
 486 * Stephan is Stephan Szabo <sszabo@megazone23.bigpanda.com>
 487 * Tatsuo is Tatsuo Ishii <t-ishii@sra.co.jp> of Software Research Assoc.
 488 * Thomas is Thomas Lockhart <lockhart@fourpalms.org> of Jet Propulsion Labratory
 489 * Tom is Tom Lane <tgl@sss.pgh.pa.us> of Red Hat
 490 * Vadim is Vadim B. Mikheev <vadim4o@email.com> of Sector Data