granicus.if.org Git - postgresql/blob - doc/FAQ_DEV

   1
   2           Developer's Frequently Asked Questions (FAQ) for PostgreSQL
   3
   4    Last updated: Sat May 5 10:33:44 EDT 2007
   5
   6    Current maintainer: Bruce Momjian (bruce@momjian.us)
   7
   8    The most recent version of this document can be viewed at
   9    http://www.postgresql.org/docs/faqs.FAQ_DEV.html.
  10      _________________________________________________________________
  11
  12 General Questions
  13
  14    1.1) How do I get involved in PostgreSQL development?
  15    1.2) What development environment is required to develop code?
  16    1.3) What areas need work?
  17    1.4) What do I do after choosing an item to work on?
  18    1.5) I have developed a patch, what next?
  19    1.6) How is a patch reviewed?
  20    1.7) Where can I learn more about the code?
  21    1.8) How do I download/update the current source tree?
  22    1.9) How do I test my changes?
  23    1.10) What tools are available for developers?
  24    1.11) What books are good for developers?
  25    1.12) What is configure all about?
  26    1.13) How do I add a new port?
  27    1.14) Why don't you use threads, raw devices, async-I/O, <insert your
  28    favorite wizz-bang feature here>?
  29    1.15) How are RPM's packaged?
  30    1.16) How are CVS branches handled?
  31    1.17) Where can I get a copy of the SQL standards?
  32    1.18) Where can I get technical assistance?
  33    1.19) How do I get involved in PostgreSQL web site development?
  34    1.20) Why haven't you replaced CVS with SVN, Git, Monotone, VSS,
  35    <insert your favorite SCM system here>?
  36
  37 Technical Questions
  38
  39    2.1) How do I efficiently access information in tables from the
  40    backend code?
  41    2.2) Why are table, column, type, function, view names sometimes
  42    referenced as Name or NameData, and sometimes as char *?
  43    2.3) Why do we use Node and List to make data structures?
  44    2.4) I just added a field to a structure. What else should I do?
  45    2.5) Why do we use palloc() and pfree() to allocate memory?
  46    2.6) What is ereport()?
  47    2.7) What is CommandCounterIncrement()?
  48    2.8) What debugging features are available?
  49      _________________________________________________________________
  50
  51 General Questions
  52
  53   1.1) How do I get involved in PostgreSQL development?
  54
  55    Download the code and have a look around. See 1.8.
  56
  57    Subscribe to and read the pgsql-hackers mailing list (often termed
  58    'hackers'). This is where the major contributors and core members of
  59    the project discuss development.
  60
  61   1.2) What development environment is required to develop code?
  62
  63    PostgreSQL is developed mostly in the C programming language. It also
  64    makes use of Yacc and Lex.
  65
  66    The source code is targeted at most of the popular Unix platforms and
  67    the Windows environment (XP, Windows 2000, and up).
  68
  69    Most developers make use of the open source development tool chain. If
  70    you have contributed to open source software before, you will probably
  71    be familiar with these tools. They include: GCC (http://gcc.gnu.org,
  72    GDB (www.gnu.org/software/gdb/gdb.html), autoconf
  73    (www.gnu.org/software/autoconf/) AND GNU make
  74    (www.gnu.org/software/make/make.html.
  75
  76    Developers using this tool chain on Windows make use of MingW (see
  77    http://www.mingw.org/).
  78
  79    Some developers use compilers from other software vendors with mixed
  80    results.
  81
  82    Developers who regularly rebuild the source often pass the
  83    --enable-depend flag to configure. The result is that when you make a
  84    modification to a C header file, all files depend upon that file are
  85    also rebuilt.
  86
  87    src/Makefile.custom can be used to set environment variables, like
  88    CUSTOM_COPT, that are used for every compile.
  89
  90   1.3) What areas need work?
  91
  92    Outstanding features are detailed in the TODO list. This is located in
  93    doc/TODO in the source distribution or at
  94    http://www.postgresql.org/docs/faqs.TODO.html.
  95
  96    You can learn more about these features by consulting the archives,
  97    the SQL standards and the recommend texts (see 1.11).
  98
  99   1.4) What do I do after choosing an item to work on?
 100
 101    Send an email to pgsql-hackers with a proposal for what you want to do
 102    (assuming your contribution is not trivial). Working in isolation is
 103    not advisable because others might be working on the same TODO item,
 104    or you might have misunderstood the TODO item. In the email, discuss
 105    both the internal implementation method you plan to use, and any
 106    user-visible changes (new syntax, etc). For complex patches, it is
 107    important to get community feeback on your proposal before starting
 108    work. Failure to do so might mean your patch is rejected. If your work
 109    is being sponsored by a company, read this article for tips on being
 110    more effective.
 111
 112    A web site is maintained for patches awaiting review,
 113    http://momjian.postgresql.org/cgi-bin/pgpatches, and those that are
 114    being kept for the next release,
 115    http://momjian.postgresql.org/cgi-bin/pgpatches_hold.
 116
 117   1.5) I have developed a patch, what next?
 118
 119    You will need to submit the patch to pgsql-patches@postgresql.org. It
 120    will be reviewed by other contributors to the project and will be
 121    either accepted or sent back for further work. To help ensure your
 122    patch is reviewed and committed in a timely fashion, please try to
 123    make sure your submission conforms to the following guidelines:
 124     1. Ensure that your patch is generated against the most recent
 125        version of the code, which for developers is CVS HEAD. For more on
 126        branches in PostgreSQL, see 1.16.
 127     2. Try to make your patch as readable as possible by following the
 128        project's code-layout conventions. This makes it easier for the
 129        reviewer, and there's no point in trying to layout things
 130        differently than pgindent. Also avoid unnecessary whitespace
 131        changes because they just distract the reviewer, and formatting
 132        changes will be removed by the next run of pgindent.
 133     3. The patch should be generated in contextual diff format (diff -c
 134        and should be applicable from the root directory. If you are
 135        unfamiliar with this, you might find the script
 136        src/tools/make_diff/difforig useful. (Unified diffs are only
 137        preferable if the file changes are single-line changes and do not
 138        rely on surrounding lines.)
 139     4. PostgreSQL is licensed under a BSD license. By posting a patch to
 140        the public PostgreSQL mailling lists, you are giving the
 141        PostgreSQL Global Development Group the non-revokable right to
 142        distribute your patch under the BSD license.
 143     5. Confirm that your changes can pass the regression tests. If your
 144        changes are port specific, please list the ports you have tested
 145        it on.
 146     6. If you are adding a new feature, confirm that it has been tested
 147        thoroughly. Try to test the feature in all conceivable scenarios.
 148     7. New feature patches should also be accompanied by documentation
 149        patches. If you need help checking the SQL standard, see 1.17.
 150     8. Provide an implementation overview, preferably in code comments.
 151        Following the surrounding code commenting style is usually a good
 152        approach.
 153     9. If it is a performance patch, please provide confirming test
 154        results to show the benefit of your patch. It is OK to post
 155        patches without this information, though the patch will not be
 156        applied until somebody has tested the patch and found a
 157        significant performance improvement.
 158
 159    Even if you pass all of the above, the patch might still be rejected
 160    for other reasons. Please be prepared to listen to comments and make
 161    modifications.
 162
 163    You will be notified via email when the patch is applied, and your
 164    name will appear in the next version of the release notes.
 165
 166   1.6) How is a patch reviewed?
 167
 168    Patch committers check several things before applying a patch:
 169      * Patch follows the SQL standard or community agreed-upon behavior
 170      * Style merges seamlessly into the surrounding code
 171      * Written as simply and efficiently as possible
 172      * Uses the available PostgreSQL subsystems properly
 173      * Contains sufficient comments
 174      * Contains code that works on all supported operating systems
 175      * Has proper documentation
 176      * Passes all regression tests, and if needed, adds new ones
 177      * Behaves as expected, even under unusual cirumstances
 178      * Contains no reliability risks
 179      * Does not overly complicate the source code
 180      * If performance-related, has a measureable performance benefit
 181      * Is of sufficient usefulness to the average PostgreSQL user
 182      * Follows existing PostgreSQL coding standards
 183
 184   1.7) Where can I learn more about the code?
 185
 186    Other than documentation in the source tree itself, you can find some
 187    papers/presentations discussing the code at
 188    http://www.postgresql.org/developer. An excellent presentation is at
 189    http://neilconway.org/talks/hacking/
 190
 191   1.8) How do I download/update the current source tree?
 192
 193    There are several ways to obtain the source tree. Occasional
 194    developers can just get the most recent source tree snapshot from
 195    ftp://ftp.postgresql.org.
 196
 197    Regular developers might want to take advantage of anonymous access to
 198    our source code management system. The source tree is currently hosted
 199    in CVS. For details of how to obtain the source from CVS see
 200    http://developer.postgresql.org/docs/postgres/cvs.html.
 201
 202   1.9) How do I test my changes?
 203
 204    Basic system testing
 205
 206    The easiest way to test your code is to ensure that it builds against
 207    the latest version of the code and that it does not generate compiler
 208    warnings.
 209
 210    It is worth advised that you pass --enable-cassert to configure. This
 211    will turn on assertions with in the source which will often show us
 212    bugs because they cause data corruption of segmentation violations.
 213    This generally makes debugging much easier.
 214
 215    Then, perform run time testing via psql.
 216
 217    Regression test suite
 218
 219    The next step is to test your changes against the existing regression
 220    test suite. To do this, issue "make check" in the root directory of
 221    the source tree. If any tests failure, investigate.
 222
 223    If you've deliberately changed existing behavior, this change might
 224    cause a regression test failure but not any actual regression. If so,
 225    you should also patch the regression test suite.
 226
 227    Other run time testing
 228
 229    Some developers make use of tools such as valgrind
 230    (http://valgrind.kde.org) for memory testing, gprof (which comes with
 231    the GNU binutils suite) and oprofile
 232    (http://oprofile.sourceforge.net/) for profiling and other related
 233    tools.
 234
 235    What about unit testing, static analysis, model checking...?
 236
 237    There have been a number of discussions about other testing frameworks
 238    and some developers are exploring these ideas.
 239
 240    Keep in mind the Makefiles do not have the proper dependencies for
 241    include files. You have to do a make clean and then another make. If
 242    you are using GCC you can use the --enable-depend option of configure
 243    to have the compiler compute the dependencies automatically.
 244
 245   1.10) What tools are available for developers?
 246
 247    First, all the files in the src/tools directory are designed for
 248    developers.
 249     RELEASE_CHANGES changes we have to make for each release
 250     backend         description/flowchart of the backend directories
 251     ccsym           find standard defines made by your compiler
 252      copyright       fixes copyright notices
 253
 254     entab           converts spaces to tabs, used by pgindent
 255     find_static     finds functions that could be made static
 256     find_typedef    finds typedefs in the source code
 257     find_badmacros  finds macros that use braces incorrectly
 258     fsync           a script to provide information about the cost of cache
 259                      syncing system calls
 260     make_ctags      make vi 'tags' file in each directory
 261     make_diff       make *.orig and diffs of source
 262     make_etags      make emacs 'etags' files
 263     make_keywords   make comparison of our keywords and SQL'92
 264     make_mkid       make mkid ID files
 265     pgcvslog        used to generate a list of changes for each release
 266     pginclude       scripts for adding/removing include files
 267     pgindent        indents source files
 268     pgtest          a semi-automated build system
 269     thread          a thread testing script
 270
 271    In src/include/catalog:
 272     unused_oids     a script which generates unused OIDs for use in system
 273                      catalogs
 274     duplicate_oids  finds duplicate OIDs in system catalog definitions
 275
 276    If you point your browser at the tools/backend/index.html file, you
 277    will see few paragraphs describing the data flow, the backend
 278    components in a flow chart, and a description of the shared memory
 279    area. You can click on any flowchart box to see a description. If you
 280    then click on the directory name, you will be taken to the source
 281    directory, to browse the actual source code behind it. We also have
 282    several README files in some source directories to describe the
 283    function of the module. The browser will display these when you enter
 284    the directory also. The tools/backend directory is also contained on
 285    our web page under the title How PostgreSQL Processes a Query.
 286
 287    Second, you really should have an editor that can handle tags, so you
 288    can tag a function call to see the function definition, and then tag
 289    inside that function to see an even lower-level function, and then
 290    back out twice to return to the original function. Most editors
 291    support this via tags or etags files.
 292
 293    Third, you need to get id-utils from ftp://ftp.gnu.org/gnu/id-utils/
 294
 295    By running tools/make_mkid, an archive of source symbols can be
 296    created that can be rapidly queried.
 297
 298    Some developers make use of cscope, which can be found at
 299    http://cscope.sf.net/. Others use glimpse, which can be found at
 300    http://webglimpse.net/.
 301
 302    tools/make_diff has tools to create patch diff files that can be
 303    applied to the distribution. This produces context diffs, which is our
 304    preferred format.
 305
 306    Our standard format BSD style, with each level of code indented one
 307    tab, where each tab is four spaces. You will need to set your editor
 308    or file viewer to display tabs as four spaces:
 309     vi in ~/.exrc:
 310             set tabstop=4
 311             set sw=4
 312     more:
 313             more -x4
 314     less:
 315             less -x4
 316
 317    The tools/editors directory of the latest sources contains sample
 318    settings that can be used with the emacs, xemacs and vim editors, that
 319    assist in keeping to PostgreSQL coding standards.
 320
 321    pgindent will the format code by specifying flags to your operating
 322    system's utility indent. This article describes the value of a
 323    consistent coding style.
 324
 325    pgindent is run on all source files just before each beta test period.
 326    It auto-formats all source files to make them consistent. Comment
 327    blocks that need specific line breaks should be formatted as block
 328    comments, where the comment starts as /*------. These comments will
 329    not be reformatted in any way.
 330
 331    pginclude contains scripts used to add needed #include's to include
 332    files, and removed unneeded #include's.
 333
 334    When adding system types, you will need to assign oids to them. There
 335    is also a script called unused_oids in pgsql/src/include/catalog that
 336    shows the unused oids.
 337
 338   1.11) What books are good for developers?
 339
 340    I have four good books, An Introduction to Database Systems, by C.J.
 341    Date, Addison, Wesley, A Guide to the SQL Standard, by C.J. Date, et.
 342    al, Addison, Wesley, Fundamentals of Database Systems, by Elmasri and
 343    Navathe, and Transaction Processing, by Jim Gray, Morgan, Kaufmann
 344
 345    There is also a database performance site, with a handbook on-line
 346    written by Jim Gray at http://www.benchmarkresources.com..
 347
 348   1.12) What is configure all about?
 349
 350    The files configure and configure.in are part of the GNU autoconf
 351    package. Configure allows us to test for various capabilities of the
 352    OS, and to set variables that can then be tested in C programs and
 353    Makefiles. Autoconf is installed on the PostgreSQL main server. To add
 354    options to configure, edit configure.in, and then run autoconf to
 355    generate configure.
 356
 357    When configure is run by the user, it tests various OS capabilities,
 358    stores those in config.status and config.cache, and modifies a list of
 359    *.in files. For example, if there exists a Makefile.in, configure
 360    generates a Makefile that contains substitutions for all @var@
 361    parameters found by configure.
 362
 363    When you need to edit files, make sure you don't waste time modifying
 364    files generated by configure. Edit the *.in file, and re-run configure
 365    to recreate the needed file. If you run make distclean from the
 366    top-level source directory, all files derived by configure are
 367    removed, so you see only the file contained in the source
 368    distribution.
 369
 370   1.13) How do I add a new port?
 371
 372    There are a variety of places that need to be modified to add a new
 373    port. First, start in the src/template directory. Add an appropriate
 374    entry for your OS. Also, use src/config.guess to add your OS to
 375    src/template/.similar. You shouldn't match the OS version exactly. The
 376    configure test will look for an exact OS version number, and if not
 377    found, find a match without version number. Edit src/configure.in to
 378    add your new OS. (See configure item above.) You will need to run
 379    autoconf, or patch src/configure too.
 380
 381    Then, check src/include/port and add your new OS file, with
 382    appropriate values. Hopefully, there is already locking code in
 383    src/include/storage/s_lock.h for your CPU. There is also a
 384    src/makefiles directory for port-specific Makefile handling. There is
 385    a backend/port directory if you need special files for your OS.
 386
 387   1.14) Why don't you use threads, raw devices, async-I/O, <insert your
 388   favorite wizz-bang feature here>?
 389
 390    There is always a temptation to use the newest operating system
 391    features as soon as they arrive. We resist that temptation.
 392
 393    First, we support 15+ operating systems, so any new feature has to be
 394    well established before we will consider it. Second, most new
 395    wizz-bang features don't provide dramatic improvements. Third, they
 396    usually have some downside, such as decreased reliability or
 397    additional code required. Therefore, we don't rush to use new features
 398    but rather wait for the feature to be established, then ask for
 399    testing to show that a measurable improvement is possible.
 400
 401    As an example, threads are not currently used in the backend code
 402    because:
 403      * Historically, threads were unsupported and buggy.
 404      * An error in one backend can corrupt other backends.
 405      * Speed improvements using threads are small compared to the
 406        remaining backend startup time.
 407      * The backend code would be more complex.
 408
 409    So, we are not ignorant of new features. It is just that we are
 410    cautious about their adoption. The TODO list often contains links to
 411    discussions showing our reasoning in these areas.
 412
 413   1.15) How are RPMs packaged?
 414
 415    This was written by Lamar Owen and Devrim Gündüz:
 416
 417    2006-10-16
 418
 419    As to how the RPMs are built -- to answer that question sanely
 420    requires us to know how much experience you have with the whole RPM
 421    paradigm. 'How is the RPM built?' is a multifaceted question. The
 422    obvious simple answer is that we maintain:
 423     1. A set of patches to make certain portions of the source tree
 424        'behave' in the different environment of the RPMset;
 425     2. The initscript;
 426     3. Any other ancillary scripts and files;
 427     4. A README.rpm-dist document that tries to adequately document both
 428        the differences between the RPM build and the WHY of the
 429        differences, as well as useful RPM environment operations (like,
 430        using syslog, upgrading, getting postmaster to start at OS boot,
 431        etc);
 432     5. The spec file that throws it all together. This is not a trivial
 433        undertaking in a package of this size.
 434
 435    PGDG RPM Maintainer builds the SRPM and announces the SRPM to the
 436    pgsqlrpms-hackers list. This is a list where package builders are
 437    subscribed. Then, the builders download the SRPM and rebuild it on
 438    their machines.
 439
 440    We try to build on as many different canonical distributions as we
 441    can. Currently we are able to build on Red Hat Linux 9, RHEL 3 and
 442    above, and all Fedora Core Linux releases.
 443
 444    To test the binaries, we install them on our local machines and run
 445    regression tests. If the package builders uses postgres user to build
 446    the rpms, then it is possible to run regression tests during RPM
 447    builds.
 448
 449    Once the build passes these tests, the binary RPMs are sent back to
 450    PGDG RPM Maintainer and they are pushed to main FTP site, followed by
 451    a release announcement to pgsqlrpms-* lists, pgsql-general and
 452    pgsql-announce lists.
 453
 454    You will notice we said 'canonical' distributions above. That simply
 455    means that the machine is as stock 'out of the box' as practical --
 456    that is, everything (except select few programs) on these boxen are
 457    installed by RPM; only official Red Hat released RPMs are used (except
 458    in unusual circumstances involving software that will not alter the
 459    build -- for example, installing a newer non-RedHat version of the Dia
 460    diagramming package is OK -- installing Python 2.1 on the box that has
 461    Python 1.5.2 installed is not, as that alters the PostgreSQL build).
 462    The RPM as uploaded is built to as close to out-of-the-box pristine as
 463    is possible. Only the standard released 'official to that release'
 464    compiler is used -- and only the standard official kernel is used as
 465    well.
 466
 467    PGDG RPM Building Project does not build RPMs for Mandrake .
 468
 469    We usually have only one SRPM for all platforms. This is because of
 470    our limited resources. However, on some cases, we may distribute
 471    different SRPMs for different platforms, depending on possible
 472    compilation problems, especially on older distros.
 473
 474    Please note that this is a volunteered job -- We are doing our best to
 475    keep packages up to date. We, at least, provide SRPMs for all
 476    platforms. For example, if you do not find a RHEL 4 x86_64 RPM in our
 477    FTP site, it means that we do not have a RHEL 4 x86_64 server around.
 478    If you have one and want to help us, please do not hesitate to build
 479    rpms and send to us :-)
 480    http://pgfoundry.org/docman/view.php/1000048/98/PostgreSQL-RPM-Install
 481    ation-PGDG.pdf has some information about building binary RPMs using
 482    an SRPM.
 483
 484    PGDG RPM Building Project is a hosted on pgFoundry :
 485    http://pgfoundry.org/projects/pgsqlrpms. We are an open community,
 486    except one point : Our pgsqlrpms-hackers list is open to package
 487    builders only. Still, its archives are visible to public. We use a CVS
 488    server to save the work we have done so far. This includes spec files
 489    and patches; as well as documents.
 490
 491    As to why all these files aren't part of the source tree, well, unless
 492    there was a large cry for it to happen, we don't believe it should.
 493
 494   1.16) How are CVS branches managed?
 495
 496    This was written by Tom Lane:
 497
 498    2001-05-07
 499
 500    If you just do basic "cvs checkout", "cvs update", "cvs commit", then
 501    you'll always be dealing with the HEAD version of the files in CVS.
 502    That's what you want for development, but if you need to patch past
 503    stable releases then you have to be able to access and update the
 504    "branch" portions of our CVS repository. We normally fork off a branch
 505    for a stable release just before starting the development cycle for
 506    the next release.
 507
 508    The first thing you have to know is the branch name for the branch you
 509    are interested in getting at. To do this, look at some long-lived
 510    file, say the top-level HISTORY file, with "cvs status -v" to see what
 511    the branch names are. (Thanks to Ian Lance Taylor for pointing out
 512    that this is the easiest way to do it.) Typical branch names are:
 513     REL7_1_STABLE
 514     REL7_0_PATCHES
 515     REL6_5_PATCHES
 516
 517    OK, so how do you do work on a branch? By far the best way is to
 518    create a separate checkout tree for the branch and do your work in
 519    that. Not only is that the easiest way to deal with CVS, but you
 520    really need to have the whole past tree available anyway to test your
 521    work. (And you *better* test your work. Never forget that dot-releases
 522    tend to go out with very little beta testing --- so whenever you
 523    commit an update to a stable branch, you'd better be doubly sure that
 524    it's correct.)
 525
 526    Normally, to checkout the head branch, you just cd to the place you
 527    want to contain the toplevel "pgsql" directory and say
 528     cvs ... checkout pgsql
 529
 530    To get a past branch, you cd to wherever you want it and say
 531     cvs ... checkout -r BRANCHNAME pgsql
 532
 533    For example, just a couple days ago I did
 534     mkdir ~postgres/REL7_1
 535     cd ~postgres/REL7_1
 536     cvs ... checkout -r REL7_1_STABLE pgsql
 537
 538    and now I have a maintenance copy of 7.1.*.
 539
 540    When you've done a checkout in this way, the branch name is "sticky":
 541    CVS automatically knows that this directory tree is for the branch,
 542    and whenever you do "cvs update" or "cvs commit" in this tree, you'll
 543    fetch or store the latest version in the branch, not the head version.
 544    Easy as can be.
 545
 546    So, if you have a patch that needs to apply to both the head and a
 547    recent stable branch, you have to make the edits and do the commit
 548    twice, once in your development tree and once in your stable branch
 549    tree. This is kind of a pain, which is why we don't normally fork the
 550    tree right away after a major release --- we wait for a dot-release or
 551    two, so that we won't have to double-patch the first wave of fixes.
 552
 553   1.17) Where can I get a copy of the SQL standards?
 554
 555    There are three versions of the SQL standard: SQL-92, SQL:1999, and
 556    SQL:2003. They are endorsed by ANSI and ISO. Draft versions can be
 557    downloaded from:
 558      * SQL-92 http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
 559      * SQL:1999
 560        http://www.cse.iitb.ac.in/dbms/Data/Papers-Other/SQL1999/ansi-iso-
 561        9075-2-1999.pdf
 562      * SQL:2003 http://www.wiscorp.com/sql_2003_standard.zip
 563
 564    Some SQL standards web pages are:
 565      * http://troels.arvin.dk/db/rdbms/links/#standards
 566      * http://www.wiscorp.com/SQLStandards.html
 567      * http://www.contrib.andrew.cmu.edu/~shadow/sql.html#syntax (SQL-92)
 568      * http://dbs.uni-leipzig.de/en/lokal/standards.pdf (paper)
 569
 570   1.18) Where can I get technical assistance?
 571
 572    Many technical questions held by those new to the code have been
 573    answered on the pgsql-hackers mailing list - the archives of which can
 574    be found at http://archives.postgresql.org/pgsql-hackers/.
 575
 576    If you cannot find discussion or your particular question, feel free
 577    to put it to the list.
 578
 579    Major contributors also answer technical questions, including
 580    questions about development of new features, on IRC at
 581    irc.freenode.net in the #postgresql channel.
 582
 583   1.19) How do I get involved in PostgreSQL web site development?
 584
 585    PostgreSQL website development is discussed on the
 586    pgsql-www@postgresql.org mailing list. The is a project page where the
 587    source code is available at
 588    http://gborg.postgresql.org/project/pgweb/projdisplay.php , the code
 589    for the next version of the website is under the "portal" module.
 590
 591   1.20) Why haven't you replaced CVS with SVN, Git, Monotone, VSS, <insert your
 592   favorite SCMS here>?
 593
 594    Currently the core developers see no SCMS that will provide enough
 595    benefit to outwiegh the pain involved in moving to a new SCMS. Typical
 596    problems that must be addressed by any new SCMS include:
 597      * Run natively on all of our supported platforms.
 598      * Integrate into the Buildfarm.
 599      * Import our entire CVS Repository while preserving complete
 600        history.
 601      * Allow for anonymous checkouts.
 602
 603    Currently there is no intention for switching to a new SCMS until at
 604    least the end of the 8.4 development cycle sometime in late 2008. For
 605    more information please refer to the mailing list archives.
 606
 607 Technical Questions
 608
 609   2.1) How do I efficiently access information in tables from the backend code?
 610
 611    You first need to find the tuples(rows) you are interested in. There
 612    are two ways. First, SearchSysCache() and related functions allow you
 613    to query the system catalogs. This is the preferred way to access
 614    system tables, because the first call to the cache loads the needed
 615    rows, and future requests can return the results without accessing the
 616    base table. The caches use system table indexes to look up tuples. A
 617    list of available caches is located in
 618    src/backend/utils/cache/syscache.c.
 619    src/backend/utils/cache/lsyscache.c contains many column-specific
 620    cache lookup functions.
 621
 622    The rows returned are cache-owned versions of the heap rows.
 623    Therefore, you must not modify or delete the tuple returned by
 624    SearchSysCache(). What you should do is release it with
 625    ReleaseSysCache() when you are done using it; this informs the cache
 626    that it can discard that tuple if necessary. If you neglect to call
 627    ReleaseSysCache(), then the cache entry will remain locked in the
 628    cache until end of transaction, which is tolerable but not very
 629    desirable.
 630
 631    If you can't use the system cache, you will need to retrieve the data
 632    directly from the heap table, using the buffer cache that is shared by
 633    all backends. The backend automatically takes care of loading the rows
 634    into the buffer cache.
 635
 636    Open the table with heap_open(). You can then start a table scan with
 637    heap_beginscan(), then use heap_getnext() and continue as long as
 638    HeapTupleIsValid() returns true. Then do a heap_endscan(). Keys can be
 639    assigned to the scan. No indexes are used, so all rows are going to be
 640    compared to the keys, and only the valid rows returned.
 641
 642    You can also use heap_fetch() to fetch rows by block number/offset.
 643    While scans automatically lock/unlock rows from the buffer cache, with
 644    heap_fetch(), you must pass a Buffer pointer, and ReleaseBuffer() it
 645    when completed.
 646
 647    Once you have the row, you can get data that is common to all tuples,
 648    like t_self and t_oid, by merely accessing the HeapTuple structure
 649    entries. If you need a table-specific column, you should take the
 650    HeapTuple pointer, and use the GETSTRUCT() macro to access the
 651    table-specific start of the tuple. You then cast the pointer as a
 652    Form_pg_proc pointer if you are accessing the pg_proc table, or
 653    Form_pg_type if you are accessing pg_type. You can then access the
 654    columns by using a structure pointer:
 655 ((Form_pg_class) GETSTRUCT(tuple))->relnatts
 656
 657    You must not directly change live tuples in this way. The best way is
 658    to use heap_modifytuple() and pass it your original tuple, and the
 659    values you want changed. It returns a palloc'ed tuple, which you pass
 660    to heap_replace(). You can delete tuples by passing the tuple's t_self
 661    to heap_destroy(). You use t_self for heap_update() too. Remember,
 662    tuples can be either system cache copies, which might go away after
 663    you call ReleaseSysCache(), or read directly from disk buffers, which
 664    go away when you heap_getnext(), heap_endscan, or ReleaseBuffer(), in
 665    the heap_fetch() case. Or it may be a palloc'ed tuple, that you must
 666    pfree() when finished.
 667
 668   2.2) Why are table, column, type, function, view names sometimes referenced
 669   as Name or NameData, and sometimes as char *?
 670
 671    Table, column, type, function, and view names are stored in system
 672    tables in columns of type Name. Name is a fixed-length,
 673    null-terminated type of NAMEDATALEN bytes. (The default value for
 674    NAMEDATALEN is 64 bytes.)
 675 typedef struct nameData
 676     {
 677         char        data[NAMEDATALEN];
 678     } NameData;
 679     typedef NameData *Name;
 680
 681    Table, column, type, function, and view names that come into the
 682    backend via user queries are stored as variable-length,
 683    null-terminated character strings.
 684
 685    Many functions are called with both types of names, ie. heap_open().
 686    Because the Name type is null-terminated, it is safe to pass it to a
 687    function expecting a char *. Because there are many cases where
 688    on-disk names(Name) are compared to user-supplied names(char *), there
 689    are many cases where Name and char * are used interchangeably.
 690
 691   2.3) Why do we use Node and List to make data structures?
 692
 693    We do this because this allows a consistent way to pass data inside
 694    the backend in a flexible way. Every node has a NodeTag which
 695    specifies what type of data is inside the Node. Lists are groups of
 696    Nodes chained together as a forward-linked list.
 697
 698    Here are some of the List manipulation commands:
 699
 700    lfirst(i), lfirst_int(i), lfirst_oid(i)
 701           return the data (a pointer, integer or OID respectively) of
 702           list cell i.
 703
 704    lnext(i)
 705           return the next list cell after i.
 706
 707    foreach(i, list)
 708           loop through list, assigning each list cell to i. It is
 709           important to note that i is a ListCell *, not the data in the
 710           List element. You need to use lfirst(i) to get at the data.
 711           Here is a typical code snippet that loops through a List
 712           containing Var *'s and processes each one:
 713
 714
 715     List        *list;
 716     ListCell    *i;
 717
 718     foreach(i, list)
 719     {
 720         Var *var = lfirst(i);
 721
 722         /* process var here */
 723     }
 724
 725    lcons(node, list)
 726           add node to the front of list, or create a new list with node
 727           if list is NIL.
 728
 729    lappend(list, node)
 730           add node to the end of list.
 731
 732    list_concat(list1, list2)
 733           Concatenate list2 on to the end of list1.
 734
 735    list_length(list)
 736           return the length of the list.
 737
 738    list_nth(list, i)
 739           return the i'th element in list, counting from zero.
 740
 741    lcons_int, ...
 742           There are integer versions of these: lcons_int, lappend_int,
 743           etc. Also versions for OID lists: lcons_oid, lappend_oid, etc.
 744
 745    You can print nodes easily inside gdb. First, to disable output
 746    truncation when you use the gdb print command:
 747 (gdb) set print elements 0
 748
 749    Instead of printing values in gdb format, you can use the next two
 750    commands to print out List, Node, and structure contents in a verbose
 751    format that is easier to understand. List's are unrolled into nodes,
 752    and nodes are printed in detail. The first prints in a short format,
 753    and the second in a long format:
 754 (gdb) call print(any_pointer)
 755     (gdb) call pprint(any_pointer)
 756
 757    The output appears in the postmaster log file, or on your screen if
 758    you are running a backend directly without a postmaster.
 759
 760   2.4) I just added a field to a structure. What else should I do?
 761
 762    The structures passed around in the parser, rewriter, optimizer, and
 763    executor require quite a bit of support. Most structures have support
 764    routines in src/backend/nodes used to create, copy, read, and output
 765    those structures (in particular, the files copyfuncs.c and
 766    equalfuncs.c. Make sure you add support for your new field to these
 767    files. Find any other places the structure might need code for your
 768    new field. mkid is helpful with this (see 1.10).
 769
 770   2.5) Why do we use palloc() and pfree() to allocate memory?
 771
 772    palloc() and pfree() are used in place of malloc() and free() because
 773    we find it easier to automatically free all memory allocated when a
 774    query completes. This assures us that all memory that was allocated
 775    gets freed even if we have lost track of where we allocated it. There
 776    are special non-query contexts that memory can be allocated in. These
 777    affect when the allocated memory is freed by the backend.
 778
 779   2.6) What is ereport()?
 780
 781    ereport() is used to send messages to the front-end, and optionally
 782    terminate the current query being processed. The first parameter is an
 783    ereport level of DEBUG (levels 1-5), LOG, INFO, NOTICE, ERROR, FATAL,
 784    or PANIC. NOTICE prints on the user's terminal and the postmaster
 785    logs. INFO prints only to the user's terminal and LOG prints only to
 786    the server logs. (These can be changed from postgresql.conf.) ERROR
 787    prints in both places, and terminates the current query, never
 788    returning from the call. FATAL terminates the backend process. The
 789    remaining parameters of ereport are a printf-style set of parameters
 790    to print.
 791
 792    ereport(ERROR) frees most memory and open file descriptors so you
 793    don't need to clean these up before the call.
 794
 795   2.7) What is CommandCounterIncrement()?
 796
 797    Normally, transactions can not see the rows they modify. This allows
 798    UPDATE foo SET x = x + 1 to work correctly.
 799
 800    However, there are cases where a transactions needs to see rows
 801    affected in previous parts of the transaction. This is accomplished
 802    using a Command Counter. Incrementing the counter allows transactions
 803    to be broken into pieces so each piece can see rows modified by
 804    previous pieces. CommandCounterIncrement() increments the Command
 805    Counter, creating a new part of the transaction.
 806
 807   2.8) What debugging features are available?
 808
 809    First, try running configure with the --enable-cassert option, many
 810    assert()s monitor the progress of the backend and halt the program
 811    when something unexpected occurs.
 812
 813    The postmaster has a -d option that allows even more detailed
 814    information to be reported. The -d option takes a number that
 815    specifies the debug level. Be warned that high debug level values
 816    generate large log files.
 817
 818    If the postmaster is not running, you can actually run the postgres
 819    backend from the command line, and type your SQL statement directly.
 820    This is recommended only for debugging purposes. If you have compiled
 821    with debugging symbols, you can use a debugger to see what is
 822    happening. Because the backend was not started from postmaster, it is
 823    not running in an identical environment and locking/backend
 824    interaction problems might not be duplicated.
 825
 826    If the postmaster is running, start psql in one window, then find the
 827    PID of the postgres process used by psql using SELECT
 828    pg_backend_pid(). Use a debugger to attach to the postgres PID. You
 829    can set breakpoints in the debugger and issue queries from the other.
 830    If you are looking to find the location that is generating an error or
 831    log message, set a breakpoint at errfinish. psql. If you are debugging
 832    postgres startup, you can set PGOPTIONS="-W n", then start psql. This
 833    will cause startup to delay for n seconds so you can attach to the
 834    process with the debugger, set any breakpoints, and continue through
 835    the startup sequence.
 836
 837    You can also compile with profiling to see what functions are taking
 838    execution time. The backend profile files will be deposited in the
 839    pgsql/data directory. The client profile file will be put in the
 840    client's current directory. Linux requires a compile with
 841    -DLINUX_PROFILE for proper profiling.