2 Developer's Frequently Asked Questions (FAQ) for PostgreSQL
4 Last updated: Fri Jun 1 00:05:22 EDT 2007
6 Current maintainer: Bruce Momjian (bruce@momjian.us)
8 The most recent version of this document can be viewed at
9 http://www.postgresql.org/docs/faqs.FAQ_DEV.html.
10 _________________________________________________________________
14 1.1) How do I get involved in PostgreSQL development?
15 1.2) What development environment is required to develop code?
16 1.3) What areas need work?
17 1.4) What do I do after choosing an item to work on?
18 1.5) I have developed a patch, what next?
19 1.6) How is a patch reviewed?
20 1.7) Where can I learn more about the code?
21 1.8) How do I download/update the current source tree?
22 1.9) How do I test my changes?
23 1.10) What tools are available for developers?
24 1.11) What books are good for developers?
25 1.12) What is configure all about?
26 1.13) How do I add a new port?
27 1.14) Why don't you use threads, raw devices, async-I/O, <insert your
28 favorite wizz-bang feature here>?
29 1.15) How are RPM's packaged?
30 1.16) How are CVS branches handled?
31 1.17) Where can I get a copy of the SQL standards?
32 1.18) Where can I get technical assistance?
33 1.19) How do I get involved in PostgreSQL web site development?
34 1.20) Why haven't you replaced CVS with SVN, Git, Monotone, VSS,
35 <insert your favorite SCM system here>?
39 2.1) How do I efficiently access information in tables from the
41 2.2) Why are table, column, type, function, view names sometimes
42 referenced as Name or NameData, and sometimes as char *?
43 2.3) Why do we use Node and List to make data structures?
44 2.4) I just added a field to a structure. What else should I do?
45 2.5) Why do we use palloc() and pfree() to allocate memory?
46 2.6) What is ereport()?
47 2.7) What is CommandCounterIncrement()?
48 2.8) What debugging features are available?
49 _________________________________________________________________
53 1.1) How do I get involved in PostgreSQL development?
55 Download the code and have a look around. See 1.8.
57 Subscribe to and read the pgsql-hackers mailing list (often termed
58 'hackers'). This is where the major contributors and core members of
59 the project discuss development.
61 1.2) What development environment is required to develop code?
63 PostgreSQL is developed mostly in the C programming language. It also
64 makes use of Yacc and Lex.
66 The source code is targeted at most of the popular Unix platforms and
67 the Windows environment (XP, Windows 2000, and up).
69 Most developers make use of the open source development tool chain. If
70 you have contributed to open source software before, you will probably
71 be familiar with these tools. They include: GCC (http://gcc.gnu.org,
72 GDB (www.gnu.org/software/gdb/gdb.html), autoconf
73 (www.gnu.org/software/autoconf/) AND GNU make
74 (www.gnu.org/software/make/make.html.
76 Developers using this tool chain on Windows make use of MingW (see
77 http://www.mingw.org/).
79 Some developers use compilers from other software vendors with mixed
82 Developers who regularly rebuild the source often pass the
83 --enable-depend flag to configure. The result is that when you make a
84 modification to a C header file, all files depend upon that file are
87 src/Makefile.custom can be used to set environment variables, like
88 CUSTOM_COPT, that are used for every compile.
90 1.3) What areas need work?
92 Outstanding features are detailed in the TODO list. This is located in
93 doc/TODO in the source distribution or at
94 http://www.postgresql.org/docs/faqs.TODO.html.
96 You can learn more about these features by consulting the archives,
97 the SQL standards and the recommend texts (see 1.11).
99 1.4) What do I do after choosing an item to work on?
101 Send an email to pgsql-hackers with a proposal for what you want to do
102 (assuming your contribution is not trivial). Working in isolation is
103 not advisable because others might be working on the same TODO item,
104 or you might have misunderstood the TODO item. In the email, discuss
105 both the internal implementation method you plan to use, and any
106 user-visible changes (new syntax, etc). For complex patches, it is
107 important to get community feeback on your proposal before starting
108 work. Failure to do so might mean your patch is rejected. If your work
109 is being sponsored by a company, read this article for tips on being
112 A web site is maintained for patches awaiting review,
113 http://momjian.postgresql.org/cgi-bin/pgpatches, and those that are
114 being kept for the next release,
115 http://momjian.postgresql.org/cgi-bin/pgpatches_hold.
117 1.5) I have developed a patch, what next?
119 You will need to submit the patch to pgsql-patches@postgresql.org. It
120 will be reviewed by other contributors to the project and will be
121 either accepted or sent back for further work. To help ensure your
122 patch is reviewed and committed in a timely fashion, please try to
123 make sure your submission conforms to the following guidelines:
124 1. Ensure that your patch is generated against the most recent
125 version of the code, which for developers is CVS HEAD. For more on
126 branches in PostgreSQL, see 1.16.
127 2. Try to make your patch as readable as possible by following the
128 project's code-layout conventions. This makes it easier for the
129 reviewer, and there's no point in trying to layout things
130 differently than pgindent. Also avoid unnecessary whitespace
131 changes because they just distract the reviewer, and formatting
132 changes will be removed by the next run of pgindent.
133 3. The patch should be generated in contextual diff format (diff -c
134 and should be applicable from the root directory. If you are
135 unfamiliar with this, you might find the script
136 src/tools/make_diff/difforig useful. (Unified diffs are only
137 preferable if the file changes are single-line changes and do not
138 rely on surrounding lines.)
139 4. PostgreSQL is licensed under a BSD license. By posting a patch to
140 the public PostgreSQL mailling lists, you are giving the
141 PostgreSQL Global Development Group the non-revokable right to
142 distribute your patch under the BSD license.
143 5. Confirm that your changes can pass the regression tests. If your
144 changes are port specific, please list the ports you have tested
146 6. If you are adding a new feature, confirm that it has been tested
147 thoroughly. Try to test the feature in all conceivable scenarios.
148 7. New feature patches should also be accompanied by documentation
149 patches. If you need help checking the SQL standard, see 1.17.
150 8. Provide an implementation overview, preferably in code comments.
151 Following the surrounding code commenting style is usually a good
152 approach (also see this
153 http://www.ibm.com/developerworks/linux/library/l-clear-code/?ca=d
154 gr-FClnxw01linuxcodetips).
155 9. If it is a performance patch, please provide confirming test
156 results to show the benefit of your patch. It is OK to post
157 patches without this information, though the patch will not be
158 applied until somebody has tested the patch and found a
159 significant performance improvement.
161 Even if you pass all of the above, the patch might still be rejected
162 for other reasons. Please be prepared to listen to comments and make
165 You will be notified via email when the patch is applied, and your
166 name will appear in the next version of the release notes.
168 1.6) How is a patch reviewed?
170 Patch committers check several things before applying a patch:
171 * Patch follows the SQL standard or community agreed-upon behavior
172 * Style merges seamlessly into the surrounding code
173 * Written as simply and efficiently as possible
174 * Uses the available PostgreSQL subsystems properly
175 * Contains sufficient comments
176 * Contains code that works on all supported operating systems
177 * Has proper documentation
178 * Passes all regression tests, and if needed, adds new ones
179 * Behaves as expected, even under unusual cirumstances
180 * Contains no reliability risks
181 * Does not overly complicate the source code
182 * If performance-related, has a measureable performance benefit
183 * Is of sufficient usefulness to the average PostgreSQL user
184 * Follows existing PostgreSQL coding standards
186 1.7) Where can I learn more about the code?
188 Other than documentation in the source tree itself, you can find some
189 papers/presentations discussing the code at
190 http://www.postgresql.org/developer. An excellent presentation is at
191 http://neilconway.org/talks/hacking/
193 1.8) How do I download/update the current source tree?
195 There are several ways to obtain the source tree. Occasional
196 developers can just get the most recent source tree snapshot from
197 ftp://ftp.postgresql.org.
199 Regular developers might want to take advantage of anonymous access to
200 our source code management system. The source tree is currently hosted
201 in CVS. For details of how to obtain the source from CVS see
202 http://developer.postgresql.org/docs/postgres/cvs.html.
204 1.9) How do I test my changes?
208 The easiest way to test your code is to ensure that it builds against
209 the latest version of the code and that it does not generate compiler
212 It is worth advised that you pass --enable-cassert to configure. This
213 will turn on assertions with in the source which will often show us
214 bugs because they cause data corruption of segmentation violations.
215 This generally makes debugging much easier.
217 Then, perform run time testing via psql.
219 Regression test suite
221 The next step is to test your changes against the existing regression
222 test suite. To do this, issue "make check" in the root directory of
223 the source tree. If any tests failure, investigate.
225 If you've deliberately changed existing behavior, this change might
226 cause a regression test failure but not any actual regression. If so,
227 you should also patch the regression test suite.
229 Other run time testing
231 Some developers make use of tools such as valgrind
232 (http://valgrind.kde.org) for memory testing, gprof (which comes with
233 the GNU binutils suite) and oprofile
234 (http://oprofile.sourceforge.net/) for profiling and other related
237 What about unit testing, static analysis, model checking...?
239 There have been a number of discussions about other testing frameworks
240 and some developers are exploring these ideas.
242 Keep in mind the Makefiles do not have the proper dependencies for
243 include files. You have to do a make clean and then another make. If
244 you are using GCC you can use the --enable-depend option of configure
245 to have the compiler compute the dependencies automatically.
247 1.10) What tools are available for developers?
249 First, all the files in the src/tools directory are designed for
251 RELEASE_CHANGES changes we have to make for each release
252 backend description/flowchart of the backend directories
253 ccsym find standard defines made by your compiler
254 copyright fixes copyright notices
256 entab converts spaces to tabs, used by pgindent
257 find_static finds functions that could be made static
258 find_typedef finds typedefs in the source code
259 find_badmacros finds macros that use braces incorrectly
260 fsync a script to provide information about the cost of cache
262 make_ctags make vi 'tags' file in each directory
263 make_diff make *.orig and diffs of source
264 make_etags make emacs 'etags' files
265 make_keywords make comparison of our keywords and SQL'92
266 make_mkid make mkid ID files
267 pgcvslog used to generate a list of changes for each release
268 pginclude scripts for adding/removing include files
269 pgindent indents source files
270 pgtest a semi-automated build system
271 thread a thread testing script
273 In src/include/catalog:
274 unused_oids a script which generates unused OIDs for use in system
276 duplicate_oids finds duplicate OIDs in system catalog definitions
278 If you point your browser at the tools/backend/index.html file, you
279 will see few paragraphs describing the data flow, the backend
280 components in a flow chart, and a description of the shared memory
281 area. You can click on any flowchart box to see a description. If you
282 then click on the directory name, you will be taken to the source
283 directory, to browse the actual source code behind it. We also have
284 several README files in some source directories to describe the
285 function of the module. The browser will display these when you enter
286 the directory also. The tools/backend directory is also contained on
287 our web page under the title How PostgreSQL Processes a Query.
289 Second, you really should have an editor that can handle tags, so you
290 can tag a function call to see the function definition, and then tag
291 inside that function to see an even lower-level function, and then
292 back out twice to return to the original function. Most editors
293 support this via tags or etags files.
295 Third, you need to get id-utils from ftp://ftp.gnu.org/gnu/id-utils/
297 By running tools/make_mkid, an archive of source symbols can be
298 created that can be rapidly queried.
300 Some developers make use of cscope, which can be found at
301 http://cscope.sf.net/. Others use glimpse, which can be found at
302 http://webglimpse.net/.
304 tools/make_diff has tools to create patch diff files that can be
305 applied to the distribution. This produces context diffs, which is our
308 Our standard format BSD style, with each level of code indented one
309 tab, where each tab is four spaces. You will need to set your editor
310 or file viewer to display tabs as four spaces:
319 The tools/editors directory of the latest sources contains sample
320 settings that can be used with the emacs, xemacs and vim editors, that
321 assist in keeping to PostgreSQL coding standards.
323 pgindent will the format code by specifying flags to your operating
324 system's utility indent. This article describes the value of a
325 consistent coding style.
327 pgindent is run on all source files just before each beta test period.
328 It auto-formats all source files to make them consistent. Comment
329 blocks that need specific line breaks should be formatted as block
330 comments, where the comment starts as /*------. These comments will
331 not be reformatted in any way.
333 pginclude contains scripts used to add needed #include's to include
334 files, and removed unneeded #include's.
336 When adding system types, you will need to assign oids to them. There
337 is also a script called unused_oids in pgsql/src/include/catalog that
338 shows the unused oids.
340 1.11) What books are good for developers?
342 I have four good books, An Introduction to Database Systems, by C.J.
343 Date, Addison, Wesley, A Guide to the SQL Standard, by C.J. Date, et.
344 al, Addison, Wesley, Fundamentals of Database Systems, by Elmasri and
345 Navathe, and Transaction Processing, by Jim Gray, Morgan, Kaufmann
347 There is also a database performance site, with a handbook on-line
348 written by Jim Gray at http://www.benchmarkresources.com..
350 1.12) What is configure all about?
352 The files configure and configure.in are part of the GNU autoconf
353 package. Configure allows us to test for various capabilities of the
354 OS, and to set variables that can then be tested in C programs and
355 Makefiles. Autoconf is installed on the PostgreSQL main server. To add
356 options to configure, edit configure.in, and then run autoconf to
359 When configure is run by the user, it tests various OS capabilities,
360 stores those in config.status and config.cache, and modifies a list of
361 *.in files. For example, if there exists a Makefile.in, configure
362 generates a Makefile that contains substitutions for all @var@
363 parameters found by configure.
365 When you need to edit files, make sure you don't waste time modifying
366 files generated by configure. Edit the *.in file, and re-run configure
367 to recreate the needed file. If you run make distclean from the
368 top-level source directory, all files derived by configure are
369 removed, so you see only the file contained in the source
372 1.13) How do I add a new port?
374 There are a variety of places that need to be modified to add a new
375 port. First, start in the src/template directory. Add an appropriate
376 entry for your OS. Also, use src/config.guess to add your OS to
377 src/template/.similar. You shouldn't match the OS version exactly. The
378 configure test will look for an exact OS version number, and if not
379 found, find a match without version number. Edit src/configure.in to
380 add your new OS. (See configure item above.) You will need to run
381 autoconf, or patch src/configure too.
383 Then, check src/include/port and add your new OS file, with
384 appropriate values. Hopefully, there is already locking code in
385 src/include/storage/s_lock.h for your CPU. There is also a
386 src/makefiles directory for port-specific Makefile handling. There is
387 a backend/port directory if you need special files for your OS.
389 1.14) Why don't you use threads, raw devices, async-I/O, <insert your
390 favorite wizz-bang feature here>?
392 There is always a temptation to use the newest operating system
393 features as soon as they arrive. We resist that temptation.
395 First, we support 15+ operating systems, so any new feature has to be
396 well established before we will consider it. Second, most new
397 wizz-bang features don't provide dramatic improvements. Third, they
398 usually have some downside, such as decreased reliability or
399 additional code required. Therefore, we don't rush to use new features
400 but rather wait for the feature to be established, then ask for
401 testing to show that a measurable improvement is possible.
403 As an example, threads are not currently used in the backend code
405 * Historically, threads were unsupported and buggy.
406 * An error in one backend can corrupt other backends.
407 * Speed improvements using threads are small compared to the
408 remaining backend startup time.
409 * The backend code would be more complex.
411 So, we are not ignorant of new features. It is just that we are
412 cautious about their adoption. The TODO list often contains links to
413 discussions showing our reasoning in these areas.
415 1.15) How are RPMs packaged?
417 This was written by Lamar Owen and Devrim Gündüz:
421 As to how the RPMs are built -- to answer that question sanely
422 requires us to know how much experience you have with the whole RPM
423 paradigm. 'How is the RPM built?' is a multifaceted question. The
424 obvious simple answer is that we maintain:
425 1. A set of patches to make certain portions of the source tree
426 'behave' in the different environment of the RPMset;
428 3. Any other ancillary scripts and files;
429 4. A README.rpm-dist document that tries to adequately document both
430 the differences between the RPM build and the WHY of the
431 differences, as well as useful RPM environment operations (like,
432 using syslog, upgrading, getting postmaster to start at OS boot,
434 5. The spec file that throws it all together. This is not a trivial
435 undertaking in a package of this size.
437 PGDG RPM Maintainer builds the SRPM and announces the SRPM to the
438 pgsqlrpms-hackers list. This is a list where package builders are
439 subscribed. Then, the builders download the SRPM and rebuild it on
442 We try to build on as many different canonical distributions as we
443 can. Currently we are able to build on Red Hat Linux 9, RHEL 3 and
444 above, and all Fedora Core Linux releases.
446 To test the binaries, we install them on our local machines and run
447 regression tests. If the package builders uses postgres user to build
448 the rpms, then it is possible to run regression tests during RPM
451 Once the build passes these tests, the binary RPMs are sent back to
452 PGDG RPM Maintainer and they are pushed to main FTP site, followed by
453 a release announcement to pgsqlrpms-* lists, pgsql-general and
454 pgsql-announce lists.
456 You will notice we said 'canonical' distributions above. That simply
457 means that the machine is as stock 'out of the box' as practical --
458 that is, everything (except select few programs) on these boxen are
459 installed by RPM; only official Red Hat released RPMs are used (except
460 in unusual circumstances involving software that will not alter the
461 build -- for example, installing a newer non-RedHat version of the Dia
462 diagramming package is OK -- installing Python 2.1 on the box that has
463 Python 1.5.2 installed is not, as that alters the PostgreSQL build).
464 The RPM as uploaded is built to as close to out-of-the-box pristine as
465 is possible. Only the standard released 'official to that release'
466 compiler is used -- and only the standard official kernel is used as
469 PGDG RPM Building Project does not build RPMs for Mandrake .
471 We usually have only one SRPM for all platforms. This is because of
472 our limited resources. However, on some cases, we may distribute
473 different SRPMs for different platforms, depending on possible
474 compilation problems, especially on older distros.
476 Please note that this is a volunteered job -- We are doing our best to
477 keep packages up to date. We, at least, provide SRPMs for all
478 platforms. For example, if you do not find a RHEL 4 x86_64 RPM in our
479 FTP site, it means that we do not have a RHEL 4 x86_64 server around.
480 If you have one and want to help us, please do not hesitate to build
481 rpms and send to us :-)
482 http://pgfoundry.org/docman/view.php/1000048/98/PostgreSQL-RPM-Install
483 ation-PGDG.pdf has some information about building binary RPMs using
486 PGDG RPM Building Project is a hosted on pgFoundry :
487 http://pgfoundry.org/projects/pgsqlrpms. We are an open community,
488 except one point : Our pgsqlrpms-hackers list is open to package
489 builders only. Still, its archives are visible to public. We use a CVS
490 server to save the work we have done so far. This includes spec files
491 and patches; as well as documents.
493 As to why all these files aren't part of the source tree, well, unless
494 there was a large cry for it to happen, we don't believe it should.
496 1.16) How are CVS branches managed?
498 This was written by Tom Lane:
502 If you just do basic "cvs checkout", "cvs update", "cvs commit", then
503 you'll always be dealing with the HEAD version of the files in CVS.
504 That's what you want for development, but if you need to patch past
505 stable releases then you have to be able to access and update the
506 "branch" portions of our CVS repository. We normally fork off a branch
507 for a stable release just before starting the development cycle for
510 The first thing you have to know is the branch name for the branch you
511 are interested in getting at. To do this, look at some long-lived
512 file, say the top-level HISTORY file, with "cvs status -v" to see what
513 the branch names are. (Thanks to Ian Lance Taylor for pointing out
514 that this is the easiest way to do it.) Typical branch names are:
519 OK, so how do you do work on a branch? By far the best way is to
520 create a separate checkout tree for the branch and do your work in
521 that. Not only is that the easiest way to deal with CVS, but you
522 really need to have the whole past tree available anyway to test your
523 work. (And you *better* test your work. Never forget that dot-releases
524 tend to go out with very little beta testing --- so whenever you
525 commit an update to a stable branch, you'd better be doubly sure that
528 Normally, to checkout the head branch, you just cd to the place you
529 want to contain the toplevel "pgsql" directory and say
530 cvs ... checkout pgsql
532 To get a past branch, you cd to wherever you want it and say
533 cvs ... checkout -r BRANCHNAME pgsql
535 For example, just a couple days ago I did
536 mkdir ~postgres/REL7_1
538 cvs ... checkout -r REL7_1_STABLE pgsql
540 and now I have a maintenance copy of 7.1.*.
542 When you've done a checkout in this way, the branch name is "sticky":
543 CVS automatically knows that this directory tree is for the branch,
544 and whenever you do "cvs update" or "cvs commit" in this tree, you'll
545 fetch or store the latest version in the branch, not the head version.
548 So, if you have a patch that needs to apply to both the head and a
549 recent stable branch, you have to make the edits and do the commit
550 twice, once in your development tree and once in your stable branch
551 tree. This is kind of a pain, which is why we don't normally fork the
552 tree right away after a major release --- we wait for a dot-release or
553 two, so that we won't have to double-patch the first wave of fixes.
555 1.17) Where can I get a copy of the SQL standards?
557 There are three versions of the SQL standard: SQL-92, SQL:1999, and
558 SQL:2003. They are endorsed by ANSI and ISO. Draft versions can be
560 * SQL-92 http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
562 http://www.cse.iitb.ac.in/dbms/Data/Papers-Other/SQL1999/ansi-iso-
564 * SQL:2003 http://www.wiscorp.com/sql_2003_standard.zip
566 Some SQL standards web pages are:
567 * http://troels.arvin.dk/db/rdbms/links/#standards
568 * http://www.wiscorp.com/SQLStandards.html
569 * http://www.contrib.andrew.cmu.edu/~shadow/sql.html#syntax (SQL-92)
570 * http://dbs.uni-leipzig.de/en/lokal/standards.pdf (paper)
572 1.18) Where can I get technical assistance?
574 Many technical questions held by those new to the code have been
575 answered on the pgsql-hackers mailing list - the archives of which can
576 be found at http://archives.postgresql.org/pgsql-hackers/.
578 If you cannot find discussion or your particular question, feel free
579 to put it to the list.
581 Major contributors also answer technical questions, including
582 questions about development of new features, on IRC at
583 irc.freenode.net in the #postgresql channel.
585 1.19) How do I get involved in PostgreSQL web site development?
587 PostgreSQL website development is discussed on the
588 pgsql-www@postgresql.org mailing list. The is a project page where the
589 source code is available at
590 http://gborg.postgresql.org/project/pgweb/projdisplay.php , the code
591 for the next version of the website is under the "portal" module.
593 1.20) Why haven't you replaced CVS with SVN, Git, Monotone, VSS, <insert your
596 Currently the core developers see no SCMS that will provide enough
597 benefit to outwiegh the pain involved in moving to a new SCMS. Typical
598 problems that must be addressed by any new SCMS include:
599 * Run natively on all of our supported platforms.
600 * Integrate into the Buildfarm.
601 * Import our entire CVS Repository while preserving complete
603 * Allow for anonymous checkouts.
605 Currently there is no intention for switching to a new SCMS until at
606 least the end of the 8.4 development cycle sometime in late 2008. For
607 more information please refer to the mailing list archives.
611 2.1) How do I efficiently access information in tables from the backend code?
613 You first need to find the tuples(rows) you are interested in. There
614 are two ways. First, SearchSysCache() and related functions allow you
615 to query the system catalogs. This is the preferred way to access
616 system tables, because the first call to the cache loads the needed
617 rows, and future requests can return the results without accessing the
618 base table. The caches use system table indexes to look up tuples. A
619 list of available caches is located in
620 src/backend/utils/cache/syscache.c.
621 src/backend/utils/cache/lsyscache.c contains many column-specific
622 cache lookup functions.
624 The rows returned are cache-owned versions of the heap rows.
625 Therefore, you must not modify or delete the tuple returned by
626 SearchSysCache(). What you should do is release it with
627 ReleaseSysCache() when you are done using it; this informs the cache
628 that it can discard that tuple if necessary. If you neglect to call
629 ReleaseSysCache(), then the cache entry will remain locked in the
630 cache until end of transaction, which is tolerable but not very
633 If you can't use the system cache, you will need to retrieve the data
634 directly from the heap table, using the buffer cache that is shared by
635 all backends. The backend automatically takes care of loading the rows
636 into the buffer cache.
638 Open the table with heap_open(). You can then start a table scan with
639 heap_beginscan(), then use heap_getnext() and continue as long as
640 HeapTupleIsValid() returns true. Then do a heap_endscan(). Keys can be
641 assigned to the scan. No indexes are used, so all rows are going to be
642 compared to the keys, and only the valid rows returned.
644 You can also use heap_fetch() to fetch rows by block number/offset.
645 While scans automatically lock/unlock rows from the buffer cache, with
646 heap_fetch(), you must pass a Buffer pointer, and ReleaseBuffer() it
649 Once you have the row, you can get data that is common to all tuples,
650 like t_self and t_oid, by merely accessing the HeapTuple structure
651 entries. If you need a table-specific column, you should take the
652 HeapTuple pointer, and use the GETSTRUCT() macro to access the
653 table-specific start of the tuple. You then cast the pointer as a
654 Form_pg_proc pointer if you are accessing the pg_proc table, or
655 Form_pg_type if you are accessing pg_type. You can then access the
656 columns by using a structure pointer:
657 ((Form_pg_class) GETSTRUCT(tuple))->relnatts
659 You must not directly change live tuples in this way. The best way is
660 to use heap_modifytuple() and pass it your original tuple, and the
661 values you want changed. It returns a palloc'ed tuple, which you pass
662 to heap_replace(). You can delete tuples by passing the tuple's t_self
663 to heap_destroy(). You use t_self for heap_update() too. Remember,
664 tuples can be either system cache copies, which might go away after
665 you call ReleaseSysCache(), or read directly from disk buffers, which
666 go away when you heap_getnext(), heap_endscan, or ReleaseBuffer(), in
667 the heap_fetch() case. Or it may be a palloc'ed tuple, that you must
668 pfree() when finished.
670 2.2) Why are table, column, type, function, view names sometimes referenced
671 as Name or NameData, and sometimes as char *?
673 Table, column, type, function, and view names are stored in system
674 tables in columns of type Name. Name is a fixed-length,
675 null-terminated type of NAMEDATALEN bytes. (The default value for
676 NAMEDATALEN is 64 bytes.)
677 typedef struct nameData
679 char data[NAMEDATALEN];
681 typedef NameData *Name;
683 Table, column, type, function, and view names that come into the
684 backend via user queries are stored as variable-length,
685 null-terminated character strings.
687 Many functions are called with both types of names, ie. heap_open().
688 Because the Name type is null-terminated, it is safe to pass it to a
689 function expecting a char *. Because there are many cases where
690 on-disk names(Name) are compared to user-supplied names(char *), there
691 are many cases where Name and char * are used interchangeably.
693 2.3) Why do we use Node and List to make data structures?
695 We do this because this allows a consistent way to pass data inside
696 the backend in a flexible way. Every node has a NodeTag which
697 specifies what type of data is inside the Node. Lists are groups of
698 Nodes chained together as a forward-linked list.
700 Here are some of the List manipulation commands:
702 lfirst(i), lfirst_int(i), lfirst_oid(i)
703 return the data (a pointer, integer or OID respectively) of
707 return the next list cell after i.
710 loop through list, assigning each list cell to i. It is
711 important to note that i is a ListCell *, not the data in the
712 List element. You need to use lfirst(i) to get at the data.
713 Here is a typical code snippet that loops through a List
714 containing Var *'s and processes each one:
722 Var *var = lfirst(i);
724 /* process var here */
728 add node to the front of list, or create a new list with node
732 add node to the end of list.
734 list_concat(list1, list2)
735 Concatenate list2 on to the end of list1.
738 return the length of the list.
741 return the i'th element in list, counting from zero.
744 There are integer versions of these: lcons_int, lappend_int,
745 etc. Also versions for OID lists: lcons_oid, lappend_oid, etc.
747 You can print nodes easily inside gdb. First, to disable output
748 truncation when you use the gdb print command:
749 (gdb) set print elements 0
751 Instead of printing values in gdb format, you can use the next two
752 commands to print out List, Node, and structure contents in a verbose
753 format that is easier to understand. List's are unrolled into nodes,
754 and nodes are printed in detail. The first prints in a short format,
755 and the second in a long format:
756 (gdb) call print(any_pointer)
757 (gdb) call pprint(any_pointer)
759 The output appears in the postmaster log file, or on your screen if
760 you are running a backend directly without a postmaster.
762 2.4) I just added a field to a structure. What else should I do?
764 The structures passed around in the parser, rewriter, optimizer, and
765 executor require quite a bit of support. Most structures have support
766 routines in src/backend/nodes used to create, copy, read, and output
767 those structures (in particular, the files copyfuncs.c and
768 equalfuncs.c. Make sure you add support for your new field to these
769 files. Find any other places the structure might need code for your
770 new field. mkid is helpful with this (see 1.10).
772 2.5) Why do we use palloc() and pfree() to allocate memory?
774 palloc() and pfree() are used in place of malloc() and free() because
775 we find it easier to automatically free all memory allocated when a
776 query completes. This assures us that all memory that was allocated
777 gets freed even if we have lost track of where we allocated it. There
778 are special non-query contexts that memory can be allocated in. These
779 affect when the allocated memory is freed by the backend.
781 2.6) What is ereport()?
783 ereport() is used to send messages to the front-end, and optionally
784 terminate the current query being processed. The first parameter is an
785 ereport level of DEBUG (levels 1-5), LOG, INFO, NOTICE, ERROR, FATAL,
786 or PANIC. NOTICE prints on the user's terminal and the postmaster
787 logs. INFO prints only to the user's terminal and LOG prints only to
788 the server logs. (These can be changed from postgresql.conf.) ERROR
789 prints in both places, and terminates the current query, never
790 returning from the call. FATAL terminates the backend process. The
791 remaining parameters of ereport are a printf-style set of parameters
794 ereport(ERROR) frees most memory and open file descriptors so you
795 don't need to clean these up before the call.
797 2.7) What is CommandCounterIncrement()?
799 Normally, transactions can not see the rows they modify. This allows
800 UPDATE foo SET x = x + 1 to work correctly.
802 However, there are cases where a transactions needs to see rows
803 affected in previous parts of the transaction. This is accomplished
804 using a Command Counter. Incrementing the counter allows transactions
805 to be broken into pieces so each piece can see rows modified by
806 previous pieces. CommandCounterIncrement() increments the Command
807 Counter, creating a new part of the transaction.
809 2.8) What debugging features are available?
811 First, try running configure with the --enable-cassert option, many
812 assert()s monitor the progress of the backend and halt the program
813 when something unexpected occurs.
815 The postmaster has a -d option that allows even more detailed
816 information to be reported. The -d option takes a number that
817 specifies the debug level. Be warned that high debug level values
818 generate large log files.
820 If the postmaster is not running, you can actually run the postgres
821 backend from the command line, and type your SQL statement directly.
822 This is recommended only for debugging purposes. If you have compiled
823 with debugging symbols, you can use a debugger to see what is
824 happening. Because the backend was not started from postmaster, it is
825 not running in an identical environment and locking/backend
826 interaction problems might not be duplicated.
828 If the postmaster is running, start psql in one window, then find the
829 PID of the postgres process used by psql using SELECT
830 pg_backend_pid(). Use a debugger to attach to the postgres PID. You
831 can set breakpoints in the debugger and issue queries from the other.
832 If you are looking to find the location that is generating an error or
833 log message, set a breakpoint at errfinish. psql. If you are debugging
834 postgres startup, you can set PGOPTIONS="-W n", then start psql. This
835 will cause startup to delay for n seconds so you can attach to the
836 process with the debugger, set any breakpoints, and continue through
837 the startup sequence.
839 You can also compile with profiling to see what functions are taking
840 execution time. The backend profile files will be deposited in the
841 pgsql/data directory. The client profile file will be put in the
842 client's current directory. Linux requires a compile with
843 -DLINUX_PROFILE for proper profiling.