2 $PostgreSQL: pgsql/doc/src/sgml/sources.sgml,v 2.14 2004/05/08 02:13:31 momjian Exp $
6 <title>PostgreSQL Coding Conventions</title>
8 <sect1 id="source-format">
9 <title>Formatting</title>
12 Source code formatting uses 4 column tab spacing, with
13 tabs preserved (i.e. tabs are not expanded to spaces).
14 Each logical indentation level is one additional tab stop.
15 Layout rules (brace positioning, etc) follow BSD conventions.
19 While submitted patches do not absolutely have to follow these formatting
20 rules, it's a good idea to do so. Your code will get run through
21 <application>pgindent</>, so there's no point in making it look nice
22 under some other set of formatting conventions.
26 For <productname>Emacs</productname>, add the following (or
27 something similar) to your <filename>~/.emacs</filename>
31 ;; check for files with a path containing "postgres" or "pgsql"
33 (cons '("\\(postgres\\|pgsql\\).*\\.[ch]\\'" . pgsql-c-mode)
36 (cons '("\\(postgres\\|pgsql\\).*\\.cc\\'" . pgsql-c-mode)
39 (defun pgsql-c-mode ()
40 ;; sets up formatting for PostgreSQL C code
43 (setq-default tab-width 4)
44 (c-set-style "bsd") ; set c-basic-offset to 4, plus other stuff
45 (c-set-offset 'case-label '+) ; tweak case indent to match PG custom
46 (setq indent-tabs-mode t)) ; make sure we keep tabs when indenting
51 For <application>vi</application>, your
52 <filename>~/.vimrc</filename> or equivalent file should contain
59 or equivalently from within <application>vi</application>, try
67 The text browsing tools <application>more</application> and
68 <application>less</application> can be invoked as
73 to make them show tabs appropriately.
77 <sect1 id="error-message-reporting">
78 <title>Reporting Errors Within the Server</title>
81 <primary>ereport</primary>
84 <primary>elog</primary>
88 Error, warning, and log messages generated within the server code
89 should be created using <function>ereport</>, or its older cousin
90 <function>elog</>. The use of this function is complex enough to
91 require some explanation.
95 There are two required elements for every message: a severity level
96 (ranging from <literal>DEBUG</> to <literal>PANIC</>) and a primary
97 message text. In addition there are optional elements, the most
98 common of which is an error identifier code that follows the SQL spec's
100 <function>ereport</> itself is just a shell function, that exists
101 mainly for the syntactic convenience of making message generation
102 look like a function call in the C source code. The only parameter
103 accepted directly by <function>ereport</> is the severity level.
104 The primary message text and any optional message elements are
105 generated by calling auxiliary functions, such as <function>errmsg</>,
106 within the <function>ereport</> call.
110 A typical call to <function>ereport</> might look like this:
113 (errcode(ERRCODE_DIVISION_BY_ZERO),
114 errmsg("division by zero")));
116 This specifies error severity level <literal>ERROR</> (a run-of-the-mill
117 error). The <function>errcode</> call specifies the SQLSTATE error code
118 using a macro defined in <filename>src/include/utils/errcodes.h</>. The
119 <function>errmsg</> call provides the primary message text. Notice the
120 extra set of parentheses surrounding the auxiliary function calls ---
121 these are annoying but syntactically necessary.
125 Here is a more complex example:
128 (errcode(ERRCODE_AMBIGUOUS_FUNCTION),
129 errmsg("function %s is not unique",
130 func_signature_string(funcname, nargs,
132 errhint("Unable to choose a best candidate function. "
133 "You may need to add explicit typecasts.")));
135 This illustrates the use of format codes to embed run-time values into
136 a message text. Also, an optional <quote>hint</> message is provided.
140 The available auxiliary routines for <function>ereport</> are:
144 <function>errcode</>(sqlerrcode) specifies the SQLSTATE error identifier
145 code for the condition. If this routine is not called, the error
146 identifier defaults to
147 <literal>ERRCODE_INTERNAL_ERROR</> when the error severity level is
148 <literal>ERROR</> or higher, <literal>ERRCODE_WARNING</> when the
149 error level is <literal>WARNING</>, otherwise (for <literal>NOTICE</>
150 and below) <literal>ERRCODE_SUCCESSFUL_COMPLETION</>.
151 While these defaults are often convenient, always think whether they
152 are appropriate before omitting the <function>errcode</>() call.
157 <function>errmsg</>(const char *msg, ...) specifies the primary error
158 message text, and possibly run-time values to insert into it. Insertions
159 are specified by <function>sprintf</>-style format codes. In addition to
160 the standard format codes accepted by <function>sprintf</>, the format
161 code <literal>%m</> can be used to insert the error message returned
162 by <function>strerror</> for the current value of <literal>errno</>.
165 That is, the value that was current when the <function>ereport</> call
166 was reached; changes of <literal>errno</> within the auxiliary reporting
167 routines will not affect it. That would not be true if you were to
168 write <literal>strerror(errno)</> explicitly in <function>errmsg</>'s
169 parameter list; accordingly, do not do so.
172 <literal>%m</> does not require any
173 corresponding entry in the parameter list for <function>errmsg</>.
174 Note that the message string will be run through <function>gettext</>
175 for possible localization before format codes are processed.
180 <function>errmsg_internal</>(const char *msg, ...) is the same as
181 <function>errmsg</>, except that the message string will not be
182 included in the internationalization message dictionary.
183 This should be used for <quote>can't happen</> cases that are probably
184 not worth expending translation effort on.
189 <function>errdetail</>(const char *msg, ...) supplies an optional
190 <quote>detail</> message; this is to be used when there is additional
191 information that seems inappropriate to put in the primary message.
192 The message string is processed in just the same way as for
198 <function>errhint</>(const char *msg, ...) supplies an optional
199 <quote>hint</> message; this is to be used when offering suggestions
200 about how to fix the problem, as opposed to factual details about
202 The message string is processed in just the same way as for
208 <function>errcontext</>(const char *msg, ...) is not normally called
209 directly from an <function>ereport</> message site; rather it is used
210 in <literal>error_context_stack</> callback functions to provide
211 information about the context in which an error occurred, such as the
212 current location in a PL function.
213 The message string is processed in just the same way as for
214 <function>errmsg</>. Unlike the other auxiliary functions, this can
215 be called more than once per <function>ereport</> call; the successive
216 strings thus supplied are concatenated with separating newlines.
221 <function>errposition</>(int cursorpos) specifies the textual location
222 of an error within a query string. Currently it is only useful for
223 errors detected in the lexical and syntactic analysis phases of
229 <function>errcode_for_file_access</>() is a convenience function that
230 selects an appropriate SQLSTATE error identifier for a failure in a
231 file-access-related system call. It uses the saved
232 <literal>errno</> to determine which error code to generate.
233 Usually this should be used in combination with <literal>%m</> in the
234 primary error message text.
239 <function>errcode_for_socket_access</>() is a convenience function that
240 selects an appropriate SQLSTATE error identifier for a failure in a
241 socket-related system call.
248 There is an older function <function>elog</> that is still heavily used.
249 An <function>elog</> call
251 elog(level, "format string", ...);
253 is exactly equivalent to
255 ereport(level, (errmsg_internal("format string", ...)));
257 Notice that the SQLSTATE errcode is always defaulted, and the message
258 string is not included in the internationalization message dictionary.
259 Therefore, <function>elog</> should be used only for internal errors and
260 low-level debug logging. Any message that is likely to be of interest to
261 ordinary users should go through <function>ereport</>. Nonetheless,
262 there are enough internal <quote>can't happen</> error checks in the
263 system that <function>elog</> is still widely used; it is preferred for
264 those messages for its notational simplicity.
268 Advice about writing good error messages can be found in
269 <xref linkend="error-style-guide">.
273 <sect1 id="error-style-guide">
274 <title>Error Message Style Guide</title>
277 This style guide is offered in the hope of maintaining a consistent,
278 user-friendly style throughout all the messages generated by
279 <productname>PostgreSQL</>.
283 <title>What goes where</title>
286 The primary message should be short, factual, and avoid reference to
287 implementation details such as specific function names.
288 <quote>Short</quote> means <quote>should fit on one line under normal
289 conditions</quote>. Use a detail message if needed to keep the primary
290 message short, or if you feel a need to mention implementation details
291 such as the particular system call that failed. Both primary and detail
292 messages should be factual. Use a hint message for suggestions about what
293 to do to fix the problem, especially if the suggestion might not always be
298 For example, instead of
300 IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m
301 (plus a long addendum that is basically a hint)
305 Primary: could not create shared memory segment: %m
306 Detail: Failed syscall was shmget(key=%d, size=%u, 0%o).
312 Rationale: keeping the primary message short helps keep it to the point,
313 and lets clients lay out screen space on the assumption that one line is
314 enough for error messages. Detail and hint messages may be relegated to a
315 verbose mode, or perhaps a pop-up error-details window. Also, details and
316 hints would normally be suppressed from the server log to save
317 space. Reference to implementation details is best avoided since users
318 don't know the details anyway.
324 <title>Formatting</title>
327 Don't put any specific assumptions about formatting into the message
328 texts. Expect clients and the server log to wrap lines to fit their own
329 needs. In long messages, newline characters (\n) may be used to indicate
330 suggested paragraph breaks. Don't end a message with a newline. Don't
331 use tabs or other formatting characters. (In error context displays,
332 newlines are automatically added to separate levels of context such as
337 Rationale: Messages are not necessarily displayed on terminal-type
338 displays. In GUI displays or browsers these formatting instructions are
345 <title>Quotation marks</title>
348 English text should use double quotes when quoting is appropriate.
349 Text in other languages should consistently use one kind of quotes that is
350 consistent with publishing customs and computer output of other programs.
354 Rationale: The choice of double quotes over single quotes is somewhat
355 arbitrary, but tends to be the preferred use. Some have suggested
356 choosing the kind of quotes depending on the type of object according to
357 SQL conventions (namely, strings single quoted, identifiers double
358 quoted). But this is a language-internal technical issue that many users
359 aren't even familiar with, it won't scale to other kinds of quoted terms,
360 it doesn't translate to other languages, and it's pretty pointless, too.
366 <title>Use of quotes</title>
369 Use quotes always to delimit file names, user-supplied identifiers, and
370 other variables that might contain words. Do not use them to mark up
371 variables that will not contain words (for example, operator names).
375 There are functions in the backend that will double-quote their own output
376 at need (for example, <function>format_type_be</>()). Do not put
377 additional quotes around the output of such functions.
381 Rationale: Objects can have names that create ambiguity when embedded in a
382 message. Be consistent about denoting where a plugged-in name starts and
383 ends. But don't clutter messages with unnecessary or duplicate quote
390 <title>Grammar and punctuation</title>
393 The rules are different for primary error messages and for detail/hint
398 Primary error messages: Do not capitalize the first letter. Do not end a
399 message with a period. Do not even think about ending a message with an
404 Detail and hint messages: Use complete sentences, and end each with
405 a period. Capitalize the first word of sentences.
409 Rationale: Avoiding punctuation makes it easier for client applications to
410 embed the message into a variety of grammatical contexts. Often, primary
411 messages are not grammatically complete sentences anyway. (And if they're
412 long enough to be more than one sentence, they should be split into
413 primary and detail parts.) However, detail and hint messages are longer
414 and may need to include multiple sentences. For consistency, they should
415 follow complete-sentence style even when there's only one sentence.
421 <title>Upper case vs. lower case</title>
424 Use lower case for message wording, including the first letter of a
425 primary error message. Use upper case for SQL commands and key words if
426 they appear in the message.
430 Rationale: It's easier to make everything look more consistent this
431 way, since some messages are complete sentences and some not.
437 <title>Avoid passive voice</title>
440 Use the active voice. Use complete sentences when there is an acting
441 subject (<quote>A could not do B</quote>). Use telegram style without
442 subject if the subject would be the program itself; do not use
443 <quote>I</quote> for the program.
447 Rationale: The program is not human. Don't pretend otherwise.
453 <title>Present vs past tense</title>
456 Use past tense if an attempt to do something failed, but could perhaps
457 succeed next time (perhaps after fixing some problem). Use present tense
458 if the failure is certainly permanent.
462 There is a nontrivial semantic difference between sentences of the form
464 could not open file "%s": %m
468 cannot open file "%s"
470 The first one means that the attempt to open the file failed. The
471 message should give a reason, such as <quote>disk full</quote> or
472 <quote>file doesn't exist</quote>. The past tense is appropriate because
473 next time the disk might not be full anymore or the file in question may
478 The second form indicates the the functionality of opening the named file
479 does not exist at all in the program, or that it's conceptually
480 impossible. The present tense is appropriate because the condition will
481 persist indefinitely.
485 Rationale: Granted, the average user will not be able to draw great
486 conclusions merely from the tense of the message, but since the language
487 provides us with a grammar we should use it correctly.
493 <title>Type of the object</title>
496 When citing the name of an object, state what kind of object it is.
500 Rationale: Otherwise no one will know what <quote>foo.bar.baz</>
507 <title>Brackets</title>
510 Square brackets are only to be used (1) in command synopses to denote
511 optional arguments, or (2) to denote an array subscript.
515 Rationale: Anything else does not correspond to widely-known customary
516 usage and will confuse people.
522 <title>Assembling error messages</title>
525 When a message includes text that is generated elsewhere, embed it in
528 could not open file %s: %m
533 Rationale: It would be difficult to account for all possible error codes
534 to paste this into a single smooth sentence, so some sort of punctuation
535 is needed. Putting the embedded text in parentheses has also been
536 suggested, but it's unnatural if the embedded text is likely to be the
537 most important part of the message, as is often the case.
543 <title>Reasons for errors</title>
546 Messages should always state the reason why an error occurred.
549 BAD: could not open file %s
550 BETTER: could not open file %s (I/O failure)
552 If no reason is known you better fix the code.
558 <title>Function names</title>
561 Don't include the name of the reporting routine in the error text. We have
562 other mechanisms for finding that out when needed, and for most users it's
563 not helpful information. If the error text doesn't make as much sense
564 without the function name, reword it.
566 BAD: pg_atoi: error in "z": can't parse "z"
567 BETTER: invalid input syntax for integer: "z"
572 Avoid mentioning called function names, either; instead say what the code
575 BAD: open() failed: %m
576 BETTER: could not open file %s: %m
578 If it really seems necessary, mention the system call in the detail
579 message. (In some cases, providing the actual values passed to the
580 system call might be appropriate information for the detail message.)
584 Rationale: Users don't know what all those functions do.
590 <title>Tricky words to avoid</title>
593 <title>Unable</title>
595 <quote>Unable</quote> is nearly the passive voice. Better use
596 <quote>cannot</quote> or <quote>could not</quote>, as appropriate.
603 Error messages like <quote>bad result</quote> are really hard to interpret
604 intelligently. It's better to write why the result is <quote>bad</quote>,
605 e.g., <quote>invalid format</quote>.
610 <title>Illegal</title>
612 <quote>Illegal</quote> stands for a violation of the law, the rest is
613 <quote>invalid</quote>. Better yet, say why it's invalid.
618 <title>Unknown</title>
620 Try to avoid <quote>unknown</quote>. Consider <quote>error: unknown
621 response</quote>. If you don't know what the response is, how do you know
622 it's erroneous? <quote>Unrecognized</quote> is often a better choice.
623 Also, be sure to include the value being complained of.
625 BAD: unknown node type
626 BETTER: unrecognized node type: 42
632 <title>Find vs. Exists</title>
634 If the program uses a nontrivial algorithm to locate a resource (e.g., a
635 path search) and that algorithm fails, it is fair to say that the program
636 couldn't <quote>find</quote> the resource. If, on the other hand, the
637 expected location of the resource is known but the program cannot access
638 it there then say that the resource doesn't <quote>exist</quote>. Using
639 <quote>find</quote> in this case sounds weak and confuses the issue.
646 <title>Proper spelling</title>
649 Spell out words in full. For instance, avoid:
680 Rationale: This will improve consistency.
686 <title>Localization</title>
689 Keep in mind that error message texts need to be translated into other
690 languages. Follow the guidelines in <xref linkend="nls-guidelines">
691 to avoid making life difficult for translators.
699 <!-- Keep this comment at the end of the file
704 sgml-minimize-attributes:nil
705 sgml-always-quote-attributes:t
708 sgml-parent-document:nil
709 sgml-default-dtd-file:"./reference.ced"
710 sgml-exposed-tags:nil
711 sgml-local-catalogs:("/usr/lib/sgml/catalog")
712 sgml-local-ecat-files:nil