-<!-- $PostgreSQL: pgsql/doc/src/sgml/charset.sgml,v 2.73 2005/06/21 04:02:29 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/charset.sgml,v 2.74 2005/10/13 21:43:43 tgl Exp $ -->
<chapter id="charset">
<title>Localization</>
</row>
<row>
<entry><literal>SQL_ASCII</literal></entry>
- <entry><acronym>ASCII</acronym></entry>
- <entry>English</entry>
+ <entry>unspecified (see text)</entry>
+ <entry><emphasis>any</></entry>
<entry>1</entry>
<entry></entry>
</row>
<entry><literal>UTF8</literal></entry>
<entry>Unicode, 8-bit</entry>
<entry><emphasis>all</></entry>
- <entry>1-3</entry>
+ <entry>1-4</entry>
<entry><literal>Unicode</></entry>
</row>
<row>
JDBC driver does not support <literal>MULE_INTERNAL</>, <literal>LATIN6</>,
<literal>LATIN8</>, and <literal>LATIN10</>.
</para>
+
+ <para>
+ The <literal>SQL_ASCII</> setting behaves considerably differently
+ from the other settings. When the server character set is
+ <literal>SQL_ASCII</>, the server interprets byte values 0-127
+ according to the ASCII standard, while byte values 128-255 are taken
+ as uninterpreted characters. No encoding conversion will be done when
+ the setting is <literal>SQL_ASCII</>. Thus, this setting is not so
+ much a declaration that a specific encoding is in use, as a declaration
+ of ignorance about the encoding. In most cases, if you are
+ working with any non-ASCII data, it is unwise to use the
+ <literal>SQL_ASCII</> setting, because
+ <productname>PostgreSQL</productname> will be unable to help you by
+ converting or validating non-ASCII characters.
+ </para>
</sect2>
<sect2>
</row>
<row>
<entry><literal>SQL_ASCII</literal></entry>
- <entry><emphasis>SQL_ASCII</emphasis>,
- <literal>MULE_INTERNAL</literal>,
- <literal>UTF8</literal>
+ <entry><emphasis>any (no conversion will be performed)</emphasis>
</entry>
</row>
<row>
</table>
<para>
- To enable the automatic character set conversion, you have to
+ To enable automatic character set conversion, you have to
tell <productname>PostgreSQL</productname> the character set
(encoding) you would like to use in the client. There are several
ways to accomplish this:
hexadecimal byte values in parentheses, e.g.,
<literal>(826C)</literal>.
</para>
+
+ <para>
+ If the client character set is defined as <literal>SQL_ASCII</>,
+ encoding conversion is disabled, regardless of the server's character
+ set. Just as for the server, use of <literal>SQL_ASCII</> is unwise
+ unless you are working with all-ASCII data.
+ </para>
</sect2>
<sect2>