From: Rocco Rutte Date: Tue, 7 Jul 2009 10:53:46 +0000 (+0200) Subject: Manual: mention terminal setup for charsets, more unicode pros. X-Git-Tag: mutt-1-5-21-rel~171 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=fbbf3af9413aed1c94590b2fe7a9e8af0f474fc8;p=mutt Manual: mention terminal setup for charsets, more unicode pros. Closes #3292. --- diff --git a/ChangeLog b/ChangeLog index 75b7b945..950cd0d7 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,12 @@ +2009-07-06 15:28 +0200 Rocco Rutte (ccab6c56b557) + + * doc/manual.xml.head: Manual: Add a note about when/why to use utf-8 + +2009-07-05 18:36 -0700 Brendan Cully (118b8fef8aae) + + * buffy.c, buffy.h, mx.c: Suppress new mail notification + from mailbox just left. Closes #3290. + 2009-07-02 20:42 +0200 Rocco Rutte (042f2ce0b870) * doc/manual.xml.head: Manual: minor fixes diff --git a/doc/manual.xml.head b/doc/manual.xml.head index bf244cce..6ff51f3f 100644 --- a/doc/manual.xml.head +++ b/doc/manual.xml.head @@ -4572,7 +4572,8 @@ A character set is basically a mapping between bytes and glyphs and implies a certain character encoding scheme. For example, for the ISO 8859 family of character sets, an encoding of 8bit per character is used. For the Unicode character set, different character encodings -may be used, UTF-8 being the most popular. +may be used, UTF-8 being the most popular. In UTF-8, a character is +represented using a variable number of bytes ranging from 1 to 4. @@ -4598,28 +4599,49 @@ setup itself. If you happen to work with several character sets on a regular basis, it's highly advisable to use Unicode and an UTF-8 locale. Unicode can -represent nearly all characters in a message at the same time, making -all conversions superfluous which eliminates the risk of conversion -errors. It also eliminates potentially wrong expectations about the -character set between Mutt and external programs. +represent nearly all characters in a message at the same time. When not +using a Unicode locale, it may happen that you receive messages with +characters not representable in your locale. When displaying such a +message, or replying to or forwarding it, information may get lost +possibly rendering the message unusable (not only for you but also for +the recipient, this breakage is not reversible as lost information +cannot be guessed). + + + +A Unicode locale makes all conversions superfluous which eliminates the +risk of conversion errors. It also eliminates potentially wrong +expectations about the character set between Mutt and external programs. + + + +The terminal emulator used also must be properly configured for the +current locale. Terminal emulators usually do not +derive the locale from environment variables, they need to be configured +separately. If the terminal is incorrectly configured, Mutt may display +random and unexpected characters (question marks, octal codes, or just +random glyphs), format strings may not work as expected, you may not be +abled to enter non-ascii characters, and possible more. Data is always +represented using bytes and so a correct setup is very important as to +the machine, all character sets look the same. Warning: A mismatch between what system and library functions think the locale is and what Mutt was told what the locale is may make it behave -badly with non-ascii input: it will fail at seemingly random -places. This warning is to be taken seriously since not only local mail -handling may suffer: sent messages may carry wrong character set -information the receiver has too deal with. The -need to set $charset directly in most cases points at -terminal and environment variable setup problems, not Mutt problems. +badly with non-ascii input: it will fail at seemingly random places. +This warning is to be taken seriously since not only local mail handling +may suffer: sent messages may carry wrong character set information the +receiver has too deal with. The need to set +$charset directly in most cases points at terminal +and environment variable setup problems, not Mutt problems. A list of officially assigned and known character sets can be found at IANA, -a list of locally supported locales can be obtained by -running locale -a. +a list of locally supported locales can be obtained by running +locale -a.