From 423fe9d7f59c6109adc97be981c04df2e55e2ed0 Mon Sep 17 00:00:00 2001 From: =?utf8?q?Fran=C3=A7ois=20Pinard?= Date: Wed, 27 Feb 2008 17:34:56 -0500 Subject: [PATCH] Bump to 3.7-beta1 --- ChangeLog | 2 + NEWS | 579 ++++++++++++++++++++++++++-------------------- README | 31 +++ configure | 2 +- configure.ac | 2 +- doc/recode.info | 182 +++++++-------- doc/recode.info-1 | 2 +- doc/stamp-vti | 4 +- doc/version.texi | 4 +- src/main.c | 2 +- src/recode.1 | 4 +- 11 files changed, 456 insertions(+), 358 deletions(-) diff --git a/ChangeLog b/ChangeLog index cf44ac9..bd0f43b 100644 --- a/ChangeLog +++ b/ChangeLog @@ -6,6 +6,8 @@ (AC_OUTPUT): Handle it. Reported by Bruno Haible. + * configure.ac: Version 3.7-beta1. + 2008-02-23 François Pinard * configure.ac, Makefile.am: Handle python/. diff --git a/NEWS b/NEWS index af16194..3460350 100644 --- a/NEWS +++ b/NEWS @@ -1,273 +1,338 @@ -Free recode NEWS - User visible changes. -*- outline -*- (allout) -Copyright © 1993-1999, 2000, 2001 Free Software Foundation, Inc. - -* Version 3.6 - François Pinard, Bruno Haible, 2001-01. - -.* General changes -. + The recode manual is now indexed, by charset, by concept, etc. -. + Program messages are also available in Greek, Gallicean and Italian. -. + Bruno Haible's nice portable iconv library has been integrated. -. + RFC 1345 tables and French character names have been updated. -. + The Texinfo charset has been refreshed, and made reversible. - -.* New charsets (most from libiconv) - -. + Japanese - EUC-JP (csEUCPkdFmtJapanese, EUC_JP, - Extended_UNIX_Code_Packed_Format_for_Japanese); - ISO-2022-JP (csISO2022JP); ISO-2022-JP-1; ISO-2022-JP-2 (csISO2022JP2); - JIS_C6220-1969-ro (csISO14JISC6220ro, ISO646-JP, iso-ir-14, jp); - JIS_X0201 (csHalfWidthKatakana, JIS0201, JISX0201-1976, JISX0201.1976-0, - X0201); - JIS_X0208 (csISO87JISX0208, ISO-IR-87, JIS0208, JIS_X0208.1983-0, - JIS_X0208.1983-1, JIS_X0208-1990-0, JIS_X0208.1983-1, X0208); - JIS_X0212 (csISO159JISX02121990, ISO-IR-159, JIS0212, JIS_X0212.1990-0, - JIS_X0212-1990, X0212); - SJIS (csShiftJIS, MS_KANJI, SHIFT-JIS). +======================================= +Free recode NEWS - User visible changes +======================================= + +.. contents:: +.. sectnum:: + +:Copyright: © 1993-1999, 2000, 2001, 2008 Free Software Foundation, Inc. + +Version 3.7-beta1 +================= + +:Author: François Pinard, 2008-02. + ++ Changes are mostly internal, and correct reported bugs. + +Version 3.6 +=========== + +:Author: François Pinard, Bruno Haible, 2001-01. + +General changes +--------------- + ++ The recode manual is now indexed, by charset, by concept, etc. ++ Program messages are also available in Greek, Gallicean and Italian. ++ Bruno Haible's nice portable iconv library has been integrated. ++ RFC 1345 tables and French character names have been updated. ++ The Texinfo charset has been refreshed, and made reversible. + +New charsets +------------ + +(most from libiconv) + ++ Japanese + + + EUC-JP (csEUCPkdFmtJapanese, EUC_JP, + Extended_UNIX_Code_Packed_Format_for_Japanese); + + ISO-2022-JP (csISO2022JP); ISO-2022-JP-1; ISO-2022-JP-2 (csISO2022JP2); + + JIS_C6220-1969-ro (csISO14JISC6220ro, ISO646-JP, iso-ir-14, jp); + + JIS_X0201 (csHalfWidthKatakana, JIS0201, JISX0201-1976, JISX0201.1976-0, + X0201); + + JIS_X0208 (csISO87JISX0208, ISO-IR-87, JIS0208, JIS_X0208.1983-0, + JIS_X0208.1983-1, JIS_X0208-1990-0, JIS_X0208.1983-1, X0208); + + JIS_X0212 (csISO159JISX02121990, ISO-IR-159, JIS0212, JIS_X0212.1990-0, + JIS_X0212-1990, X0212); + + SJIS (csShiftJIS, MS_KANJI, SHIFT-JIS). . + Chinese - BIG5 (BIG-5, BIG-FIVE, BIGFIVE, CN-BIG5 csBig5); BIG5HKSCS; - EUC-CN (CN-GB, csGB2312, EUC_CN, GB2312); EUC-TW (csEUCTW, EUC_TW); - GB18030; HZ (HZ-GB-2312); ISO-2022-CN (csISO2022CN); ISO-2022-CN-EXT; - GB_1988-80 (cn, csISO57GB1988, ISO646-CN, iso-ir-57); - GB_2312-80 (CHINESE, csISO58GB231280, GB2312.1980-0, ISO-IR-58); - ISO-IR-165 (CN-GB-ISOIR165). + + + BIG5 (BIG-5, BIG-FIVE, BIGFIVE, CN-BIG5 csBig5); BIG5HKSCS; + + EUC-CN (CN-GB, csGB2312, EUC_CN, GB2312); EUC-TW (csEUCTW, EUC_TW); + + GB18030; HZ (HZ-GB-2312); ISO-2022-CN (csISO2022CN); ISO-2022-CN-EXT; + + GB_1988-80 (cn, csISO57GB1988, ISO646-CN, iso-ir-57); + + GB_2312-80 (CHINESE, csISO58GB231280, GB2312.1980-0, ISO-IR-58); + + ISO-IR-165 (CN-GB-ISOIR165). . + Korean - JOHAB (CP1361); EUC-KR (csEUCKR, EUC_KR); GBK (CP936); - ISO-2022-KR (csISO2022KR); - KSC_5601 (CP949, csKSC56011987, ISO-IR-149, KOREAN, KSC5601.1987-0, - KS_C_5601-1987, KS_C_5601-1989, KSX1001:1992). + + + JOHAB (CP1361); EUC-KR (csEUCKR, EUC_KR); GBK (CP936); + + ISO-2022-KR (csISO2022KR); + + KSC_5601 (CP949, csKSC56011987, ISO-IR-149, KOREAN, KSC5601.1987-0, + KS_C_5601-1987, KS_C_5601-1989, KSX1001:1992). . + Vietnamese (independently of libiconv) - TCVN; VIQR; VISCII; VNI; VPS. + + + TCVN; VIQR; VISCII; VNI; VPS. . + Other languages - ARMSCII-8; Georgian-Academy; Georgian-PS; WINDOWS-874 (CP874); - MuleLao-1; CP1133 (IBM-CP1133); CP1258 (WINDOWS-1258); - TIS-620 (ISO-IR-166, TIS620, TIS620.2529-1, TIS620-0, TIS620.2533-0, - TIS620.2533-1). + + + ARMSCII-8; Georgian-Academy; Georgian-PS; WINDOWS-874 (CP874); + + MuleLao-1; CP1133 (IBM-CP1133); CP1258 (WINDOWS-1258); + + TIS-620 (ISO-IR-166, TIS620, TIS620.2529-1, TIS620-0, TIS620.2533-0, + TIS620.2533-1). . + Apple specifics - MacArabic; MacCentralEurope; MacCroatian; MacCyrillic; MacGreek; - MacHebrew; MacIceland; MacRomania; MacThai; MacTurkish; MacUkraine + + + MacArabic; MacCentralEurope; MacCroatian; MacCyrillic; MacGreek; + + MacHebrew; MacIceland; MacRomania; MacThai; MacTurkish; MacUkraine . + Unicode - JAVA; UCS-2-INTERNAL; UCS-2LE (UnicodeLITTLE); UCS-2-SWAPPED; UCS-4BE; - UCS-4-INTERNAL; UCS-4LE; UCS-4-SWAPPED; UTF-16BE; UTF-16LE. + + + JAVA; UCS-2-INTERNAL; UCS-2LE (UnicodeLITTLE); UCS-2-SWAPPED; UCS-4BE; + + UCS-4-INTERNAL; UCS-4LE; UCS-4-SWAPPED; UTF-16BE; UTF-16LE. . + Others - CP932; CP949 (UHC); CP950; CP866 (866, csIBM866, IBM866). - ISO-8859-16 (ISO-IR-226, ISO_8859-16:2000). + + + CP932; CP949 (UHC); CP950; CP866 (866, csIBM866, IBM866). + + ISO-8859-16 (ISO-IR-226, ISO_8859-16:2000). . + Recode internal - :libiconv: (:) [so option -x: avoids going through libiconv] - -.* New aliases (from libiconv) [list to be revised] - csASCII (for ANSI_X3.4-1968); csHPRoman8 (for hp-roman8); - csISOLatin1 (for ISO-8859-1); csISOLatin2 (for ISO-8859-2); - csISOLatin3 (for ISO-8859-3); csISOLatin4 (for ISO-8859-4); - csISOLatin5 (for ISO-8859-9); - csISOLatin6 and ISO_8859-10:1992 (for ISO-8859-10); - csISOLatinArabic (for ISO-8859-6); csISOLatinCyrillic (for ISO-8859-5); - csISOLatinGreek (for ISO-8859-7); csISOLatinHebrew (for ISO-8859-8); - csKOI8R (for KOI8-R); csPC850Multilingual (for IBM850); - csUCS4 (for ISO-10646-UCS-4); - csUnicode, csUnicode11, UCS-2BE, UnicodeBIG (for ISO-10646-UCS-2); - csUnicode11UTF7 (for UNICODE-1-1-UTF-7); - csVISCII and VISCII1.1-1 (for VISCII); - ISO-IR-179 (for ISO-8859-13); csMacintosh and MacRoman (for macintosh); - TCVN5712-1, TCVN5712-1:1993 and TCVN-5712 (for TCVN). - -.* New surfaces - tree (experimental). - -* Version 3.5 - François Pinard, 1999-05. - -.* Incompatible changes -. + A double dot `..' should now be used instead of a colon `:'. -. + Option --force (-f) is needed to pursue recoding despite errors. -. + There is no more quoting for special characters within charsets names. -. + Auto check (`-a') and popen (`-o') options have been withdrawn. -. + Some charsets and aliases were deleted, see `Charsets & aliases' below. - -.* Extended features -. + Program messages are available in localised form for many languages. -. + Long character names are available in French, if LANGUAGE is set to `fr'. -. + A new request syntax allows for recode chaining, and for surfaces. -. + Option --header-file (-h) accepts a language parameter, and Perl is new. -. + Full charset listings now show the UCS-2 value for characters. -. + Option --known=PAIRS (-k) also accepts octal and hexadecimal numbers. -. + Option --list (-l) better sorts charsets and aliases, also fully written. -. + Charset `RFC1345' implements mnemonic+ascii+38, and is now reversible. -. + HTML is not limited anymore to Latin-1, HTML 4.0 entities are supported. - -.* New features -. + Euro support. -. + Updated RFC 1345 set of tables, from Keld Simonsen. -. + Some African charsets and transliterated forms. -. + Conversions for ISO 10646 and Unicode. -. + Combining or explosion of UCS-2 diacriticized characters and ligatures. -. + Implementation of surfaces, see `Surfaces & aliases' below. -. + Mixed mode for recoding only comments and strings in C sources or PO files. -. + A stand-alone recoding library gets installed, often as a shared library. -. + Option --find-subsets (-T) lists charsets which are subsets of another. -. + The library may generate testing data, and study character frequencies. - -.* Charsets & aliases - -. + New ISO 10646 and Unicode charsets -. - combined-UCS-2: pseudo-charset. -. - count-characters: pseudo-charset. -. - dump-with-names: pseudo-charset. -. - ISO-10646-UCS-2 (UNICODE-1-1, BMP, rune, u2). -. - ISO-10646-UCS-4 (10646, ISO-10646, UCS-4, u4). -. - UNICODE-1-1-UTF-7 (TF-7, u7). -. - UTF-8 (UTF-2, UTF-FSS, FSS_UTF, TF-8, u8). -. - UTF-16 (Unicode, TF-16, u6). - -. + RFC 1345.bis matters -. - Deleted charsets - dk-us, us-dk (because of &duplicate which `recode' does not handle yet). -. - New charsets - baltic (alias is iso-ir-179); CP1250 (1250, ms-ee, windows-1250); - CP1251 (1251, ms-cyrl, windows-1251); - CP1252 (1252, ms-ansi, windows-1252); - CP1253 (1253, ms-greek, windows-1253); - CP1254 (1254, ms-turk, windows-1254); - CP1255 (1255, ms-hebr, windows-1255); - CP1256 (1256, ms-arab, windows-1256); - CP1257 (1257, WinBaltRim, windows-1257); - CWI (CWI-2, cp-hu); EBCDIC-IS-FRISS (friss); - GOST_19768-87 with aliases of previous GOST_19768-74; - IBM256 (256, CP256, EBCDIC-INT1); IBM875 (875, CP875, EBCDIC-Greek); - IBM1004 (1004, CP1004, os2latin1); IBM1047 (1047, CP1047); - ISO-8859-13 (ISO_8859-13:1998, iso-baltic, iso-ir-179a, l7, latin7); - ISO-8859-14 (ISO_8859-14:1998, iso-celtic, iso-ir-199, l8, latin8); - ISO-8859-15 (ISO_8859-15:1998, iso-ir-203, l9, latin9); - KOI-7; KOI-8 (GOST_19768-74); KOI8-R; KOI8-RU; KOI8-U; - macintosh_ce (macce); mac-is; - NeXTSTEP (next) yet previous `recode' had it outside RFC 1345. -. - Alias promoted to charset (with previous charset becoming alias) - ISO-646.basic (with ISO-646.basic:1983); ISO-646.irv (ISO-646.irv:1983); - ISO_5427-ext (ISO_5427:1981); ISO_5428 (ISO_5428:1980); - ISO-8859-1 (ISO_8859-1:1987); ISO-8859-2 (ISO_8859-2:1987); - ISO-8859-3 (ISO_8859-3:1988); ISO-8859-4 (ISO_8859-4:1988); - ISO-8859-5 (ISO_8859-5:1988); ISO-8859-6 (ISO_8859-6:1987); - ISO-8859-7 (ISO_8859-7:1987); ISO-8859-8 (ISO_8859-8:1988); - ISO-8859-9 (ISO_8859-9:1989); ISO-8859-10 (latin6); - NC_NC00-10 (NC_NC00-10:81); sami (latin-lap). -. - New aliases - 037 (for charset IBM037); 038 (IBM038); 273 (IBM273); 274 (IBM274); - 275 (IBM275); 278 (IBM278); 280 (IBM280); 281 (IBM281); 284 (IBM284); - 285 (IBM285); 290 (IBM290); 297 (IBM297); 367 (ANSI_X3.4-1968); - 420 (IBM420); 423 (IBM423); 424 (IBM424); 500, 500V1 (IBM500); - 819 (ISO-8859-1); 864 (IBM864); 868 (IBM868); 870 (IBM870); - 871 (IBM871); 880 (IBM880); 891 (IBM891); 903 (IBM903); 905 (IBM905); - 912, CP912, IBM912 (ISO-8859-2); 918 (IBM918); 1026 (IBM1026); - ECMA-113, ECMA-113:1986 (ECMA-Cyrillic); GOST_19768-74 (KOI8); - ISO_8859-N (ISO-8859-N) for N = 1 through 10 and 13 through 15; - ISO_8859-10:1993 (ISO-8869-10); iso-ir-170 (INVARIANT); - KOI8_L2 (CSN_369103); pclatin2, pcl2 (IBM852); SS636127 (SEN_850200_B). - -. + New African charsets - AFRL1-101-BPI_OCIL (t-francais, t-fra); - AFRFUL-102-BPI_OCIL (bambara, bra, ewondo, fulfulde); - AFRFUL-103-BPI_OCIL (t-bambara, t-bra, t-ewondo, t-fulfulde); - AFRLIN-104-BPI_OCIL (lingala, lin, sango, wolof); - AFRLIN-105-BPI_OCIL (t-lingala, t-lin, t-sango, t-wolof). - -. + Extra miscellaneous charsets - KEYBCS2 (Kamenicky); CORK (T1); KOI-8_CS2. - -. + New HTML pseudo-charsets - HTML_1.1 (h1); HTML_2.0 (RFC 1866, 1866, h2); HTML-i18n (RFC 2070); - HTML_3.2 (h3) reimplemented; HTML_4.0 (h4, HTML, h); - deleted aliases HTF, 8859, ISO 8859, Entities, SGML, WWW, w3. - -.* Surfaces & aliases - Base64 (64, b64); Quoted-Printable (qp, Quote-Printable); - 21-Permutation (swabytes); 4321-Permutation; CR; CR-LF (cl); - Decimal-1 (d, d1); Decimal-2 (d2), Decimal-4 (d4); - Hexadecimal-1 (x, x1); Hexadecimal-2 (x2); Hexadecimal-4 (x4); - Octal-1 (o, o1); Octal-2 (o2); Octal-4 (o4). - data; test7; test8; test15; test16. - -* Version 3.4 - François Pinard, 1994-11. - -.* Charset HTML is new, it handles `&...;' sequences for Latin-1. -.* Charset AtariST handling is more general, --list may be used with it. -.* Charset ASCII-BS overstriking has been extended, mainly for German. -.* Charset RFC1345 may be a goal, to debug or study RFC 1345 short names. -.* Charset names have been revised. Note that nextstep is now NeXT. -.* Option --force (-f) is accepted, but does not yet protect reversibility. -.* Option --quiet or --silent (-q) silences irreversible recoding messages. -.* Option --known=PAIRS (-k) helps searching through recodings. -.* Option --sequence=pipe (-p) does not fall back on -o anymore. -.* Option --auto-check may narrow its study around one particular charset. -.* An MSDOS port is available, check ftp.iro.umontreal.ca in pub/gnuish. -.* Compilation should now succeed on OS/2 EMX. Thanks to Kai Uwe Rommel. -.* Program initialization is almost three times faster on average. -.* Corrected reported bugs, added small improvements, some aesthetic. - -* Version 3.3 - François Pinard, 1993-12. - -.* Charsets atarist, ebcdic-ccc, ebcdic-ibm and nextstep have been added. -.* Also, most RFC 1345 charsets and aliases are handled. That's a bunch! -.* Old ascii disappears because of RFC 1345's ascii, use ascii-bs instead. -.* Old maci disappears because of RFC 1345's macintosh, use applemac instead. -.* Charsets cccascii and cdcascii disappear, use ebcdic-ccc and ebcdic instead. -.* Recoding between latin1, ibmpc and applemac is (almost) reversible. -.* The texinfo documentation has been reorganized, this to be continued. -.* Long options are accepted, charset names may be abbreviated. -.* Option --list (-l) displays charsets, aliases and contents in many formats. -.* Option --strict (-s) asks for stricter, non-reversible recodings. -.* Option --graphics (-g) approximates ibmpc rulers with ASCII graphics. -.* Option --header (-h) produces C source for many recoding tables. -.* Option --auto-check (-a) reports about all possible recodings. -.* Option --ignore (-x) prevents a charset from being selected. -.* Execution has been sped up through step merging, hashing for charset names. -.* Many various buglets have been eradicated, portability increased. -.* Charsets may be edited out by modifying the Makefile only. -.* Configuration is made through the use of an external config.h file. - -* Version 3.2.4 - François Pinard, 1992-10. - -.* None. - -* Version 3.2.3 - François Pinard, 1992-09. - -.* New -d `diacritics_only' option for LaTeX. -.* A few bugs have been corrected. -.* Documentation reorganization and improvements. -.* Increased portability, now uses Autoconf. -.* A few bugs solved. - -* Version 3.2 - François Pinard, 1991-10. - -.* MSDOS port redone. -.* New check goal at installation time. -.* Add -v option for verbose processing, remove old -q. -.* Add -i, -o and -p for letting the user control the strategy. -.* A few bugs corrected. -.* Embedded NULs should now be transmitted. - -* Version 3.1 - François Pinard, 1990-03. - -.* Rename -V to -C for showing Copyright. -.* Calling sequence changed, said files now recoded on themselves. -.* Add -t option for touching files. -.* Better on-line help. - -* Version 3.0.1 - François Pinard, 1990-02. - -.* Add -q option for quiet processing. -.* Executable file now considerably smaller, also speedier. -.* A few bugs corrected. - -* Version 3.0 - François Pinard, 1989-10. - -.* New Text to Latin1 processing, should be faster. -.* A few bugs corrected. - -* For prior history down to 1980, see at the end of the ChangeLog. + + + :libiconv: (:) [so option -x: avoids going through libiconv] + +New aliases +----------- + +(from libiconv) [list to be revised] + ++ csASCII (for ANSI_X3.4-1968); csHPRoman8 (for hp-roman8); ++ csISOLatin1 (for ISO-8859-1); csISOLatin2 (for ISO-8859-2); ++ csISOLatin3 (for ISO-8859-3); csISOLatin4 (for ISO-8859-4); ++ csISOLatin5 (for ISO-8859-9); ++ csISOLatin6 and ISO_8859-10:1992 (for ISO-8859-10); ++ csISOLatinArabic (for ISO-8859-6); csISOLatinCyrillic (for ISO-8859-5); ++ csISOLatinGreek (for ISO-8859-7); csISOLatinHebrew (for ISO-8859-8); ++ csKOI8R (for KOI8-R); csPC850Multilingual (for IBM850); ++ csUCS4 (for ISO-10646-UCS-4); ++ csUnicode, csUnicode11, UCS-2BE, UnicodeBIG (for ISO-10646-UCS-2); ++ csUnicode11UTF7 (for UNICODE-1-1-UTF-7); ++ csVISCII and VISCII1.1-1 (for VISCII); ++ ISO-IR-179 (for ISO-8859-13); csMacintosh and MacRoman (for macintosh); ++ TCVN5712-1, TCVN5712-1:1993 and TCVN-5712 (for TCVN). + +New surfaces +------------ + ++ tree (experimental). + +Version 3.5 +=========== + +:Author: François Pinard, 1999-05. + +Incompatible changes +-------------------- + ++ A double dot `..' should now be used instead of a colon `:'. ++ Option --force (-f) is needed to pursue recoding despite errors. ++ There is no more quoting for special characters within charsets names. ++ Auto check (`-a') and popen (`-o') options have been withdrawn. ++ Some charsets and aliases were deleted, see `Charsets & aliases' below. + +Extended features +----------------- + ++ Program messages are available in localised form for many languages. ++ Long character names are available in French, if LANGUAGE is set to `fr'. ++ A new request syntax allows for recode chaining, and for surfaces. ++ Option --header-file (-h) accepts a language parameter, and Perl is new. ++ Full charset listings now show the UCS-2 value for characters. ++ Option --known=PAIRS (-k) also accepts octal and hexadecimal numbers. ++ Option --list (-l) better sorts charsets and aliases, also fully written. ++ Charset `RFC1345' implements mnemonic+ascii+38, and is now reversible. ++ HTML is not limited anymore to Latin-1, HTML 4.0 entities are supported. + +New features +------------ + ++ Euro support. ++ Updated RFC 1345 set of tables, from Keld Simonsen. ++ Some African charsets and transliterated forms. ++ Conversions for ISO 10646 and Unicode. ++ Combining or explosion of UCS-2 diacriticized characters and ligatures. ++ Implementation of surfaces, see `Surfaces & aliases' below. ++ Mixed mode for recoding only comments and strings in C sources or PO files. ++ A stand-alone recoding library gets installed, often as a shared library. ++ Option --find-subsets (-T) lists charsets which are subsets of another. ++ The library may generate testing data, and study character frequencies. + +Charsets & aliases +------------------ + ++ New ISO 10646 and Unicode charsets + + + combined-UCS-2: pseudo-charset. + + count-characters: pseudo-charset. + + dump-with-names: pseudo-charset. + + ISO-10646-UCS-2 (UNICODE-1-1, BMP, rune, u2). + + ISO-10646-UCS-4 (10646, ISO-10646, UCS-4, u4). + + UNICODE-1-1-UTF-7 (TF-7, u7). + + UTF-8 (UTF-2, UTF-FSS, FSS_UTF, TF-8, u8). + + UTF-16 (Unicode, TF-16, u6). + ++ RFC 1345.bis matters + + + Deleted charsets + + + dk-us, us-dk (because of &duplicate which `recode' does not handle yet). + + + New charsets + + + baltic (alias is iso-ir-179); CP1250 (1250, ms-ee, windows-1250); + + CP1251 (1251, ms-cyrl, windows-1251); + + CP1252 (1252, ms-ansi, windows-1252); + + CP1253 (1253, ms-greek, windows-1253); + + CP1254 (1254, ms-turk, windows-1254); + + CP1255 (1255, ms-hebr, windows-1255); + + CP1256 (1256, ms-arab, windows-1256); + + CP1257 (1257, WinBaltRim, windows-1257); + + CWI (CWI-2, cp-hu); EBCDIC-IS-FRISS (friss); + + GOST_19768-87 with aliases of previous GOST_19768-74; + + IBM256 (256, CP256, EBCDIC-INT1); IBM875 (875, CP875, EBCDIC-Greek); + + IBM1004 (1004, CP1004, os2latin1); IBM1047 (1047, CP1047); + + ISO-8859-13 (ISO_8859-13:1998, iso-baltic, iso-ir-179a, l7, latin7); + + ISO-8859-14 (ISO_8859-14:1998, iso-celtic, iso-ir-199, l8, latin8); + + ISO-8859-15 (ISO_8859-15:1998, iso-ir-203, l9, latin9); + + KOI-7; KOI-8 (GOST_19768-74); KOI8-R; KOI8-RU; KOI8-U; + + macintosh_ce (macce); mac-is; + + NeXTSTEP (next) yet previous `recode' had it outside RFC 1345. + + + Alias promoted to charset (with previous charset becoming alias) + + + ISO-646.basic (with ISO-646.basic:1983); ISO-646.irv (ISO-646.irv:1983); + + ISO_5427-ext (ISO_5427:1981); ISO_5428 (ISO_5428:1980); + + ISO-8859-1 (ISO_8859-1:1987); ISO-8859-2 (ISO_8859-2:1987); + + ISO-8859-3 (ISO_8859-3:1988); ISO-8859-4 (ISO_8859-4:1988); + + ISO-8859-5 (ISO_8859-5:1988); ISO-8859-6 (ISO_8859-6:1987); + + ISO-8859-7 (ISO_8859-7:1987); ISO-8859-8 (ISO_8859-8:1988); + + ISO-8859-9 (ISO_8859-9:1989); ISO-8859-10 (latin6); + + NC_NC00-10 (NC_NC00-10:81); sami (latin-lap). + + + New aliases + + + 037 (for charset IBM037); 038 (IBM038); 273 (IBM273); 274 (IBM274); + + 275 (IBM275); 278 (IBM278); 280 (IBM280); 281 (IBM281); 284 (IBM284); + + 285 (IBM285); 290 (IBM290); 297 (IBM297); 367 (ANSI_X3.4-1968); + + 420 (IBM420); 423 (IBM423); 424 (IBM424); 500, 500V1 (IBM500); + + 819 (ISO-8859-1); 864 (IBM864); 868 (IBM868); 870 (IBM870); + + 871 (IBM871); 880 (IBM880); 891 (IBM891); 903 (IBM903); 905 (IBM905); + + 912, CP912, IBM912 (ISO-8859-2); 918 (IBM918); 1026 (IBM1026); + + ECMA-113, ECMA-113:1986 (ECMA-Cyrillic); GOST_19768-74 (KOI8); + + ISO_8859-N (ISO-8859-N) for N = 1 through 10 and 13 through 15; + + ISO_8859-10:1993 (ISO-8869-10); iso-ir-170 (INVARIANT); + + KOI8_L2 (CSN_369103); pclatin2, pcl2 (IBM852); SS636127 (SEN_850200_B). + ++ New African charsets + + + AFRL1-101-BPI_OCIL (t-francais, t-fra); + + AFRFUL-102-BPI_OCIL (bambara, bra, ewondo, fulfulde); + + AFRFUL-103-BPI_OCIL (t-bambara, t-bra, t-ewondo, t-fulfulde); + + AFRLIN-104-BPI_OCIL (lingala, lin, sango, wolof); + + AFRLIN-105-BPI_OCIL (t-lingala, t-lin, t-sango, t-wolof). + ++ Extra miscellaneous charsets + + + KEYBCS2 (Kamenicky); CORK (T1); KOI-8_CS2. + ++ New HTML pseudo-charsets + + + HTML_1.1 (h1); HTML_2.0 (RFC 1866, 1866, h2); HTML-i18n (RFC 2070); + + HTML_3.2 (h3) reimplemented; HTML_4.0 (h4, HTML, h); + + deleted aliases HTF, 8859, ISO 8859, Entities, SGML, WWW, w3. + +Surfaces & aliases +------------------ + ++ Base64 (64, b64); Quoted-Printable (qp, Quote-Printable); ++ 21-Permutation (swabytes); 4321-Permutation; CR; CR-LF (cl); ++ Decimal-1 (d, d1); Decimal-2 (d2), Decimal-4 (d4); ++ Hexadecimal-1 (x, x1); Hexadecimal-2 (x2); Hexadecimal-4 (x4); ++ Octal-1 (o, o1); Octal-2 (o2); Octal-4 (o4). ++ data; test7; test8; test15; test16. + +Version 3.4 +=========== + +:Author: François Pinard, 1994-11. + ++ Charset HTML is new, it handles `&...;' sequences for Latin-1. ++ Charset AtariST handling is more general, --list may be used with it. ++ Charset ASCII-BS overstriking has been extended, mainly for German. ++ Charset RFC1345 may be a goal, to debug or study RFC 1345 short names. ++ Charset names have been revised. Note that nextstep is now NeXT. ++ Option --force (-f) is accepted, but does not yet protect reversibility. ++ Option --quiet or --silent (-q) silences irreversible recoding messages. ++ Option --known=PAIRS (-k) helps searching through recodings. ++ Option --sequence=pipe (-p) does not fall back on -o anymore. ++ Option --auto-check may narrow its study around one particular charset. ++ An MSDOS port is available, check ftp.iro.umontreal.ca in pub/gnuish. ++ Compilation should now succeed on OS/2 EMX. Thanks to Kai Uwe Rommel. ++ Program initialization is almost three times faster on average. ++ Corrected reported bugs, added small improvements, some aesthetic. + +Version 3.3 +=========== + +:Author: François Pinard, 1993-12. + ++ Charsets atarist, ebcdic-ccc, ebcdic-ibm and nextstep have been added. ++ Also, most RFC 1345 charsets and aliases are handled. That's a bunch! ++ Old ascii disappears because of RFC 1345's ascii, use ascii-bs instead. ++ Old maci disappears because of RFC 1345's macintosh, use applemac instead. ++ Charsets cccascii and cdcascii disappear, use ebcdic-ccc and ebcdic instead. ++ Recoding between latin1, ibmpc and applemac is (almost) reversible. ++ The texinfo documentation has been reorganized, this to be continued. ++ Long options are accepted, charset names may be abbreviated. ++ Option --list (-l) displays charsets, aliases and contents in many formats. ++ Option --strict (-s) asks for stricter, non-reversible recodings. ++ Option --graphics (-g) approximates ibmpc rulers with ASCII graphics. ++ Option --header (-h) produces C source for many recoding tables. ++ Option --auto-check (-a) reports about all possible recodings. ++ Option --ignore (-x) prevents a charset from being selected. ++ Execution has been sped up through step merging, hashing for charset names. ++ Many various buglets have been eradicated, portability increased. ++ Charsets may be edited out by modifying the Makefile only. ++ Configuration is made through the use of an external config.h file. ++ New -d `diacritics_only' option for LaTeX. ++ A few bugs have been corrected. ++ Documentation reorganization and improvements. ++ Increased portability, now uses Autoconf. ++ A few bugs solved. + +Version 3.2 +=========== + +:Author: François Pinard, 1991-10. + ++ MSDOS port redone. ++ New check goal at installation time. ++ Add -v option for verbose processing, remove old -q. ++ Add -i, -o and -p for letting the user control the strategy. ++ A few bugs corrected. ++ Embedded NULs should now be transmitted. + +Version 3.1 +=========== + +:Author: François Pinard, 1990-03. + ++ Rename -V to -C for showing Copyright. ++ Calling sequence changed, said files now recoded on themselves. ++ Add -t option for touching files. ++ Better on-line help. ++ Add -q option for quiet processing. ++ Executable file now considerably smaller, also speedier. ++ A few bugs corrected. + +Version 3.0 +=========== + +:Author: François Pinard, 1989-10. + ++ New Text to Latin1 processing, should be faster. ++ A few bugs corrected. + +For prior history down to 1980, see at the end of the ChangeLog. diff --git a/README b/README index f268551..46214c4 100644 --- a/README +++ b/README @@ -113,6 +113,37 @@ least, new features go to Recodec only. __ http://recodec.progiciels-bpi.ca +Notes for version 3.7-beta1 +--------------------------- + +The beta 1 pre-test release for the incoming Recode 3.7 has been made +available for those needing it right away. While it solves some serious +bugs and portability problems, others are meant to be addressed only in +later pre-tests. In particular, none of charset or surface issues, user +requests, and various suggestions appear in this pre-test, and will not +either in later pretests, until all real show-stoppers are solved first. +So this is in no way a candidate for a Recode 3.7 release. + +The test suite is worth more comments: + ++ The suite is very partial, and may not be thought as a validation + suite. Before it could be used to ascertain confidence, it would need, + much more tests than it has already. + ++ Testing is notably more speedy than it used to be. For example, the + previous :code:`bigauto` test, which was not run by default because it + ran for too long, is now executed within the standard test suite, once + in non-strict mode, and a second time in strict mode. + ++ It does not use Autotest anymore, but rather a home grown test driver + much inspired from the Codespeak project. The link between the test and + the Recode library is established through a Pyrex interface, so you need + to have :code:`python` and :code:`python-devel` installed first. + ++ Beware that the Pyrex interface to the Recode library is only meant + for testing. for now at least. While you may play with it, it would not + be wise relying on it, as the specifications might change at any time. + Installation ============ diff --git a/configure b/configure index f2d07c8..806bdf6 100755 --- a/configure +++ b/configure @@ -2290,7 +2290,7 @@ fi # Define the identity of the package. PACKAGE=recode - VERSION=3.6 + VERSION=3.7-beta1 cat >>confdefs.h <<_ACEOF diff --git a/configure.ac b/configure.ac index 4095f20..ebfba51 100644 --- a/configure.ac +++ b/configure.ac @@ -5,7 +5,7 @@ AC_INIT(src/recode.c) AC_PREREQ(2.12) AM_CONFIG_HEADER(config.h) -AM_INIT_AUTOMAKE(recode, 3.6) +AM_INIT_AUTOMAKE(recode, 3.7-beta1) AC_PROG_CC AC_AIX diff --git a/doc/recode.info b/doc/recode.info index e85caa3..3c6e031 100644 --- a/doc/recode.info +++ b/doc/recode.info @@ -29,100 +29,100 @@ translation approved by the Foundation.  Indirect: recode.info-1: 1138 -recode.info-2: 244435 +recode.info-2: 244441  Tag Table: (Indirect) Node: Top1138 -Node: Tutorial5541 -Node: Introduction9778 -Node: Charset overview14017 -Node: Surface overview15812 -Node: Contributing17284 -Ref: Contributing-Footnote-119530 -Node: Invoking recode19664 -Node: Synopsis20621 -Ref: Synopsis-Footnote-123063 -Node: Requests23362 -Ref: Requests-Footnote-129276 -Ref: Requests-Footnote-229343 -Ref: Requests-Footnote-329521 -Node: Listings29980 -Ref: Listings-Footnote-140398 -Node: Recoding40725 -Node: Reversibility43550 -Ref: Reversibility-Footnote-152053 -Node: Sequencing52190 -Node: Mixed54636 -Node: Emacs58029 -Node: Debugging59008 -Node: Library63275 -Node: Outer level64629 -Node: Request level70115 -Node: Task level80584 -Node: Charset level91006 -Node: Errors91848 -Ref: Errors-Footnote-196702 -Ref: Errors-Footnote-296816 -Node: Universal97177 -Ref: Universal-Footnote-1100305 -Ref: Universal-Footnote-2100373 -Node: UCS-2100586 -Node: UCS-4103120 -Node: UTF-7103662 -Node: UTF-8104259 -Node: UTF-16108566 -Node: count-characters109716 -Node: dump-with-names110389 -Node: libiconv112942 -Node: Tabular124499 -Node: ASCII misc146765 -Node: ASCII147131 -Node: ISO 8859147951 -Node: ASCII-BS150249 -Node: flat152088 -Node: IBM and MS152761 -Node: EBCDIC153334 -Node: IBM-PC155448 -Ref: IBM-PC-Footnote-1157570 -Node: Icon-QNX157729 -Node: CDC158156 -Node: Display Code159860 -Ref: Display Code-Footnote-1162144 -Node: CDC-NOS162349 -Node: Bang-Bang164313 -Node: Micros166244 -Node: Apple-Mac166629 -Node: AtariST168685 -Node: Miscellaneous169675 -Node: HTML170412 -Node: LaTeX176440 -Node: Texinfo177216 -Node: Vietnamese177996 -Node: African178976 -Node: Others180332 -Node: Texte181790 -Ref: Texte-Footnote-1186345 -Ref: Texte-Footnote-2186425 -Ref: Texte-Footnote-3186900 -Node: Mule186997 -Ref: Mule-Footnote-1188784 -Node: Surfaces189303 -Ref: Surfaces-Footnote-1192291 -Node: Permutations192397 -Node: End lines193242 -Node: MIME195449 -Node: Dump196640 -Node: Test200834 -Node: Internals203314 -Node: Main flow204552 -Node: New charsets207672 -Node: New surfaces212215 -Node: Design212943 -Ref: Design-Footnote-1222158 -Node: Concept Index222262 -Node: Option Index237005 -Node: Library Index239858 -Node: Charset and Surface Index244435 +Node: Tutorial5547 +Node: Introduction9784 +Node: Charset overview14023 +Node: Surface overview15818 +Node: Contributing17290 +Ref: Contributing-Footnote-119536 +Node: Invoking recode19670 +Node: Synopsis20627 +Ref: Synopsis-Footnote-123069 +Node: Requests23368 +Ref: Requests-Footnote-129282 +Ref: Requests-Footnote-229349 +Ref: Requests-Footnote-329527 +Node: Listings29986 +Ref: Listings-Footnote-140404 +Node: Recoding40731 +Node: Reversibility43556 +Ref: Reversibility-Footnote-152059 +Node: Sequencing52196 +Node: Mixed54642 +Node: Emacs58035 +Node: Debugging59014 +Node: Library63281 +Node: Outer level64635 +Node: Request level70121 +Node: Task level80590 +Node: Charset level91012 +Node: Errors91854 +Ref: Errors-Footnote-196708 +Ref: Errors-Footnote-296822 +Node: Universal97183 +Ref: Universal-Footnote-1100311 +Ref: Universal-Footnote-2100379 +Node: UCS-2100592 +Node: UCS-4103126 +Node: UTF-7103668 +Node: UTF-8104265 +Node: UTF-16108572 +Node: count-characters109722 +Node: dump-with-names110395 +Node: libiconv112948 +Node: Tabular124505 +Node: ASCII misc146771 +Node: ASCII147137 +Node: ISO 8859147957 +Node: ASCII-BS150255 +Node: flat152094 +Node: IBM and MS152767 +Node: EBCDIC153340 +Node: IBM-PC155454 +Ref: IBM-PC-Footnote-1157576 +Node: Icon-QNX157735 +Node: CDC158162 +Node: Display Code159866 +Ref: Display Code-Footnote-1162150 +Node: CDC-NOS162355 +Node: Bang-Bang164319 +Node: Micros166250 +Node: Apple-Mac166635 +Node: AtariST168691 +Node: Miscellaneous169681 +Node: HTML170418 +Node: LaTeX176446 +Node: Texinfo177222 +Node: Vietnamese178002 +Node: African178982 +Node: Others180338 +Node: Texte181796 +Ref: Texte-Footnote-1186351 +Ref: Texte-Footnote-2186431 +Ref: Texte-Footnote-3186906 +Node: Mule187003 +Ref: Mule-Footnote-1188790 +Node: Surfaces189309 +Ref: Surfaces-Footnote-1192297 +Node: Permutations192403 +Node: End lines193248 +Node: MIME195455 +Node: Dump196646 +Node: Test200840 +Node: Internals203320 +Node: Main flow204558 +Node: New charsets207678 +Node: New surfaces212221 +Node: Design212949 +Ref: Design-Footnote-1222164 +Node: Concept Index222268 +Node: Option Index237011 +Node: Library Index239864 +Node: Charset and Surface Index244441  End Tag Table diff --git a/doc/recode.info-1 b/doc/recode.info-1 index 3e8c554..2119663 100644 --- a/doc/recode.info-1 +++ b/doc/recode.info-1 @@ -40,7 +40,7 @@ sets and is able to convert files between almost any pair. Most RFC 1345 character sets, and all `libiconv' character sets, are supported. The `recode' program is a handy front-end to the library. - The current `recode' release is 3.6. + The current `recode' release is 3.7-beta1. * Menu: diff --git a/doc/stamp-vti b/doc/stamp-vti index ed841a5..f6293d8 100644 --- a/doc/stamp-vti +++ b/doc/stamp-vti @@ -1,4 +1,4 @@ @set UPDATED 27 February 2008 @set UPDATED-MONTH February 2008 -@set EDITION 3.6 -@set VERSION 3.6 +@set EDITION 3.7-beta1 +@set VERSION 3.7-beta1 diff --git a/doc/version.texi b/doc/version.texi index ed841a5..f6293d8 100644 --- a/doc/version.texi +++ b/doc/version.texi @@ -1,4 +1,4 @@ @set UPDATED 27 February 2008 @set UPDATED-MONTH February 2008 -@set EDITION 3.6 -@set VERSION 3.6 +@set EDITION 3.7-beta1 +@set VERSION 3.7-beta1 diff --git a/src/main.c b/src/main.c index adfa6d5..e167921 100644 --- a/src/main.c +++ b/src/main.c @@ -622,7 +622,7 @@ Written by Franc,ois Pinard .\n"), stdout); fputs (_("\ \n\ -Copyright (C) 1990, 92, 93, 94, 96, 97, 99 Free Software Foundation, Inc.\n"), +Copyright (C) 1990, 92-94, 96, 97, 99, 08 Free Software Foundation, Inc.\n"), stdout); fputs (_("\ This is free software; see the source for copying conditions. There is NO\n\ diff --git a/src/recode.1 b/src/recode.1 index 43fea6c..bbcebbf 100644 --- a/src/recode.1 +++ b/src/recode.1 @@ -1,5 +1,5 @@ .\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.35. -.TH RECODE "1" "February 2008" "recode 3.6" "User Commands" +.TH RECODE "1" "February 2008" "recode 3.7-beta1" "User Commands" .SH NAME recode \- converts files between character sets .SH SYNOPSIS @@ -100,7 +100,7 @@ Written by Franc,ois Pinard . .SH "REPORTING BUGS" Report bugs to . .SH COPYRIGHT -Copyright \(co 1990, 92, 93, 94, 96, 97, 99 Free Software Foundation, Inc. +Copyright \(co 1990, 92-94, 96, 97, 99, 08 Free Software Foundation, Inc. .br This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. -- 2.40.0