From: François Pinard Date: Sat, 15 Mar 2008 15:53:34 +0000 (-0400) Subject: README updated X-Git-Tag: v3.7~219 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=c1a5b4eec096a6f2fd3b36c7d06f811aaac909b5;p=recode README updated --- diff --git a/README b/README index c80d754..e94a72a 100644 --- a/README +++ b/README @@ -16,8 +16,6 @@ README file for Recode :--> -.. - .. contents:: .. sectnum:: @@ -34,25 +32,23 @@ configuration. Make sure you read files :file:`ABOUT-NLS` and :file:`INSTALL` if you are not familiar with them already. The Recode library converts files between character sets and usages. -It recognises or produces more than 300 different character sets -and transliterates files between almost any pair. When exact -transliteration are not possible, it gets rid of offending characters -or falls back on approximations. The :code:`recode` program is a handy -front-end to the library. +It recognises or produces over 200 different character sets (or about +300 if combined with an :code:`iconv` library) and transliterates files +between almost any pair. When exact transliteration are not possible, +it gets rid of offending characters or falls back on approximations. +The :code:`recode` program is a handy front-end to the library. The Recode program and library have been written by François Pinard, -yet it significantly reuses works from Keld Simonsen and Bruno Haible. -It is an evolving package, and specifications might change in future -releases. - -Little notes: +yet it significantly reuses tabular works from Keld Simonsen. It is an +evolving package, and specifications might change in future releases. -+ Option ``-f`` is now fairly implemented, yet not fully. -+ there is a `contrib/`__ directory in the distribution. -+ In 1999, I gave a `presentation`__ of Recode in Japan. +On various Unix systems, Recode is usually compiled from sources, +see the `Installation`_ section below. On Linux, it often comes +bundled. Recode had been ported to other popular systems. See both +`contrib/README`__ and the `Non-Unix ports`_ section below, to find +some more information about these. __ contrib.html -__ m17n99.html Reports and collaboration ------------------------- @@ -124,18 +120,17 @@ Recode 3.7. I publish it to ease later exchanges of patches with testers. names the executable program specifically, or the distribution archive itself. -+ Recode does not include :code:`libiconv` anymore. However, it uses - an external :code:`iconv` library if one is available at installation - time, like :code:`libiconv` or the one provided within GNU :code:`libc`. - The ``-x:`` option to the program, or a new flag to the library - :code:`recode_new_outer` function, inhibits the initialisation and - usage of :code:`iconv`. ++ Recode does not itself include :code:`libiconv` anymore. However, + it uses an external :code:`iconv` library if one is available at + installation time, like :code:`libiconv` or the one provided within GNU + :code:`libc`. The ``-x:`` option to the program, or a new flag to the + library :code:`recode_new_outer` function, inhibits the initialisation + and usage of :code:`iconv`. + The bug about loosing a few characters, here and there, when recoding big files in :code:`iconv` context, seems to have been corrected. A patch for this problem has been floating around for years, but it was - not solving all cases. For this particular problem, I still have to - check a few user-submitted data files demonstrating the problem. + not solving all cases. + Recode installation now uses Python. In particular, it creates file :file:`build/src/iconvdecl.h` from local ``iconv -l`` output. @@ -231,12 +226,11 @@ these commands:: git clone git://recode%(bpi)s/recode cd recode - sh after-git.sh + sh after-patch.sh -(or ``python after-git.py`` if you miss either :code:`sh` or GNU +(or ``python after-patch.py`` if you miss either :code:`sh` or GNU :code:`touch`). - Once you have an unpacked distribution, see files: =================== ======================================================= @@ -265,7 +259,7 @@ and :file:`ABOUT-NLS`, a few extra options may be accepted after + Option ``--with-gnu-ld`` - to force the assomption that the C compiler uses GNU ld. + to force the assumption that the C compiler uses GNU ld. + Option ``--with-dmalloc`` @@ -347,3 +341,164 @@ calling ``./configure``. File :file:`INSTALL` explains this. heavily when processes fork. In this case, just before doing ``make``, edit :file:`config.h` and ensure :code:`HAVE_PIPE` is *not* defined. + +External pointers +================= + +Documentation +------------- + ++ IETF references + + + Character Mnemonics & Character Sets + + + ftp://nic.ddn.mil/rfc/rfc1345.txt + + Keld Simonsen , 1992-06. + + + UTF-7 - A Mail-Safe Transformation Format of Unicode + + + ftp://nic.ddn.mil/rfc/rfc1642.txt + + David Goldsmith and Mark Davis + , 1994-07. + + + UTF-8, a transformation format of Unicode and ISO 10646 + + + ftp://nic.ddn.mil/rfc/rfc2044.txt + + François Yergeau , 1997-10. + ++ Various references + + + Unicode charset mappings + + + ftp://ftp.unicode.org:/Public/MAPPINGS/ + + The Unicode consortium makes available plenty of charset mappings + for converting "legacy" charsets to Unicode. + + + Normalisation et internationalisation: Inventaire et prospectives des + normes clefs pour le traitement informatique du français. (392p.) + + This is a report, written in French, discussing charset issues and many + other topics as well. Laurent Bourbeau + and François Pinard , 1995-10. + + + ftp://ftp.iro.umontreal.ca/pub/contrib/pinard/accents/oqil-tome1.ps.gz + + http://www.ceveil.qc.ca/Normes + ++ Recode specific + + + ETL presentation + + In 1999, the organisers of the `m17n99 conference`__ in Tsukuba, + Japan, were kind enough to invite me. This has been for me a + fabulous trip and experience, and I met many extraordinary people in + there. At the conference, I presented the Translation Project, and + Recode. The Recode `presentation slides`__ are available. + +__ http://www.m17n.org/conference/m17n99_all_but_registration/welcome.en.html +__ m17n99.html + +Programs +-------- + ++ :code:`libiconv` + + This comprehensive charset converter library revolves around Unicode, + and support Asian encodings among many others. Even Recode uses it! + + + http://www.gnu.org/software/libiconv/ + + Bruno Haible + ++ :code:`tcs` + + Here is the main recoding tool from the Plan9 project. + + + ftp://research.att.com/dist/tcs.shar.Z + ++ :code:`yuedit` + + This GUI editor handles many encodings, among which UTF-8. It also + installs uniconv, a recoding program, and uniprint, a printing tool. + + + ftp://sunsite.unc.edu/pub/Linux/apps/editors/X/yudit-1.2.tar.gz + + Gaspar Sinai , 1999-01. + ++ :code:`ucs-fonts` + + These 6x13 fonts, covering Unicode characters besides the Asian sets, + merely replace the Linux fixed 6x13 font. Works nicely with yudit. + + + http://www.cl.cam.ac.uk/~mgk25/download/ucs-fonts.tar.gz + + Markus Kuhn , 1998-11. + ++ :code:`MtRecode` + + This charset converter is oriented towards SGML text manipulation. It + may be freely downloaded for non-commercial, non-military use from: + + + http://www.lpl.univ-aix.fr/projects/multext/MtRecode/ + + Pointer given by Jean Véronis , 1996-06. + ++ :code:`sp` + + This quite nice SGML structure analyser contains internal C++ modules + for handling many charsets. + + + ftp://ftp.jclark.com/pub/sp/sp-1.3.tar.gz + + James Clark + ++ :code:`b2c` + + This program is able to generate interpreted + character dumps, but properly embedded within complete C header files. + + + http://research.de.uu.net:8080/~gnu/b2c/b2c-2.1.tar.gz + + Jörg Heitkötter , 1997-11. + ++ :code:`PyRecode` + + This wrapper provides Recode functionality to Python programs. + + + http://www.suxers.de/PyRecode.tgz + + Andreas Jung + + Also see: + + + http://www.vex.net/parnassus/apyllo.py?find=recode + + http://www.suxers.de/python/pyrecode.htm + +Non-Unix ports +-------------- + +Please mailto:recode-bugs@iro.umontreal.ca if you are aware of various +ports to non-Unix systems not listed here, or for corrections. Please +provide the goal system, a complete and stable URL, the maintainer name +and address, the Recode version used as a base, and your comments. + ++ IBM/PC (MSDOS) + + + + ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/rcode34b.zip + + ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/rcode34s.zip + + ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/rcode34d.zip + + (for binaries, sources and docs respectively) maintained by Wojciech + Galazka Based on Recode 3.4.1. + + + + http://www.simtel.net/simtel.net/ + + http://www.leo.org/pub/comp/platforms/pc/gnuish (Germany) + + ftp://ftp.simtel.net/simtelnet/gnu + + ftp://ftp.leo.org/pub/comp/platforms/pc/gnuish + + maintained by Darrel Hankerson You get + many GNU tools, not only Recode. The GNUish project is described in + :file:`gnuish_t.htm`. diff --git a/contrib/README b/contrib/README index 49c3b21..6f0cea2 100644 --- a/contrib/README +++ b/contrib/README @@ -5,12 +5,6 @@ README file for :file:`recode/contrib/` ======================================= -.. contents:: -.. sectnum:: - -Hi, people! -=========== - The :file:`contrib/` directory of the Recode distribution contains a few miscellaneous tools, ports, or such things, which have been collected here and there, a bit randomly, for your possible entertainment or use. @@ -20,10 +14,9 @@ might have to contact the authors directly to get support. There is no guarantee that any file in this directory will still exist in subsequent releases. Finally, there is no guarantee either that I will accept to include contributions, unless I find them very good or very small. But -I'm quite willing to give URL pointers to other tools, right here! +I'm quite willing to give `URL pointers`__ to other tools. -Included files -============== +__ /index.html#external-pointers To the best of my knowledge, all included files are free, and already available widely by other means. I did not collect any kind legalistic @@ -79,151 +72,3 @@ help me at deciding what should be kept and what should go away. without a leading tab. Hence my :file:`/usr/src/redhat/` is still ``root:root``, and yet I can do my :code:`rpm` building as myself. - -External pointers -================= - -Documentation -------------- - -+ IETF references - - + Character Mnemonics & Character Sets - - + ftp://nic.ddn.mil/rfc/rfc1345.txt - - Keld Simonsen , 1992-06. - - + UTF-7 - A Mail-Safe Transformation Format of Unicode - - + ftp://nic.ddn.mil/rfc/rfc1642.txt - - David Goldsmith and Mark Davis - , 1994-07. - - + UTF-8, a transformation format of Unicode and ISO 10646 - - + ftp://nic.ddn.mil/rfc/rfc2044.txt - - François Yergeau , 1997-10. - -+ Various references - - + Unicode charset mappings - - + ftp://ftp.unicode.org:/Public/MAPPINGS/ - - The Unicode consortium makes available plenty of charset mappings - for converting "legacy" charsets to Unicode. - - + Normalisation et internationalisation: Inventaire et prospectives des - normes clefs pour le traitement informatique du français. (392p.) - - This is a report, written in French, discussing charset issues and many - other topics as well. Laurent Bourbeau - and François Pinard , 1995-10. - - + ftp://ftp.iro.umontreal.ca/pub/contrib/pinard/accents/oqil-tome1.ps.gz - + http://www.ceveil.qc.ca/Normes - -Programs --------- - -+ :code:`libiconv` - - This comprehensive charset converter library revolves around Unicode, - and support Asian encodings among many others. Even Recode uses it! - - + http://www.gnu.org/software/libiconv/ - - Bruno Haible - -+ :code:`tcs` - - Here is the main recoding tool from the Plan9 project. - - + ftp://research.att.com/dist/tcs.shar.Z - -+ :code:`yuedit` - - This GUI editor handles many encodings, among which UTF-8. It also - installs uniconv, a recoding program, and uniprint, a printing tool. - - + ftp://sunsite.unc.edu/pub/Linux/apps/editors/X/yudit-1.2.tar.gz - - Gaspar Sinai , 1999-01. - -+ :code:`ucs-fonts` - - These 6x13 fonts, covering Unicode characters besides the Asian sets, - merely replace the Linux fixed 6x13 font. Works nicely with yudit. - - + http://www.cl.cam.ac.uk/~mgk25/download/ucs-fonts.tar.gz - - Markus Kuhn , 1998-11. - -+ :code:`MtRecode` - - This charset converter is oriented towards SGML text manipulation. It - may be freely downloaded for non-commercial, non-military use from: - - + http://www.lpl.univ-aix.fr/projects/multext/MtRecode/ - - Pointer given by Jean Véronis , 1996-06. - -+ :code:`sp` - - This quite nice SGML structure analyser contains internal C++ modules - for handling many charsets. - - + ftp://ftp.jclark.com/pub/sp/sp-1.3.tar.gz - - James Clark - -+ :code:`b2c` - - This program is able to generate interpreted - character dumps, but properly embedded within complete C header files. - - + http://research.de.uu.net:8080/~gnu/b2c/b2c-2.1.tar.gz - - Jörg Heitkötter , 1997-11. - -+ :code:`PyRecode` - - This wrapper provides Recode functionality to Python programs. - - + http://www.suxers.de/PyRecode.tgz - - Andreas Jung - - Also see: - - + http://www.vex.net/parnassus/apyllo.py?find=recode - + http://www.suxers.de/python/pyrecode.htm - -Non-Unix ports or alikes -======================== - -Please mailto:recode-bugs@iro.umontreal.ca if you are aware of various -ports to non-Unix systems not listed here, or for corrections. Please -provide the goal system, a complete and stable URL, the maintainer name -and address, the Recode version used as a base, and your comments. - -+ IBM/PC (MSDOS) - - + + ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/rcode34b.zip - + ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/rcode34s.zip - + ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/rcode34d.zip - - (for binaries, sources and docs respectively) maintained by Wojciech - Galazka Based on Recode 3.4.1. - - + + http://www.simtel.net/simtel.net/ - + http://www.leo.org/pub/comp/platforms/pc/gnuish (Germany) - + ftp://ftp.simtel.net/simtelnet/gnu - + ftp://ftp.leo.org/pub/comp/platforms/pc/gnuish - - maintained by Darrel Hankerson You get - many GNU tools, not only Recode. The GNUish project is described in - :file:`gnuish_t.htm`.