2 .. role:: file(literal)
17 Here is version 3.6 for the Recode program and library. Hereafter,
18 Recode means the whole package, :code:`recode` means the executable
19 program. Glance through this :file:`README` file before starting
20 configuration. Make sure you read files :file:`ABOUT-NLS` and
21 :file:`INSTALL` if you are not familiar with them already.
23 The Recode library converts files between character sets and usages.
24 It recognises or produces over 200 different character sets (or about
25 300 if combined with an :code:`iconv` library) and transliterates files
26 between almost any pair. When exact transliteration are not possible,
27 it gets rid of offending characters or falls back on approximations.
28 The :code:`recode` program is a handy front-end to the library.
30 The Recode program and library have been written by François Pinard,
31 yet it significantly reuses tabular works from Keld Simonsen. It is an
32 evolving package, and specifications might change in future releases.
34 On various Unix systems, Recode is usually compiled from sources,
35 see the `Installation`_ section below. On Linux, it often comes
36 bundled. Recode had been ported to other popular systems. See both
37 `contrib/README`__ and the `Non-Unix ports`_ section below, to find
38 some more information about these.
42 Reports and collaboration
43 -------------------------
45 Send bug reports to mailto:recode-bugs@iro.umontreal.ca' . A bug
46 report is an adequate description of the problem: your input, what you
47 expected, what you got, and why this is wrong. Diffs are welcome, but
48 they only describe a solution, from which the problem might be uneasy
49 to infer. If needed, submit actual data files with your report. Small
50 data files are preferred. Big files may sometimes be necessary, but do
51 not send them on the mailing list; rather take special arrangement with
54 Your feedback will help us to make a better and more portable
55 package. Consider documentation errors as bugs, and report them
56 as such. If you develop anything pertaining to Recode or have
57 suggestions, let us know and share your findings by writing at
58 mailto:recode-forum@iro.umontreal.ca . You may also choose to directly
59 write at mailto:pinard@iro.umontreal.ca, yet be warned that such
60 correspondence is often visible for a while through the Recode Web site.
62 If you feel like receiving releases and pretest announcements for the
63 Recode package, send a message to mailto:majordomo@iro.umontreal.ca
64 having, in its body, a line saying::
66 subscribe recode-announce
68 If you rather want to participate actively in discussions, pretesting
69 and development for Recode, do just as above, but this time, use::
71 subscribe recode-forum
73 Visit http://recode.progiciels-bpi.ca/ for releases or pretests, and related
74 files. In particular, button ``Browse`` gives access to a weekly mirror
75 of the current unpackaged work files, while button ``Folders`` gives
76 access to saved or pending correspondence.
78 Please *do not* widely redistribute releases having a letter after the
79 version numbers, as these are meant for pretesting only, and might not
80 be stable enough for other usages.
85 My plan has long been to end the 3.x series of this package, rather
86 aiming 4.0 as a major internal rewrite. As there is still a long
87 way before 4.0 gets ready, and *especially* because some of my good
88 collaborators insisted that I do so, there will be a Recode 3.7. That
89 release is meant to provide a selection of user-contributed patches.
91 For prototyping what Recode will become and experimenting new concepts
92 more easily, I created a subsidiary and standalone project named
93 Recodec, meant to receive the best part of my development efforts in
94 this particular area. Once I'll be happy with the prototype, the plan
95 is to rewrite it from Python to C, somehow. Visit the Web pages for
96 this `Recodec project`__ for more information and details. For now at
97 least, new features go to Recodec only.
99 __ http://recodec.progiciels-bpi.ca
101 Notes for version 3.7-beta2
102 ---------------------------
104 Here are a few notes related to the beta2 pre-test release for the incoming
105 Recode 3.7. I publish it to ease later exchanges of patches with testers.
107 + The name has been changed from Free recode to Recode -- as "Free" was
108 a four letter word to some people :-). :code:`recode` (no capital) still
109 names the executable program specifically, or the distribution archive
112 + Recode does not itself include :code:`libiconv` anymore. However,
113 it uses an external :code:`iconv` library if one is available at
114 installation time, like :code:`libiconv` or the one provided within GNU
115 :code:`libc`. The ``-x:`` option to the program, or a new flag to the
116 library :code:`recode_new_outer` function, inhibits the initialisation
117 and usage of :code:`iconv`.
119 + The bug about loosing a few characters, here and there, when recoding
120 big files in :code:`iconv` context, seems to have been corrected. A
121 patch for this problem has been floating around for years, but it was
122 not solving all cases.
124 + Recode installation now uses Python. In particular, it creates
125 file :file:`build/src/iconvdecl.h` from local ``iconv -l`` output.
126 Recode testing through ``make check`` also needs what people
127 :code:`python-devel`, providing C header files for Python and
128 :code:`distutils`. The :file:`Makemore` file has been merged within
129 regular Makefiles and is not distributed separately anymore.
131 + It is likely that new bugs have been introduced through the above
132 changes. In particular, not everything is cosy on the side of release
133 engineering. A few files are either spuriously remade, or remade late.
134 I'm a bit surprised by the difficulty to get this right.
136 + ``make check`` accepts a ``LIMIT=`` option, for limiting tests to one or
137 a few cases. See :file:`tests/Makefile` for more information.
139 + PO files have been updated from the Translation Project.
141 Notes for version 3.7-beta1
142 ---------------------------
144 The beta 1 pre-test release for the incoming Recode 3.7 has been made
145 available for those needing it right away. While it solves some serious
146 bugs and portability problems, others are meant to be addressed only in
147 later pre-tests. In particular, none of charset or surface issues, user
148 requests, and various suggestions appear in this pre-test, and will not
149 either in later pretests, until all real show-stoppers are solved first.
150 So this is in no way a candidate for a Recode 3.7 release.
152 The test suite is worth more comments:
154 + The suite is very partial, and may not be thought as a validation
155 suite. Before it could be used to ascertain confidence, it would need
156 much more tests than it has already.
158 + Testing is notably more speedy than it used to be. For example, the
159 previous :code:`bigauto` test, which was not run by default because it
160 ran for too long, is now executed within the standard test suite, once
161 in non-strict mode, and a second time in strict mode.
163 + It does not use Autotest anymore, but rather a home grown test driver
164 much inspired from the Codespeak project. The link between the test and
165 the Recode library is established through a Pyrex interface, so you need
166 to have :code:`python` and :code:`python-devel` installed first.
168 + Beware that the Pyrex interface to the Recode library is only meant
169 for testing. for now at least. While you may play with it, it would not
170 be wise relying on it, as the specifications might change at any time.
178 Simple installation of Recode requires the usual tools and facilities as
179 those needed for most GNU packages. If not already bundled with your
180 system, you also need to pre-install Python, version 2.2 or better. You
183 http://www.python.org
185 It is also convenient to have some :code:`iconv` library already present
186 on your system, this much extends Recode capabilities, especially in
187 the area of Asiatic character sets. GNU :code:`libc`, as found on
188 Linux systems and a few others, already has such an :code:`iconv`
189 library. Otherwise, you might consider pre-installing the portable
190 :code:`libiconv`, written by Bruno Haible. You may get it from:
192 http://www.gnu.org/software/libiconv/
197 The canonical distribution point for this version is:
199 http://recode.progiciels-bpi.ca/archives/recode.tar.gz
201 GNU mirrors usually hold a copy of non-pretest releases, the canonical
202 distribution point for the last such release is:
204 ftp://ftp.gnu.org/pub/gnu/recode/recode-3.6.tar.gz
206 Some older distributions *might* be available in this directory:
208 http://recode.progiciels-bpi.ca/archives/
213 Visit http://github.com/pinard/Recode/tree/dev-3.7 and use the
214 *Download* button to get a packaged copy of development sources. If you
215 happen to be a Git lover, you may rather use::
217 git clone git://github.com/pinard/Recode.git
219 and then, checkout branch ``dev-3.7``. File timestamps after checktou
220 may trigger Make difficulties. As a way to avoid, from the top level of
221 the distribution, execute ``sh after-patch.sh``. If you miss either
222 :code:`sh` or GNU :code:`touch`, try ``python after-patch.py`` instead.
227 Once you have an unpacked distribution, see files:
229 =================== =======================================================
230 File name Description
231 =================== =======================================================
232 :file:`ABOUT-NLS` how to customise this program to your language
233 :file:`COPYING` copying conditions for the program
234 :file:`COPYING.LIB` copying conditions for the library
235 :file:`INSTALL` compilation and installation instructions
236 :file:`NEWS` major changes in the current release
237 :file:`THANKS` partial list of contributors
238 =================== =======================================================
240 Besides those configure options documented in files :file:`INSTALL`
241 and :file:`ABOUT-NLS`, a few extra options may be accepted after
244 + Options ``--disable-shared`` or ``--disable-static``
246 to inhibit the building of shared libraries or static libraries; the
247 default is to always build static libraries, and to attempt building
248 shared libraries if there is some known recipe for this.
250 + Option ``--with-gnu-ld``
252 to force the assumption that the C compiler uses GNU ld.
254 + Option ``--with-dmalloc``
256 to trigger a debugging feature for looking at memory management
257 problems, it pre-requires Gray Watson's package, which is available as
258 ftp://ftp.letters.com/src/dmalloc/dmalloc.tar.gz .
263 For simple modifications to Recode, you should not need special tools
264 beyond those usual for installing GNU packages. However, if you modify
265 any :file:`.l` source file, Python and Flex are both needed for remaking
268 For more comprehensive modifications, you might need more tools. If not
269 done already, make sure you have a copy of the packages listed in the
270 following table. You may also choose to establish a link in your build
271 :file:`doc/` directory, as explained within :file:`doc/Makemore`.
273 ================ ========== ========== =============
274 Package name Current Minimum Install after
275 ================ ========== ========== =============
276 :code:`autoconf` 2.61 2.12 :code:`m4`
277 :code:`automake` 1.10 1.9 :code:`Perl`
278 :code:`Flex` 2.5.33 2.5.4a
279 :code:`gettext` 0.16 0.16
280 :code:`Help2man` 1.36 1.020 :code:`Perl`
281 :code:`libtool` 1.5.24 1.3.4
282 :code:`m4` 1.4.10 1.4n
284 :code:`Perl` 5.8.8 5.005.03
285 :code:`Python` 2.5.1 2.2
286 :code:`tar` 1.17 1.12
288 ================ ========== ========== =============
290 The *current* version numbers just happen to be those used for
291 development, it is often likely that older versions would work just as
292 well. The *minimum* version numbers were once acceptable, they might
293 not be anymore, this has not been verified; any updating information is
299 Here are a few hints which might help installing Recode on some systems.
300 Many may be applied by temporary presetting environment variables while
301 calling ``./configure``. File :file:`INSTALL` explains this.
305 Some C compilers, like Apollo's, have a hard time compiling
306 :file:`merged.c`. If this is your case, avoid compiler
307 optimisation. From within the Bourne shell, you may use::
311 But if you want to give a real hard time to your C optimiser on
312 :file:`merged.c`, to get code that runs only a bit faster, merely
315 CPPFLAGS=-DINLINE_HARDER ./configure
319 For 80286 based systems (do some still exist?!), it has been
320 reported that some compilers generate wrong code while optimising
321 for *small* models. So, from within the Bourne shell, do::
323 CFLAGS=-Ml LDFLAGS=-Ml ./configure
325 to force large memory model. For 80286 Xenix compiler, the last time
326 it was tried a while ago, one ought to use::
328 CFLAGS='-Ml -F2000' LDFLAGS=-Ml ./configure
330 Other systems have poor :code:`pipe`/:code:`popen` support or thrash
331 heavily when processes fork. In this case, just before doing
332 ``make``, edit :file:`config.h` and ensure :code:`HAVE_PIPE` is
343 + Character Mnemonics & Character Sets
345 + ftp://nic.ddn.mil/rfc/rfc1345.txt
347 Keld Simonsen <keld@dkuug.dk>, 1992-06.
349 + UTF-7 - A Mail-Safe Transformation Format of Unicode
351 + ftp://nic.ddn.mil/rfc/rfc1642.txt
353 David Goldsmith <david_goldsmith@taligent.com> and Mark Davis
354 <ark_davis@taligent.com>, 1994-07.
356 + UTF-8, a transformation format of Unicode and ISO 10646
358 + ftp://nic.ddn.mil/rfc/rfc2044.txt
360 François Yergeau <yergeau@alis.com>, 1997-10.
364 + Unicode charset mappings
366 + ftp://ftp.unicode.org:/Public/MAPPINGS/
368 The Unicode consortium makes available plenty of charset mappings
369 for converting "legacy" charsets to Unicode.
371 + Normalisation et internationalisation: Inventaire et prospectives des
372 normes clefs pour le traitement informatique du français. (392p.)
374 This is a report, written in French, discussing charset issues and many
375 other topics as well. Laurent Bourbeau <bourbeau@progiciels-bpi.ca>
376 and François Pinard <pinard@iro.umontreal.ca>, 1995-10.
378 + ftp://ftp.iro.umontreal.ca/pub/contrib/pinard/accents/oqil-tome1.ps.gz
379 + http://www.ceveil.qc.ca/Normes
385 In 1999, the organisers of the `m17n99 conference`__ in Tsukuba,
386 Japan, were kind enough to invite me. This has been for me a
387 fabulous trip and experience, and I met many extraordinary people in
388 there. At the conference, I presented the Translation Project, and
389 Recode. The Recode `presentation slides`__ are available.
391 __ http://www.m17n.org/conference/m17n99_all_but_registration/welcome.en.html
399 This comprehensive charset converter library revolves around Unicode,
400 and support Asian encodings among many others. Even Recode uses it!
402 + http://www.gnu.org/software/libiconv/
404 Bruno Haible <haible@ilog.fr>
408 Here is the main recoding tool from the Plan9 project.
410 + ftp://research.att.com/dist/tcs.shar.Z
414 This GUI editor handles many encodings, among which UTF-8. It also
415 installs uniconv, a recoding program, and uniprint, a printing tool.
417 + ftp://sunsite.unc.edu/pub/Linux/apps/editors/X/yudit-1.2.tar.gz
419 Gaspar Sinai <gsinai@iname.com>, 1999-01.
423 These 6x13 fonts, covering Unicode characters besides the Asian sets,
424 merely replace the Linux fixed 6x13 font. Works nicely with yudit.
426 + http://www.cl.cam.ac.uk/~mgk25/download/ucs-fonts.tar.gz
428 Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>, 1998-11.
432 This charset converter is oriented towards SGML text manipulation. It
433 may be freely downloaded for non-commercial, non-military use from:
435 + http://www.lpl.univ-aix.fr/projects/multext/MtRecode/
437 Pointer given by Jean Véronis <veronis@univ-aix.fr>, 1996-06.
441 This quite nice SGML structure analyser contains internal C++ modules
442 for handling many charsets.
444 + ftp://ftp.jclark.com/pub/sp/sp-1.3.tar.gz
446 James Clark <jjc@jclark.com>
450 This program is able to generate interpreted
451 character dumps, but properly embedded within complete C header files.
453 + http://research.de.uu.net:8080/~gnu/b2c/b2c-2.1.tar.gz
455 Jörg Heitkötter <Joerg.Heitkoetter@de.uu.net>, 1997-11.
459 This wrapper provides Recode functionality to Python programs.
461 + http://www.suxers.de/PyRecode.tgz
463 Andreas Jung <ajung@server.python.net>
467 + http://www.vex.net/parnassus/apyllo.py?find=recode
468 + http://www.suxers.de/python/pyrecode.htm
473 Please mailto:recode-bugs@iro.umontreal.ca if you are aware of various
474 ports to non-Unix systems not listed here, or for corrections. Please
475 provide the goal system, a complete and stable URL, the maintainer name
476 and address, the Recode version used as a base, and your comments.
480 Juan Manuel Guerrero <juan.guerrero@gmx.de> maintains this port,
481 dated 2001-03 and based on Recode 3.5. The following archives hold
482 binaries, docs and sources respectively.
484 + ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/rcode35b.zip
485 + ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/rcode35d.zip
486 + ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/rcode35s.zip
488 See `contrib/DJGPP/README`__ in the Recode distribution for more
489 information about compiling this port.
495 Darrel Hankerson <hankedr@mail.auburn.edu> maintains this port, dated
496 1994-11 and based on Recode 3.4. You get many GNU tools, not only
497 Recode. The GNUish project is described in :file:`gnuish_t.htm`.
499 + http://www.simtel.net/simtel.net/
500 + http://www.leo.org/pub/comp/platforms/pc/gnuish (Germany)
501 + ftp://ftp.simtel.net/simtelnet/gnu
502 + ftp://ftp.leo.org/pub/comp/platforms/pc/gnuish
504 + OS/2 (using emx/gcc)
506 Maintainer unknown (maybe Kai Uwe Rommel <rommel@ars.de>), dated
507 1994-11 and based on Recode 3.4.
509 + http://hobbes.nmsu.edu/pub/os2/util/convert/gnurcode.zip