Implemented "character map" system for replacing Unicode
characters. (closes #
1226009).
::PROBLEM:
The existing manpages mechanism for replacing Unicode symbols and
special characters with roff equivalents is not scalable and not
anywhere near as complete as it should be.
For example, the mechanism currently only handles a (somewhat
arbitrary) selection of less than 20 or so Unicode characters.
But there are potentially more than _800_ Unicode special
characters that have some groff equivalent they can be mapped to.
And there are about 34 symbols in the Latin-1 (ISO-8859-1) block
alone. Users might reasonably expect that if they include any of
those Latin-1 characters in their DocBook source documents, they
will get correctly convered to known roff equivalents in output.
In addition to those common symbols, certain users may have a need
to use symbols from other Unicode blocks.
Say, somebody who is documenting an application related to math
might need to use a bunch of symbols from the "Mathematical
Operators" Unicode block (there are about 65 characters in that
block that have reasonable roff equivalents).
Or somebody else might really like Dingbats -- such as the
checkmark character (I like that one myself) and so might use a
bunch of things from the "Dingbat" block (141 characters in that
that have roff equivalents or that can at least be "degraded"
somewhat gracefully into roff).
So we need a mechanism that is capable of handling all those 800
Unicode characters that have roff equivalents -- and/or of
allowing users to choose which Unicode blocks to use (through
tuning the value of a parameter or something).
::FIX:
Replaced the current Unicode character-substitution mechanism
(replace-entities template) with a completely different
character-substitution mechanism that is based on use of a
"character map" (in a format compliant with the XSLT 2.0 spec and
therefore completely "forward compatible" with XSLT 2.0).
By default, the new "character map" mechanism does replacement of
all Latin-1 symbols, along with most special spaces, dashes, and
quotes (about 75 characters by default, compared to the less than
20 special characters that were handled previously). And the
"full" character map provides support for converting about 800
characters.
The mechanism is controlled through the following parameters:
- man.charmap.enabled:
turns character-map support on/off
- man.charmap.use.subset.xml
specifies that a subset of the character map is used instead
of the full character map
- man.charmap.subset.profile.xml
specifies profile of character-map subset
- man.charmap.uri.xml
specifies an alternate character map to use instead of the
"standard" character map provided in the distribution
For more details, see the current documention at:
http://docbook.sf.net/snapshot/xsl/doc/manpages/charmap.html