Checkpointing before coding and committing final character-map changes.
This change fully implements character-map support. I'll write up
a longer description of that in a later commit. But the brief
description is: The old Unicode character replacement mechanism
(replace-entities template) has been removed; a completely
different character-replacement mechanism is now used instead.
By default, it does replacement of all Latin-1 symbols, along with
most special spaces, dashes, and quotes (about 75 characters by
default, compared to the less than 20 special characters that were
handled previously). And the "full" character map provides support
for converting about 800 characters. The mechanism use a
"character map" (in a format compliant with the XSLT 2.0 spec and
therefore completely "forward compatible" with XSLT 2.0.
Other changes made for this commit:
- Changed default output encoding to UTF-8.
THIS DOES NOT MEAN THAT MAN PAGES ARE OUTPUT IN RAW UTF-8,
because the character-map is applied before final output,
causing all UTF-8 characters covered in the map to be
converted to roff equivalents.
- Removed code for adding backslashes before periods/dots and
before hyphens (-); here's why:
* Backslashes in front of periods/dots are needed only in the
very rare case where a period is the very first character in
a line, without any space in front of it. A better way to
deal with that rare case is for authors to add a zero-width
space in front of the offending dot(s) in their source
* Backslashes in front of (-/-) are needed... when?
Myself, I don't know, so the current stylesheet does not add
backslashes in front of them, ever. If there is a specific
case where they are necessary or desirable, then we need to
add code for that case, not just do a blanket conversion.
And, anyway, my understanding from reading the groff docs is
that \- is, specifically, a _minus sign_. So if you have a
place where you want a minus sign to be output instead of
(-), then you should use (−/−) in your
source instead. And if you have a place where you want an
en dash, (–/–). Or if there are places where
the stylesheets are internally generating (-) where they
should be generating − or –, then we need to fix
those, not just do blanket conversion.
- Consolidated all bold and italic formatting so that it is done
by applying the mode="bold" and mode="italic" templates.
- Consolidated handling of all instances where we want to
prevent line breaking; they are all now processed using the
prevent.line.breaking template.
- Removed "quote" template. In output, this was causing anything
marked up with the <quote> element to be preceded by two
backticks and followed by two apostrophes -- that is, that
old-school hack for generating "curly" quotes in Emacs and in
X-Windows fonts. While Emacs still seems to support that,
I don't think X-Windows has for a long time now. And, anyway,
it looks (and has always looked) like complete crap when
viewed on a normal tty/console