Michael Smith [Tue, 28 Jun 2005 07:49:51 +0000 (07:49 +0000)]
Added support for generating <rx:meta-field creator="foo"/> field
for XEP. Also, added support for picking up and using contents of
<corpauthor> for XEP <rx:meta-field author="foo"/> field.
Michael Smith [Mon, 27 Jun 2005 10:56:02 +0000 (10:56 +0000)]
Implemented "character map" system for replacing Unicode
characters. (closes #1226009).
::PROBLEM:
The existing manpages mechanism for replacing Unicode symbols and
special characters with roff equivalents is not scalable and not
anywhere near as complete as it should be.
For example, the mechanism currently only handles a (somewhat
arbitrary) selection of less than 20 or so Unicode characters.
But there are potentially more than _800_ Unicode special
characters that have some groff equivalent they can be mapped to.
And there are about 34 symbols in the Latin-1 (ISO-8859-1) block
alone. Users might reasonably expect that if they include any of
those Latin-1 characters in their DocBook source documents, they
will get correctly convered to known roff equivalents in output.
In addition to those common symbols, certain users may have a need
to use symbols from other Unicode blocks.
Say, somebody who is documenting an application related to math
might need to use a bunch of symbols from the "Mathematical
Operators" Unicode block (there are about 65 characters in that
block that have reasonable roff equivalents).
Or somebody else might really like Dingbats -- such as the
checkmark character (I like that one myself) and so might use a
bunch of things from the "Dingbat" block (141 characters in that
that have roff equivalents or that can at least be "degraded"
somewhat gracefully into roff).
So we need a mechanism that is capable of handling all those 800
Unicode characters that have roff equivalents -- and/or of
allowing users to choose which Unicode blocks to use (through
tuning the value of a parameter or something).
::FIX:
Replaced the current Unicode character-substitution mechanism
(replace-entities template) with a completely different
character-substitution mechanism that is based on use of a
"character map" (in a format compliant with the XSLT 2.0 spec and
therefore completely "forward compatible" with XSLT 2.0).
By default, the new "character map" mechanism does replacement of
all Latin-1 symbols, along with most special spaces, dashes, and
quotes (about 75 characters by default, compared to the less than
20 special characters that were handled previously). And the
"full" character map provides support for converting about 800
characters.
The mechanism is controlled through the following parameters:
- man.charmap.enabled:
turns character-map support on/off
- man.charmap.use.subset.xml
specifies that a subset of the character map is used instead
of the full character map
- man.charmap.subset.profile.xml
specifies profile of character-map subset
- man.charmap.uri.xml
specifies an alternate character map to use instead of the
"standard" character map provided in the distribution
Michael Smith [Mon, 27 Jun 2005 08:01:58 +0000 (08:01 +0000)]
Added "man.string.subst.map" parameter for controlling roff string
substitution performed just before applying character map. The
value of this parameter is not really intended to be monkeyed
with, but adding it as a param just in case.
Michael Smith [Mon, 27 Jun 2005 00:50:16 +0000 (00:50 +0000)]
Added an "apply-string-subst-map" function (template). Only
difference is that in the map that it expects, "oldstring" and
"newstring" attributes are used instead of "character" and
"string" attributes.
Michael Smith [Sun, 26 Jun 2005 07:10:38 +0000 (07:10 +0000)]
Checkpointing before coding and committing final character-map changes.
This change fully implements character-map support. I'll write up
a longer description of that in a later commit. But the brief
description is: The old Unicode character replacement mechanism
(replace-entities template) has been removed; a completely
different character-replacement mechanism is now used instead.
By default, it does replacement of all Latin-1 symbols, along with
most special spaces, dashes, and quotes (about 75 characters by
default, compared to the less than 20 special characters that were
handled previously). And the "full" character map provides support
for converting about 800 characters. The mechanism use a
"character map" (in a format compliant with the XSLT 2.0 spec and
therefore completely "forward compatible" with XSLT 2.0.
Other changes made for this commit:
- Changed default output encoding to UTF-8.
THIS DOES NOT MEAN THAT MAN PAGES ARE OUTPUT IN RAW UTF-8,
because the character-map is applied before final output,
causing all UTF-8 characters covered in the map to be
converted to roff equivalents.
- Removed code for adding backslashes before periods/dots and
before hyphens (-); here's why:
* Backslashes in front of periods/dots are needed only in the
very rare case where a period is the very first character in
a line, without any space in front of it. A better way to
deal with that rare case is for authors to add a zero-width
space in front of the offending dot(s) in their source
* Backslashes in front of (-/-) are needed... when?
Myself, I don't know, so the current stylesheet does not add
backslashes in front of them, ever. If there is a specific
case where they are necessary or desirable, then we need to
add code for that case, not just do a blanket conversion.
And, anyway, my understanding from reading the groff docs is
that \- is, specifically, a _minus sign_. So if you have a
place where you want a minus sign to be output instead of
(-), then you should use (−/−) in your
source instead. And if you have a place where you want an
en dash, (–/–). Or if there are places where
the stylesheets are internally generating (-) where they
should be generating − or –, then we need to fix
those, not just do blanket conversion.
- Consolidated all bold and italic formatting so that it is done
by applying the mode="bold" and mode="italic" templates.
- Consolidated handling of all instances where we want to
prevent line breaking; they are all now processed using the
prevent.line.breaking template.
- Removed "quote" template. In output, this was causing anything
marked up with the <quote> element to be preceded by two
backticks and followed by two apostrophes -- that is, that
old-school hack for generating "curly" quotes in Emacs and in
X-Windows fonts. While Emacs still seems to support that,
I don't think X-Windows has for a long time now. And, anyway,
it looks (and has always looked) like complete crap when
viewed on a normal tty/console
Michael Smith [Fri, 24 Jun 2005 07:20:40 +0000 (07:20 +0000)]
Added initial, EXPERIMENTAL support for generating content for
HTML "title" attributes from content of the Alt element.
This change adds support for generating HTML "title" attributes
for the following inline elements only (all inlines -- support for
block elements will need to wait for later).
abbrev accel acronym action application authorinitials beginpage
citation citerefentry citetitle city classname code command
comment computeroutput constant country database email envar
errorcode errorname errortext errortype exceptionname fax
filename firstname firstterm foreignphrase function glossterm
guibutton guiicon guilabel guimenu guimenuitem guisubmenu
hardware honorific interface interfacedefinition interfacename
keycap keycode keysym lineage lineannotation literal markup
medialabel methodname mousebutton option optional otheraddr
othername package parameter personname phone pob postcode
productname productnumber prompt property quote refentrytitle
remark replaceable returnvalue sgmltag shortcut state street
structfield structname subscript superscript surname symbol
systemitem tag termdef token trademark type uri userinput
varname wordasword
Implemented by creating a new named template, generate.html.title.
That is is currently called in eleven places only, in the
inline.xsl file. But it's called by all the inline.* templates
(e.g., inline.boldseq), which in turn are called by other
(element) templates, so it results, currently, in supporting
generation of the HTML "title" attribute for a total of about 93
elements (the list above).
Michael Smith [Fri, 24 Jun 2005 03:08:34 +0000 (03:08 +0000)]
Added support for chunking revhistory into separate file (similar
to the support for doing same with legalnotice). Patch from Thomas
Schraitle (closes #1096574). Controlled through new
generate.revhistory.link parameter.
Michael Smith [Thu, 23 Jun 2005 06:01:19 +0000 (06:01 +0000)]
Changed default for man.output.encoding to UTF-8. Because
character-map processing depends on it. Also added note saying
that changing the value to another encoding my break character-map
processing, so set man.charmap.enabled to 0 in that case.
Michael Smith [Thu, 23 Jun 2005 03:28:56 +0000 (03:28 +0000)]
Added several parameters for manpages output.
I have chosen to use a man.* naming strategy for all man-only
params (which these all are), even though in some cases the params
are basically the same as existing params (man.alignment and
man.hyphenate are basically same as the corresponding FO params,
man.output.encoding is same thing as chunker.output.encoding,
man.output.quietly is same as chunk.quietly).
One reason I went with create new params instead of reusing
existing ones is that this allows default values to be used for
manpages output that are different than the ones for whatever
other output format. For example, I set default for man.alignment
to "left" and man.hyphenation to "false". The corresponding FO
params have defaults of "justify" and "true".
And, as far as the man.* naming strategy, yeah, I guess it it
redundant. If, for example, all the HTML-only params started with
html.*, I guess it might not be so great idea. Or maybe it would
be. Anyway, I don't think there will probably ever be more than 30
man.* params total (god help us). And for developers at least, I
think it is helpful to be able to sort params by prefix if they
want to. Because, well, we now have 434 separate files in the
params/ directory. Going through those to find, say, just the
ones that relate to manpages output is a whole lot easier if they
all start with the same prefix.
Michael Smith [Sat, 18 Jun 2005 08:54:41 +0000 (08:54 +0000)]
Added initial versions of replace-chars-with-strings() and
apply-character-map() functions. These are intended mainly for use
in the manpages stylesheets but may be useful elsewhere too.
I need to fix the logic in the manpages stylesheet so that the
character-map file is read only once per document. The way it is
now, the character map is read each time a refentry is found,
which is a big waste.
Michael Smith [Fri, 17 Jun 2005 12:50:02 +0000 (12:50 +0000)]
Incorporated slides and website stylesheets into the build.
Note: This currently only affects the "distrib" (doc) build. So if
you don't need to build distrib/doc, you won't be affected by this
change. If you DO need to build distrib/doc, it will break unless
you use the xsl/Makefile from the "build" branch instead of from
the head.
This build alters the distrib build such that:
- an xsl/slides directory is created by copying over the
contents of the slides/xsl
- an xsl/website directory is created by copying over the
contents of the website/xsl directory
- the reference.html part of the doc build now adds the slides
and website param reference doc
This is an experiment. If we decide to go ahead with it in the
release build, and everything is found to be OK when it gets out
to users and they test it, then the next step would be to ask SF
admin to move the website/xsl and slides/xsl CVS directory into
xsl/ to create xsl/slides and xsl/website, and they would be
maintained in the xsl/ CVS going forward.
Michael Smith [Fri, 17 Jun 2005 03:46:44 +0000 (03:46 +0000)]
Refined doc build.
- Changed makefiles in docsrc/fo, docsrc/html, and
docsrc/manpages dirs to depend on corresponding params.xsl
instead of on param.xweb. Rationale is that param.xsl gets
rebuilt any time param.xweb changes. In addition, param.xsl
gets rebuild any time any included params/*.xml file is
changed. So making docsrc/* builds depend on param.xsl
effectively makes them depend on both the param.xweb changes
on on the actual param changes.
- Changed doc/Makefile so that reference.html is rebuilt only
when docsrc/reference.xml changes, not when any of its
included files change. Rationale is that, because we chunk
output for the doc build, reference.html is simply a sort of
TOC page that doesn't need to get remade if the included files
change. Because output of those included files goes to
separate fo, html, and manpages subdirs, and that output gets
generated by seperate make targets.
Michael Smith [Tue, 14 Jun 2005 09:17:23 +0000 (09:17 +0000)]
More charmap reorganization.
- Removed unicodetrans.xsl file (function moved to lib/lib.xsl).
- Removed charmap.groff.xml & charmap.roff.min.xml and created a
single charmap.groff.xsl file that incorporates both (using a
class="default" attribute/value to mark those mappings that
are in the default/minimal set).
- Made charmap.groff.xsl into a "real" (valid) XSLT 2.0 character
map so it can be used as-is for XSLT 2.0-aware processing (e.g.,
it can imported or included into another XSLT 2.0 stylesheet).
Michael Smith [Mon, 13 Jun 2005 07:25:40 +0000 (07:25 +0000)]
Trademark symbol handling made consistent with handling of same in
HTML stylesheets. (closes #1218286; thanks to Mauritz Jeanson for
reporting the problem)
Prior to this change, if you processed a doc that contained no
value for the Class attribute on the Trademark element, the HTML
stylesheets would default to rendering a superscript TM symbol
after the Trademark contents, but the FO stylesheets would render
nothing. This change alters the FO handing of Trademark such that
it is now identical to the HTML handling.
Michael Smith [Fri, 10 Jun 2005 05:51:37 +0000 (05:51 +0000)]
Reverted some recent build changes.
Reverted build of xref.xsl. Will no longer need it after Unicode
char handling change is made.
Reverted build of single-pass profiling stylesheet (for now). It
doesn't appear to work with manpages, and figuring if and how I
can get it work is a very low priority, especially given that
single-pass profiling doesn't work with documents that contain
xref instances. If you want to profile content before converting
to man-page output, please just do a separate profiling pass first.
Michael Smith [Fri, 10 Jun 2005 05:45:35 +0000 (05:45 +0000)]
Made further changes for Unicode character translation.
Renamed roff.charmap.xml to charmap.groff.xml.
Added charmap.roff.min.xml (minimal subset of around 40 "safe"
mappings appropriate for nroff as opposed to groff).
Removed $charmap.file param from unicodetrans.xsl in preparation
for adding it as a real param to param.xweb
Removed the used-in-manpages-stylesheet-only replace-string()
function and replaced all instances where it had been called with
calls to the same string-substitution function used by the HTML
and FO stylesheets: string.subst() from ../lib/lib.xsl
Michael Smith [Wed, 8 Jun 2005 09:53:22 +0000 (09:53 +0000)]
Reworked *info gathering and rethought Refclass handling.
For each Refentry found, we now cache its *info and its parent's
*info as node-sets; we then do all further matches against those
node-sets (rather than re-selecting the original *info nodes each
time we need to check them).
Also, reverted the special handling of Refclass that was added
recently. We eventually need to make Refclass handling consistent
with that of the HTML and FO stylesheets.
Michael Smith [Tue, 7 Jun 2005 10:51:15 +0000 (10:51 +0000)]
Don't render NAME heading for secondary Refnamedivs (closes #1216292)
If a document has multiple Refnamedivs, a NAME heading was getting
rendered for each. But we only need one NAME heading. This change
causes it to be rendered just once. This makes behavior in this
respect consistent with how the HTML and FO stylesheets handle the
generated NAME heading for Refnamediv.
Michael Smith [Tue, 7 Jun 2005 07:38:07 +0000 (07:38 +0000)]
Original changelog from the docbook/contrib/xsl/db2man days.
Adding so that we at least have access to a record of the change
descriptions here (if not the whole CVS history).
Michael Smith [Tue, 7 Jun 2005 06:21:59 +0000 (06:21 +0000)]
Removed unnecessary trailing comma after final term/glossterm
(closes #1215890; thanks to Sam Steingold for reporting the
problem).
::PROBLEM::
If a varlistentry or glossentry contains multiple term or
glossterm elements, a comma is rendered after the final term or
glossterm. A comma should instead be rendered only after every
term or glossterm _except_ the last.
::FIX::
Reworked template logic for term/glossterm. They are now handled
with an xsl:for-each in the varlistentry/glossentry template,
rather than as separate templates.
HTML and FO stylesheets appear to have the same problem, so we
probably need to port this change to those as well.
Michael Smith [Mon, 6 Jun 2005 09:39:41 +0000 (09:39 +0000)]
Uppercase titles in x-ref to Refentry children (closes #1215547;
thanks to Jens Granseuer for reporting the problem).
::PROBLEM::
Titles of all first-level sections in man pages are always
rendered in uppercase. But cross-references to those titles are
not uppercase.
::FIX::
Cross-references to titles of all first-level sections of Refentry
output are now rendered in uppercase; that is, titles in x-refs to
Refnamediv, Refsynopsisdiv, Refsect1, and any Refsection that is a
direct child of Refentry.
Also, x-ref to Refnamediv now uses the localized "NAME" title
instead of the using the first Refname child. This makes the
output inconsistent with HTML and FO output, but for man-page
output, it seems to make better sense to have the "NAME". (It may
actually make better sense to do it that way in HTML and FO output
as well.) That said, I guess it's not likely that most people
would put in an x-ref to a Refnamediv section, so maybe it's kind
of a moot point...
Bob Stayton [Fri, 3 Jun 2005 17:22:31 +0000 (17:22 +0000)]
Fixed bug [ 1212159 ] Missing navigational links with XHTML chunking
which was caused by the following:
1. When chunk.hierarchy is created, it is a collection of
div elements in a variable, and then that is converted to a
node-set by exlt.
2. In HTML, there is no namespace, so the div elements are in
no namespace and they work.
3. In XHTML, the div elements are in the xhtml namespace because
it is the default namespace. So they cannot be addressed as
just "div", they must have a namespace prefix.
4. I added an explicit chunkfast namespace to avoid conflict
with the default namespace.
Michael Smith [Thu, 2 Jun 2005 06:58:00 +0000 (06:58 +0000)]
Added support for processing funcparams (closes #1213166; thanks
to Barry Rountree for reporting).
::PROBLEM::
The funcparams element was not being processed as expected.
::CAUSE::
No logic existed in manpages stylesheets for handling funcparams.
::FIX::
Fixed by taking old code for handling of funcprototype and
children, and replacing it with code ported over from HTML
templates for ANSI-style output.
::AFFECTS::
This change affects handling of all funcprototype output. Along
with adding support for funcparams, the following changes were
also made:
- removed the space that was being output between funcdef and
paramdef; example:
was: float rand (void);
now: float rand(void);
- turned off bold formatting for the <type> element when it
occurs within a funcdef or paramdef
- moved space -> nobreak-space replacement logic into a separate
template (for potential re-use elsewhere if we need it)
::TODO::
We need to add an option for K&R style funcprototypes.
See #1213277.
Michael Smith [Wed, 1 Jun 2005 17:10:03 +0000 (17:10 +0000)]
Applied patch from David Green for #1211477, to prevent
StackOverflowError encountered when processing tables with ~700
rows with Xalan. Smoke-tested and didn't see any obvious problems
with the fix, so going ahead and committing it so others can test
with snapshot.
Michael Smith [Wed, 1 Jun 2005 11:41:48 +0000 (11:41 +0000)]
Align refnamediv title correctly when refentry.generate.title
is non-zero (closes #1212641).
::Problem:
When refentry.generate.title is non-zero, the title output for
Refnamediv is not aligned flush left.
::Cause:
No code for setting start-indent="" was included in
refentry.title.properties. It should be in order to make the
Refnamediv title output be flush left, as are titles for all other
sectioning children of Refentry
::Fix:
Added code for setting start-indent="" in
refentry.title.properties.
Michael Smith [Wed, 1 Jun 2005 10:45:30 +0000 (10:45 +0000)]
Generate XEP bookmarks for Refentry children. (closes #1212491)
Titled child sections of Refentry (Refsynosisdiv, Refsection, and
Refsect1 to Refsect3) were not included in the match statement
used for generating XEP bookmarks. Don't know whether that was
intentional for some reason or whether it was just an oversight.
But given that both AXF and Passivetex bookmarks are generated for
those same Refentry children, it seems like XEP ones ought to also
be generated, for the sake of consistency if for no other reason.
Michael Smith [Wed, 1 Jun 2005 10:25:17 +0000 (10:25 +0000)]
Corrected formatting of generated "Name" title in refentry output
(closes #1212396; thanks to Andreas Lalloo for reporting the
problem).
:Problem::
The "Name" title generated for FO output of Refnamediv in Refentry
is not aligned flush left, as all the other subheadings of
Refentry are, and as the generated Name subheading for Refentry is
in HTML output. Also, the Name title is in a larger font size than
the titles of the other first-level children of Refentry.
:Fix::
The "Name" title generated for FO output of Refnamediv in Refentry
is now handled using the same formatting as that used for all
other first-level children of Refentry.
::Affects:
Along with affecting processing for generated titles for
Refnamediv in FO output, it is possible that this change may have
unanticipated side effects on processing of titles for
Refsynopsisdiv, Refsection, and Refsect1 to Refsect3. The reason
is that part of this change takes the template contents formerly
used only for processing Refsynopsisdiv, Refsection, and Refsect1
to Refsect3, and "repurposes" those template contents for use in
processing the generated title for Refnamediv.
Michael Smith [Mon, 30 May 2005 10:59:42 +0000 (10:59 +0000)]
Re-worked construction of .TH title line (closes #1210488).
Also, made comment generated at top of page include version info
(closes #1211254).
Here are the details about the refinements made to the
construction of the .TH title line:
- "extra1" (which shows up in the center footer of each page):
If a date cannot be found in the source, we now automatically
generate a localized "long format" date
- "extra2" (which shows up in the left footer):
We now first search for "product version" info; then, if we
can't find that, a "product name"; if we can't find that, we
look for "other" info to use. And we can't find that, we leave
it empty. The exact sequence of elements checked is this:
1. productnumber in info or refentryinfo
2. productnumber in info or referenceinfo of parent reference
3. any refmeta/refmiscinfo that has class = 'version'
4. productname in info or refentryinfo
5. productname in info or referenceinfo of parent reference
6. refmeta/refmiscinfo (first one)
7. refnamediv/refclass (first one)
- "extra3" (which shows up in the center header):
The exact sequence of elements checked is now this:
1. title in info or referenceinfo of parent reference
2. refnamediv/refclass (first one)
3. refmeta/refmiscinfo (first one)
Michael Smith [Sun, 29 May 2005 09:00:28 +0000 (09:00 +0000)]
Standalone stylesheet for stripping namespaces from DocBook 5/NG
docs. You currently need to do two-pass processing to use this:
First, transform your DocBook 5/NG source doc using this, then
transform the result as usual with the manpages/docbook.xsl
stylesheet. Of course you can always run the process using a pipe
if you want. Example:
It may be that there is actually some way to set it up as a
single-pass XSLT process, as is done with the HTML stylesheets.
But I've not yet figured out how to get that to work...
Jirka Kosek [Sat, 28 May 2005 11:54:15 +0000 (11:54 +0000)]
Previous change (adding text-align="left") caused article titles to be displayed left aligned instead centered. This was backward incompatible change in presentation. Now attribute set is conditional and outputs text-align="center" for titles of standalone articles.
Probably in the future more general fix should be done -- either creating separate article.title.properties, or refactoring FO properties settings between titlepage templates and attribute sets.
Michael Smith [Sat, 28 May 2005 02:55:59 +0000 (02:55 +0000)]
Portability tweaks for the build.
- pull in cvstools/Makefile.incl, mainly so that we can use
cvstools/runtrang
- "trang" -> $(RUNTRANG) so that cvstools/runtrang is used; if
users don't have trang binary installed, that will find
trang.jar and run it. Also allows users to manually specify
what trang they want (e.g., "make RUNTRANG=trang")
- "clean" target now also removes dbforms* files
- "clean" target now also does "make -C build clean"
Michael Smith [Thu, 26 May 2005 23:29:25 +0000 (23:29 +0000)]
Make language codes RFC compliant (closes #1208931; thanks to
Bernd Groh for reporting).
::PROBLEM:
Stylesheets output two-part language codes in the form "zh_CN".
But underscores in language codes are actually neither RFC
compliant nor compliant with the HTML 4.0 rec. The separator
should be a hyphen. To quote the specs:
Section 8.1.1, "Language Codes"[1], in the HTML 4.0 Rec.
states that:
[RFC1766] defines and explains the language codes that MUST be
used in HTML documents.
Briefly, language codes consist of a primary code and a
possibly empty series of subcodes:
language-code = primary-code ( "-" subcode )*
And in RFC 1766, "Tags for the Identification of
Languages"[2], the EBNF for "language tag" is given as:
::CAUSE:
Stylesheets simply pass through language codes unaltered. So if
users put "zh_CN" in their source, they will get "zh_CN" in
their HTML output.
::FIX:
Added a new boolean config parameter, "l10n.lang.value.rfc.compliant",
set to 1 by default. If it is non-zero, any underscore in a
language code will be converted to a hyphen in HTML output. If
it is zero, the language code will be left as-is.
::AFFECTS:
This change affects any HTML output that contains two-part
language codes.