In a sort of Chesterton’s fence¹ sense, I do not understand the existing code.
While I understand the first order motivation (dynamic string buffering inline),
I do not follow the implementation. It manages to hit the trifecta:
1. NULL pointer dereference. Failures of allocation functions were not
checked.
2. Memory leaks. Calls to `agstrdup` and `agstrdup_html` allocate an owned
string internally, yet this code was acting as if they stole the `Sbuf`
backing memory.
3. Not-obviously-correct and repeated computation of the end of the buffer to
append to. `addstr` seemed overly complex and inefficient.
Static initialization of an agxbuf was a bit complicated until 04a248348798a4a32fc9ac7e5d860a84ab5ee49f, though it was still possible.
Following that commit, it becomes a no-brainer.
The code after this change still leaks memory. Perhaps in future we should reset
(`agxbfree(&Sbuf); memset(&Sbuf, 0, sizeof(Sbuf));`) the buffer after `agstrdup`
or `agstrdup_html`, though there would be a performance cost to this.
Magnus Jacobsson [Thu, 21 Jul 2022 11:16:41 +0000 (13:16 +0200)]
tests: SvgAnalyzer: add internal re-creation of the SVG element hierarchy
This will be used in upcoming commits in this series to store
attributes on and to re-create the original SVG from. Also, an
upcoming commit series will extend the SVG analyzer to be aware of
Graphviz graphs, nodes and edges and will use this hierarchy and the
associated attributes to be able to answer questions about their
properties.
tests: svgAnalyzer: remove useless handling of 'any' SVG element
We haven't asked SVG++ to be called back for any element, just for
explicitly specified elements. Therefore we will never receive any
unknown elements that the SVG analyzer could count so this
functionality is also removed.
gvpr colorx: use a local agxbuf instead of GVPR state’s 'tmp'
This makes it clearer to both humans and the compiler that the data written to
this buffer is not required beyond this function. This also gets us closer to
removing an SFIO dependency.
This makes it clearer to both humans and the compiler that the data written to
this buffer has a much more constrained scope than was previously implied. This
also gets us closer to removing an SFIO dependency.
gvpr traverse: use a local agxbuf instead of GVPR state’s 'tmp'
This makes it clearer to both humans and the compiler that the data written to
this buffer is not required beyond this function. This also gets us closer to
removing an SFIO dependency.
`toUpper` worked by using the GVPR state’s temporary buffer, `tmp`, to
incrementally construct an uppercase copy of the input. This works but is
suboptimal. When printing to the temporary buffer, it has no knowledge of what
we are up to and needs to make heuristic-based decisions on how to allocate
memory. But we know the exact size of the result string already. This commit
takes advantage of this knowledge and allocates the destination memory upfront
and then simply writes directly into it. By doing this, we reduce heap pressure
and eliminate an intermediate copying operation.
This commit also renames the `s` parameter to something more descriptive.
`toLower` worked by using the GVPR state’s temporary buffer, `tmp`, to
incrementally construct a lowercase copy of the input. This works but is
suboptimal. When printing to the temporary buffer, it has no knowledge of what
we are up to and needs to make heuristic-based decisions on how to allocate
memory. But we know the exact size of the result string already. This commit
takes advantage of this knowledge and allocates the destination memory upfront
and then simply writes directly into it. By doing this, we reduce heap pressure
and eliminate an intermediate copying operation.
This commit also renames the `s` parameter to something more descriptive.
Roger Nesbitt [Wed, 23 Feb 2022 16:42:25 +0000 (05:42 +1300)]
Use "none" instead of "transparent" when painting in SVG
The SVG 1.1 spec, which SVGs generated by gvrender_core_svg.c are
labeled as, does not have the color "transparent" available which is
causing issues with SVG 1.1 parsers.
When specifying paint in the fill or stroke attributes, use "none"
instead of "transparent". When specifying a gradient color that has
zero opacity, use that color at zero opacity instead of "transparent".
When specifying a gradient with the graphviz transparent color, use
a zero-opacity black as this is how SVG interprets "transparent" in
gradients in SVG 2.0.
cgraph agxbdisown: exit on allocation failure instead of returning NULL
82b7d2021a7faa565065804df8b730a3213a8356 abbreviated some xdot code into a call
to `agxbdisown`. But it failed to account for the fact that `agxbdisown` could
return `NULL` on `strdup` failure, leaving still-allocated memory in the agxbuf.
Thus this commit introduced a new code path that leaked memory. I.e. the removed
call to `agxbfree` should have been retained.
So the obvious fix would be to re-add the `agxbfree` call, right? Well closer
inspection of agxbuf reveals another incongruity. agxbuf takes design decisions
based on the idea that the caller either cannot handle failure or does not want
to handle failure. It calls the alloc.h wrappers that exit on out-of-memory,
alleviating the caller from having to think about such things. Except for one
case: calling `strdup` in `agxbdisown`. Here it assumes the caller _does_ want
to handle failure. Except actually not, because it previously calls `agxbputc`
that conditionally calls `agxbmore` that exits on allocation failure. This is a
bit convoluted, but putting this all together we can see that the caller of
`agxbdisown` actually _cannot_ robustly recover from failure because, even if
they try to, `agxbmore` can still exit before they have a chance.
The way out of this quagmire is to adhere to the original principle more
strongly in `agxbdisown`. We assume, here too, that the caller does not want to
or cannot handle failure. `agxbdisown` now exits on all allocation failure
paths.
This has two coincidental beneficial side effects:
1. In one case, we can leverage `gv_strndup` to avoid writing a trailing
`'\0'` that is only for the purposes of immediately copying it somewhere
else.
2. Any caller of `agxbdisown` no longer needs to call `agxbfree` afterwards.
In an analogy to C++, `agxbdisown` is effectively a move operator and
destructor in one.
We do not usually reformat existing code unless it is otherwise being edited.
But in this case the file was so close to compliant already, it seems worth it.
replace NULL-hinted agxbuf usage with zero initialization
This leads to easy to read code and fewer things for the caller to think about.
I think this should probably become the default pattern for agxbuf usage for the
cases where the data you are printing is unpredictable or you are unsure.
cgraph: rename 'agxbuf.dyna' and flip its polarity
On the surface, this change looks a little odd – why care about what bit means
what in an internal field of a data structure? But this has an interesting side
effect. After this change, the zeroed state of an agxbuf is a valid 0-sized
dynamically allocated buffer, the same as the end state after `agxbdisown`. In
other words, C99 zero initialization (`agxbuf xb = {0}`) now defines a
reasonably “default” agxbuf state. This gives you all the benefits of the
previous “I don’t know how long my data will be so I’ll just use `BUFSIZ` as a
hint” approach with less typing and less hard coding. The next commit will roll
this out for some simplification in agxbuf usage.
cgraph agxbput_n: take an early exit when input string is zero
An upcoming change makes it possible to enter this function with `xb->ptr` as
`NULL`. It looks as if there is nothing wrong with this; a zero-sized string
fits in any agxbuf, including an unallocated one. However UBSan educates us that
calling `memcpy` with a `NULL` destination pointer is undefined behavior, even
when the passed size is 0. So this change avoids reaching `memcpy` when we know
the function will be a no-op.
Note that this is sort of fixing a latent bug, in that an agxbuf user could
already cause the scenario described above by `agxbdisown`ing an agxbuf and then
calling `agxbput_n` on it with size 0. However no existing callers do this.
cgraph: when an agxbuf to be expanded has 0 size, expand by 'BUFSIZ'
When calling `agxbmore` on a 0-sized buffer, the new size to expand to would be
chosen based on the length of the current string being appended. This string
often originated from user’s input graph and hence its length was often
user-defined. There is nothing strictly wrong with this, but it is a little
unpredictable and sometimes leads to pathological outcomes, e.g:
agxbputc(&xb, 'a'); // expands 0-sized buffer to 1
agxbputc(&xb, 'b'); // doubles 1-sized buffer to 2
agxbputc(&xb, 'c'); // doubles 2-sized buffer to 4
agxbputc(&xb, 'd');
agxbputc(&xb, 'e'); // doubles 4-sized buffer to 8
…
This is functionally correct, but that is a lot of `realloc` calls just to
accrue 5 bytes.¹ It is more convenient to enlarge our initial allocation to
something matching the default hint when an agxbuf is initialized.
The situation described above has not typically been a concern in the past
because a 0-sized agxbuf is hard to obtain to begin with. You need to either
call `agxbinit` while supplying your own backing storage that also has 0 size
(why would you do this?) or reuse an agxbuf after `agxbdisown` (technically
legal, but no good reason to do this). Neither of these scenarios occur anywhere
in the Graphviz tree to my knowledge. However an upcoming change will make
0-sized agxbufs more common.
¹ Some of these `realloc`s are likely no-ops, as the system allocator generally
does not give you exactly 1 byte when you ask for 1. It will probably give you
more and then when you ask to double to 2, it will tell you your existing
pointer is just fine to keep on using. But the agxbuf code does not know any
of these details and will still repeatedly call into `realloc`.
Lefty, and its dependent tools Dotty and Lneato, have been suspected unused for
a long time. They were scheduled for removal, but we took a cautious and very
slow approach to this given how pervasive Graphviz is and the limited visibility
we have into its usage:
1. 2021-09-25 make Lefty default to disabled in the build system¹
2. 2022-01-16 disable Lefty in published graphviz.org packages²
3. 2022-03-28 remove build system support³
This completes the final step of removing the code from the repository. It is
still retained in the history, should anyone need access to it or need to build
a past version of Graphviz with Lefty.
So far, we have only heard two notable reactions from the downstream world:
1. Lefty disabling was not quite complete on the first attempt.⁴
2. There has been an offer of assistance to maintain Lefty.⁵ No patches have
yet emerged, but if and when they do we could feasibly reverse direction on
this and reinstate Lefty and friends.
Gitlab: closes #219, closes #552, related to #1836, #1854
expr: remove unused parameters from 'Exdisc_t.matchf'
I guess when I committed 260f650085eb25b5b68c81a5960232d025fb6942 it was not
obvious to me that this is entirely part of private internal interfaces and the
callback signature involved is not exposed publicly.
expr: remove unused parameters from 'Exdisc_t.setf'
I guess when I committed 5b4de3922a2f169c2e773ffe85651898aedd9351 it was not
obvious to me that this is entirely part of private internal interfaces and the
callback signature involved is not exposed publicly.
expr: remove unused parameters from 'Exdisc_t.keyf'
I guess when I committed 2e9753d9c9886f5bc013212420528e9a18bc592b it was not
obvious to me that this is entirely part of private internal interfaces and the
callback signature involved is not exposed publicly.