Dr.Smile [Sun, 19 May 2019 22:01:34 +0000 (01:01 +0300)]
renderer: improve handling of subpixel shift
Integral pixel shift is extracted in quantization function now,
taking account of full glyph transformation and not only translation
part of it. It makes program logic more straight and ensures that
subpixel shift from cache key never exceed full pixel.
Dr.Smile [Sun, 19 May 2019 21:48:26 +0000 (00:48 +0300)]
Consolidate and quantize all transformations
This commit defers all outline transformations until rasterization stage.
Combined transformation is then quantized and used as bitmap key.
That should improve performance of slow animations.
Also caching of initial and stroked outlines and bitmaps is now separate
in preparation to proper error estimation for stroker stage.
Note that Z-clipping for perspective transformations is now done
differently compared to VSFilter. That clipping is mostly safety feature
to protect from overflows and divisions by zero and is almost never
triggered in real-world subtitles.
Dr.Smile [Sun, 19 May 2019 17:24:29 +0000 (20:24 +0300)]
cache: construct cache values only from corresponding keys
This commit forces construction of cache values using only data
available in its companion keys. That ensures logical correctness:
keys are guaranteed to have all the necessary data, and prevents
accidental collisions.
Most fixes of cache logic correspond to minor problem
when rendering is done with double parameter but cache key stores
its approximate fixed-point representation. The only serious problem
is missing scale of clip drawing. Also this commit removes unused
scale parameters from glyph metrics cache key.
Due to missing scale clip shapes that differed only in scale
treated by cache system as identical. That can lead to incorrect reuse
of cached bitmap of different scale instead of correct one.
The only hack left is in glyph metrics cache with its
unicode >= VERTICAL_LOWER_BOUND check.
Dr.Smile [Sun, 21 Jan 2018 17:30:35 +0000 (20:30 +0300)]
render: simplify detection of hard overrides
Previously each \r triggered full rescan of event string.
After this commit such scanning is done once in init_render_context().
Additionally some lines have moved around to correctly account for
state.evt_type (calculated in apply_transition_effects) and
state.explicit (used in reset_render_context).
That should fix cases with incorrectly applied style overrides
for subs with banner scrolling effect before the first \r.
Oleg Oshmyan [Thu, 4 Jan 2018 01:37:28 +0000 (03:37 +0200)]
parse_tags: handle argumentless \t inside \t() like VSFilter
\t with no parantheses inside \t() resets the animation parameters
of the \t() for subsequent tags, so they are animated as if the \t()
was the single-argument version regardless of the actual number
of arguments the \t() has.
Equivalently, you could say parentheses are implied for \t inside \t().
For example, \t(20,60,\frx0\t\fry0\frz0) animates \frx from 20 to 60 ms
and animates \fry and \frz for the whole duration of the line,
just like \t(20,60,\frx0)\t(\fry0\frz0) or \t(20,60,\frx0\t(\fry0\frz0)).
Technically, VSFilter simply resets the animation parameters for any \t
it encounters but parses the embedded tags only if the \t has the right
number of arguments. However, top-level animation parameters don't matter
because top-level tags are not animated, while any nested \t that has
parentheses terminates the containing \t because they share the closing
parenthesis, so the fact that a nested \t with empty parentheses or with
at least four arguments changes the animation parameters also doesn't
matter because the containing \t immediately ends and the changed
parameters have nothing to apply to. Thus the only situation where
this has a visible effect is a nested \t without parentheses.
Oleg Oshmyan [Thu, 4 Jan 2018 00:42:09 +0000 (02:42 +0200)]
parse_tags: don't recurse for nested \t()
This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=4892
(stack overflow on deeply nested \t()).
This is possible because parentheses do not nest and the first ')'
terminates the whole tag. Thus something like \t(\t(\t(\t(\t() can be
read in a simple loop with no recursion required. Recursion is also
not required if the ')' is missing entirely and the outermost \t(...
never ends.
See https://github.com/libass/libass/pull/296 for more backstory.
Prefer to link against ApplicationServices to maximize the
portability of binaries built on newer versions of macOS.
The symbol kCTFontURLAttribute, which is checked in this commit, was
introduced in Mac OS X 10.6, the latest of any Core Text symbols that
we use. It is essential to our Core Text font provider, so this is the
earliest version of Mac OS X where we can support this font provider.
The TARGET_OS_IPHONE conditional that this commit adds is necessary to
continue supporting iOS in addition to supporting old Mac OS X. On iOS,
CoreText.h *must* be included to use Core Text, whereas on old Mac OS X,
CoreText.h is not directly accessible and ApplicationServices.h must be
used. On modern macOS, either header works. This conditional is also
used in HarfBuzz.
FT_Vector and FT_BBox types are based on FT_Pos, which is alias of long.
FreeType treats it as 32-bit integer, but on some platforms long can be
64-bit. That leads to wasted memory and suboptimal performance.
Allow using shadow offset to adjust size of text background
Text background refers to the libass-only BorderStyle 4, which is
similar to 3, but isn't affected by outline/border size and doesn't
render shadow, so shadow offset can be used.
You can override the horizontal and vertical box size separately
with override tags, just like you can override the color with
shadow color.
Grigori Goronzy [Thu, 1 Jun 2017 09:25:09 +0000 (11:25 +0200)]
directwrite: fix font collections
DirectWrite's FontFileStream does not actually use the data of a specific
font in a collection, which was an expectation of the existing code. It
simply returns a stream to the underlying file, collection or not. So we
need to get the index of the font. This needs to be done lazily as this
information is only available in a FontFace, which is expensive to
initialize.
Add a new optional font provider function for lazy initialization of the
index and use it. This is similar to the check_postscript callback.
Grigori Goronzy [Wed, 10 May 2017 11:39:57 +0000 (13:39 +0200)]
Fix PlayResX/Y calculations
Avoid that PlayResY is set to 0 when only PlayResX is specified and
set to 1. Setting PlayResY to 0 results in divide-by-zero errors.
Also fix PlayResX calculations in case only PlayResY is specified,
for completeness.
Oleg Oshmyan [Sat, 4 Feb 2017 02:02:50 +0000 (04:02 +0200)]
Fix decode_font when size % 4 != 0 or data contains illegal bytes
When given a byte c, decode_chars expects that 0 <= c - 33 <= 63,
i. e. that only the six lowest bits of c - 33 are possibly set.
With this assumption, it shifts and adds together multiple c - 33 values.
When c > 96, c - 33 has high nonzero bits, which interferes with other
shifted terms. c < 33 is even worse: c - 33 is negative (if unsigned char
fits in int), and left-shifting negative numbers has undefined behavior.
Even before the shift, on common platforms with a two's complement
representation of negative integers (or if unsigned char does not fit in
int and is promoted to unsigned int), c - 33 has high nonzero bits, which
again interfere with other shifted terms.
To make matters worse, even perfectly valid encoded data is affected when
size % 4 != 0, as decode_font calls decode_chars with '\0', which leads
decode_chars to shift and add -33, causing undefined behavior and/or
incorrect output.
Take our cue from VSFilter and bit-mask c - 33 to keep only the six
relevant bits. To ensure that we get the same bits as VSFilter when
c < 33 and to avoid the undefined behavior of left-shifting negative
numbers, convert the number to unsigned before masking and shifting.
While we are at it, rewrite decode_chars entirely
to get rid of any GPL code from mkvtoolnix.
Related mkvtoolnix bug: https://github.com/mbunkus/mkvtoolnix/issues/1003
Oleg Oshmyan [Fri, 3 Feb 2017 19:40:19 +0000 (21:40 +0200)]
string2timecode: don't truncate milliseconds to int
Commit 8c8741fe2000d4b4d89a53f894363a42288cec3e attempted to fix this
expression and make it use the full range of long long, but it missed
the millisecond term.
This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=522.
The entire timestamp can still overflow long long though.
Oleg Oshmyan [Fri, 3 Feb 2017 19:34:13 +0000 (21:34 +0200)]
Fix parsing of unusual Alignment values in ASS style definitions
Handle large and negative values except INT32_MIN like VSFilter.
This avoids both overflow and inconsistent internal state.
This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=523.
VSFilter handles INT32_MIN like a mix of \an1, \an2 and \an3:
* Vertical alignment is bottom.
* Lines within the event are center-aligned.
* Without \pos or \move, the center of the event is aligned
with the right edge of the screen minus MarginR.
* With \pos or \move, the left edge of the event is aligned
with the position point.
* Without \org, the rotation origin is aligned
with the horizontal center of the event.
* (With \org, the rotation origin is as specified.)
If we wanted to emulate this in libass, the cleanest way would be to
introduce a new horizontal alignment constant for this purpose that
would be used only for ASS style definitions with Alignment INT32_MIN.
This commit makes no attempt to do this and instead arbitrarily picks
\an2 for style definitions with Alignment -INT_MAX-1, which equals
INT32_MIN if int is int32_t. The fact that int is platform-dependent
is one of the reasons for this. We could change Alignment to be int32_t
instead of int for perfect VSFilter compatibility, but the same applies
to many other fields that currently use platform-dependent types.
Oleg Oshmyan [Tue, 7 Feb 2017 12:14:07 +0000 (14:14 +0200)]
Travis CI: remove HarfBuzz and re-remove Fontconfig from OS X builds
Installing HarfBuzz through Homebrew seems to be consistently slow
whether we use the bottles and disable the Fontconfig cache or build
it from source and drop Fontconfig and other dependencies entirely.
To speed up OS X builds, disable both HarfBuzz and Fontconfig.
We build with HarfBuzz and Fontconfig on Linux, and we should
not have any platform-dependent code that depends on them,
so this should not reduce our code coverage.
Oleg Oshmyan [Mon, 6 Feb 2017 16:46:22 +0000 (18:46 +0200)]
Travis CI: re-enable Fontconfig on OS X but force no cache built
Building HarfBuzz from source works to avoid Fontconfig, but it is still
fairly slow. To further speed up the build, try to use only the prebuilt
bottle packages (which inevitably brings in Fontconfig as a dependency)
but hack the Fontconfig formula to avoid building the font cache.
Adding Fontconfig is not the goal of this commit, as we already have it
on Linux and our Fontconfig-related code "should" work equally well on
other platforms. But since we can now afford it, explicitly ask Homebrew
to install Fontconfig even if the dependency that brings it in disappears
from Homebrew in the future, and enjoy the improved code coverage.
Oleg Oshmyan [Sun, 5 Feb 2017 18:47:48 +0000 (20:47 +0200)]
Travis CI: build with HarfBuzz
On OS X, disable some unnecessary HarfBuzz dependencies. This triggers
a source build of HarfBuzz, but it should be fast and avoids bringing
in Fontconfig through a dependency chain, which we want to avoid as it
wastes a lot of time building its cache when installed.
The dependency that brings in Fontconfig is gobject-introspection, but
we don't need icu4c either, so disable that to save a little more time
that would be spent installing icu4c. We could also disable glib, but
the fribidi formula also has it as a dependency and brings it in anyway.
Oleg Oshmyan [Sun, 5 Feb 2017 02:24:54 +0000 (04:24 +0200)]
Travis CI: run Coverity Scan on every master build
We never remember to push to the coverity_scan branch, so currently
Coverity Scan never runs. Our master builds are not very frequent,
so we should be able to afford running Coverity Scan on every build.
Since https://blog.travis-ci.com/2016-10-04-osx-73-default-image-live/,
this libtool comes preinstalled on Travis CI, thus the hack is no longer needed.
Homebrew bug report possibly relevant to the original problem:
https://github.com/Homebrew/legacy-homebrew/issues/43874
Oleg Oshmyan [Sun, 5 Feb 2017 01:44:34 +0000 (03:44 +0200)]
Travis CI: don't require Fontconfig binaries
Only the library is needed.
In fact, `apt-get install fontconfig` didn't even install the library at
all. Luckily, the package we actually want is preinstalled on Travis CI.
We could continue to rely on this fact and completely remove Fontconfig
from the install list, but it's clearer and possibly more future-proof
to explicitly list it there.
Oleg Oshmyan [Sun, 5 Feb 2017 00:05:20 +0000 (02:05 +0200)]
Travis CI: disable Fontconfig on OS X
Homebrew generates the Fontconfig cache when installing Fontconfig,
which delays the build by several minutes. Disable the Fontconfig
font provider on OS X to avoid this.
Oleg Oshmyan [Mon, 30 Jan 2017 22:11:49 +0000 (00:11 +0200)]
Reduce precision of border width in outline cache keys
The value used to generate outline cache values is 26.6, so there
is no point in storing the more precise 16.16 in the cache key.
Indeed, this can only reduce the efficiency of the cache
and provide an extra opportunity for overflow.
Oleg Oshmyan [Mon, 30 Jan 2017 21:45:43 +0000 (23:45 +0200)]
Reflect border_scale in outline cache keys
border_scale can change, e. g. when ass_render_frame is called twice with
the same renderer but different tracks. Glyphs with equal \bord tag values
but different border_scale values produce different border outlines and
hence should be distinguished in outline cache keys. To this end, store
scaled border widths (which are really used when generating the outlines)
in cache keys instead of \bord tag values.
Dr.Smile [Mon, 30 Jan 2017 23:47:58 +0000 (02:47 +0300)]
render: remove redundant has_clips
has_clips was a workaround for the case where a new image reused
the same memory address as another image used in the previous frame.
In case of such reuse, comparison by pointer address failed
to distinguish the different images in ass_detect_change().
After commit dd06ca30ea79ce50116a43cc5521d4eaf60a017e,
images in the previous frame are no longer freed before
the comparison with current frame. Thus no such reuse can occur,
and the workaround is redundant.
wm4 [Fri, 13 Jan 2017 08:19:23 +0000 (09:19 +0100)]
render_api: do not discard old images on reconfiguration
I noticed that when resizing the mpv window while playback is ongoing
and with subtitles, that subtitles could sometimes get "stuck" on the
screen. The stuck subtitle would remain until the next subtitle event,
or until seeking to a position that has subtitles again.
It turned out that this was a libass change detection bug. The following
steps should reproduce the problem:
1. call ass_render_frame() with a time that has subtitles
2. call ass_set_frame_size() with a different size
3. call ass_render_frame() with a time that has no subtitles
The previous call will return with *detect_change==0.
To make this worse, libass will deallocate image data before the next
ass_render_frame() or ass_renderer_done(), which violates the API and
could possibly make some API users crash. (That the user can rely on
this is not documented though.)
There are two possible solutions:
1. Set a flag in ass_reconfigure(), that makes the next
ass_render_frame() call always return *detect_change==2.
2. Do not discard the previous subtitles (images_root), so change
detection can work reliably.
This commit implements 2. - I prefer this in part because it doesn't
clobber the previously returned image list before the next
ass_render_frame() call. (As pointed out above, this might be unexpected
behavior to the API user.)
This is a regression and was possibly broken by commit dd06ca and later.
I did not check whether it actually behaved sanely before that change,
but it probably did to a degree.
wm4 [Wed, 11 Jan 2017 06:10:13 +0000 (07:10 +0100)]
render: clip BorderStyle=4 against screen
ASS_Images returned by libass are guaranteed to be clipped. Not doing
this will cause invalid memory accesses in applications which try to use
this guarantee.
Oleg Oshmyan [Tue, 3 Jan 2017 19:20:20 +0000 (21:20 +0200)]
Bump ABI version and release 0.13.6
sizeof(ASS_Style) is actually part of the ABI, so adding the Justify field
in commit e54c123d5a08b6212533ddcced2cb1a50fa3d2b2 broke the ABI even
though we tried to avoid it by placing the field at the end of the struct.
Oleg Oshmyan [Wed, 28 Dec 2016 19:14:21 +0000 (21:14 +0200)]
Fix buffer overread in parse_tag when end points to a space
When parse_tag is invoked recursively to handle the animated tags inside
a \t tag, the `end` argument is taken from the `end` field of a struct arg
in the enclosing parse_tag. When struct arg is filled by push_arg, this
field is always right-trimmed using rskip_spaces. Ultimately, the inner
parse_tag invokation sees its `end` argument point not to the ')' or '}'
of the \t as it expects but rather to the spaces preceding the ')' or '}'.
At this point, when parse_tag calls skip_spaces, which is ignorant of the
end pointer, it happily skips over the spaces preceding the ')', moving the
pointer past `end`. Subsequent `pointer != end` comparisons in parse_tag
fail (as in fact `pointer > end`), and parse_tag thinks it is still inside
the substring to be parsed.
This is harmless in many cases, but given either of the following inputs,
parse_tag reads past the end of the actual buffer that stores the string:
{\t(\ }
{\t(\ )(}
After this commit, parse_tag knows that `end` can point to a sequence of
spaces and avoids calling skip_spaces on `end`, thus avoiding the overread.
Discovered by OSS-Fuzz.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=194.
Oleg Oshmyan [Fri, 4 Nov 2016 14:27:44 +0000 (16:27 +0200)]
ass_strtod: correctly convert large negative exponents
Avoid overflow in dblExp that prevents subnormal numbers from being
generated (or small normal numbers if `double` supports many more
negative exponents than positive): if `10**abs(exp)` would overflow
and we actually want a negative exponent, switch to using precomputed
negative powers of 10 rather than positive.
Also avoid underflow for numbers with a large negative exponent where
the exponent alone underflows but the significand has enough digits to
cancel this out, e. g. in `10e-324` with IEEE 754 double.
Oleg Oshmyan [Sun, 30 Oct 2016 00:26:00 +0000 (03:26 +0300)]
ass_strtod: skip leading zeros in mantissa
ass_strtod reads at most 18 leading digits of the mantissa.
This previously included zeros, even though they are not significant
digits, e. g. 0.000000000000000001e18 was converted to 0.0.
After this commit, leading zeros before and after the decimal point
will be skipped, so the above number will be correctly converted to 1.0.